Add vision OCR via Ollama vision model with Tesseract fallback
Introduces VISION_OCR_MODEL setting. When set (e.g. llama3.2-vision:11b), receipt images are transcribed by the Ollama vision model before falling back to Tesseract. Also improves Tesseract preprocessing with adaptive binarisation (autocontrast + threshold at 140) for better accuracy on thermal receipts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -16,6 +16,10 @@ class Settings(BaseSettings):
|
||||
ollama_model: str = 'activeblue-chat'
|
||||
ollama_timeout: int = 120
|
||||
ollama_max_concurrent: int = 2
|
||||
# Set to a vision-capable model (e.g. llama3.2-vision:11b) to use
|
||||
# vision OCR for receipt images instead of Tesseract. Leave empty
|
||||
# to keep the Tesseract pipeline.
|
||||
vision_ocr_model: str = ''
|
||||
|
||||
# Anthropic / Claude
|
||||
anthropic_api_key: str = ''
|
||||
|
||||
Reference in New Issue
Block a user