Files
odoo-ai/requirements.txt
Carlos Garcia 69519393c1 Add EasyOCR engine for receipt image parsing
EasyOCR (deep-learning OCR) replaces Tesseract as the default engine for
receipt images. It handles phone photos, thermal paper, dot-matrix fonts,
and rotated images significantly better than Tesseract without requiring
manual preprocessing pipelines.

Key design decisions:
- OCR_ENGINE=easyocr (default) | tesseract — switchable via .env, no rebuild
- EasyOCR Reader is a module-level singleton: model loaded once per container
  start, not per receipt
- Falls back to Tesseract automatically if EasyOCR fails or returns < 20 chars
- EXIF rotation fix still applied before EasyOCR (phone photo orientation)
- Images resized to max 2000px width for speed before passing to EasyOCR
- _easyocr_to_text() groups detections into visual lines (y-overlap) and
  sorts left-to-right within each line for clean single-string output

Revert: echo "OCR_ENGINE=tesseract" >> .env && docker compose up -d agent-service

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 01:22:22 -04:00

24 lines
630 B
Plaintext

fastapi==0.115.0
uvicorn[standard]==0.30.6
pydantic==2.9.2
pydantic-settings==2.5.2
asyncpg==0.29.0
anthropic==0.34.2
httpx==0.27.2
alembic==1.13.3
sqlalchemy[asyncio]==2.0.35
json-log-formatter==0.5.2
python-dotenv==1.0.1
mcp==1.3.0
ollama==0.3.3
# Receipt parsing — also requires: apt install tesseract-ocr (for image OCR)
pdfplumber==0.11.4
Pillow==10.4.0
pytesseract==0.3.13
# EasyOCR: deep-learning OCR, better on phone photos and difficult fonts.
# Set OCR_ENGINE=tesseract in .env to use Tesseract instead.
# Note: pulls in torch (~1.5GB) — only add if disk space allows.
easyocr
python-multipart==0.0.12
docker==7.1.0