Commit Graph

8 Commits

Author SHA1 Message Date
Carlos Garcia
ec6b41943f fix: vision OCR JSON failures — add format='json' and repair fallback
Three receipts per batch were failing with JSONDecodeError (e.g.
"Expecting ':' delimiter: line 1 column 90") because activeblue-chat
(llama3.2-vision) occasionally outputs near-JSON with trailing commas,
single-quoted strings, or unquoted keys.

Two-layer fix:
1. Add format='json' to the Ollama chat call — Ollama JSON mode forces
   syntactically valid output at the sampler level, eliminating most
   structural errors.
2. Add _repair_json() fallback that runs on any remaining JSONDecodeError:
   strips trailing commas, converts single→double quotes, and quotes
   unquoted keys. If repair succeeds, the result is re-serialised as
   canonical JSON before being returned.

Also re-serialise with json.dumps() on success so the fast path in
_parse_receipt_text always receives clean, canonical JSON regardless of
whitespace or key ordering in the model's original output.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 22:24:50 -04:00
Carlos Garcia
9fa391c720 fix: reduce hallucination in receipt extraction — conservative prompts + date injection
Two sources of hallucinated values in receipt parsing:

1. The LLM extraction prompt had no explicit "don't guess" constraint, so
   when Tesseract produced garbled OCR text the LLM substituted plausible-
   looking values (wrong vendor names, wrong totals) instead of returning
   safe defaults.

2. The date field asked the LLM to extract the date from the OCR text even
   when date_hint (from the filename timestamp, e.g. 20260509_180857.jpg)
   was already available — a reliable signal that was being ignored.

expenses_agent._parse_receipt_text:
- LLM path: new prompt leads with "copy values EXACTLY, do NOT guess or
  infer"; adds "if OCR looks corrupted, return safe default rather than
  a more logical value"; injects date_hint directly as an authoritative
  value when available so the LLM never needs to extract the date.
- Vision fast path: normalise "null" string for date the same way as time;
  prefer date_hint over a null date returned by the vision model.

receipt_parser._ocr_image_vision:
- Vision prompt now leads with the same "copy exactly, do not guess"
  constraint and explicitly accepts null for date/time when not clearly
  visible, matching the conservative tone of the LLM extraction prompt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 22:19:20 -04:00
Carlos Garcia
11cc261923 fix: vision OCR receipt extraction — skip second LLM call, fix total truncation
receipt_parser: change _ocr_image_vision() to extract structured JSON
{vendor,amount,date,time,category} directly from the image instead of
transcribing raw text, so the downstream LLM extraction step is
unnecessary and the two-step error-compounding is eliminated.

expenses_agent: add _match_category() helper to map vision category
labels to expense product names via substring/fuzzy match; add fast
path in _parse_receipt_text() that detects pre-extracted vision JSON
(text starts with '{') and skips the second LLM submit call entirely.
Fix text[:2000] truncation that discarded receipt totals — now keeps
first 1500 + last 1500 chars of long receipts so the grand total at
the bottom is always included.

tests: fix stale test_act_enters_awaiting_confirmation_on_first_pass
(confirmation gate was removed); add TestMatchCategory and three new
tests for the vision JSON fast path and LLM fallthrough.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 21:49:31 -04:00
Carlos Garcia
b76d01b64f Fix vision OCR response parsing for dict-returning ollama client versions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 11:59:11 -04:00
Carlos Garcia
5b924e60de Add vision OCR via Ollama vision model with Tesseract fallback
Introduces VISION_OCR_MODEL setting. When set (e.g. llama3.2-vision:11b),
receipt images are transcribed by the Ollama vision model before falling
back to Tesseract. Also improves Tesseract preprocessing with adaptive
binarisation (autocontrast + threshold at 140) for better accuracy on
thermal receipts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 18:43:21 -04:00
Carlos Garcia
c2d1078d79 fix: improve OCR accuracy for rotated/sideways receipt photos
- Dockerfile: add tesseract-ocr-osd for orientation detection data
- receipt_parser: resize large phone photos to 1800px, convert to
  grayscale, sharpen before OCR; use psm 1 (auto + OSD) so rotated
  receipts are correctly oriented before text extraction
- expenses_agent: tighten amount extraction prompt to pick the FINAL
  total, not subtotal or tax line, reducing misreads like 42.90->409.00

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 01:51:29 -04:00
Carlos Garcia
ef6dad5a81 feat: OCR via tesseract, dedup, category selection for expense receipts
- Dockerfile: install tesseract-ocr so Pillow+pytesseract can OCR receipt images
- operational_store: JSON-serialize raw_data before passing to asyncpg JSONB
- receipt_parser: add SHA256 hash + date extracted from filename timestamps
- expenses_agent: deduplicate receipts by hash before creating expense records
- expenses_agent: fetch all expensable Odoo products, pass list to LLM for
  category selection (Meals, Flights, etc.) per receipt
- expenses_agent: pass date_hint from filename (e.g. 20260509_180857.jpg -> 2026-05-09)
  as fallback when OCR text is unavailable
- expenses_tools: add get_expense_products() to fetch all expensable products

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 01:40:32 -04:00
Carlos Garcia
4b7223a139 feat: file upload + expense report creation from Discuss attachments
- Discuss bot now reads ir.attachment from incoming messages; file-only
  messages no longer silently dropped
- ZIP files are described (contents listed) and bot asks clarifying
  question before acting; user's follow-up reply looks back for pending
  attachments so files don't need to be re-uploaded
- receipt_parser: extracts text from ZIP (recursive), JPG/PNG/etc (OCR),
  PDF (pdfplumber), HTML, TXT
- expenses_agent: full rewrite fixing broken method signatures; adds
  create_expense_sheet / create_expense / attach_receipt flow driven by
  LLM receipt parsing (Ollama, HIPAA-locked)
- master_agent: extra_context threads receipts + user_id into directives
- FastAPI /upload multipart endpoint; registered in main.py
- Odoo /ai/upload controller proxies files to agent service
- ab_ai_bot: dispatch_message_with_files() for multipart uploads

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 01:02:24 -04:00