LAYAL CAFE ($2.80 instead of $42.90):
- Add (?!\s*tax) lookahead to _TOTAL_RE so "Total Taxes $2.80" is never
confused with the receipt total when OCR drops the "Taxes" word
- Change Pass 1 from matches[-1] to max() so the largest labeled amount
always wins, regardless of line order in the OCR output
United Airlines (Subway/$0/wrong date):
- Add OSD-based rotation correction in receipt_parser.py: after EXIF
transpose, ask Tesseract's orientation-detection engine (--psm 0) what
angle to rotate; applies to receipts photographed lying sideways where
EXIF metadata cannot help
- Add month-name date patterns (DD MON YYYY / MON DD YYYY) to
_extract_date_from_text for airline/hotel receipts that print dates
like "05 MAY 2026" instead of "05/07/26"
85 tests, all passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The llama3.2-vision model was producing unreliable structured data
(wrong vendors, amounts, dates) making expense reports worse than
Tesseract + LLM extraction. Removes _ocr_image_vision(), the
vision JSON fast path in _parse_receipt_text(), _match_category(),
and the vision_ocr_model config setting entirely.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three receipts per batch were failing with JSONDecodeError (e.g.
"Expecting ':' delimiter: line 1 column 90") because activeblue-chat
(llama3.2-vision) occasionally outputs near-JSON with trailing commas,
single-quoted strings, or unquoted keys.
Two-layer fix:
1. Add format='json' to the Ollama chat call — Ollama JSON mode forces
syntactically valid output at the sampler level, eliminating most
structural errors.
2. Add _repair_json() fallback that runs on any remaining JSONDecodeError:
strips trailing commas, converts single→double quotes, and quotes
unquoted keys. If repair succeeds, the result is re-serialised as
canonical JSON before being returned.
Also re-serialise with json.dumps() on success so the fast path in
_parse_receipt_text always receives clean, canonical JSON regardless of
whitespace or key ordering in the model's original output.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two sources of hallucinated values in receipt parsing:
1. The LLM extraction prompt had no explicit "don't guess" constraint, so
when Tesseract produced garbled OCR text the LLM substituted plausible-
looking values (wrong vendor names, wrong totals) instead of returning
safe defaults.
2. The date field asked the LLM to extract the date from the OCR text even
when date_hint (from the filename timestamp, e.g. 20260509_180857.jpg)
was already available — a reliable signal that was being ignored.
expenses_agent._parse_receipt_text:
- LLM path: new prompt leads with "copy values EXACTLY, do NOT guess or
infer"; adds "if OCR looks corrupted, return safe default rather than
a more logical value"; injects date_hint directly as an authoritative
value when available so the LLM never needs to extract the date.
- Vision fast path: normalise "null" string for date the same way as time;
prefer date_hint over a null date returned by the vision model.
receipt_parser._ocr_image_vision:
- Vision prompt now leads with the same "copy exactly, do not guess"
constraint and explicitly accepts null for date/time when not clearly
visible, matching the conservative tone of the LLM extraction prompt.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
receipt_parser: change _ocr_image_vision() to extract structured JSON
{vendor,amount,date,time,category} directly from the image instead of
transcribing raw text, so the downstream LLM extraction step is
unnecessary and the two-step error-compounding is eliminated.
expenses_agent: add _match_category() helper to map vision category
labels to expense product names via substring/fuzzy match; add fast
path in _parse_receipt_text() that detects pre-extracted vision JSON
(text starts with '{') and skips the second LLM submit call entirely.
Fix text[:2000] truncation that discarded receipt totals — now keeps
first 1500 + last 1500 chars of long receipts so the grand total at
the bottom is always included.
tests: fix stale test_act_enters_awaiting_confirmation_on_first_pass
(confirmation gate was removed); add TestMatchCategory and three new
tests for the vision JSON fast path and LLM fallthrough.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces VISION_OCR_MODEL setting. When set (e.g. llama3.2-vision:11b),
receipt images are transcribed by the Ollama vision model before falling
back to Tesseract. Also improves Tesseract preprocessing with adaptive
binarisation (autocontrast + threshold at 140) for better accuracy on
thermal receipts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Dockerfile: add tesseract-ocr-osd for orientation detection data
- receipt_parser: resize large phone photos to 1800px, convert to
grayscale, sharpen before OCR; use psm 1 (auto + OSD) so rotated
receipts are correctly oriented before text extraction
- expenses_agent: tighten amount extraction prompt to pick the FINAL
total, not subtotal or tax line, reducing misreads like 42.90->409.00
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Dockerfile: install tesseract-ocr so Pillow+pytesseract can OCR receipt images
- operational_store: JSON-serialize raw_data before passing to asyncpg JSONB
- receipt_parser: add SHA256 hash + date extracted from filename timestamps
- expenses_agent: deduplicate receipts by hash before creating expense records
- expenses_agent: fetch all expensable Odoo products, pass list to LLM for
category selection (Meals, Flights, etc.) per receipt
- expenses_agent: pass date_hint from filename (e.g. 20260509_180857.jpg -> 2026-05-09)
as fallback when OCR text is unavailable
- expenses_tools: add get_expense_products() to fetch all expensable products
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Discuss bot now reads ir.attachment from incoming messages; file-only
messages no longer silently dropped
- ZIP files are described (contents listed) and bot asks clarifying
question before acting; user's follow-up reply looks back for pending
attachments so files don't need to be re-uploaded
- receipt_parser: extracts text from ZIP (recursive), JPG/PNG/etc (OCR),
PDF (pdfplumber), HTML, TXT
- expenses_agent: full rewrite fixing broken method signatures; adds
create_expense_sheet / create_expense / attach_receipt flow driven by
LLM receipt parsing (Ollama, HIPAA-locked)
- master_agent: extra_context threads receipts + user_id into directives
- FastAPI /upload multipart endpoint; registered in main.py
- Odoo /ai/upload controller proxies files to agent service
- ab_ai_bot: dispatch_message_with_files() for multipart uploads
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>