Gas station receipts (Costco, Shell, etc.) print "Total Sale $X.XX" — the
word "Sale" between "Total" and the amount prevented _TOTAL_RE from matching,
causing the Costco receipt to fall through to the max-scan heuristic and
return a garbled OCR value instead of the correct total.
Also add "Net Sale" and "Sale Total" variants for broader coverage.
79 tests, all passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add _is_likely_bank_statement(): if OCR text has ≥10 lines with dollar
amounts it is almost certainly a bank/card statement screenshot, not a
single receipt. Return skip=True so _act() skips it and adds a note to
the escalations list instead of creating a $1,699 expense line.
- Fix default product selection in _act(): prefer "Meals" over whatever
happens to be first in Odoo's expense product list ("Communication"),
so unrecognised receipts get a sensible fallback category.
- Improve LLM category prompt: remove hardcoded product names (airline →
Transport) that don't exist in every Odoo install; describe business
types semantically so the model picks from the actual available list.
- Mention skipped statements in the final summary message.
- 77 tests, all passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove card brands (VISA/MC/Amex) from _SKIP_LINE_RE so card-terminal
lines like "VISA USD$ 36.78" are no longer skipped
- Replace bottom-50% scan with full-text max scan (Pass 2): scans every
line in the receipt and returns the largest dollar amount, correctly
handling display-style receipts that show the charge at the top with
no label (e.g. LAYAL CAFE $40.10 before the item list)
- Update vendor LLM prompt to ask the model to correct OCR garbling
(e.g. "NeDonald's" → "McDonald's") and detect bank statements
- Add 4 new tests covering top-amount, card-terminal, max-beats-items,
and change-exclusion scenarios (71 tests, all passing)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Receipt quality: replace LLM amount/date extraction with regex.
LLM was hallucinating 2021/2022 dates and returning '198.40 USD' strings.
Amounts now use deterministic regex (Total:/Grand Total:/Amount Due:).
Dates: filename timestamp > OCR regex > today (no LLM date guessing).
LLM only asked for vendor name + product category.
Approval: fix GET /approval/pending 500 by using correct column
name 'started_at' instead of 'created_at' (which does not exist).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The llama3.2-vision model was producing unreliable structured data
(wrong vendors, amounts, dates) making expense reports worse than
Tesseract + LLM extraction. Removes _ocr_image_vision(), the
vision JSON fast path in _parse_receipt_text(), _match_category(),
and the vision_ocr_model config setting entirely.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
receipt_parser: change _ocr_image_vision() to extract structured JSON
{vendor,amount,date,time,category} directly from the image instead of
transcribing raw text, so the downstream LLM extraction step is
unnecessary and the two-step error-compounding is eliminated.
expenses_agent: add _match_category() helper to map vision category
labels to expense product names via substring/fuzzy match; add fast
path in _parse_receipt_text() that detects pre-extracted vision JSON
(text starts with '{') and skips the second LLM submit call entirely.
Fix text[:2000] truncation that discarded receipt totals — now keeps
first 1500 + last 1500 chars of long receipts so the grand total at
the bottom is always included.
tests: fix stale test_act_enters_awaiting_confirmation_on_first_pass
(confirmation gate was removed); add TestMatchCategory and three new
tests for the vision JSON fast path and LLM fallthrough.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds module-level label and cross-reference to the new doc.
TEST_EXPENSES_AGENT.md documents every test group, case, and the
real-world bug each test guards against (e.g. In-N-Out OCR mismatch).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
'Restaurant A' vs 'Restaurant Z' differ by 1 char so difflib scores
them at ~91% -- correctly above the 80% threshold. Use clearly
different vendors (Starbucks Coffee vs McDonalds Burger) instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>