fix: reduce hallucination in receipt extraction — conservative prompts + date injection

Two sources of hallucinated values in receipt parsing:

1. The LLM extraction prompt had no explicit "don't guess" constraint, so
   when Tesseract produced garbled OCR text the LLM substituted plausible-
   looking values (wrong vendor names, wrong totals) instead of returning
   safe defaults.

2. The date field asked the LLM to extract the date from the OCR text even
   when date_hint (from the filename timestamp, e.g. 20260509_180857.jpg)
   was already available — a reliable signal that was being ignored.

expenses_agent._parse_receipt_text:
- LLM path: new prompt leads with "copy values EXACTLY, do NOT guess or
  infer"; adds "if OCR looks corrupted, return safe default rather than
  a more logical value"; injects date_hint directly as an authoritative
  value when available so the LLM never needs to extract the date.
- Vision fast path: normalise "null" string for date the same way as time;
  prefer date_hint over a null date returned by the vision model.

receipt_parser._ocr_image_vision:
- Vision prompt now leads with the same "copy exactly, do not guess"
  constraint and explicitly accepts null for date/time when not clearly
  visible, matching the conservative tone of the LLM extraction prompt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Carlos Garcia
2026-05-20 22:19:20 -04:00
parent cc025695ac
commit 9fa391c720
2 changed files with 54 additions and 25 deletions

View File

@@ -113,18 +113,24 @@ def _ocr_image_vision(data: bytes, filename: str, ollama_url: str, model: str) -
messages=[{
'role': 'user',
'content': (
'This is a photo of a receipt. Extract these fields:\n'
'- vendor: the store or restaurant name\n'
'- amount: the FINAL total the customer paid. Look for a line '
'labeled "Total", "Grand Total", "Amount Due", or "Balance Due". '
'Do NOT use subtotal, tax, or tip. Return 0 if you cannot find '
'a clear final total.\n'
'- date: transaction date in YYYY-MM-DD format\n'
'- time: transaction time in HH:MM 24-hour format, or null\n'
'- category: one word describing the expense type — one of: '
'meals, fuel, hotel, office, transport, other\n\n'
'You are a receipt data extractor. '
'Read this receipt image and extract the following fields. '
'Copy values EXACTLY as printed — do NOT guess, infer, or '
'invent values you cannot clearly see.\n\n'
'Fields to extract:\n'
'- vendor: the store or restaurant name exactly as printed; '
'empty string if not clearly visible\n'
'- amount: the FINAL total the customer paid; find a line '
'labeled "Total", "Grand Total", "Amount Due", or "Balance Due"; '
'copy the number exactly; do NOT use subtotal, tax, or tip; '
'return 0 if no clearly labeled final total is visible\n'
'- date: transaction date in YYYY-MM-DD format; '
'null if not clearly visible\n'
'- time: transaction time in HH:MM 24-hour format; '
'null if not clearly visible\n'
'- category: one of: meals, fuel, hotel, office, transport, other\n\n'
'Return ONLY a valid JSON object, no commentary, no markdown:\n'
'{"vendor":"...","amount":0.00,"date":"YYYY-MM-DD",'
'{"vendor":"...","amount":0.00,"date":"YYYY-MM-DD or null",'
'"time":"HH:MM or null","category":"..."}'
),
'images': [data],