fix: improve OCR accuracy for rotated/sideways receipt photos

- Dockerfile: add tesseract-ocr-osd for orientation detection data - receipt_parser: resize large phone photos to 1800px, convert to grayscale, sharpen before OCR; use psm 1 (auto + OSD) so rotated receipts are correctly oriented before text extraction - expenses_agent: tighten amount extraction prompt to pick the FINAL total, not subtotal or tax line, reducing misreads like 42.90->409.00 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 01:51:29 -04:00
parent 8a9d772b8e
commit c2d1078d79
3 changed files with 33 additions and 6 deletions
--- a/agent_service/agents/expenses_agent.py
+++ b/agent_service/agents/expenses_agent.py
@@ -220,11 +220,14 @@ class ExpensesAgent(BaseAgent):
            prompt = (
                'Extract expense details from the following receipt text. '
                'Return ONLY valid JSON with these keys:\n'
-                '"vendor" (string, merchant name),\n'
-                '"amount" (number, the total amount charged — look for "Total", "Amount Due", "Grand Total"),\n'
-                f'"date" (string YYYY-MM-DD, use {date_hint or today} if not found),\n'
+                '"vendor" (string, merchant or restaurant name),\n'
+                '"amount" (number — the FINAL total the customer paid; '
+                'this is labeled "Total", "Amount Due", "Grand Total", or the last dollar figure; '
+                'do NOT use subtotal, tax, or tip separately; '
+                'if multiple totals appear pick the largest one labeled as the final total),\n'
+                f'"date" (string YYYY-MM-DD, use {date_hint or today} if not found in text),\n'
                f'"product_name" (string, pick the best match from [{product_list}] or empty string).\n\n'
-                f'Receipt text (first 2000 chars):\n{text[:2000]}\n\nJSON only:'
+                f'Receipt text:\n{text[:2000]}\n\nJSON only:'
            )
        try:
            resp = await self._llm.submit(