EasyOCR (deep-learning OCR) replaces Tesseract as the default engine for
receipt images. It handles phone photos, thermal paper, dot-matrix fonts,
and rotated images significantly better than Tesseract without requiring
manual preprocessing pipelines.
Key design decisions:
- OCR_ENGINE=easyocr (default) | tesseract — switchable via .env, no rebuild
- EasyOCR Reader is a module-level singleton: model loaded once per container
start, not per receipt
- Falls back to Tesseract automatically if EasyOCR fails or returns < 20 chars
- EXIF rotation fix still applied before EasyOCR (phone photo orientation)
- Images resized to max 2000px width for speed before passing to EasyOCR
- _easyocr_to_text() groups detections into visual lines (y-overlap) and
sorts left-to-right within each line for clean single-string output
Revert: echo "OCR_ENGINE=tesseract" >> .env && docker compose up -d agent-service
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove "NeDonald's → McDonald's" from LLM vendor correction examples; the
example was biasing the model to return McDonald's for any ambiguous receipt
(Home Depot, Sergio's/HMSHost). Replace with neutral brand examples and add
an explicit instruction not to substitute a brand name absent from the OCR text.
- Add `net\s*fee` to _TOTAL_RE so MIA Parking kiosk receipts ("net fee: 150.00 USD")
are captured by Pass 1 rather than the max-scan which could pick a larger line.
- Add Step 5b grayscale fallback in receipt_parser: if all binarized PSM attempts
yield < 20 chars, retry OCR on the pre-binarization grayscale image. Fixes
dot-matrix and certain thermal-print fonts destroyed by the 160-threshold.
- Tests: 88 passing (test_net_fee_parking, test_vendor_prompt_does_not_contain_mcdonalds,
test_vendor_prompt_instructs_not_to_guess_absent_brand).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
LAYAL CAFE ($2.80 instead of $42.90):
- Add (?!\s*tax) lookahead to _TOTAL_RE so "Total Taxes $2.80" is never
confused with the receipt total when OCR drops the "Taxes" word
- Change Pass 1 from matches[-1] to max() so the largest labeled amount
always wins, regardless of line order in the OCR output
United Airlines (Subway/$0/wrong date):
- Add OSD-based rotation correction in receipt_parser.py: after EXIF
transpose, ask Tesseract's orientation-detection engine (--psm 0) what
angle to rotate; applies to receipts photographed lying sideways where
EXIF metadata cannot help
- Add month-name date patterns (DD MON YYYY / MON DD YYYY) to
_extract_date_from_text for airline/hotel receipts that print dates
like "05 MAY 2026" instead of "05/07/26"
85 tests, all passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The llama3.2-vision model was producing unreliable structured data
(wrong vendors, amounts, dates) making expense reports worse than
Tesseract + LLM extraction. Removes _ocr_image_vision(), the
vision JSON fast path in _parse_receipt_text(), _match_category(),
and the vision_ocr_model config setting entirely.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three receipts per batch were failing with JSONDecodeError (e.g.
"Expecting ':' delimiter: line 1 column 90") because activeblue-chat
(llama3.2-vision) occasionally outputs near-JSON with trailing commas,
single-quoted strings, or unquoted keys.
Two-layer fix:
1. Add format='json' to the Ollama chat call — Ollama JSON mode forces
syntactically valid output at the sampler level, eliminating most
structural errors.
2. Add _repair_json() fallback that runs on any remaining JSONDecodeError:
strips trailing commas, converts single→double quotes, and quotes
unquoted keys. If repair succeeds, the result is re-serialised as
canonical JSON before being returned.
Also re-serialise with json.dumps() on success so the fast path in
_parse_receipt_text always receives clean, canonical JSON regardless of
whitespace or key ordering in the model's original output.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two sources of hallucinated values in receipt parsing:
1. The LLM extraction prompt had no explicit "don't guess" constraint, so
when Tesseract produced garbled OCR text the LLM substituted plausible-
looking values (wrong vendor names, wrong totals) instead of returning
safe defaults.
2. The date field asked the LLM to extract the date from the OCR text even
when date_hint (from the filename timestamp, e.g. 20260509_180857.jpg)
was already available — a reliable signal that was being ignored.
expenses_agent._parse_receipt_text:
- LLM path: new prompt leads with "copy values EXACTLY, do NOT guess or
infer"; adds "if OCR looks corrupted, return safe default rather than
a more logical value"; injects date_hint directly as an authoritative
value when available so the LLM never needs to extract the date.
- Vision fast path: normalise "null" string for date the same way as time;
prefer date_hint over a null date returned by the vision model.
receipt_parser._ocr_image_vision:
- Vision prompt now leads with the same "copy exactly, do not guess"
constraint and explicitly accepts null for date/time when not clearly
visible, matching the conservative tone of the LLM extraction prompt.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
receipt_parser: change _ocr_image_vision() to extract structured JSON
{vendor,amount,date,time,category} directly from the image instead of
transcribing raw text, so the downstream LLM extraction step is
unnecessary and the two-step error-compounding is eliminated.
expenses_agent: add _match_category() helper to map vision category
labels to expense product names via substring/fuzzy match; add fast
path in _parse_receipt_text() that detects pre-extracted vision JSON
(text starts with '{') and skips the second LLM submit call entirely.
Fix text[:2000] truncation that discarded receipt totals — now keeps
first 1500 + last 1500 chars of long receipts so the grand total at
the bottom is always included.
tests: fix stale test_act_enters_awaiting_confirmation_on_first_pass
(confirmation gate was removed); add TestMatchCategory and three new
tests for the vision JSON fast path and LLM fallthrough.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. OdooClient missing self._timeout — every _xmlrpc_call raised
AttributeError, making the odoo health check permanently fail.
Fix: set self._timeout = XMLRPC_TIMEOUT in __init__.
2. action_ping only accepted ollama=='ok' but health.py now returns
'warming' when the model is not yet hot in VRAM. Fix: treat
warming as passing so the bot goes online and the model loads
on the first real request.
3. /ai/approval/pending declared methods=['GET'] on a type='json'
route — Odoo JSON-RPC always POSTs, so every browser call got
405 METHOD NOT ALLOWED. Fix: change to methods=['POST'].
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- All specialist agents: handle_peer_request(request_type, params, directive_id)
replaces handle_peer_request(request: dict) so callers pass structured args
- ab_ai_bot: force-write bus_presence.status via SQL so Odoo 18 WebSocket presence
shows the correct colour immediately (ORM compute does not trigger on last_poll writes)
- odoo_client: wrap XML-RPC executor calls in asyncio.wait_for to enforce timeout
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a new specialist agent that gives the AI system control over its
own infrastructure:
- sysops_tools.py: docker SDK (ps/logs/restart) + git CLI (pull/status/log)
+ Odoo channel notifier for autonomous action broadcasts
- sysops_agent.py: BaseAgent subclass handling on-demand chat requests,
auto_heal() triggered by health failures, and sweep() for audits
- Background auto-heal loop (main.py): runs every 2 minutes, calls
_get_failing_systems() and triggers auto_heal() when degraded
- health.py: extracted _get_failing_systems() helper reused by both
the /health/detailed endpoint and the auto-heal loop
- docker-compose.yml: mount docker socket + /root/odoo workspace +
SSH keys for git authentication
- Dockerfile: add git to apt-get
- requirements.txt: add docker==7.1.0 Python SDK
Auto-heal behavior:
- Detects failing containers, restarts them, notifies all bot DM channels
- Ollama (192.168.2.9) is flagged as external and skipped
- On-demand via chat: "restart agent", "check logs", "pull latest code"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- expenses_tools: remove 'date' from hr.expense.sheet field lists (Odoo 18
uses accounting_date; querying 'date' raised ValueError at runtime)
- master_system.txt: add few-shot routing examples so Llama 3.1 8B correctly
outputs agents=[] for general questions instead of defaulting to expenses_agent
- ab_ai_bot.py: increase bot presence last_poll offset from 90s to 10min so
the green dot stays on between cron runs (cron fires every ~5min in practice,
not every 20s as configured)
- ARCHITECTURE.md: full system documentation covering component layout, request
flow, LLM routing, agent registry, access control, health/presence mechanism,
known issues fixed today, and future self-healing concept
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces VISION_OCR_MODEL setting. When set (e.g. llama3.2-vision:11b),
receipt images are transcribed by the Ollama vision model before falling
back to Tesseract. Also improves Tesseract preprocessing with adaptive
binarisation (autocontrast + threshold at 140) for better accuracy on
thermal receipts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Dockerfile: add tesseract-ocr-osd for orientation detection data
- receipt_parser: resize large phone photos to 1800px, convert to
grayscale, sharpen before OCR; use psm 1 (auto + OSD) so rotated
receipts are correctly oriented before text extraction
- expenses_agent: tighten amount extraction prompt to pick the FINAL
total, not subtotal or tax line, reducing misreads like 42.90->409.00
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Dockerfile: install tesseract-ocr so Pillow+pytesseract can OCR receipt images
- operational_store: JSON-serialize raw_data before passing to asyncpg JSONB
- receipt_parser: add SHA256 hash + date extracted from filename timestamps
- expenses_agent: deduplicate receipts by hash before creating expense records
- expenses_agent: fetch all expensable Odoo products, pass list to LLM for
category selection (Meals, Flights, etc.) per receipt
- expenses_agent: pass date_hint from filename (e.g. 20260509_180857.jpg -> 2026-05-09)
as fallback when OCR text is unavailable
- expenses_tools: add get_expense_products() to fetch all expensable products
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- _synthesize: short-circuit on any single-agent report (avoids extra
Ollama call that can timeout); wrap multi-agent LLM call in try/except
- _update_memory: catch exceptions so DB/memory failures don't kill reply
- _log_directive_start: use 0 instead of NULL for channel_id (NOT NULL col)
- create_expense: drop 'description' field (not valid on hr.expense in Odoo 18)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Discuss bot now reads ir.attachment from incoming messages; file-only
messages no longer silently dropped
- ZIP files are described (contents listed) and bot asks clarifying
question before acting; user's follow-up reply looks back for pending
attachments so files don't need to be re-uploaded
- receipt_parser: extracts text from ZIP (recursive), JPG/PNG/etc (OCR),
PDF (pdfplumber), HTML, TXT
- expenses_agent: full rewrite fixing broken method signatures; adds
create_expense_sheet / create_expense / attach_receipt flow driven by
LLM receipt parsing (Ollama, HIPAA-locked)
- master_agent: extra_context threads receipts + user_id into directives
- FastAPI /upload multipart endpoint; registered in main.py
- Odoo /ai/upload controller proxies files to agent service
- ab_ai_bot: dispatch_message_with_files() for multipart uploads
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ElearningTools: add create_course, update_course, publish_course,
add_section, create_slide, enroll_user write methods using OdooClient
- ElearningAgent: fix all BaseAgent method signatures (_plan/_gather/
_reason/_act/_report no longer take wrong positional args)
- Replace dead _dispatch_tool pattern with _tool_<name> methods so
BaseAgent._run_tool() can drive them via LLM tool calls in _loop()
- Add LLM-driven course creation in _reason(): when intent is create,
_loop() is called with a course-building system prompt and all tools;
the LLM calls create_course → add_section → create_slide → publish
- Fix handle_peer_request signature to match BaseAgent interface
- Fix AgentReport missing directive_id; fix SweepReport invalid kwargs
- Extend ELEARNING_TOOLS list with all new write-side tools
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Health endpoint called .ping() on both but neither implemented it,
causing ollama/odoo to always show as error and the bot to stay offline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Some Odoo instances require the user's actual login/email for API key
auth rather than the __system__ special login. ODOO_USER defaults to
__system__ for standard Odoo 16+ installs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>