odoo-ai

Author	SHA1	Message	Date
Carlos Garcia	aea2fa02b8	expenses_agent: batch LLM calls + skip RAG to fix timeout on large uploads - auto_rag=False: skip PeerBus odoo_doc_agent call on every execute(); eliminates 30s Ollama semaphore contention before parsing even starts - _batch_parse_receipts(): Phase 1 regex (instant per-receipt: amount, date, bank-statement skip); Phase 2 single batched LLM call for all vendor+product_name instead of N individual calls; vision mode falls back to per-receipt calls (can't batch images); LLM fallback on bad JSON or wrong item count - _act() updated to use _batch_parse_receipts() - 7 new tests covering batch happy path, regex-only amounts, private-key cleanup, bank-statement skip, malformed-JSON fallback, wrong-count fallback, no-products short-circuit (99 tests total, all passing) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 06:36:22 -04:00
Carlos Garcia	69519393c1	Add EasyOCR engine for receipt image parsing EasyOCR (deep-learning OCR) replaces Tesseract as the default engine for receipt images. It handles phone photos, thermal paper, dot-matrix fonts, and rotated images significantly better than Tesseract without requiring manual preprocessing pipelines. Key design decisions: - OCR_ENGINE=easyocr (default) \| tesseract — switchable via .env, no rebuild - EasyOCR Reader is a module-level singleton: model loaded once per container start, not per receipt - Falls back to Tesseract automatically if EasyOCR fails or returns < 20 chars - EXIF rotation fix still applied before EasyOCR (phone photo orientation) - Images resized to max 2000px width for speed before passing to EasyOCR - _easyocr_to_text() groups detections into visual lines (y-overlap) and sorts left-to-right within each line for clean single-string output Revert: echo "OCR_ENGINE=tesseract" >> .env && docker compose up -d agent-service Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 01:22:22 -04:00
Carlos Garcia	a736f3352b	Add vision LLM path for receipt vendor/category identification When RECEIPT_VISION_MODE=vision (default), uploaded receipt images are sent directly to the vision-capable LLM (llama3.2-vision via Ollama) instead of the OCR text excerpt. The model can read logos, stylised fonts, and layouts that Tesseract OCR mangles (Home Depot, HMSHost/Sergio's, etc.). Architecture: - amount + date: always from Tesseract regex (deterministic, never LLM) - vendor + category: vision LLM when image available, text LLM as fallback - Fallthrough: if vision call fails for any reason, text path is tried next - PDF/TXT/HTML receipts: always use text path (not visual media) Revert instantly without a rebuild: echo "RECEIPT_VISION_MODE=text" >> /root/odoo/odoo-ai/.env docker compose up -d agent-service config.py: add receipt_vision_mode setting (default 'vision') expenses_agent.py: _VISION_MIMETYPES, _get_vision_mode() helper, dual-path _parse_receipt_text (b64/mimetype params), _act() passes b64 tests: 92 passing — 4 new vision tests, 2 existing prompt tests pinned to text mode via _get_vision_mode patch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 01:06:55 -04:00
Carlos Garcia	db06fede5f	Fix vendor mis-identification (McDonald's bias), MIA Parking amount, grayscale OCR fallback - Remove "NeDonald's → McDonald's" from LLM vendor correction examples; the example was biasing the model to return McDonald's for any ambiguous receipt (Home Depot, Sergio's/HMSHost). Replace with neutral brand examples and add an explicit instruction not to substitute a brand name absent from the OCR text. - Add `net\s*fee` to _TOTAL_RE so MIA Parking kiosk receipts ("net fee: 150.00 USD") are captured by Pass 1 rather than the max-scan which could pick a larger line. - Add Step 5b grayscale fallback in receipt_parser: if all binarized PSM attempts yield < 20 chars, retry OCR on the pre-binarization grayscale image. Fixes dot-matrix and certain thermal-print fonts destroyed by the 160-threshold. - Tests: 88 passing (test_net_fee_parking, test_vendor_prompt_does_not_contain_mcdonalds, test_vendor_prompt_instructs_not_to_guess_absent_brand). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 00:56:45 -04:00
Carlos Garcia	ece811cccb	fix(expenses): LAYAL CAFE $2.80 bug, United Airlines rotation & date LAYAL CAFE ($2.80 instead of $42.90): - Add (?!\s*tax) lookahead to _TOTAL_RE so "Total Taxes $2.80" is never confused with the receipt total when OCR drops the "Taxes" word - Change Pass 1 from matches[-1] to max() so the largest labeled amount always wins, regardless of line order in the OCR output United Airlines (Subway/$0/wrong date): - Add OSD-based rotation correction in receipt_parser.py: after EXIF transpose, ask Tesseract's orientation-detection engine (--psm 0) what angle to rotate; applies to receipts photographed lying sideways where EXIF metadata cannot help - Add month-name date patterns (DD MON YYYY / MON DD YYYY) to _extract_date_from_text for airline/hotel receipts that print dates like "05 MAY 2026" instead of "05/07/26" 85 tests, all passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 00:46:08 -04:00
Carlos Garcia	ce57d19528	fix(expenses): add 'Total Sale' and 'Net Sale' to labeled-total pattern Gas station receipts (Costco, Shell, etc.) print "Total Sale $X.XX" — the word "Sale" between "Total" and the amount prevented _TOTAL_RE from matching, causing the Costco receipt to fall through to the max-scan heuristic and return a garbled OCR value instead of the correct total. Also add "Net Sale" and "Sale Total" variants for broader coverage. 79 tests, all passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 00:36:44 -04:00
Carlos Garcia	77fab52475	fix(expenses): detect bank statements, fix default category, improve prompts - Add _is_likely_bank_statement(): if OCR text has ≥10 lines with dollar amounts it is almost certainly a bank/card statement screenshot, not a single receipt. Return skip=True so _act() skips it and adds a note to the escalations list instead of creating a $1,699 expense line. - Fix default product selection in _act(): prefer "Meals" over whatever happens to be first in Odoo's expense product list ("Communication"), so unrecognised receipts get a sensible fallback category. - Improve LLM category prompt: remove hardcoded product names (airline → Transport) that don't exist in every Odoo install; describe business types semantically so the model picks from the actual available list. - Mention skipped statements in the final summary message. - 77 tests, all passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 00:25:44 -04:00
Carlos Garcia	6287b3bcef	fix(expenses): improve receipt amount extraction and vendor naming - Remove card brands (VISA/MC/Amex) from _SKIP_LINE_RE so card-terminal lines like "VISA USD$ 36.78" are no longer skipped - Replace bottom-50% scan with full-text max scan (Pass 2): scans every line in the receipt and returns the largest dollar amount, correctly handling display-style receipts that show the charge at the top with no label (e.g. LAYAL CAFE $40.10 before the item list) - Update vendor LLM prompt to ask the model to correct OCR garbling (e.g. "NeDonald's" → "McDonald's") and detect bank statements - Add 4 new tests covering top-amount, card-terminal, max-beats-items, and change-exclusion scenarios (71 tests, all passing) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 00:11:03 -04:00
Carlos Garcia	1536d83376	Improve OCR preprocessing and amount extraction robustness Image preprocessing (receipt_parser.py): - Add ImageOps.exif_transpose() — fixes portrait photos stored with EXIF rotation metadata (most phone photos); without this Tesseract reads a rotated image and produces garbage - Upscale images < 600px wide for better character recognition - Raise binarization threshold 140→160 for faint thermal-print receipts - Try PSM 6 (single text block) before PSM 4, PSM 11 as fallbacks; PSM 6 is better suited to single-column receipt layout Amount extraction (expenses_agent.py): - Add Pass 2 bottom-of-receipt line scan when labeled Total: regex fails; reads lines bottom-to-top in the last 50% of text, skipping change/tip lines — handles 'T0TAL' OCR misread and amount-on-next-line layout - Add _SKIP_LINE_RE and _ANY_DOLLAR_RE module-level patterns - 8 new tests covering garbled total, change-skip, USD suffix, etc. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 23:33:38 -04:00
Carlos Garcia	f1a8add84b	Add OCR debug logging to diagnose receipt extraction quality Logs per-receipt: OCR text length, first 120 chars of OCR output, and final parsed vendor/amount/date/product_name. This will show whether Tesseract is producing usable text. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 23:23:13 -04:00
Carlos Garcia	e6c3d08990	Fix receipt parsing quality and approval endpoint Receipt quality: replace LLM amount/date extraction with regex. LLM was hallucinating 2021/2022 dates and returning '198.40 USD' strings. Amounts now use deterministic regex (Total:/Grand Total:/Amount Due:). Dates: filename timestamp > OCR regex > today (no LLM date guessing). LLM only asked for vendor name + product category. Approval: fix GET /approval/pending 500 by using correct column name 'started_at' instead of 'created_at' (which does not exist). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 23:02:11 -04:00
Carlos Garcia	0320591344	Remove vision OCR — use Tesseract-only pipeline for receipt parsing The llama3.2-vision model was producing unreliable structured data (wrong vendors, amounts, dates) making expense reports worse than Tesseract + LLM extraction. Removes _ocr_image_vision(), the vision JSON fast path in _parse_receipt_text(), _match_category(), and the vision_ocr_model config setting entirely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 22:32:26 -04:00
Carlos Garcia	ec6b41943f	fix: vision OCR JSON failures — add format='json' and repair fallback Three receipts per batch were failing with JSONDecodeError (e.g. "Expecting ':' delimiter: line 1 column 90") because activeblue-chat (llama3.2-vision) occasionally outputs near-JSON with trailing commas, single-quoted strings, or unquoted keys. Two-layer fix: 1. Add format='json' to the Ollama chat call — Ollama JSON mode forces syntactically valid output at the sampler level, eliminating most structural errors. 2. Add _repair_json() fallback that runs on any remaining JSONDecodeError: strips trailing commas, converts single→double quotes, and quotes unquoted keys. If repair succeeds, the result is re-serialised as canonical JSON before being returned. Also re-serialise with json.dumps() on success so the fast path in _parse_receipt_text always receives clean, canonical JSON regardless of whitespace or key ordering in the model's original output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 22:24:50 -04:00
Carlos Garcia	9fa391c720	fix: reduce hallucination in receipt extraction — conservative prompts + date injection Two sources of hallucinated values in receipt parsing: 1. The LLM extraction prompt had no explicit "don't guess" constraint, so when Tesseract produced garbled OCR text the LLM substituted plausible- looking values (wrong vendor names, wrong totals) instead of returning safe defaults. 2. The date field asked the LLM to extract the date from the OCR text even when date_hint (from the filename timestamp, e.g. 20260509_180857.jpg) was already available — a reliable signal that was being ignored. expenses_agent._parse_receipt_text: - LLM path: new prompt leads with "copy values EXACTLY, do NOT guess or infer"; adds "if OCR looks corrupted, return safe default rather than a more logical value"; injects date_hint directly as an authoritative value when available so the LLM never needs to extract the date. - Vision fast path: normalise "null" string for date the same way as time; prefer date_hint over a null date returned by the vision model. receipt_parser._ocr_image_vision: - Vision prompt now leads with the same "copy exactly, do not guess" constraint and explicitly accepts null for date/time when not clearly visible, matching the conservative tone of the LLM extraction prompt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 22:19:20 -04:00
Carlos Garcia	cc025695ac	fix: prevent master agent asking for clarification when receipts are uploaded When a zip/image arrives via /upload, the LLM was classifying the message as needs_clarification=True (because the chat body was just a filename like "download (8).zip", not an instruction), and the early return on line 91 fired before the receipts safety guard on line 106, so the guard never executed. master_agent: move the receipts safety guard to BEFORE the needs_clarification early-return. If extra_context contains receipts, unconditionally set needs_clarification=False and ensure expenses_agent is in the agents list — the LLM cannot veto an upload with a question. upload router: normalize empty or filename-only messages (e.g. when the user drops a file in Discuss chat with no text) to "Create an expense report from these uploaded receipts." so the LLM intent classification also has a sensible string to work with. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 22:13:46 -04:00
Carlos Garcia	68b7b3f0f3	fix: add missing approval workflow columns to ab_directive_log (migration 002) /approval/pending was returning 500 UndefinedColumnError because the approval router and MCP get_pending_approvals tool both query columns (agent_name, action_type, description, context_data, approver_id, approval_note, updated_at) that were never added in the initial schema migration 001. Adds migration 002 to ALTER TABLE ab_directive_log with all seven missing columns (all nullable so existing rows are unaffected) and an index on updated_at for efficient polling. Deploy: after pulling on miaai, run: cd /root/odoo/odoo-ai && docker compose exec agent-service \ alembic -c agent_service/migrations/alembic.ini upgrade head Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 22:03:59 -04:00
Carlos Garcia	70145c9e04	fix: chat attachment detection — 3-method fallback + deferred retry ab_ai_mail.py: when a user sends a file via Odoo 18 Discuss, the zip was going through /dispatch (text-only) instead of /upload, causing the bot to respond "I'm unable to locate the zip file" because attachment_ids was empty in the message_post override. Root cause: Odoo 18 Discuss links file attachments to mail.message records via three different mechanisms depending on the upload path, and we only checked one (the Many2many relation table). Fixes: 1. Three-method attachment detection in message_post: - Method 1: result.attachment_ids (Many2many relation table) - Method 2: ir.attachment with res_model='mail.message' (Odoo 15+ style) - Method 3: attachment IDs parsed from href URLs in the HTML body 2. Deferred retry in _agent_thread: if att_data is still empty but a message_id is known, sleep 1s then re-read via a fresh DB cursor so we see data committed after message_post returned (timing race fix) 3. Skip zero-byte attachments and warn instead of silently using them 4. Pass message_id to the background thread (new kwarg, backward compat) 5. Add debug logging so future issues can be diagnosed from Odoo logs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 22:01:38 -04:00
Carlos Garcia	11cc261923	fix: vision OCR receipt extraction — skip second LLM call, fix total truncation receipt_parser: change _ocr_image_vision() to extract structured JSON {vendor,amount,date,time,category} directly from the image instead of transcribing raw text, so the downstream LLM extraction step is unnecessary and the two-step error-compounding is eliminated. expenses_agent: add _match_category() helper to map vision category labels to expense product names via substring/fuzzy match; add fast path in _parse_receipt_text() that detects pre-extracted vision JSON (text starts with '{') and skips the second LLM submit call entirely. Fix text[:2000] truncation that discarded receipt totals — now keeps first 1500 + last 1500 chars of long receipts so the grand total at the bottom is always included. tests: fix stale test_act_enters_awaiting_confirmation_on_first_pass (confirmation gate was removed); add TestMatchCategory and three new tests for the vision JSON fast path and LLM fallthrough. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 21:49:31 -04:00
Carlos Garcia	7a0aad3f37	fix: three bugs blocking bot presence and approval UI 1. OdooClient missing self._timeout — every _xmlrpc_call raised AttributeError, making the odoo health check permanently fail. Fix: set self._timeout = XMLRPC_TIMEOUT in __init__. 2. action_ping only accepted ollama=='ok' but health.py now returns 'warming' when the model is not yet hot in VRAM. Fix: treat warming as passing so the bot goes online and the model loads on the first real request. 3. /ai/approval/pending declared methods=['GET'] on a type='json' route — Odoo JSON-RPC always POSTs, so every browser call got 405 METHOD NOT ALLOWED. Fix: change to methods=['POST']. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 20:53:49 -04:00
Carlos Garcia	b23ab77ee9	fix: bot presence stays offline after vision model change ping() was calling ollama.AsyncClient.list() which parses /api/tags with ollama==0.3.3 pydantic models. Vision models carry metadata fields that 0.3.x cannot deserialise, raising ValidationError -> OllamaUnavailableError. This made the /health/detailed ollama field 'error: ...' instead of 'ok', so ab_ai_bot.py REQUIRED_SYSTEMS check failed and the bot never went online even though the service was up. Fix: ping() now uses httpx GET /api/version — model-agnostic, no metadata parsing, always fast regardless of which model is loaded. Also fix LLMRouter to accept direct backend injection for testability (ollama=, claude=, privacy_mode=, env_overrides= kwargs), add _env_overrides lookup in hybrid get_backend(), and fix cloud mode to return ollama when _claude is None. All 6 test_llm_router tests now pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 19:15:49 -04:00
tocmo0nlord	2f9791f925	Update CLAUDE.md	2026-05-20 21:26:34 +00:00
tocmo0nlord	e67dc06a22	CLAUDE.md	2026-05-20 21:21:45 +00:00
Carlos Garcia	564f1a9479	fix: raise Ollama timeout to 300s, add model pre-warming, improve health check - OllamaBackend enforces _MIN_TIMEOUT=300s (overrides OLLAMA_TIMEOUT env var) - warm_model() background task loads activeblue-chat into VRAM at startup - health/detailed reports "warming" vs "ok" via Ollama ps() API - README updated with May 2026 changes and test coverage details	2026-05-20 05:03:15 +00:00
Carlos Garcia	20a69313d7	Add comprehensive unit tests for all agent service components Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 04:00:45 +00:00
Carlos Garcia	6c22a9a128	feat: elearning_agent — reduce tools 14 → 8 so it registers at startup - Merge get_course_stats + get_enrolled_users + get_slide_completion → get_course_details - Fold publish_course into update_course via website_published param - Drop flag_low_completion (replaced by post_chatter_note) and suggest_next_course (still callable internally via peer-bus suggest_courses request) - elearning_tools: add get_course_details(), extend update_course() signature - ARCHITECTURE.md: mark elearning_agent as registered Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 23:02:51 -04:00
Carlos Garcia	233f461480	fix: align peer_bus signature, bot presence SQL, XML-RPC timeout - All specialist agents: handle_peer_request(request_type, params, directive_id) replaces handle_peer_request(request: dict) so callers pass structured args - ab_ai_bot: force-write bus_presence.status via SQL so Odoo 18 WebSocket presence shows the correct colour immediately (ORM compute does not trigger on last_poll writes) - odoo_client: wrap XML-RPC executor calls in asyncio.wait_for to enforce timeout Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 23:02:51 -04:00
Carlos Garcia	93f2a101fa	refactor: remove scripted file intercept — LLM owns all responses Previously ab_ai_mail.py intercepted file uploads before reaching the LLM and responded with a hardcoded clarification template. The LLM had no involvement in the file upload response. Changes: - ab_ai_mail.py: remove _post_file_clarification, _find_pending_attachments, _describe_zip, and the two-step pending-attachment lookup. All messages (text, files, or both) are dispatched to the agent service immediately. Files with no text pass an empty message — the LLM decides what to do. - upload.py: default message changed from hardcoded receipt instruction to '' so the LLM determines intent from file content. - master_agent._synthesize: always runs through the LLM for both single and multi-agent cases — no raw templates reach the user. - master_system.txt: add FILE UPLOADS routing rule so the LLM knows to route receipts to expenses_agent without asking for clarification. New flow: upload → parse → LLM classifies → agent acts → LLM synthesizes natural response → user sees it. Zero scripted intercepts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 21:05:38 -04:00
Carlos Garcia	0bd1810405	fix: create expense report immediately — remove broken confirmation gate The old flow required a "confirm" reply after showing a parsed-receipt table, but that follow-up dispatch call carries no receipts (they only exist in the /upload context). The confirmation gate was architecturally broken: the second turn would always create nothing. Fix: create the expense sheet immediately when receipts are present. Byte-exact and semantic duplicates are auto-skipped; the count of skipped items is reported in the success message. The report is always created in Odoo as a draft so users can review amounts and submit manually via Odoo > Expenses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 20:58:47 -04:00
Carlos Garcia	8d1727b498	feat: sysops_agent — Docker/git self-management with auto-heal Adds a new specialist agent that gives the AI system control over its own infrastructure: - sysops_tools.py: docker SDK (ps/logs/restart) + git CLI (pull/status/log) + Odoo channel notifier for autonomous action broadcasts - sysops_agent.py: BaseAgent subclass handling on-demand chat requests, auto_heal() triggered by health failures, and sweep() for audits - Background auto-heal loop (main.py): runs every 2 minutes, calls _get_failing_systems() and triggers auto_heal() when degraded - health.py: extracted _get_failing_systems() helper reused by both the /health/detailed endpoint and the auto-heal loop - docker-compose.yml: mount docker socket + /root/odoo workspace + SSH keys for git authentication - Dockerfile: add git to apt-get - requirements.txt: add docker==7.1.0 Python SDK Auto-heal behavior: - Detects failing containers, restarts them, notifies all bot DM channels - Ollama (192.168.2.9) is flagged as external and skipped - On-demand via chat: "restart agent", "check logs", "pull latest code" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 17:01:57 -04:00
Carlos Garcia	f4991dd920	fix: presence window 24h → 10min to match cron heartbeat Bot green dot stays on for 10 minutes after each successful health check (2× the ~5-min cron cycle). A failed check sets last_poll to 1 hour in the past, going offline immediately. If the cron stops entirely, the dot goes offline on its own after 10 minutes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 16:36:48 -04:00
Carlos Garcia	160f96a549	fix: override bus.presence._compute_status so bot shows online Odoo 18's _compute_status treats future last_poll as MORE disconnected (absolute delta). Override forces status='online' when last_poll > now, which is set 24h ahead by _sync_bot_user_presence when the health check passes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 16:27:40 -04:00
Carlos Garcia	eeea45b37f	fix: explicit per-system health checks gate online status action_ping now checks db, odoo, ollama, and master_agent individually. All four must report 'ok' for the bot to go online. Presence is updated immediately inside action_ping (not as a separate cron step), so every ping — whether from the cron or a manual button press — atomically checks all systems and sets the correct online/offline/error state. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 16:07:59 -04:00
Carlos Garcia	99cc19195a	fix: keep bot presence online for 24h instead of racing the 30s timer Set last_poll and last_presence 24h ahead when the service is confirmed online, so status stays 'online' until the cron explicitly marks it down. The previous 10min offset still expired between cron runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 16:06:08 -04:00
Carlos Garcia	a0fc1396a9	fix: Odoo 18 field errors, routing quality, bot presence, and add architecture docs - expenses_tools: remove 'date' from hr.expense.sheet field lists (Odoo 18 uses accounting_date; querying 'date' raised ValueError at runtime) - master_system.txt: add few-shot routing examples so Llama 3.1 8B correctly outputs agents=[] for general questions instead of defaulting to expenses_agent - ab_ai_bot.py: increase bot presence last_poll offset from 90s to 10min so the green dot stays on between cron runs (cron fires every ~5min in practice, not every 20s as configured) - ARCHITECTURE.md: full system documentation covering component layout, request flow, LLM routing, agent registry, access control, health/presence mechanism, known issues fixed today, and future self-healing concept Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 15:47:48 -04:00
Carlos Garcia	b76d01b64f	Fix vision OCR response parsing for dict-returning ollama client versions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-17 11:59:11 -04:00
Carlos Garcia	5b924e60de	Add vision OCR via Ollama vision model with Tesseract fallback Introduces VISION_OCR_MODEL setting. When set (e.g. llama3.2-vision:11b), receipt images are transcribed by the Ollama vision model before falling back to Tesseract. Also improves Tesseract preprocessing with adaptive binarisation (autocontrast + threshold at 140) for better accuracy on thermal receipts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:43:21 -04:00
Carlos Garcia	9f38fb013c	docs: label test file and add TEST_EXPENSES_AGENT.md Adds module-level label and cross-reference to the new doc. TEST_EXPENSES_AGENT.md documents every test group, case, and the real-world bug each test guards against (e.g. In-N-Out OCR mismatch). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:35:07 -04:00
Carlos Garcia	469025b6f2	test: fix bad vendor example in pass2 similarity test 'Restaurant A' vs 'Restaurant Z' differ by 1 char so difflib scores them at ~91% -- correctly above the 80% threshold. Use clearly different vendors (Starbucks Coffee vs McDonalds Burger) instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:32:38 -04:00
Carlos Garcia	1c5f6e7ca3	test: fix _ext import (only exists in ab_ai_mail, not receipt_parser) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:31:30 -04:00
Carlos Garcia	92ba6bd069	test: add requirements-test.txt for isolated test venv Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:14:05 -04:00
Carlos Garcia	6fcd830e6f	test: unit tests for expenses agent dedup, plan, act, and receipt parser - TestFindSemanticDuplicate: 18 cases covering Pass1 (amount match), Pass2 (OCR mismatch / high vendor similarity), time window, filenames, zero-amount exclusion, multi-candidate index correctness - test_plan_: keyword detection for confirm/skip/keep-all, mode routing - test_act_: confirmation gate, byte-dedup, no-employee escalation, confirmed creation with mocked Odoo tools - TestParseUpload: ZIP extraction, directory skipping, filename date parsing, SHA256 consistency, b64 round-trip - TestTextToHtml: escaping, newline to <br>, empty string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:11:32 -04:00
Carlos Garcia	af1d27be89	feat: pre-creation confirmation step with inline duplicate warnings Before writing any expense records the bot now posts a numbered table of parsed vendor/amount/date for every receipt, with duplicate entries flagged inline. User replies 'confirm' (skips dups) or 'confirm, keep all'. This catches OCR amount misreads before they land in Odoo. Also removes the separate awaiting_dup_approval step; duplicate review is now part of the single confirmation table. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 16:54:25 -04:00
Carlos Garcia	12576ead1b	feat: two-pass dedup catches same-vendor OCR amount misreads Pass 1 unchanged: same date + amount within 0.05 + vendor similarity 60%. Pass 2 (new): same vendor (>= 80% similarity) + same date, regardless of amount, to catch receipts where OCR misread the total. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 16:48:51 -04:00
Carlos Garcia	774c0cc062	fix: tighten receipt amount extraction prompt to reduce OCR misreads Replaced 'pick the largest one' guidance with 'bottom-most total' and 'return 0 if no clear total found' to avoid picking line items or tips. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 16:47:48 -04:00
Carlos Garcia	bb1e93fabb	fix: widen actions_taken to list[Any] and improve bot error replies DispatchResponse declared actions_taken as list[dict] but agents return list[str], causing a 422 on every successful upload. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 16:31:45 -04:00
Carlos Garcia	cf3fe5e0a5	fix: await get_all() in registry router and align get_all key names The /registry/agents endpoint was 500 on every call because AgentRegistry.get_all() is async but was called without await. Also aligns get_all() dict keys (name, domain) with what the router reads. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 13:38:06 -04:00
Carlos Garcia	d87f3c3e99	Non-blocking agent dispatch: run LLM call in background thread message_post now returns immediately after collecting attachment data. The agent HTTP call and reply posting happen in a daemon thread, so Odoo commits the user's message and the browser confirms receipt right away -- instead of waiting 10+ seconds for Ollama to respond. File clarification (no LLM) still posts inline since it's instant. The background thread opens its own DB cursor to post the bot reply. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 12:27:03 -04:00
Carlos Garcia	7d260ca526	Fix HTML display: use plain text + _text_to_html for all bot messages All bot messages now built as plain text and converted via _text_to_html() which escapes content and converts newlines to <br>. This avoids raw HTML tags appearing literally in Odoo 18 Discuss. - _describe_zip: returns plain str (no Markup/HTML) - _post_file_clarification: builds plain text, posts via _text_to_html() - _find_pending_attachments: strip HTML before phrase matching - _text_to_html: new helper shared by clarification and agent replies Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 12:21:58 -04:00
Carlos Garcia	9e3fe974dc	Fix dup approval flow: preserve raw message, force expenses routing, fix HTML rendering - master_agent: thread raw user message into extra_context and peer_data so expenses_agent can check it directly without relying on LLM intent_summary - master_agent: when receipts are in extra_context always route to expenses_agent, so replies like 'skip duplicates' still trigger expense processing - expenses_agent: _plan() checks peer_data raw_message alongside task so skip/keep keywords are detected even when master rewrites the intent - ab_ai_mail: wrap clarification message HTML in Markup() so Odoo does not re-escape the tags; use <br> instead of <br/> - ab_ai_mail: convert agent plain-text replies newlines to <br> for proper line-break rendering in Discuss Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 11:55:46 -04:00
Carlos Garcia	462f63d11d	Add duplicate approval flow with time-based dedup - expenses_agent: extract transaction time (HH:MM) from OCR receipt text - expenses_agent: _find_semantic_duplicate uses time to rule out false positives (>30 min apart = different receipts) - expenses_agent: pause when duplicates found, set mode=awaiting_dup_approval, ask user before creating sheet - expenses_agent: _report formats approval message listing each dup pair with vendor/amount/date/times/filenames - ab_ai_mail: _find_pending_attachments recognises dup-approval bot message so ZIP re-attaches on user reply Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 02:07:37 -04:00

1 2 3

117 Commits