odoo-ai

Author	SHA1	Message	Date
Carlos Garcia	6287b3bcef	fix(expenses): improve receipt amount extraction and vendor naming - Remove card brands (VISA/MC/Amex) from _SKIP_LINE_RE so card-terminal lines like "VISA USD$ 36.78" are no longer skipped - Replace bottom-50% scan with full-text max scan (Pass 2): scans every line in the receipt and returns the largest dollar amount, correctly handling display-style receipts that show the charge at the top with no label (e.g. LAYAL CAFE $40.10 before the item list) - Update vendor LLM prompt to ask the model to correct OCR garbling (e.g. "NeDonald's" → "McDonald's") and detect bank statements - Add 4 new tests covering top-amount, card-terminal, max-beats-items, and change-exclusion scenarios (71 tests, all passing) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 00:11:03 -04:00
Carlos Garcia	1536d83376	Improve OCR preprocessing and amount extraction robustness Image preprocessing (receipt_parser.py): - Add ImageOps.exif_transpose() — fixes portrait photos stored with EXIF rotation metadata (most phone photos); without this Tesseract reads a rotated image and produces garbage - Upscale images < 600px wide for better character recognition - Raise binarization threshold 140→160 for faint thermal-print receipts - Try PSM 6 (single text block) before PSM 4, PSM 11 as fallbacks; PSM 6 is better suited to single-column receipt layout Amount extraction (expenses_agent.py): - Add Pass 2 bottom-of-receipt line scan when labeled Total: regex fails; reads lines bottom-to-top in the last 50% of text, skipping change/tip lines — handles 'T0TAL' OCR misread and amount-on-next-line layout - Add _SKIP_LINE_RE and _ANY_DOLLAR_RE module-level patterns - 8 new tests covering garbled total, change-skip, USD suffix, etc. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 23:33:38 -04:00
Carlos Garcia	e6c3d08990	Fix receipt parsing quality and approval endpoint Receipt quality: replace LLM amount/date extraction with regex. LLM was hallucinating 2021/2022 dates and returning '198.40 USD' strings. Amounts now use deterministic regex (Total:/Grand Total:/Amount Due:). Dates: filename timestamp > OCR regex > today (no LLM date guessing). LLM only asked for vendor name + product category. Approval: fix GET /approval/pending 500 by using correct column name 'started_at' instead of 'created_at' (which does not exist). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 23:02:11 -04:00
Carlos Garcia	0320591344	Remove vision OCR — use Tesseract-only pipeline for receipt parsing The llama3.2-vision model was producing unreliable structured data (wrong vendors, amounts, dates) making expense reports worse than Tesseract + LLM extraction. Removes _ocr_image_vision(), the vision JSON fast path in _parse_receipt_text(), _match_category(), and the vision_ocr_model config setting entirely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 22:32:26 -04:00
Carlos Garcia	11cc261923	fix: vision OCR receipt extraction — skip second LLM call, fix total truncation receipt_parser: change _ocr_image_vision() to extract structured JSON {vendor,amount,date,time,category} directly from the image instead of transcribing raw text, so the downstream LLM extraction step is unnecessary and the two-step error-compounding is eliminated. expenses_agent: add _match_category() helper to map vision category labels to expense product names via substring/fuzzy match; add fast path in _parse_receipt_text() that detects pre-extracted vision JSON (text starts with '{') and skips the second LLM submit call entirely. Fix text[:2000] truncation that discarded receipt totals — now keeps first 1500 + last 1500 chars of long receipts so the grand total at the bottom is always included. tests: fix stale test_act_enters_awaiting_confirmation_on_first_pass (confirmation gate was removed); add TestMatchCategory and three new tests for the vision JSON fast path and LLM fallthrough. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 21:49:31 -04:00
Carlos Garcia	20a69313d7	Add comprehensive unit tests for all agent service components Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 04:00:45 +00:00
Carlos Garcia	9f38fb013c	docs: label test file and add TEST_EXPENSES_AGENT.md Adds module-level label and cross-reference to the new doc. TEST_EXPENSES_AGENT.md documents every test group, case, and the real-world bug each test guards against (e.g. In-N-Out OCR mismatch). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:35:07 -04:00
Carlos Garcia	469025b6f2	test: fix bad vendor example in pass2 similarity test 'Restaurant A' vs 'Restaurant Z' differ by 1 char so difflib scores them at ~91% -- correctly above the 80% threshold. Use clearly different vendors (Starbucks Coffee vs McDonalds Burger) instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:32:38 -04:00
Carlos Garcia	1c5f6e7ca3	test: fix _ext import (only exists in ab_ai_mail, not receipt_parser) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:31:30 -04:00
Carlos Garcia	6fcd830e6f	test: unit tests for expenses agent dedup, plan, act, and receipt parser - TestFindSemanticDuplicate: 18 cases covering Pass1 (amount match), Pass2 (OCR mismatch / high vendor similarity), time window, filenames, zero-amount exclusion, multi-candidate index correctness - test_plan_: keyword detection for confirm/skip/keep-all, mode routing - test_act_: confirmation gate, byte-dedup, no-employee escalation, confirmed creation with mocked Odoo tools - TestParseUpload: ZIP extraction, directory skipping, filename date parsing, SHA256 consistency, b64 round-trip - TestTextToHtml: escaping, newline to <br>, empty string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 18:11:32 -04:00
ActiveBlue Build	66b114cdcf	feat(mcp): add MCP gateway — 14 tools over SSE, all agent calls forced local Architecture: - agent_service/mcp/tools.py: 14 Tool definitions with JSON schemas dispatch, finance_query, accounting_query, crm_query, sales_query, project_query, elearning_query, expenses_query, employees_query, get_health, list_agents, trigger_sweep, get_pending_approvals, approve_directive - agent_service/mcp/server.py: mcp.Server with list_tools + call_tool handlers - agent_service/routers/mcp_router.py: Starlette routes at /mcp/sse + /mcp/messages - main.py: mounts MCP routes alongside existing FastAPI routers (graceful fallback if mcp not installed) Privacy guarantee (enforced in server.py, not by convention): - _force_local_context() sets llm_router._privacy_mode = 'local' before EVERY agent call - _restore_mode() restores original mode after the tool returns - HIPAA agents (finance, accounting, expenses, employees) were already Ollama-only; MCP adds a second enforcement layer for all 8 agents - MCP client (e.g. Claude Code CLI) receives only tool results — no LLM completions cross the boundary Usage (Claude Code CLI): claude mcp add --transport sse http://192.168.2.47:8001/mcp/sse or copy claude_mcp_config.json to ~/.claude/mcp_servers.json requirements.txt: added mcp==1.3.0 tests/test_mcp_server.py: 13 tests covering tool count, schemas, HIPAA labelling, privacy override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 16:45:49 -04:00
ActiveBlue Build	7487fc73f9	feat(infra): add sweep coordinator, structured logging, test suite, and README Sweep coordinator (Step 16): - SweepCoordinator runs all 8 agents in parallel with 60s per-agent / 300s total timeout - Aggregates findings, actions, errors into SweepCoordinatorResult - Registered in FastAPI lifespan; triggered via POST /sweep Structured logging (Step 18): - logging_utils/structured.py: JSONFormatter emitting ts/level/logger/msg + custom fields - log_directive_event() for structured directive lifecycle logging - push_to_loki() async Loki push (graceful no-op if LOKI_URL unset) - configure_logging() replaces root handler at startup Tests (Steps 17+19): - conftest.py: mock_odoo, mock_pool, mock_llm fixtures - test_tool_validator.py: 9 tests covering validation, coercion, hallucination stripping - test_llm_router.py: 6 tests covering local/cloud/hybrid modes and HIPAA enforcement - test_peer_bus.py: 6 tests covering registration, timeout, depth, circular detection - test_finance_agent.py: 10 tests covering all 6 steps + sweep + peer request - test_memory_manager.py: 3 tests covering context build + hard cap enforcement - test_dispatch_router.py: 3 tests covering dispatch, rate limit, health endpoint - test_odoo_client.py: 4 tests covering search_read, write result, unlink warning - test_e2e_dispatch.py: 2 E2E tests - full dispatch cycle + peer bus communication README (Step 20): architecture diagram, privacy modes, quick start, env vars, structure Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 18:08:11 -04:00

12 Commits