# odoo18-rag Retrieval-Augmented Generation over the full Odoo 18 documentation. Built for the ActiveBlue AI agent stack. ## Stack | Component | What it does | |---|---| | `scraper/` | Crawls odoo.com/documentation/18.0, outputs clean JSONL | | `indexer/` | Chunks pages, embeds with `nomic-embed-text`, loads Qdrant | | `api/` | FastAPI — `/ask`, `/ask/stream`, `/agent/ask`, `/health` | | Qdrant | Vector database (Docker) | | Ollama @ `miaai:11434` | Embeddings + generation (local, HIPAA-safe) | ## Quick start ```bash # 1. Pull the embedding model on miaai ollama pull nomic-embed-text # 2. Start Qdrant + RAG API docker compose up -d # 3. Scrape the docs (~800 pages, ~20 min) docker compose run --rm scraper # 4. Index into Qdrant (~30-40 min) docker compose run --rm indexer # 5. Test curl http://localhost:8000/health curl -X POST http://localhost:8000/ask \ -H "Content-Type: application/json" \ -d '{"question": "How do I run a payroll batch in Odoo 18?"}' ``` ## Endpoints | Method | Path | Description | |---|---|---| | GET | `/health` | Qdrant + Ollama connectivity | | GET | `/stats` | Vector count, models in use | | GET | `/modules` | List indexed Odoo modules | | POST | `/ask` | Blocking answer + sources | | POST | `/ask/stream` | SSE token stream | | POST | `/agent/ask` | ActiveBlue PeerBus integration | ### Ask with module filter ```bash curl -X POST http://localhost:8000/ask \ -H "Content-Type: application/json" \ -d '{"question": "How do reordering rules work?", "module": "inventory"}' ``` ### Streaming ```bash curl -N -X POST http://localhost:8000/ask/stream \ -H "Content-Type: application/json" \ -d '{"question": "Explain the Quote-to-Cash workflow"}' ``` ## Agent integration ```python from api.odoo_rag_agent import OdooRagAgent agent = OdooRagAgent(rag_url="http://localhost:8000") # Generic result = await agent.ask("How do I configure NACHA payments?") # Module-scoped result = await agent.ask_payroll("How do I generate a payslip batch?") result = await agent.ask_accounting("What is the chart of accounts?") result = await agent.ask_inventory("How does MTO work?") # Streaming async for token in agent.ask_stream("Explain the CRM pipeline"): print(token, end="", flush=True) # PeerBus response = await agent.handle_peer_message({ "action": "ask", "payload": {"question": "How do I set up taxes?", "module": "accounting"}, "request_id": "req-001" }) ``` ## Re-indexing Odoo releases doc updates regularly. Re-index to stay current: ```bash docker compose run --rm scraper docker compose run --rm indexer python /app/indexer/indexer.py --reset ``` Or add a monthly cron on the host: ```cron 0 3 1 * * cd /opt/odoo18-rag && docker compose run --rm scraper && docker compose run --rm indexer python /app/indexer/indexer.py --reset ``` ## Scraper options ```bash # Single module only docker compose run --rm scraper python /app/scraper/scraper.py --module accounting # Quick test (first 50 pages) docker compose run --rm scraper python /app/scraper/scraper.py --limit 50 ``` ## Environment variables All configurable via `docker-compose.yml` environment section: | Variable | Default | Description | |---|---|---| | `OLLAMA_URL` | `http://miaai:11434` | Ollama endpoint | | `QDRANT_URL` | `http://qdrant:6333` | Qdrant endpoint | | `EMBED_MODEL` | `nomic-embed-text` | Embedding model | | `GEN_MODEL` | `llama3.1` | Generation model | | `COLLECTION_NAME` | `odoo18_docs` | Qdrant collection |