odoo18-rag

Retrieval-Augmented Generation over the full Odoo 18 documentation. Built for the ActiveBlue AI agent stack.

Stack

Component What it does
scraper/ Crawls odoo.com/documentation/18.0, outputs clean JSONL
indexer/ Chunks pages, embeds with nomic-embed-text, loads Qdrant
api/ FastAPI — /ask, /ask/stream, /agent/ask, /health
Qdrant Vector database (Docker)
Ollama @ miaai:11434 Embeddings + generation (local, HIPAA-safe)

Quick start

# 1. Pull the embedding model on miaai
ollama pull nomic-embed-text

# 2. Start Qdrant + RAG API
docker compose up -d

# 3. Scrape the docs (~800 pages, ~20 min)
docker compose run --rm scraper

# 4. Index into Qdrant (~30-40 min)
docker compose run --rm indexer

# 5. Test
curl http://localhost:8000/health
curl -X POST http://localhost:8000/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "How do I run a payroll batch in Odoo 18?"}'

Endpoints

Method Path Description
GET /health Qdrant + Ollama connectivity
GET /stats Vector count, models in use
GET /modules List indexed Odoo modules
POST /ask Blocking answer + sources
POST /ask/stream SSE token stream
POST /agent/ask ActiveBlue PeerBus integration

Ask with module filter

curl -X POST http://localhost:8000/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "How do reordering rules work?", "module": "inventory"}'

Streaming

curl -N -X POST http://localhost:8000/ask/stream \
  -H "Content-Type: application/json" \
  -d '{"question": "Explain the Quote-to-Cash workflow"}'

Agent integration

from api.odoo_rag_agent import OdooRagAgent

agent = OdooRagAgent(rag_url="http://localhost:8000")

# Generic
result = await agent.ask("How do I configure NACHA payments?")

# Module-scoped
result = await agent.ask_payroll("How do I generate a payslip batch?")
result = await agent.ask_accounting("What is the chart of accounts?")
result = await agent.ask_inventory("How does MTO work?")

# Streaming
async for token in agent.ask_stream("Explain the CRM pipeline"):
    print(token, end="", flush=True)

# PeerBus
response = await agent.handle_peer_message({
    "action": "ask",
    "payload": {"question": "How do I set up taxes?", "module": "accounting"},
    "request_id": "req-001"
})

Re-indexing

Odoo releases doc updates regularly. Re-index to stay current:

docker compose run --rm scraper
docker compose run --rm indexer python /app/indexer/indexer.py --reset

Or add a monthly cron on the host:

0 3 1 * * cd /opt/odoo18-rag && docker compose run --rm scraper && docker compose run --rm indexer python /app/indexer/indexer.py --reset

Scraper options

# Single module only
docker compose run --rm scraper python /app/scraper/scraper.py --module accounting

# Quick test (first 50 pages)
docker compose run --rm scraper python /app/scraper/scraper.py --limit 50

Environment variables

All configurable via docker-compose.yml environment section:

Variable Default Description
OLLAMA_URL http://miaai:11434 Ollama endpoint
QDRANT_URL http://qdrant:6333 Qdrant endpoint
EMBED_MODEL nomic-embed-text Embedding model
GEN_MODEL llama3.1 Generation model
COLLECTION_NAME odoo18_docs Qdrant collection
Description
No description provided
Readme 1.5 MiB
Languages
Python 98.9%
Dockerfile 1.1%