Initial commit: Odoo 18 RAG stack
Scraper, indexer, and FastAPI query service for Retrieval-Augmented Generation over Odoo 18 documentation. Uses Qdrant + Ollama (nomic-embed-text + llama3.1). Integrates with ActiveBlue PeerBus agent interface. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
127
README.md
Normal file
127
README.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# odoo18-rag
|
||||
|
||||
Retrieval-Augmented Generation over the full Odoo 18 documentation.
|
||||
Built for the ActiveBlue AI agent stack.
|
||||
|
||||
## Stack
|
||||
|
||||
| Component | What it does |
|
||||
|---|---|
|
||||
| `scraper/` | Crawls odoo.com/documentation/18.0, outputs clean JSONL |
|
||||
| `indexer/` | Chunks pages, embeds with `nomic-embed-text`, loads Qdrant |
|
||||
| `api/` | FastAPI — `/ask`, `/ask/stream`, `/agent/ask`, `/health` |
|
||||
| Qdrant | Vector database (Docker) |
|
||||
| Ollama @ `miaai:11434` | Embeddings + generation (local, HIPAA-safe) |
|
||||
|
||||
## Quick start
|
||||
|
||||
```bash
|
||||
# 1. Pull the embedding model on miaai
|
||||
ollama pull nomic-embed-text
|
||||
|
||||
# 2. Start Qdrant + RAG API
|
||||
docker compose up -d
|
||||
|
||||
# 3. Scrape the docs (~800 pages, ~20 min)
|
||||
docker compose run --rm scraper
|
||||
|
||||
# 4. Index into Qdrant (~30-40 min)
|
||||
docker compose run --rm indexer
|
||||
|
||||
# 5. Test
|
||||
curl http://localhost:8000/health
|
||||
curl -X POST http://localhost:8000/ask \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"question": "How do I run a payroll batch in Odoo 18?"}'
|
||||
```
|
||||
|
||||
## Endpoints
|
||||
|
||||
| Method | Path | Description |
|
||||
|---|---|---|
|
||||
| GET | `/health` | Qdrant + Ollama connectivity |
|
||||
| GET | `/stats` | Vector count, models in use |
|
||||
| GET | `/modules` | List indexed Odoo modules |
|
||||
| POST | `/ask` | Blocking answer + sources |
|
||||
| POST | `/ask/stream` | SSE token stream |
|
||||
| POST | `/agent/ask` | ActiveBlue PeerBus integration |
|
||||
|
||||
### Ask with module filter
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8000/ask \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"question": "How do reordering rules work?", "module": "inventory"}'
|
||||
```
|
||||
|
||||
### Streaming
|
||||
|
||||
```bash
|
||||
curl -N -X POST http://localhost:8000/ask/stream \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"question": "Explain the Quote-to-Cash workflow"}'
|
||||
```
|
||||
|
||||
## Agent integration
|
||||
|
||||
```python
|
||||
from api.odoo_rag_agent import OdooRagAgent
|
||||
|
||||
agent = OdooRagAgent(rag_url="http://localhost:8000")
|
||||
|
||||
# Generic
|
||||
result = await agent.ask("How do I configure NACHA payments?")
|
||||
|
||||
# Module-scoped
|
||||
result = await agent.ask_payroll("How do I generate a payslip batch?")
|
||||
result = await agent.ask_accounting("What is the chart of accounts?")
|
||||
result = await agent.ask_inventory("How does MTO work?")
|
||||
|
||||
# Streaming
|
||||
async for token in agent.ask_stream("Explain the CRM pipeline"):
|
||||
print(token, end="", flush=True)
|
||||
|
||||
# PeerBus
|
||||
response = await agent.handle_peer_message({
|
||||
"action": "ask",
|
||||
"payload": {"question": "How do I set up taxes?", "module": "accounting"},
|
||||
"request_id": "req-001"
|
||||
})
|
||||
```
|
||||
|
||||
## Re-indexing
|
||||
|
||||
Odoo releases doc updates regularly. Re-index to stay current:
|
||||
|
||||
```bash
|
||||
docker compose run --rm scraper
|
||||
docker compose run --rm indexer python /app/indexer/indexer.py --reset
|
||||
```
|
||||
|
||||
Or add a monthly cron on the host:
|
||||
|
||||
```cron
|
||||
0 3 1 * * cd /opt/odoo18-rag && docker compose run --rm scraper && docker compose run --rm indexer python /app/indexer/indexer.py --reset
|
||||
```
|
||||
|
||||
## Scraper options
|
||||
|
||||
```bash
|
||||
# Single module only
|
||||
docker compose run --rm scraper python /app/scraper/scraper.py --module accounting
|
||||
|
||||
# Quick test (first 50 pages)
|
||||
docker compose run --rm scraper python /app/scraper/scraper.py --limit 50
|
||||
```
|
||||
|
||||
## Environment variables
|
||||
|
||||
All configurable via `docker-compose.yml` environment section:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|---|---|---|
|
||||
| `OLLAMA_URL` | `http://miaai:11434` | Ollama endpoint |
|
||||
| `QDRANT_URL` | `http://qdrant:6333` | Qdrant endpoint |
|
||||
| `EMBED_MODEL` | `nomic-embed-text` | Embedding model |
|
||||
| `GEN_MODEL` | `llama3.1` | Generation model |
|
||||
| `COLLECTION_NAME` | `odoo18_docs` | Qdrant collection |
|
||||
Reference in New Issue
Block a user