Adds the full Gitea repo URL to the infrastructure table and the monitoring dashboard line, and keeps the repository-structure tree root as avc-phone-ai to match the rest of the doc. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
652 lines
20 KiB
Markdown
652 lines
20 KiB
Markdown
# AVC Phone Agent — Project Specification
|
|
> Claude Code authoritative reference. All architecture, security, and build decisions live here.
|
|
> Repo: `git.activeblue.net/tocmo0nlord/avc-phone-ai`
|
|
> Last updated: 2026-06-23 | Active Blue LLC
|
|
|
|
---
|
|
|
|
## Project Overview
|
|
|
|
**Name:** AVC Phone Agent
|
|
**Owner:** Active Blue LLC
|
|
**Client:** Advanced Vision Care (AVC) — multi-location ophthalmology/optometry practice (FL + TX)
|
|
**Agent name:** AVA (Advanced Vision Assistant)
|
|
**Purpose:** Automated AI phone agent that answers patient calls, books tentative appointments
|
|
into Odoo CRM with call recordings and transcripts attached, and self-improves via
|
|
Claude-powered transcript monitoring and a fine-tuning feedback loop.
|
|
|
|
---
|
|
|
|
## Existing Codebase — What to Keep, What to Change
|
|
|
|
The previous build at `/home/tocmo0nlord/avc-phone/` is a working foundation.
|
|
**Do not rewrite what works.** Apply only the changes documented in this section.
|
|
|
|
### Files and their status
|
|
|
|
| File | Status | Action |
|
|
|------|--------|--------|
|
|
| `bot.py` | Keep with one change | Swap Whisper STT for Deepgram Nova-2 |
|
|
| `server.py` | Keep with one change | Swap Auth Token for API Key Secret |
|
|
| `practice.py` | Keep as-is | No changes |
|
|
| `extract.py` | Keep as-is | No changes |
|
|
| `odoo_client.py` | Keep as-is | Already uses API key auth correctly |
|
|
|
|
### What is already solved — do not touch
|
|
|
|
**`EndCallProcessor` in `bot.py`** — AVC-side call termination is fully implemented.
|
|
Watches LLM text stream for closing keywords ("goodbye"), waits for TTS to finish via
|
|
`BotStoppedSpeakingFrame`, then pushes `EndTaskFrame` upstream. `TwilioFrameSerializer`
|
|
with `auto_hang_up` drops the carrier leg. This is correct. Zero changes.
|
|
|
|
**Mulaw 8kHz ↔ 16kHz conversion** — handled internally by `TwilioFrameSerializer`.
|
|
`PIPELINE_SAMPLE_RATE = 16000`, `WIRE_SAMPLE_RATE = 8000` are already set correctly.
|
|
No custom audio module needed.
|
|
|
|
**VAD tuned for telephony** — `confidence=0.5`, `min_volume=0.3` already loosened from
|
|
desktop defaults. These settings directly address the repeat-yourself problem on the
|
|
VAD side.
|
|
|
|
**Capacity gating** — `MAX_CONCURRENT_CALLS=2` with atomic slot reservation in
|
|
`server.py` prevents GPU thrashing. Keep it.
|
|
|
|
**`AudioHeartbeat`** — diagnostic processor that distinguishes VAD failure from
|
|
transport stall. Keep it.
|
|
|
|
**Post-call extraction (`extract.py`)** — single JSON-mode completion after call ends.
|
|
Correctly uses `format: json`, uses verified Twilio caller-ID instead of trusting model
|
|
output, falls back to JSONL if Odoo is unreachable. Keep it.
|
|
|
|
**Odoo integration (`odoo_client.py`)** — already uses `ODOO_API_KEY` for XML-RPC auth,
|
|
not password. Correct pattern. No changes.
|
|
|
|
---
|
|
|
|
## Change 1 — Swap Whisper STT for Deepgram Nova-2 (`bot.py`)
|
|
|
|
**Why:** Whisper buffers audio chunks before transcribing — 1-3 seconds of buffering
|
|
before the LLM sees any input. This is the primary cause of non-reply and the
|
|
repeat-yourself problem. Deepgram Nova-2 via Pipecat's native transport delivers
|
|
end-of-utterance events in under 300ms.
|
|
|
|
**Remove from `bot.py`:**
|
|
```python
|
|
# Remove this import
|
|
from pipecat.services.whisper.stt import WhisperSTTService
|
|
|
|
# Remove these env vars
|
|
WHISPER_MODEL = os.environ.get("WHISPER_MODEL", "base")
|
|
WHISPER_DEVICE = os.environ.get("WHISPER_DEVICE", "cuda")
|
|
WHISPER_COMPUTE = os.environ.get("WHISPER_COMPUTE", "float16")
|
|
WHISPER_HOTWORDS = os.environ.get("WHISPER_HOTWORDS", "...")
|
|
|
|
# Remove the entire HintedWhisperSTTService class
|
|
```
|
|
|
|
**Add to `bot.py`:**
|
|
```python
|
|
# Add import
|
|
from pipecat.services.deepgram.stt import DeepgramSTTService
|
|
|
|
# Add env var
|
|
DEEPGRAM_API_KEY = os.environ.get("DEEPGRAM_API_KEY", "")
|
|
|
|
# Replace stt instantiation in run_agent()
|
|
stt = DeepgramSTTService(
|
|
api_key=DEEPGRAM_API_KEY,
|
|
settings=DeepgramSTTService.Settings(
|
|
model="nova-2",
|
|
language="en-US",
|
|
smart_format=True,
|
|
punctuate=True,
|
|
interim_results=False, # final transcripts only — avoids double-firing
|
|
utterance_end_ms=1000, # ms of silence before end-of-utterance fires
|
|
)
|
|
)
|
|
```
|
|
|
|
**Note on Whisper:** Remove from real-time pipeline only. Whisper large-v3 is retained
|
|
for post-call transcription in Phase 3 (`recording/transcriber.py`) where latency does
|
|
not matter and accuracy is more important than speed.
|
|
|
|
---
|
|
|
|
## Change 2 — Swap Auth Token for API Key Secret (`server.py`)
|
|
|
|
**Why:** `TWILIO_AUTH_TOKEN` is the master credential for the entire Twilio account.
|
|
A leak compromises every Twilio integration. A Standard API Key is scoped to this
|
|
application and revocable independently.
|
|
|
|
**Credential hierarchy:**
|
|
```
|
|
Twilio Account SID (not secret on its own)
|
|
├── Auth Token (master — Twilio console only, rotate quarterly)
|
|
└── API Key: avc-phone-agent-prod (Standard scope)
|
|
├── TWILIO_API_KEY_SID: SK...
|
|
└── TWILIO_API_KEY_SECRET: (treat as a password)
|
|
```
|
|
|
|
**Create the API Key:**
|
|
1. Twilio console → Account → API Keys → Create new Standard key
|
|
2. Name it `avc-phone-agent-prod`
|
|
3. Copy SID (`SK...`) and Secret — Secret is shown once only
|
|
|
|
**Changes in `server.py`:**
|
|
|
|
Remove:
|
|
```python
|
|
TWILIO_AUTH_TOKEN = os.environ.get("TWILIO_AUTH_TOKEN")
|
|
```
|
|
|
|
Add:
|
|
```python
|
|
TWILIO_API_KEY_SID = os.environ.get("TWILIO_API_KEY_SID")
|
|
TWILIO_API_KEY_SECRET = os.environ.get("TWILIO_API_KEY_SECRET")
|
|
```
|
|
|
|
In `_twilio_signature_ok()`, change the HMAC key:
|
|
```python
|
|
# Before
|
|
digest = hmac.new(TWILIO_AUTH_TOKEN.encode(), payload.encode("utf-8"), hashlib.sha1).digest()
|
|
|
|
# After
|
|
digest = hmac.new(TWILIO_API_KEY_SECRET.encode(), payload.encode("utf-8"), hashlib.sha1).digest()
|
|
```
|
|
|
|
Update the guard condition:
|
|
```python
|
|
# Before
|
|
if TWILIO_VALIDATE and TWILIO_AUTH_TOKEN:
|
|
|
|
# After
|
|
if TWILIO_VALIDATE and TWILIO_API_KEY_SECRET:
|
|
```
|
|
|
|
Update the warning log:
|
|
```python
|
|
# Before
|
|
elif not TWILIO_AUTH_TOKEN:
|
|
logger.warning("/voice signature validation DISABLED (no TWILIO_AUTH_TOKEN set)")
|
|
|
|
# After
|
|
elif not TWILIO_API_KEY_SECRET:
|
|
logger.warning("/voice signature validation DISABLED (no TWILIO_API_KEY_SECRET set)")
|
|
```
|
|
|
|
In `TwilioFrameSerializer` instantiation:
|
|
```python
|
|
# Before
|
|
serializer = TwilioFrameSerializer(
|
|
stream_sid=stream_sid,
|
|
call_sid=call_sid,
|
|
account_sid=TWILIO_ACCOUNT_SID,
|
|
auth_token=TWILIO_AUTH_TOKEN,
|
|
)
|
|
|
|
# After
|
|
serializer = TwilioFrameSerializer(
|
|
stream_sid=stream_sid,
|
|
call_sid=call_sid,
|
|
account_sid=TWILIO_ACCOUNT_SID,
|
|
auth_token=TWILIO_API_KEY_SECRET,
|
|
)
|
|
```
|
|
|
|
**Key rotation procedure:**
|
|
1. Create new Standard API Key in Twilio console
|
|
2. Update `TWILIO_API_KEY_SID` + `TWILIO_API_KEY_SECRET` in `.env`
|
|
3. Restart the service — no rebuild needed
|
|
4. Verify one test call succeeds
|
|
5. Revoke old key in Twilio console
|
|
|
|
Rotate on: any suspected leak, any team member departure, quarterly as routine.
|
|
|
|
---
|
|
|
|
## Change 3 — Update `.env`
|
|
|
|
**Remove:**
|
|
```env
|
|
TWILIO_AUTH_TOKEN=
|
|
```
|
|
|
|
**Add:**
|
|
```env
|
|
TWILIO_API_KEY_SID=SK...
|
|
TWILIO_API_KEY_SECRET=
|
|
DEEPGRAM_API_KEY=
|
|
```
|
|
|
|
**Full `.env` reference:**
|
|
```env
|
|
# Twilio — Auth Token lives in Twilio console only, never on this server
|
|
TWILIO_ACCOUNT_SID=AC...
|
|
TWILIO_API_KEY_SID=SK...
|
|
TWILIO_API_KEY_SECRET=
|
|
TWILIO_PHONE_NUMBER=+1...
|
|
|
|
# STT: Deepgram (real-time, in-call only)
|
|
DEEPGRAM_API_KEY=
|
|
DEEPGRAM_MODEL=nova-2
|
|
|
|
# LLM: Ollama
|
|
OLLAMA_URL=http://127.0.0.1:11434/v1
|
|
OLLAMA_MODEL=activeblue-avc:latest
|
|
LLM_PROVIDER=ollama
|
|
LLM_TEMPERATURE=0.3
|
|
LLM_MAX_TOKENS=160
|
|
|
|
# Anthropic (optional LLM swap + monitoring + synthetic data)
|
|
ANTHROPIC_API_KEY=
|
|
ANTHROPIC_MODEL=claude-sonnet-4-6
|
|
|
|
# TTS: Kokoro
|
|
KOKORO_VOICE=af_heart
|
|
KOKORO_MODEL_DIR=/home/tocmo0nlord/pipecat-run/models
|
|
|
|
# Odoo
|
|
ODOO_URL=https://avc.activeblue.net
|
|
ODOO_DB=avc
|
|
ODOO_USER=
|
|
ODOO_API_KEY=
|
|
ODOO_TARGET=crm
|
|
ODOO_STAGE_ID=
|
|
ODOO_TEAM_ID=
|
|
ODOO_USER_ID=
|
|
|
|
# Server
|
|
PUBLIC_HOST=avc-phone.activeblue.net
|
|
PORT=8200
|
|
BIND_HOST=127.0.0.1
|
|
MAX_CONCURRENT_CALLS=2
|
|
STREAM_TOKEN=
|
|
|
|
# Call behaviour
|
|
AGENT_NAME=AVA
|
|
ENABLE_TOOLS=
|
|
VAD_CONFIDENCE=0.5
|
|
VAD_MIN_VOLUME=0.3
|
|
VAD_START_SECS=0.2
|
|
VAD_STOP_SECS=0.5
|
|
|
|
# Monitoring (Phase 4)
|
|
MONITORING_ENABLED=true
|
|
MONITORING_SCHEDULE=0 2 * * *
|
|
|
|
# A/B model routing (Phase 5 only)
|
|
AB_SPLIT_PERCENT=0
|
|
AB_MODEL_B=
|
|
```
|
|
|
|
---
|
|
|
|
## Model Configuration
|
|
|
|
### Current production model: `activeblue-avc:latest`
|
|
|
|
| Property | Value | Notes |
|
|
|----------|-------|-------|
|
|
| Base | `llama3.1:8b-instruct-q4_K_M` | Llama 3.1 8B, Q4_K_M quantization |
|
|
| ID | `366a6cc15bb7` | Rebuilt clean 2026-06-23 |
|
|
| Size | 4.9GB | Down from 8.7GB Q8_0 |
|
|
| VRAM usage | ~4.5GB | Leaves 11.5GB headroom on RTX 5080 |
|
|
| Context | 4096 tokens | Sufficient for any phone call |
|
|
| Temperature | 0.3 | Low — maximizes JSON schema compliance |
|
|
| Top-p | 0.9 | Standard |
|
|
| Adapter | None | 44-pair LoRA adapter discarded |
|
|
|
|
### Modelfile (rebuild reference)
|
|
|
|
```
|
|
FROM llama3.1:8b-instruct-q4_K_M
|
|
|
|
PARAMETER stop "<|start_header_id|>"
|
|
PARAMETER stop "<|end_header_id|>"
|
|
PARAMETER stop "<|eot_id|>"
|
|
PARAMETER num_ctx 4096
|
|
PARAMETER temperature 0.3
|
|
PARAMETER top_p 0.9
|
|
|
|
TEMPLATE "{{- range .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|>
|
|
{{ .Content }}<|eot_id|>
|
|
{{- end }}<|start_header_id|>assistant<|end_header_id|>
|
|
"
|
|
```
|
|
|
|
### Why Q4_K_M not Q8_0
|
|
|
|
Q8_0 consumed ~8.5GB VRAM for weights alone. Under telephony load this caused
|
|
inference latency spikes. Q4_K_M cuts weight VRAM to ~4.5GB with negligible quality
|
|
difference at 8B scale.
|
|
|
|
### Why no adapter
|
|
|
|
44-pair LoRA adapter was adding noise not signal. Minimum viable dataset is 200+ pairs
|
|
per intent category. Rebuilt correctly in Phase 5 with 500+ pairs in JSON output format.
|
|
|
|
### Ollama inventory (current)
|
|
|
|
```
|
|
activeblue-avc:latest 366a6cc15bb7 4.9GB production
|
|
llama3.1:8b-instruct-q4_K_M 46e0c10c039e 4.9GB base
|
|
nomic-embed-text:latest 0a109f422b47 274MB embeddings
|
|
```
|
|
|
|
### Phase 5 training note
|
|
|
|
Axolotl pulls from HuggingFace in safetensors format, not Ollama GGUF:
|
|
```bash
|
|
# Phase 5 only — do not run now
|
|
huggingface-cli download meta-llama/Llama-3.1-8B-Instruct
|
|
# ~16GB on disk, separate from Ollama storage
|
|
```
|
|
|
|
---
|
|
|
|
## Build Phases
|
|
|
|
Claude Code must not scaffold Phase N+1 until Phase N gate is marked complete.
|
|
|
|
### Phase 1 — Reliable call loop
|
|
|
|
**Goal:** Every utterance gets a response. Zero silent failures. AVC hangs up — not
|
|
the caller.
|
|
|
|
- [ ] Apply Change 1: swap Whisper for Deepgram in `bot.py`
|
|
- [ ] Apply Change 2: swap Auth Token for API Key Secret in `server.py`
|
|
- [ ] Apply Change 3: update `.env`
|
|
- [ ] Verify `EndCallProcessor` termination in Twilio call logs (AVC side, not caller)
|
|
- [ ] Verify `AudioHeartbeat` diagnostic logging active
|
|
- [ ] Verify `MAX_CONCURRENT_CALLS` capacity gating works
|
|
|
|
**Gate — all five must pass:**
|
|
1. 10 consecutive test calls — zero silent non-responses
|
|
2. Zero zombie pipeline instances after call ends (`docker stats`)
|
|
3. Call termination from AVC side confirmed in Twilio call logs
|
|
4. JSON parse failure rate visible in logs — measurable not invisible
|
|
5. Response latency P95 under 3 seconds from STT end-of-utterance to first TTS audio
|
|
|
|
### Phase 2 — Accuracy (RAG + validation)
|
|
|
|
- [ ] Populate `rag/data/*.jsonl` with real AVC data (human task — see RAG section)
|
|
- [ ] ChromaDB RAG retriever wired into pipeline
|
|
- [ ] Response validator: JSON schema + factual cross-check + PHI leak scan
|
|
- [ ] Keyword blocklist (uncertainty phrases → handoff)
|
|
- [ ] Intent classifier routing
|
|
- [ ] Turn counter: max 3 failed turns before forced handoff + termination
|
|
|
|
**Gate:** 20 manual test calls, zero hallucinations on AVC-specific facts
|
|
|
|
### Phase 3 — Booking
|
|
|
|
- [ ] Real-time calendar availability check (`odoo/calendar.py`)
|
|
- [ ] Whisper large-v3 post-call transcription (`recording/transcriber.py`)
|
|
- [ ] Recording + transcript attached to Odoo lead chatter
|
|
- [ ] Staff review flow confirmed in Odoo
|
|
|
|
**Gate:** Staff receives, reviews, and confirms a lead end-to-end
|
|
|
|
### Phase 4 — Monitoring
|
|
|
|
- [ ] Transcript index (`recordings/index.jsonl`)
|
|
- [ ] Claude monitoring job
|
|
- [ ] Dashboard: toggle, alert queue, one-click apply, playback, quality tagging
|
|
|
|
**Gate:** First monitoring run produces actionable suggestions
|
|
|
|
### Phase 5 — Fine-tuning
|
|
|
|
- [ ] Pull HuggingFace base (see model section)
|
|
- [ ] Synthetic data generation via Claude API in JSON output format
|
|
- [ ] Real call exporter using staff quality tags
|
|
- [ ] Axolotl QLoRA on RTX 5080
|
|
- [ ] Model registry + versioning + A/B routing
|
|
|
|
**Gate:** New model outperforms baseline over 50+ calls
|
|
|
|
---
|
|
|
|
## Repository Structure
|
|
|
|
```
|
|
avc-phone-ai/
|
|
├── CLAUDE.md ← this file
|
|
├── README.md
|
|
├── .env ← never committed
|
|
├── .env.example
|
|
├── .gitignore ← includes .env, recordings/, *.gguf
|
|
│
|
|
├── bot.py ← Pipecat pipeline (Phase 1 changes here)
|
|
├── server.py ← Twilio webhook server (Phase 1 changes here)
|
|
├── practice.py ← AVC facts + Odoo persistence
|
|
├── extract.py ← post-call appointment extraction
|
|
├── odoo_client.py ← Odoo XML-RPC client
|
|
│
|
|
├── rag/ ← Phase 2
|
|
│ ├── store.py
|
|
│ ├── loader.py
|
|
│ ├── retriever.py
|
|
│ └── data/
|
|
│ ├── avc_locations.jsonl
|
|
│ ├── avc_providers.jsonl
|
|
│ ├── avc_services.jsonl
|
|
│ ├── avc_hours.jsonl
|
|
│ ├── avc_insurance.jsonl
|
|
│ └── avc_faqs.jsonl
|
|
│
|
|
├── recording/ ← Phase 3
|
|
│ ├── transcriber.py ← Whisper large-v3 post-call only
|
|
│ └── storage.py
|
|
│
|
|
├── monitoring/ ← Phase 4
|
|
│ ├── monitor.py
|
|
│ ├── analyzer.py
|
|
│ ├── diff_engine.py
|
|
│ ├── scheduler.py
|
|
│ └── dashboard/
|
|
│ ├── app.py
|
|
│ └── static/
|
|
│
|
|
├── training/ ← Phase 5 stub
|
|
│ └── README.md
|
|
│
|
|
├── tests/
|
|
│ ├── test_bot.py
|
|
│ ├── test_server.py
|
|
│ ├── test_odoo_client.py
|
|
│ ├── test_extract.py
|
|
│ └── fixtures/
|
|
│ └── sample_transcripts.jsonl
|
|
│
|
|
├── scripts/
|
|
│ ├── deploy.sh
|
|
│ └── smoke_test.sh
|
|
│
|
|
├── avc-phone.service ← existing systemd unit
|
|
└── traefik-avc-phone.yml ← existing Traefik config
|
|
```
|
|
|
|
---
|
|
|
|
## Infrastructure
|
|
|
|
| Component | Host | Address | Notes |
|
|
|-----------|------|---------|-------|
|
|
| Pipecat pipeline | `miaai` | `10.10.1.221` | Python async, systemd |
|
|
| Ollama LLM | `miaai` | `http://127.0.0.1:11434/v1` | `activeblue-avc:latest` |
|
|
| ChromaDB (Phase 2) | `miaai` | `http://10.10.1.221:8001` | Docker volume |
|
|
| Twilio webhook | `miaai` | `https://avc-phone.activeblue.net` | Traefik + Let's Encrypt |
|
|
| Monitoring dashboard | `miaai` | `https://avc-monitor.activeblue.net` | internal only |
|
|
| Odoo CRM | — | `https://avc.activeblue.net` | XML-RPC, db: `avc` |
|
|
| Recordings | `miaai` | `/home/tocmo0nlord/avc-phone/recordings/` | local only |
|
|
| Gitea | — | `https://git.activeblue.net/tocmo0nlord/avc-phone-ai` | user: `tocmo0nlord` |
|
|
|
|
---
|
|
|
|
## RAG Store (Phase 2)
|
|
|
|
**Stack:** ChromaDB + `nomic-embed-text:latest` (already in Ollama)
|
|
**Collection:** `avc_knowledge`
|
|
**Retrieval:** Top-3 chunks per query on caller's current turn only
|
|
|
|
### JSONL record format
|
|
|
|
```json
|
|
{
|
|
"id": "hours-kendall-weekday",
|
|
"text": "The Kendall location is open Monday through Friday 8:00 AM to 5:00 PM.",
|
|
"tags": ["hours", "kendall"],
|
|
"last_updated": "2026-06-23"
|
|
}
|
|
```
|
|
|
|
### Data files — populated before Phase 2, not before Phase 1
|
|
|
|
| File | Content |
|
|
|------|---------|
|
|
| `avc_locations.jsonl` | Address, phone, fax, parking per location |
|
|
| `avc_providers.jsonl` | Name, title, specialty, locations, languages |
|
|
| `avc_services.jsonl` | Exam types, procedures |
|
|
| `avc_hours.jsonl` | Hours per location, holiday closures, after-hours |
|
|
| `avc_insurance.jsonl` | Accepted plans per location |
|
|
| `avc_faqs.jsonl` | Approved Q&A pairs |
|
|
|
|
**Note:** `practice.py` already contains real AVC location and insurance data scraped
|
|
from `advancedvisioncareflorida.com`. Use it as the seed for the JSONL files rather
|
|
than starting from scratch.
|
|
|
|
---
|
|
|
|
## Claude Monitoring (Phase 4)
|
|
|
|
### What it analyzes
|
|
|
|
- Facts stated by AVA contradicting RAG store
|
|
- System prompt violations
|
|
- Calls that should have been handoffs
|
|
- High failed turn counts — model or prompt signal
|
|
- RAG gaps (AVA said "I don't have that" — should it be added?)
|
|
- Phrasing that caused caller confusion
|
|
|
|
### Output schema
|
|
|
|
```json
|
|
{
|
|
"call_sid": "CA...",
|
|
"severity": "high",
|
|
"issue_type": "factual_error",
|
|
"description": "AVA stated Kendall closes at 6pm. RAG store says 5pm.",
|
|
"suggested_action": "rag_update",
|
|
"suggested_change": {
|
|
"file": "rag/data/avc_hours.jsonl",
|
|
"record_id": "hours-kendall-weekday",
|
|
"field": "text",
|
|
"old": "...open until 6pm...",
|
|
"new": "...open until 5pm..."
|
|
}
|
|
}
|
|
```
|
|
|
|
`suggested_action`: `rag_update` | `prompt_change` | `blocklist_add` | `flag_for_review`
|
|
|
|
### Dashboard
|
|
|
|
FastAPI + HTML/JS at `https://avc-monitor.activeblue.net` (internal only).
|
|
|
|
| Feature | Description |
|
|
|---------|-------------|
|
|
| Enable/disable toggle | Pauses scheduler without redeployment |
|
|
| Alert queue | Suggestions sorted by severity |
|
|
| One-click apply | Applies change, commits via Gitea API to `avc-phone-ai` |
|
|
| Call playback | Audio + transcript side-by-side |
|
|
| Quality tagging | Staff tags calls from dashboard |
|
|
| Manual trigger | `POST /monitor/run` |
|
|
|
|
---
|
|
|
|
## Fine-Tuning Pipeline (Phase 5 — stub)
|
|
|
|
> Not scaffolded until Phase 4 complete and monitoring has run minimum two weeks.
|
|
> See `training/README.md` — populated at Phase 5 start.
|
|
|
|
- Synthetic data: Claude API generates Q&A in JSON output format — schema not style
|
|
- Real calls: staff-tagged `"good"` + corrected bad calls
|
|
- Target: 500+ pairs per intent before first Axolotl run
|
|
- QLoRA via Axolotl on RTX 5080, base: HuggingFace `meta-llama/Llama-3.1-8B-Instruct`
|
|
- Versioned Ollama models: `activeblue-avc:vN`
|
|
- A/B routing: promote when new version wins on booking + hallucination rate over 50+ calls
|
|
|
|
---
|
|
|
|
## HIPAA and Compliance
|
|
|
|
- AVA identifies as automated at call start — no exceptions
|
|
- No PHI in ChromaDB — practice information only
|
|
- Recordings on `miaai` only — no cloud storage
|
|
- Odoo API user: minimum permissions, not admin
|
|
- All endpoints HTTPS via Traefik
|
|
- `.env` never committed
|
|
|
|
---
|
|
|
|
## Deploy Script (`scripts/deploy.sh`)
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
set -e
|
|
cd /home/tocmo0nlord/avc-phone
|
|
git pull origin main
|
|
pip install -r requirements.txt --quiet
|
|
systemctl restart avc-phone
|
|
systemctl status avc-phone --no-pager
|
|
echo "[deploy] Done."
|
|
```
|
|
|
|
---
|
|
|
|
## Development Conventions
|
|
|
|
- Python 3.13 (matches `miaai` miniconda environment)
|
|
- Async throughout — Pipecat is async-native
|
|
- `loguru` for all logging — already in use, keep consistent
|
|
- Structured log lines for all diagnostic events
|
|
- `python-dotenv` for local dev, env injection in prod
|
|
- Secrets never hardcoded
|
|
- Every module has `if __name__ == "__main__":` for isolated testing
|
|
|
|
---
|
|
|
|
## Key Dependencies (current)
|
|
|
|
```
|
|
pipecat-ai==1.3.0 # installed at /opt/miniconda3
|
|
pipecat-ai[deepgram] # add for Phase 1 Deepgram swap
|
|
deepgram-sdk # add for Phase 1
|
|
kokoro-tts # already installed
|
|
ollama # already installed
|
|
scipy / numpy # already installed (pipecat deps)
|
|
chromadb # add for Phase 2
|
|
sentence-transformers # add for Phase 2
|
|
anthropic # for monitoring + optional LLM swap
|
|
openai-whisper # retained for post-call transcription only
|
|
fastapi / uvicorn # already installed
|
|
loguru # already installed
|
|
httpx # already installed
|
|
```
|
|
|
|
---
|
|
|
|
## Open Items
|
|
|
|
- [ ] Create `avc-phone-agent-prod` Standard API Key in Twilio console
|
|
- [ ] Add `TWILIO_API_KEY_SID` + `TWILIO_API_KEY_SECRET` + `DEEPGRAM_API_KEY` to `.env`
|
|
- [ ] Confirm `ODOO_STAGE_ID`, `ODOO_TEAM_ID`, `ODOO_USER_ID` from live `avc` db
|
|
- [ ] Confirm AVA voice — `af_heart` is current default, confirm with AVC before go-live
|
|
- [ ] Populate `rag/data/*.jsonl` before Phase 2 (seed from `practice.py` data)
|
|
- [ ] Define Odoo confirmed appointment flow: lead → opportunity → calendar event
|
|
- [ ] Staff training on monitoring dashboard quality tagging
|
|
|
|
---
|
|
|
|
*Active Blue LLC | git.activeblue.net/tocmo0nlord/avc-phone-ai*
|