Initial commit: avc-phone-ai codebase + CLAUDE.md

This commit is contained in:
tocmo0nlord
2026-06-23 22:38:22 +00:00
parent 4bf72b9616
commit c3c719b77e
16 changed files with 1491 additions and 0 deletions

132
README.md Normal file
View File

@@ -0,0 +1,132 @@
# AVC Phone Agent — inbound optometry line (Pipecat + Twilio, fully local)
A real phone number that callers dial; the agent answers in voice, handles hours /
location / insurance / services questions, and **captures appointment requests** for
staff callback. All AI runs **locally on this box**:
```
caller ─▶ Twilio ─▶ wss (Traefik TLS) ─▶ server.py ─▶ Pipecat pipeline:
Twilio Media Stream (8kHz µ-law)
Silero VAD ─▶ Whisper STT (GPU) ─▶ activeblue-avc (Ollama) ─▶ Kokoro TTS ─▶ back to caller
```
Inbound only. No cloud STT/TTS — audio stays on the machine except the Twilio carrier leg.
## Files
| File | Role |
|---|---|
| `server.py` | FastAPI: `POST /voice` (TwiML) + `WS /ws` (Twilio Media Stream) |
| `bot.py` | The per-call Pipecat pipeline (VAD→STT→LLM→TTS) + tool wiring |
| `practice.py` | **AVC business facts (PLACEHOLDERS — edit before go-live)** + appointment-capture tool |
| `odoo_client.py` | Writes captured requests into Odoo (CRM lead by default) via XML-RPC |
| `run.sh` | Launcher (reuses pipecat-run venv + sets CUDA lib path) |
| `avc-phone.service` | systemd unit (install on this box) |
| `deploy/setup-tls.sh` | One-shot: Let's Encrypt cert + nginx vhost install (run as root) |
| `deploy/nginx-*.conf` | nginx TLS reverse-proxy vhost + WebSocket-upgrade map |
| `traefik-avc-phone.yml` | Unused alternative (kept for a future multi-host/Traefik setup) |
| `.env.example` | Copy to `.env`, fill Twilio creds + public host + Odoo creds |
| `appointment_requests.jsonl` | Local fallback — only used if Odoo is unreachable/disabled |
## What's done vs. what YOU must supply
**Working / verified locally:**
- Pipeline assembles; all services construct (smoke-tested).
- GPU Whisper fixed — installed CUDA12 `cublas`+`cudnn` wheels into the venv; `run.sh`
sets `LD_LIBRARY_PATH` so faster-whisper finds them. Verified transcribe on GPU.
- Local model `activeblue-avc:latest` is the brain; Kokoro voice; appointment tool.
- **Odoo appointment integration wired + verified** against prod `db1`: a captured
request creates a `crm.lead` (callback to-do) via XML-RPC using the same API key the
`activeblue-agent` service uses. Verified create→read→delete (no residue left in db1).
If Odoo is unreachable or creds are blank, it falls back to `appointment_requests.jsonl`
and still confirms to the caller — a request is never lost.
**You must supply (can't be done from this box):**
1. **Twilio account + a Voice phone number.**
2. **Port-forward 443** (and 80) from your router to this box, and run `deploy/setup-tls.sh`
for the nginx TLS reverse proxy (Twilio needs real TLS on 443 for the `wss` stream).
3. **Real AVC facts** in `practice.py` (hours, address, insurance, services, phone).
4. **Odoo creds in `.env`** (`ODOO_USER` + `ODOO_API_KEY`) to enable lead creation.
Set `ODOO_DB` (`db1` for prod) and `ODOO_TARGET` (`crm` lead, or `calendar` event).
Leave creds blank to disable Odoo and log to JSONL only.
## Setup
1. **Config**
```bash
cd /home/tocmo0nlord/avc-phone
cp .env.example .env # fill PUBLIC_HOST, TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN
$EDITOR practice.py # replace PLACEHOLDER hours/address/insurance/services
```
2. **Run it**
```bash
./run.sh # listens plain HTTP on :8200 (Traefik terminates TLS)
curl localhost:8200/health # {"status":"ok",...}
```
3. **TLS reverse proxy (nginx, on this box).** No Traefik — `voip.activeblue.net` points
at your WAN IP (`66.23.239.222`) which NATs to this box (`10.10.1.221`). nginx is
already installed and only serving the default site, so we add a vhost for the domain.
**Twilio's `wss` media stream needs real TLS on 443**, so:
- **Forward `443` (and `80`) on your router → `10.10.1.221`.** (80 is for the
Let's Encrypt challenge + the http→https redirect; 443 is the actual traffic.)
- Run the one-shot setup (gets a Let's Encrypt cert, installs the vhost + ws map,
reloads nginx):
```bash
sudo bash deploy/setup-tls.sh
```
It uses `deploy/nginx-voip.activeblue.net.conf` (proxies 443 → `127.0.0.1:8200`,
forwards the `/ws` upgrade, 1-hour stream timeout) and `deploy/nginx-ws-upgrade.conf`.
- Verify publicly: `curl https://voip.activeblue.net/health`.
4. **Twilio number config** (console.twilio.com → your number → Voice):
- **A call comes in** → Webhook → `https://voip.activeblue.net/voice` → HTTP **POST**.
- Save. That's it — the TwiML we return tells Twilio to open the Media Stream to
`wss://voip.activeblue.net/ws`.
5. **Call the number.** You should hear the greeting and be able to talk to it.
## Security (built in)
- **Webhook signature validation:** `POST /voice` verifies Twilio's `X-Twilio-Signature`
(HMAC-SHA1 over the public URL + sorted POST params, keyed by `TWILIO_AUTH_TOKEN`).
Enforced automatically whenever `TWILIO_AUTH_TOKEN` is set. Verified against Twilio's
published reference vector. Unsigned/forged requests get `403`. Set `TWILIO_VALIDATE=false`
only for local testing.
- The signed URL must match exactly, so **`PUBLIC_HOST` must equal the host on the number's
webhook** (`https://$PUBLIC_HOST/voice`). If Traefik rewrites host/path, signatures fail.
- **Media-stream gate:** `/ws` can't carry a usable Twilio signature, so it's gated by a
shared `STREAM_TOKEN` embedded in the wss URL we hand Twilio. Bad/missing token → socket
closed. Set a stable `STREAM_TOKEN` in `.env` (`openssl rand -base64 24`).
## Run it as a service (systemd)
A unit is provided: `avc-phone.service` (runs as your user, `Restart=always`, ordered
after `ollama.service`). Install (needs sudo — paste these in a `!` shell or a terminal):
```bash
sudo cp /home/tocmo0nlord/avc-phone/avc-phone.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now avc-phone.service
systemctl status avc-phone.service # check it's running
journalctl -u avc-phone.service -f # follow logs
```
Restart after editing `.env` or `practice.py`: `sudo systemctl restart avc-phone.service`.
(No-sudo alternative: a `systemctl --user` unit + `loginctl enable-linger tocmo0nlord` —
ask and I'll convert it.)
## Concurrency cap (built in)
`MAX_CONCURRENT_CALLS` (default **2**) bounds simultaneous live calls. The count tracks
active `/ws` pipelines (the real GPU consumers); when full, `/voice` speaks `BUSY_MESSAGE`
and hangs up **before any GPU work**, so in-progress calls are never degraded. A hard
reservation at `/ws` covers the rare race. `/health` reports `active_calls`/`max_calls`
for monitoring. Tune the cap to your GPU headroom.
## Known limits / next steps
- **Per-call Whisper load:** each call currently constructs its own Whisper model on the
GPU. Fine within the cap; a future optimization is sharing one warm Whisper instance
across calls to cut memory + first-utterance latency.
- **Latency:** first call after start pays one-time model loads (Whisper/Kokoro/Ollama).
Keep the process warm. Tune `WHISPER_MODEL=tiny` if you need faster STT.
- **Function-calling reliability:** `activeblue-avc` is an 8B fine-tune; tool-calling
may need prompt tuning. If it's flaky, we can fall back to a deterministic slot-filling
flow for appointment capture.