Reframe Change 1/2/3 to record the actual decisions instead of the trialed swaps: Deepgram and the Twilio Standard API Key were both evaluated and reverted. Document why the API Key cannot replace the Auth Token (Twilio signs webhooks with the Auth Token). Update the .env reference, Phase 1 checklist, dependencies, and open items accordingly; gate zombie-check uses ps/pgrep (bare process, not Docker). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
AVC Phone Agent — inbound optometry line (Pipecat + Twilio, fully local)
A real phone number that callers dial; the agent answers in voice, handles hours / location / insurance / services questions, and captures appointment requests for staff callback. All AI runs locally on this box:
caller ─▶ Twilio ─▶ wss (Traefik TLS) ─▶ server.py ─▶ Pipecat pipeline:
Twilio Media Stream (8kHz µ-law)
│
▼
Silero VAD ─▶ Whisper STT (GPU) ─▶ activeblue-avc (Ollama) ─▶ Kokoro TTS ─▶ back to caller
Inbound only. No cloud STT/TTS — audio stays on the machine except the Twilio carrier leg.
Files
| File | Role |
|---|---|
server.py |
FastAPI: POST /voice (TwiML) + WS /ws (Twilio Media Stream) |
bot.py |
The per-call Pipecat pipeline (VAD→STT→LLM→TTS) + tool wiring |
practice.py |
AVC business facts (PLACEHOLDERS — edit before go-live) + appointment-capture tool |
odoo_client.py |
Writes captured requests into Odoo (CRM lead by default) via XML-RPC |
run.sh |
Launcher (reuses pipecat-run venv + sets CUDA lib path) |
avc-phone.service |
systemd unit (install on this box) |
deploy/setup-tls.sh |
One-shot: Let's Encrypt cert + nginx vhost install (run as root) |
deploy/nginx-*.conf |
nginx TLS reverse-proxy vhost + WebSocket-upgrade map |
traefik-avc-phone.yml |
Unused alternative (kept for a future multi-host/Traefik setup) |
.env.example |
Copy to .env, fill Twilio creds + public host + Odoo creds |
appointment_requests.jsonl |
Local fallback — only used if Odoo is unreachable/disabled |
What's done vs. what YOU must supply
Working / verified locally:
- Pipeline assembles; all services construct (smoke-tested).
- GPU Whisper fixed — installed CUDA12
cublas+cudnnwheels into the venv;run.shsetsLD_LIBRARY_PATHso faster-whisper finds them. Verified transcribe on GPU. - Local model
activeblue-avc:latestis the brain; Kokoro voice; appointment tool. - Odoo appointment integration wired + verified against prod
db1: a captured request creates acrm.lead(callback to-do) via XML-RPC using the same API key theactiveblue-agentservice uses. Verified create→read→delete (no residue left in db1). If Odoo is unreachable or creds are blank, it falls back toappointment_requests.jsonland still confirms to the caller — a request is never lost.
You must supply (can't be done from this box):
- Twilio account + a Voice phone number.
- Port-forward 443 (and 80) from your router to this box, and run
deploy/setup-tls.shfor the nginx TLS reverse proxy (Twilio needs real TLS on 443 for thewssstream). - Real AVC facts in
practice.py(hours, address, insurance, services, phone). - Odoo creds in
.env(ODOO_USER+ODOO_API_KEY) to enable lead creation. SetODOO_DB(db1for prod) andODOO_TARGET(crmlead, orcalendarevent). Leave creds blank to disable Odoo and log to JSONL only.
Setup
-
Config
cd /home/tocmo0nlord/avc-phone cp .env.example .env # fill PUBLIC_HOST, TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN $EDITOR practice.py # replace PLACEHOLDER hours/address/insurance/services -
Run it
./run.sh # listens plain HTTP on :8200 (Traefik terminates TLS) curl localhost:8200/health # {"status":"ok",...} -
TLS reverse proxy (nginx, on this box). No Traefik —
voip.activeblue.netpoints at your WAN IP (66.23.239.222) which NATs to this box (10.10.1.221). nginx is already installed and only serving the default site, so we add a vhost for the domain. Twilio'swssmedia stream needs real TLS on 443, so:- Forward
443(and80) on your router →10.10.1.221. (80 is for the Let's Encrypt challenge + the http→https redirect; 443 is the actual traffic.) - Run the one-shot setup (gets a Let's Encrypt cert, installs the vhost + ws map,
reloads nginx):
It uses
sudo bash deploy/setup-tls.shdeploy/nginx-voip.activeblue.net.conf(proxies 443 →127.0.0.1:8200, forwards the/wsupgrade, 1-hour stream timeout) anddeploy/nginx-ws-upgrade.conf. - Verify publicly:
curl https://voip.activeblue.net/health.
- Forward
-
Twilio number config (console.twilio.com → your number → Voice):
- A call comes in → Webhook →
https://voip.activeblue.net/voice→ HTTP POST. - Save. That's it — the TwiML we return tells Twilio to open the Media Stream to
wss://voip.activeblue.net/ws.
- A call comes in → Webhook →
-
Call the number. You should hear the greeting and be able to talk to it.
Security (built in)
- Webhook signature validation:
POST /voiceverifies Twilio'sX-Twilio-Signature(HMAC-SHA1 over the public URL + sorted POST params, keyed byTWILIO_AUTH_TOKEN). Enforced automatically wheneverTWILIO_AUTH_TOKENis set. Verified against Twilio's published reference vector. Unsigned/forged requests get403. SetTWILIO_VALIDATE=falseonly for local testing.- The signed URL must match exactly, so
PUBLIC_HOSTmust equal the host on the number's webhook (https://$PUBLIC_HOST/voice). If Traefik rewrites host/path, signatures fail.
- The signed URL must match exactly, so
- Media-stream gate:
/wscan't carry a usable Twilio signature, so it's gated by a sharedSTREAM_TOKENembedded in the wss URL we hand Twilio. Bad/missing token → socket closed. Set a stableSTREAM_TOKENin.env(openssl rand -base64 24).
Run it as a service (systemd)
A unit is provided: avc-phone.service (runs as your user, Restart=always, ordered
after ollama.service). Install (needs sudo — paste these in a ! shell or a terminal):
sudo cp /home/tocmo0nlord/avc-phone/avc-phone.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now avc-phone.service
systemctl status avc-phone.service # check it's running
journalctl -u avc-phone.service -f # follow logs
Restart after editing .env or practice.py: sudo systemctl restart avc-phone.service.
(No-sudo alternative: a systemctl --user unit + loginctl enable-linger tocmo0nlord —
ask and I'll convert it.)
Concurrency cap (built in)
MAX_CONCURRENT_CALLS (default 2) bounds simultaneous live calls. The count tracks
active /ws pipelines (the real GPU consumers); when full, /voice speaks BUSY_MESSAGE
and hangs up before any GPU work, so in-progress calls are never degraded. A hard
reservation at /ws covers the rare race. /health reports active_calls/max_calls
for monitoring. Tune the cap to your GPU headroom.
Known limits / next steps
- Per-call Whisper load: each call currently constructs its own Whisper model on the GPU. Fine within the cap; a future optimization is sharing one warm Whisper instance across calls to cut memory + first-utterance latency.
- Latency: first call after start pays one-time model loads (Whisper/Kokoro/Ollama).
Keep the process warm. Tune
WHISPER_MODEL=tinyif you need faster STT. - Function-calling reliability:
activeblue-avcis an 8B fine-tune; tool-calling may need prompt tuning. If it's flaky, we can fall back to a deterministic slot-filling flow for appointment capture.