tocmo0nlord d7bfe2dbe8 Deterministic phone confirmation safety net + docs
EndCallProcessor now guarantees the callback number is confirmed on booking
calls: the 8B reads it back only ~half the time, so if a closing is reached on a
booking call (booking keyword seen) without the agent having spoken the number
(phone_marker absent from its replies), the hang-up is suppressed and a scripted
confirmation line (caller-ID spelled out) is injected as a TTSSpeakFrame first.
The agent's own readback satisfies the gate (no double-ask); info-only calls are
never asked for a number. Runtime-tested all four paths (inject / no-inject /
info-only / inject-then-end).

CLAUDE.md: document the safety net, the "never claim a booking" rule, the direct
phone-confirm phrasing, and the insurance "never say we accept" rule.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 15:52:22 +00:00

AVC Phone Agent — inbound optometry line (Pipecat + Twilio, fully local)

A real phone number that callers dial; the agent answers in voice, handles hours / location / insurance / services questions, and captures appointment requests for staff callback. All AI runs locally on this box:

caller ─▶ Twilio ─▶ wss (Traefik TLS) ─▶ server.py ─▶ Pipecat pipeline:
          Twilio Media Stream (8kHz µ-law)
              │
              ▼
   Silero VAD ─▶ Whisper STT (GPU) ─▶ activeblue-avc (Ollama) ─▶ Kokoro TTS ─▶ back to caller

Inbound only. No cloud STT/TTS — audio stays on the machine except the Twilio carrier leg.

Files

File Role
server.py FastAPI: POST /voice (TwiML) + WS /ws (Twilio Media Stream)
bot.py The per-call Pipecat pipeline (VAD→STT→LLM→TTS) + tool wiring
practice.py AVC business facts (PLACEHOLDERS — edit before go-live) + appointment-capture tool
odoo_client.py Writes captured requests into Odoo (CRM lead by default) via XML-RPC
run.sh Launcher (reuses pipecat-run venv + sets CUDA lib path)
avc-phone.service systemd unit (install on this box)
deploy/setup-tls.sh One-shot: Let's Encrypt cert + nginx vhost install (run as root)
deploy/nginx-*.conf nginx TLS reverse-proxy vhost + WebSocket-upgrade map
traefik-avc-phone.yml Unused alternative (kept for a future multi-host/Traefik setup)
.env.example Copy to .env, fill Twilio creds + public host + Odoo creds
appointment_requests.jsonl Local fallback — only used if Odoo is unreachable/disabled

What's done vs. what YOU must supply

Working / verified locally:

  • Pipeline assembles; all services construct (smoke-tested).
  • GPU Whisper fixed — installed CUDA12 cublas+cudnn wheels into the venv; run.sh sets LD_LIBRARY_PATH so faster-whisper finds them. Verified transcribe on GPU.
  • Local model activeblue-avc:latest is the brain; Kokoro voice; appointment tool.
  • Odoo appointment integration wired + verified against prod db1: a captured request creates a crm.lead (callback to-do) via XML-RPC using the same API key the activeblue-agent service uses. Verified create→read→delete (no residue left in db1). If Odoo is unreachable or creds are blank, it falls back to appointment_requests.jsonl and still confirms to the caller — a request is never lost.

You must supply (can't be done from this box):

  1. Twilio account + a Voice phone number.
  2. Port-forward 443 (and 80) from your router to this box, and run deploy/setup-tls.sh for the nginx TLS reverse proxy (Twilio needs real TLS on 443 for the wss stream).
  3. Real AVC facts in practice.py (hours, address, insurance, services, phone).
  4. Odoo creds in .env (ODOO_USER + ODOO_API_KEY) to enable lead creation. Set ODOO_DB (db1 for prod) and ODOO_TARGET (crm lead, or calendar event). Leave creds blank to disable Odoo and log to JSONL only.

Setup

  1. Config

    cd /home/tocmo0nlord/avc-phone
    cp .env.example .env        # fill PUBLIC_HOST, TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN
    $EDITOR practice.py         # replace PLACEHOLDER hours/address/insurance/services
    
  2. Run it

    ./run.sh                    # listens plain HTTP on :8200 (Traefik terminates TLS)
    curl localhost:8200/health  # {"status":"ok",...}
    
  3. TLS reverse proxy (nginx, on this box). No Traefik — voip.activeblue.net points at your WAN IP (66.23.239.222) which NATs to this box (10.10.1.221). nginx is already installed and only serving the default site, so we add a vhost for the domain. Twilio's wss media stream needs real TLS on 443, so:

    • Forward 443 (and 80) on your router → 10.10.1.221. (80 is for the Let's Encrypt challenge + the http→https redirect; 443 is the actual traffic.)
    • Run the one-shot setup (gets a Let's Encrypt cert, installs the vhost + ws map, reloads nginx):
      sudo bash deploy/setup-tls.sh
      
      It uses deploy/nginx-voip.activeblue.net.conf (proxies 443 → 127.0.0.1:8200, forwards the /ws upgrade, 1-hour stream timeout) and deploy/nginx-ws-upgrade.conf.
    • Verify publicly: curl https://voip.activeblue.net/health.
  4. Twilio number config (console.twilio.com → your number → Voice):

    • A call comes in → Webhook → https://voip.activeblue.net/voice → HTTP POST.
    • Save. That's it — the TwiML we return tells Twilio to open the Media Stream to wss://voip.activeblue.net/ws.
  5. Call the number. You should hear the greeting and be able to talk to it.

Security (built in)

  • Webhook signature validation: POST /voice verifies Twilio's X-Twilio-Signature (HMAC-SHA1 over the public URL + sorted POST params, keyed by TWILIO_AUTH_TOKEN). Enforced automatically whenever TWILIO_AUTH_TOKEN is set. Verified against Twilio's published reference vector. Unsigned/forged requests get 403. Set TWILIO_VALIDATE=false only for local testing.
    • The signed URL must match exactly, so PUBLIC_HOST must equal the host on the number's webhook (https://$PUBLIC_HOST/voice). If Traefik rewrites host/path, signatures fail.
  • Media-stream gate: /ws can't carry a usable Twilio signature, so it's gated by a shared STREAM_TOKEN embedded in the wss URL we hand Twilio. Bad/missing token → socket closed. Set a stable STREAM_TOKEN in .env (openssl rand -base64 24).

Run it as a service (systemd)

A unit is provided: avc-phone.service (runs as your user, Restart=always, ordered after ollama.service). Install (needs sudo — paste these in a ! shell or a terminal):

sudo cp /home/tocmo0nlord/avc-phone/avc-phone.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now avc-phone.service
systemctl status avc-phone.service          # check it's running
journalctl -u avc-phone.service -f          # follow logs

Restart after editing .env or practice.py: sudo systemctl restart avc-phone.service. (No-sudo alternative: a systemctl --user unit + loginctl enable-linger tocmo0nlord — ask and I'll convert it.)

Concurrency cap (built in)

MAX_CONCURRENT_CALLS (default 2) bounds simultaneous live calls. The count tracks active /ws pipelines (the real GPU consumers); when full, /voice speaks BUSY_MESSAGE and hangs up before any GPU work, so in-progress calls are never degraded. A hard reservation at /ws covers the rare race. /health reports active_calls/max_calls for monitoring. Tune the cap to your GPU headroom.

Known limits / next steps

  • Per-call Whisper load: each call currently constructs its own Whisper model on the GPU. Fine within the cap; a future optimization is sharing one warm Whisper instance across calls to cut memory + first-utterance latency.
  • Latency: first call after start pays one-time model loads (Whisper/Kokoro/Ollama). Keep the process warm. Tune WHISPER_MODEL=tiny if you need faster STT.
  • Function-calling reliability: activeblue-avc is an 8B fine-tune; tool-calling may need prompt tuning. If it's flaky, we can fall back to a deterministic slot-filling flow for appointment capture.
Description
No description provided
Readme 686 KiB
Languages
Python 95.9%
Shell 4.1%