Files
avc-phone-ai/.env.example
tocmo0nlord 32a3bb7136 Fix echo-induced silence with a half-duplex audio gate
A caller's reply was generated but never heard: 0.65s after the agent started
speaking, the VAD fired "user started speaking" (NO transcript) and broadcast an
interruption that cancelled the agent's audio -> ~24s of silence until the caller
spoke again. Cause: the agent's own TTS echoes back the phone line and the
always-on VAD interruption treats it as a barge-in. (PipelineParams has no
allow_interruptions in this pipecat build — it was a silent no-op.)

Fix: HalfDuplexGate before the VAD withholds inbound audio while the bot speaks
(+ECHO_TAIL_SECS, default 0.5s), so echo can't trigger a false barge-in.
Half-duplex (no mid-utterance barge-in); HALF_DUPLEX=false to restore it.
Runtime-tested the gate (pass idle / drop while speaking / drop in tail / resume).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 16:44:00 +00:00

66 lines
3.9 KiB
Plaintext

# Copy to .env and fill in. run.sh auto-loads it.
# ── Public ingress (Twilio dials this back) ──────────────────────────────────
# Public hostname; nginx terminates TLS here and proxies to the app. Must match the
# Twilio webhook host (Twilio signs https://PUBLIC_HOST/voice).
PUBLIC_HOST=voip.activeblue.net
PORT=8200
# App bind address. Default 127.0.0.1 (nginx proxies in locally) — not exposed on LAN.
BIND_HOST=127.0.0.1
# ── Twilio ───────────────────────────────────────────────────────────────────
# From console.twilio.com. Used to auto-hang-up the carrier leg and (recommended)
# validate inbound webhook signatures. Twilio signs webhooks with the Auth Token, so
# signature validation must use the Auth Token (not an API Key Secret).
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token_here
# Inbound webhook signature validation is ON whenever TWILIO_AUTH_TOKEN is set.
# Set to false only for local testing without real Twilio requests.
TWILIO_VALIDATE=true
# Shared secret embedded in the Media Stream wss URL to gate /ws. Set a stable random
# value (e.g. `openssl rand -base64 24`); if blank, one is generated per process start.
STREAM_TOKEN=
# ── Odoo appointment integration ─────────────────────────────────────────────
# Leave ODOO_USER/ODOO_API_KEY blank to disable Odoo and log requests to JSONL only.
# Same creds the activeblue-agent container uses (docker inspect activeblue-agent).
# Verified working against db1 with ODOO_TARGET=crm.
ODOO_URL=http://localhost:8069
ODOO_DB=db1
ODOO_USER=mr.garcia09@gmail.com
ODOO_API_KEY=
ODOO_TARGET=crm # crm = callback lead (recommended) | calendar = tentative event
# ── Capacity ─────────────────────────────────────────────────────────────────
# Max simultaneous calls (each uses GPU; Ollama serializes generation). Over-cap
# callers hear BUSY_MESSAGE and are hung up. Tune to your GPU headroom (2-3 typical).
MAX_CONCURRENT_CALLS=2
# BUSY_MESSAGE=Thank you for calling Advanced Vision Care. All of our lines are busy right now. Please call back in a few minutes. Goodbye.
# ── Models (defaults are fine) ───────────────────────────────────────────────
OLLAMA_MODEL=llama3.1:8b
OLLAMA_URL=http://127.0.0.1:11434/v1
# LLM provider: ollama (local, default) | anthropic (Claude API). Flip to A/B test Claude.
LLM_PROVIDER=ollama
ANTHROPIC_API_KEY=
# Default is the most capable model; for low-latency phone voice prefer claude-haiku-4-5
# (fastest) or claude-sonnet-4-6 (balance).
ANTHROPIC_MODEL=claude-opus-4-8
# ── STT: Whisper (faster-whisper, real-time in-call) ─────────────────────────
WHISPER_MODEL=base
WHISPER_DEVICE=cuda
WHISPER_COMPUTE=float16
KOKORO_VOICE=af_heart
KOKORO_MODEL_DIR=/home/tocmo0nlord/pipecat-run/models
# ── Call behaviour ───────────────────────────────────────────────────────────
AGENT_NAME=AVA
# How the name is SPOKEN (TTS only; logs/Odoo keep AGENT_NAME). "Eva" -> "EE-vuh".
AGENT_NAME_SPOKEN=Eva
# Grace pause after the goodbye before the carrier leg is dropped (seconds).
HANGUP_DELAY_SECS=4.0
# Half-duplex: ignore caller audio while the agent speaks (+ tail) so its own echo on the
# phone line can't trigger a false barge-in that cancels its reply. false = allow barge-in.
HALF_DUPLEX=true
ECHO_TAIL_SECS=0.5