Follow-up test call: no more cancelled replies, but 3-5s response gaps on
turns the smart-turn model judged INCOMPLETE ("I'm due to my annual exam.") -
it waited the library-default 3s of silence before triggering the LLM. Build
the stop strategy explicitly with SmartTurnParams(stop_secs=1.5), env-tunable.
A caller who really does resume just yields a follow-up turn, which is safe
now that interruption broadcasts are off.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Historical calls showed the 8B re-asking for name/reason/phone it already had
("I already gave you my full name", the "I want an appointment" -> "what brings
you in?" loop) and VAD splitting one utterance into consecutive user turns.
- callstate.py: CallStateGroomer between agg.user() and the LLM. After each
agent turn (off the critical path) it extracts collected slots via one short
JSON-mode Ollama pass, then before each generation injects an ALREADY
COLLECTED / STILL NEEDED checklist into the system message and merges
VAD-fragmented consecutive user messages. Callback-type calls get an explicit
"no booking questions" line. CALL_STATE_TRACKING env (auto: on for ollama,
off for anthropic).
- bot.py prompt step 1: "I want an appointment" is the booking intent, not the
reason - ask the visit reason once, never twice.
- scripts/ab_replay.py: regression harness replaying the real failed calls.
llama3.1-8b raw = 3 failures; with CALL STATE = 0 failures across all
scenarios (chat latency 0.31s -> 0.55s median, well under the 3s gate).
Qwen3-14B A/B'd and rejected: no better raw, ~3s/turn, 11GB VRAM.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After the phone confirmation a caller's "yes" wasn't picked up (silence) until
they repeated it louder. Logs: line was live and the half-duplex gate had
reopened, but VAD never fired for ~14s — the quick/quiet "yes" was below
threshold (min_volume 0.3, start_secs 0.2).
Now that HalfDuplexGate gates out the agent's echo while it speaks, VAD can be
sensitive without echo false-triggers (it only listens hard on the caller's
turn). Lowered min_volume 0.3->0.15, start_secs 0.2->0.1, and trimmed the echo
tail 0.5->0.25 so an answer right after the agent stops isn't dropped.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A caller's reply was generated but never heard: 0.65s after the agent started
speaking, the VAD fired "user started speaking" (NO transcript) and broadcast an
interruption that cancelled the agent's audio -> ~24s of silence until the caller
spoke again. Cause: the agent's own TTS echoes back the phone line and the
always-on VAD interruption treats it as a barge-in. (PipelineParams has no
allow_interruptions in this pipecat build — it was a silent no-op.)
Fix: HalfDuplexGate before the VAD withholds inbound audio while the bot speaks
(+ECHO_TAIL_SECS, default 0.5s), so echo can't trigger a false barge-in.
Half-duplex (no mid-utterance barge-in); HALF_DUPLEX=false to restore it.
Runtime-tested the gate (pass idle / drop while speaking / drop in tail / resume).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- .env.example: add AGENT_NAME_SPOKEN=Eva.
- CLAUDE.md: note the agent-name respelling (AVA -> Eva, "EE-vuh"), that the
caller-ID is injected pre-spelled (model mangles raw digits), and that the
phone is confirmed near the END of the call, not led with.
(.env itself is gitignored; AGENT_NAME_SPOKEN=Eva set there and live.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In-call (system prompt + per-call calendar injection):
- Gather full name (prompt asks for last name if only first given).
- Confirm the caller-ID number; if declined, use the number the caller gives.
- Ask for and LOG insurance only — never promise/confirm/deny coverage or
treatment based on it; staff verify on callback.
- Validate the requested date against an injected 45-day calendar (recomputed
per call since the server is long-running). Push back on impossible/mismatched
dates, e.g. "Monday lands on the sixth — would you like that date?".
- AGENT_NAME=AVA; 4s grace pause before hang-up (HANGUP_DELAY_SECS).
Logging (post-call extraction -> Odoo):
- Extract full name, phone_confirmed, chosen callback (caller-ID or alternate),
insurance, reason, and preferred time annotated with a resolved YYYY-MM-DD
date (today's date is fed to the extractor).
- odoo_client: insurance row on the lead note (log only — staff verify).
.gitignore: ignore rotated avc_run.log* files.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Deepgram and the Twilio Standard API Key were reverted per decision:
- bot.py: restore HintedWhisperSTTService (faster-whisper hotwords), default
model medium; remove DeepgramSTTService import + DEEPGRAM_API_KEY.
- server.py: restore TWILIO_AUTH_TOKEN for X-Twilio-Signature validation and
the serializer auto-hang-up. Twilio signs webhooks with the Auth Token, so
an API Key Secret cannot validate signatures.
- .env.example: back to TWILIO_AUTH_TOKEN + Whisper STT vars.
- .gitignore: ignore runtime *.log (avc_run.log).
OLLAMA_MODEL stays activeblue-avc:latest (the existing pulled tag).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>