A caller gave their insurance; AVA replied with a bare acknowledgment ("staff
will verify your coverage") and stopped, with no follow-up question. Both sides
then waited -> dead air (pipeline idle, no GPU/LLM activity, matching flat
memory/wattage). Caller had to break the silence with "what questions do you
have?". Root cause: the one-sentence brevity rule made AVA end a booking turn on
a dead-end statement.
Fix: prompt now requires, until the booking is complete, that every turn end
with the next question — acknowledgment + next question in the same turn (e.g.
insurance ack -> immediately ask day/time). Verified 4/4. Documented in CLAUDE.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
EndCallProcessor now guarantees the callback number is confirmed on booking
calls: the 8B reads it back only ~half the time, so if a closing is reached on a
booking call (booking keyword seen) without the agent having spoken the number
(phone_marker absent from its replies), the hang-up is suppressed and a scripted
confirmation line (caller-ID spelled out) is injected as a TTSSpeakFrame first.
The agent's own readback satisfies the gate (no double-ask); info-only calls are
never asked for a number. Runtime-tested all four paths (inject / no-inject /
info-only / inject-then-end).
CLAUDE.md: document the safety net, the "never claim a booking" rule, the direct
phone-confirm phrasing, and the insurance "never say we accept" rule.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Reason visibility: the reason WAS extracted ("disintegrated eyes") but only
lived in the Odoo description note. Add it to the post-call log line and to
the Odoo lead title so it's visible at a glance.
- Latency: split the timing — Whisper is ~0.1s, latency is LLM-side. The ~3s
tail was cold model reloads after Ollama's keep-alive expired. server.py now
warms + pins the model on startup (keep_alive=-1, ollama ps UNTIL=Forever),
removing cold first-turn stalls. Whisper size left alone (not the bottleneck).
- CLAUDE.md: insurance rule (never suggest/guess the plan), latency note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document AVA's directed call script — reason first, location, caller info
(address by name), verify phone by readback near the end, wrap up with "anything
else?" — and the gated closing (Goodbye only after the anything-else question).
Note the 8B reliability ceiling on step ordering.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- .env.example: add AGENT_NAME_SPOKEN=Eva.
- CLAUDE.md: note the agent-name respelling (AVA -> Eva, "EE-vuh"), that the
caller-ID is injected pre-spelled (model mangles raw digits), and that the
phone is confirmed near the END of the call, not led with.
(.env itself is gitignored; AGENT_NAME_SPOKEN=Eva set there and live.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Note in the Call Data Capture table that AVA confirms a matching office and
moves on rather than offering/comparing other offices — the fix for the
"I'm in Kendall" -> "Hollywood or Miami?" off-script behavior.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document the TTS number-reading fix in the "already solved" section: phone
numbers, street numbers, and zips are spoken digit-by-digit (no "dash"/parens,
country code dropped); dates/times left natural. tts_normalize() holds the rules.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Long calls overflowed the 4096-token window mid-conversation, forcing Ollama to
truncate + re-evaluate the full context each turn = multi-second stalls / dead
air. Rebuilt activeblue-avc:latest with num_ctx 8192 (rollback tag
activeblue-avc:pre-ctx8k). Combined with removing the 45-day calendar injection,
this keeps long calls well under the window. Doc: context row, Modelfile
reference, and a root-cause note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Update the call-capture section to reflect the fix — AVA takes the day/time in
the caller's words and defers exact-date confirmation to staff; the 45-day
calendar injection and in-call date validation were removed after a real call
derailed and the 8B model proved unable to compute dates reliably. Post-call
resolved_date is best-effort/staff-verified only.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- New "Call Data Capture & Date Validation" section: the six captured fields
(full name, phone confirm/alternate, office, reason, insurance log-only,
validated preferred date/time), how each is logged, and the per-call calendar
injection that drives date pushback.
- EndCallProcessor note: HANGUP_DELAY_SECS grace pause; Phase 1 gate result.
- .env reference: add HANGUP_DELAY_SECS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reframe Change 1/2/3 to record the actual decisions instead of the trialed
swaps: Deepgram and the Twilio Standard API Key were both evaluated and
reverted. Document why the API Key cannot replace the Auth Token (Twilio signs
webhooks with the Auth Token). Update the .env reference, Phase 1 checklist,
dependencies, and open items accordingly; gate zombie-check uses ps/pgrep
(bare process, not Docker).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds the full Gitea repo URL to the infrastructure table and the monitoring
dashboard line, and keeps the repository-structure tree root as avc-phone-ai
to match the rest of the doc.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Header, structure tree, and footer pointed at avc-phone-agent; the actual
Gitea repo is avc-phone-ai. The avc-phone-agent-prod Twilio API key name is
left unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>