Commit Graph

26 Commits

Author SHA1 Message Date
tocmo0nlord
94e2ca1902 Cut smart-turn INCOMPLETE wait 3s -> 1.5s (SMART_TURN_STOP_SECS)
Follow-up test call: no more cancelled replies, but 3-5s response gaps on
turns the smart-turn model judged INCOMPLETE ("I'm due to my annual exam.") -
it waited the library-default 3s of silence before triggering the LLM. Build
the stop strategy explicitly with SmartTurnParams(stop_secs=1.5), env-tunable.
A caller who really does resume just yields a follow-up turn, which is safe
now that interruption broadcasts are off.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-04 03:23:03 +00:00
tocmo0nlord
3ed63d8ea9 Fix dead-air: stop VAD interruption broadcasts under half-duplex
Live call diagnosis (recording + log): replies were generated in <1s but a
false VAD trigger (background noise, no transcript) fired 0.7s later, and the
aggregator's broadcast_interruption silently discarded the queued TTS audio.
Caller heard 20-35s of silence, said "Hello?", repeated themselves. The
HalfDuplexGate only closes while the bot is audibly speaking, so the window
between generation start and first wire audio was unprotected. SilenceWatchdog
never fired because the cancelled reply never emitted BotStoppedSpeaking.

With HALF_DUPLEX on, build the user aggregator with enable_interruptions=False
on both turn-start strategies: strict turn-taking, nothing is ever cancelled.
UserStartedSpeakingFrame still flows, so watchdog resets keep working.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-04 03:07:24 +00:00
tocmo0nlord
7b528eaed2 Greeting discloses AVA is automated (HIPAA item); never claim to be human
CLAUDE.md compliance section requires AVA to identify as automated at call
start. Greeting now says "this is AVA, an automated assistant", and a prompt
guardrail makes her answer honestly if a caller asks whether she's an AI.
Replay harness greeting kept in sync.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-04 02:30:54 +00:00
tocmo0nlord
a47f4b423c Fix re-asking: deterministic slot memory + user-turn merge + reason-loop prompt
Historical calls showed the 8B re-asking for name/reason/phone it already had
("I already gave you my full name", the "I want an appointment" -> "what brings
you in?" loop) and VAD splitting one utterance into consecutive user turns.

- callstate.py: CallStateGroomer between agg.user() and the LLM. After each
  agent turn (off the critical path) it extracts collected slots via one short
  JSON-mode Ollama pass, then before each generation injects an ALREADY
  COLLECTED / STILL NEEDED checklist into the system message and merges
  VAD-fragmented consecutive user messages. Callback-type calls get an explicit
  "no booking questions" line. CALL_STATE_TRACKING env (auto: on for ollama,
  off for anthropic).
- bot.py prompt step 1: "I want an appointment" is the booking intent, not the
  reason - ask the visit reason once, never twice.
- scripts/ab_replay.py: regression harness replaying the real failed calls.
  llama3.1-8b raw = 3 failures; with CALL STATE = 0 failures across all
  scenarios (chat latency 0.31s -> 0.55s median, well under the 3s gate).
  Qwen3-14B A/B'd and rejected: no better raw, ~3s/turn, 11GB VRAM.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-03 23:49:39 +00:00
tocmo0nlord
bae388420b Return phone pleasantries (answer "how are you?")
A caller opened with "how are you doing?" and AVA jumped straight to "what
brings you in," ignoring it, and the caller pushed back. Added a pleasantries
rule: if greeted or asked how it's doing, AVA warmly answers and asks back in
the same breath, then continues to helping — never ignores a greeting. Verified
3/3 on greeting openers. (Insurance-slip-on-callback accepted as a model-ceiling
item, no change.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 15:36:36 +00:00
tocmo0nlord
12bf494b5f Tighten callback flow: no booking questions, no re-asking
On a real order-status call the callback note saved correctly, but AVA leaked
booking behavior: it asked for insurance, asked the caller "what's the status of
your order" (which is what THEY were asking), and re-asked the name. Rewrote the
callback branch as explicit short steps: acknowledge it can't look it up; note
what they're asking (don't make them repeat it, never ask the caller for what
only staff can look up); get the name only if not already given; confirm the
callback number; promise a staff callback. Explicitly: no insurance/office/
day-time for callbacks. Verified clean on the order-status scenario.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 15:27:05 +00:00
tocmo0nlord
97e109ed89 Handle non-booking requests: take a message, log a callback note
A caller asking if their already-purchased frames were ready got railroaded
through the booking script and hung up. AVA had no path for requests it can't do
on the phone.

- Prompt: classify intent first — question (answer it), can't-do request (take
  name + a one-line note, confirm callback number, promise a staff callback;
  never force booking questions), or booking (the ordered steps).
- extract.py: request_type = appointment | callback | none. Callback gate needs
  a name or a request note. Records kind.
- practice.py / odoo_client.py: callbacks write a "📞 Callback request" lead
  (name, callback number, what they need) instead of an appointment card.

Verified the classifier: frames-status -> callback, booking -> appointment,
pure question -> none.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 14:46:23 +00:00
tocmo0nlord
7f34b06415 Stop spurious watchdog re-prompts after the call ends
The SilenceWatchdog armed a timer on the goodbye turn; it then fired ~silence_secs
later, after EndCallProcessor had already hung up — logging phantom "re-prompt"s
(3 of 5 in the last batch were after "Goodbye"). Now it stops for good on a
closing keyword (LLMFullResponseEnd) or an EndFrame/CancelFrame, so it never
re-arms once the call is closing. Real mid-call silences still re-prompt.
Runtime-tested: no reprompt after goodbye / endframe; reprompt on normal silence.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 00:14:53 +00:00
tocmo0nlord
a521dc168e Fix GPU OOM: share one Whisper model across calls (was leaking per call)
Calls were dropping right after answer with "CUDA failed with error out of
memory". Cause: each call constructed a new HintedWhisperSTTService -> new
ctranslate2 WhisperModel on the GPU, and that VRAM was never released when the
call ended. Over ~13 calls the python process grew to 9.7GB; with the pinned LLM
(6GB) the 16GB GPU filled (14 MiB free) and Whisper load failed on every call.

Fix: cache one WhisperModel per (model,device,compute) in _WHISPER_MODEL_CACHE
and reuse it across all calls; bake the fixed hotwords into the shared model's
transcribe() once (drops the racy per-call monkey-patch). VRAM now constant
(~6GB LLM + ~1.5GB Whisper). Verified: two instances share one model object;
GPU back to 6.0/16GB used after restart. Documented the VRAM budget.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 22:07:59 +00:00
tocmo0nlord
1cfdf562e2 Phone confirmation: state number, invite correction only (no "yes")
Call-recording analysis proved the repetitive post-phone silence is NOT volume:
the caller's reply was a full-energy sound right after the (long ~13s) phone
question, but VAD never registered it (no "user started speaking"), so the call
waited until the caller repeated it. Depending on catching a "yes" after a long
utterance is fragile (echo/gate timing).

Fix: stop requiring a "yes". AVA now states the number and invites a correction
only ("...; if that's not the best number, just let me know.") and flows on —
the caller only speaks to correct it. Updated the prompt step, the caller-ID
injection, and the deterministic EndCallProcessor line. Verified 4/4.

Docs: phone step, recording + watchdog entries, recording's post-gate limitation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 17:57:54 +00:00
tocmo0nlord
80824a7ab0 Add call recording (stereo WAV) + wire silence re-prompt watchdog
Stop debugging silence by guesswork: AudioBufferProcessor records every call to
recordings/<ts>_<callsid>.wav (caller=left, agent=right) so calls can be reviewed
with actual audio. (We had no audio before — that was the real gap; the earlier
"too quiet" explanation was unsupported.)

SilenceWatchdog: after the agent finishes, if the caller is silent for
SILENCE_REPROMPT_SECS (7s) it re-prompts ("are you still there?"); after
MAX_REPROMPTS it closes gracefully. This directly breaks the dead-silence
pattern (e.g. the 14s gap after the phone confirmation) instead of waiting.
Runtime-tested both. .gitignore already excludes recordings/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 17:46:07 +00:00
tocmo0nlord
b0df7fd5b0 Fix missed quiet "yes" after phone confirmation: more sensitive VAD
After the phone confirmation a caller's "yes" wasn't picked up (silence) until
they repeated it louder. Logs: line was live and the half-duplex gate had
reopened, but VAD never fired for ~14s — the quick/quiet "yes" was below
threshold (min_volume 0.3, start_secs 0.2).

Now that HalfDuplexGate gates out the agent's echo while it speaks, VAD can be
sensitive without echo false-triggers (it only listens hard on the caller's
turn). Lowered min_volume 0.3->0.15, start_secs 0.2->0.1, and trimmed the echo
tail 0.5->0.25 so an answer right after the agent stops isn't dropped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 17:36:20 +00:00
tocmo0nlord
32a3bb7136 Fix echo-induced silence with a half-duplex audio gate
A caller's reply was generated but never heard: 0.65s after the agent started
speaking, the VAD fired "user started speaking" (NO transcript) and broadcast an
interruption that cancelled the agent's audio -> ~24s of silence until the caller
spoke again. Cause: the agent's own TTS echoes back the phone line and the
always-on VAD interruption treats it as a barge-in. (PipelineParams has no
allow_interruptions in this pipecat build — it was a silent no-op.)

Fix: HalfDuplexGate before the VAD withholds inbound audio while the bot speaks
(+ECHO_TAIL_SECS, default 0.5s), so echo can't trigger a false barge-in.
Half-duplex (no mid-utterance barge-in); HALF_DUPLEX=false to restore it.
Runtime-tested the gate (pass idle / drop while speaking / drop in tail / resume).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 16:44:00 +00:00
tocmo0nlord
ceea3d151c Fix mid-call silence: keep momentum after acknowledgments
A caller gave their insurance; AVA replied with a bare acknowledgment ("staff
will verify your coverage") and stopped, with no follow-up question. Both sides
then waited -> dead air (pipeline idle, no GPU/LLM activity, matching flat
memory/wattage). Caller had to break the silence with "what questions do you
have?". Root cause: the one-sentence brevity rule made AVA end a booking turn on
a dead-end statement.

Fix: prompt now requires, until the booking is complete, that every turn end
with the next question — acknowledgment + next question in the same turn (e.g.
insurance ack -> immediately ask day/time). Verified 4/4. Documented in CLAUDE.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 16:34:25 +00:00
tocmo0nlord
d7bfe2dbe8 Deterministic phone confirmation safety net + docs
EndCallProcessor now guarantees the callback number is confirmed on booking
calls: the 8B reads it back only ~half the time, so if a closing is reached on a
booking call (booking keyword seen) without the agent having spoken the number
(phone_marker absent from its replies), the hang-up is suppressed and a scripted
confirmation line (caller-ID spelled out) is injected as a TTSSpeakFrame first.
The agent's own readback satisfies the gate (no double-ask); info-only calls are
never asked for a number. Runtime-tested all four paths (inject / no-inject /
info-only / inject-then-end).

CLAUDE.md: document the safety net, the "never claim a booking" rule, the direct
phone-confirm phrasing, and the insurance "never say we accept" rule.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 15:52:22 +00:00
tocmo0nlord
1e0472e864 Never claim the appointment is confirmed; clean phone-confirm + insurance
Fixes from a test call:
- Contradiction: AVA said "staff will confirm" then later "we've got your
  appointment scheduled". Hardened the rule — never say booked/scheduled/set/
  confirmed (even in the recap); it's always a REQUEST staff confirm on callback.
  Wrap-up recaps as "I've noted your request...".
- Phone: it asked "may I read your number back?" then read it anyway. Now states
  it directly in one line ("I have your number as <number> - is that best?"),
  no permission ask, don't skip.
- Insurance: stop saying "we accept/take <plan>" (it said "we accept All State",
  which isn't even a listed plan) — just note it, staff verify.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 15:33:27 +00:00
tocmo0nlord
8b52097713 Stop insurance hallucination: never suggest or guess the plan
A caller trailed off ("My insurance plan is...") and AVA filled in "CarePlus",
which got logged to the lead. Tightened the insurance rule: ask open-endedly,
do NOT read out/suggest plan names from the accepted list, capture only what the
caller says, never fill in/complete/guess the plan, and ask them to repeat if
unclear. Verified 4/4 on the trail-off case (asks to repeat, no fabricated plan).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 04:13:50 +00:00
tocmo0nlord
9d65fa9aaa Give AVA a clear ordered call workflow
Replace loose "gather these details" with a directed script so the call has
clear direction:
  1. Reason first — what are they calling about
  2. Location — city/area, confirm the matching office
  3. Caller info — full name, then address them by name; insurance (log only),
     preferred day/time
  4. Verify phone near the end by reading it back
  5. Wrap up — recap, then "Is there anything else I can help you with?"

Closing hardened: "Goodbye" (which ends the call) is gated behind the
anything-else question, never said in the same turn as confirming details.
Be warm but direct; one short turn at a time.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 03:44:03 +00:00
tocmo0nlord
703c902d0f Fix phone readback, lead-with-number flow, and AVA pronunciation
- Phone: inject the caller-ID into the prompt already spelled digit-by-digit so
  the model repeats clean words instead of mangling raw digits (it had emitted
  "197-three five seven three..." -> Kokoro read "one hundred ninety-seven").
- Flow: stop leading with the phone number. Prompt now flows naturally and
  saves the callback-number confirmation for the END; the caller-ID line says
  not to recite it early. Verified 3/3 openings no longer recite the number.
- Name: Kokoro spelled all-caps "AVA" as "A-V-A". Respell to AGENT_NAME_SPOKEN
  (default "Ava") in TTS only; logs/Odoo keep AGENT_NAME. Override e.g.
  AGENT_NAME_SPOKEN=Eva for an "EE-vuh" sound.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 02:08:52 +00:00
tocmo0nlord
6010b136a7 Fix office selection: confirm the matching office, don't offer others
When a caller named a city matching an office ("I'm in Kendall"), AVA confirmed
Kendall then asked them to pick between unrelated offices ("Hollywood or
Miami?"), going off script. Tightened the prompt: on a city that matches an
office, confirm THAT office and move on; never offer/compare other offices or
ask the caller to choose; name the nearest only if nothing matches. Verified
3/3 on the failing scenario.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 01:30:35 +00:00
tocmo0nlord
92abe209f3 Fix SpokenKokoroTTSService.run_tts signature (broke all call audio)
run_tts is called as run_tts(self, text, context_id); the override only accepted
(self, text), so every utterance raised "takes 2 positional arguments but 3
were given" and produced no audio — callers heard nothing on every call since
the number-normalization change. Added context_id and pass it through. Verified
the service now emits audio (118KB for a sample) with digits normalized.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 01:13:45 +00:00
tocmo0nlord
1204d24340 Read phone numbers, street numbers, and zips digit-by-digit in TTS
Kokoro spoke "983-4969" as "nine hundred eighty-three dash forty-nine sixty-
nine". Added SpokenKokoroTTSService which normalizes text just before synthesis
(run_tts gets the full sentence): US phone patterns and 4-5 digit runs (street
numbers, zips) are spoken one digit at a time, country code dropped, no "dash"/
parens. Dates and times are left natural. Deterministic, so it's robust to
whatever the model emits.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 04:17:54 +00:00
tocmo0nlord
19728e1555 Fix bad-call regressions: drop in-call date computation, tighten replies
A real call derailed: AVA argued about today's date, parroted the canned date
example, hallucinated appointment availability, and rambled. Root cause was the
date-validation feature — the local 8B model computes appointment dates wrong
~5/5 in testing, so having it state/correct dates is a liability.

- DATES: capture & defer — AVA takes the day/time in the caller's own words,
  never computes/states/corrects the calendar date, never argues about today;
  staff confirm the exact date on callback. Removed the 45-day calendar
  injection and _date_context()/datetime use.
- Hardened the no-availability rule (no "openings", no "check availability",
  no "I'll book").
- Brevity: one short sentence per reply (two at most).

Post-call extractor still records a best-effort resolved date (staff-verified).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 03:20:22 +00:00
tocmo0nlord
b8c71b15c2 Capture full appointment details + validate dates in-call
In-call (system prompt + per-call calendar injection):
- Gather full name (prompt asks for last name if only first given).
- Confirm the caller-ID number; if declined, use the number the caller gives.
- Ask for and LOG insurance only — never promise/confirm/deny coverage or
  treatment based on it; staff verify on callback.
- Validate the requested date against an injected 45-day calendar (recomputed
  per call since the server is long-running). Push back on impossible/mismatched
  dates, e.g. "Monday lands on the sixth — would you like that date?".
- AGENT_NAME=AVA; 4s grace pause before hang-up (HANGUP_DELAY_SECS).

Logging (post-call extraction -> Odoo):
- Extract full name, phone_confirmed, chosen callback (caller-ID or alternate),
  insurance, reason, and preferred time annotated with a resolved YYYY-MM-DD
  date (today's date is fed to the extractor).
- odoo_client: insurance row on the lead note (log only — staff verify).

.gitignore: ignore rotated avc_run.log* files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 03:00:35 +00:00
tocmo0nlord
5ed641255c Revert Phase 1 STT/auth swaps: stay on Whisper + Twilio Auth Token
Deepgram and the Twilio Standard API Key were reverted per decision:
- bot.py: restore HintedWhisperSTTService (faster-whisper hotwords), default
  model medium; remove DeepgramSTTService import + DEEPGRAM_API_KEY.
- server.py: restore TWILIO_AUTH_TOKEN for X-Twilio-Signature validation and
  the serializer auto-hang-up. Twilio signs webhooks with the Auth Token, so
  an API Key Secret cannot validate signatures.
- .env.example: back to TWILIO_AUTH_TOKEN + Whisper STT vars.
- .gitignore: ignore runtime *.log (avc_run.log).

OLLAMA_MODEL stays activeblue-avc:latest (the existing pulled tag).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 01:06:24 +00:00
tocmo0nlord
c3c719b77e Initial commit: avc-phone-ai codebase + CLAUDE.md 2026-06-23 22:38:22 +00:00