Phone confirmation: state number, invite correction only (no "yes")

Call-recording analysis proved the repetitive post-phone silence is NOT volume:
the caller's reply was a full-energy sound right after the (long ~13s) phone
question, but VAD never registered it (no "user started speaking"), so the call
waited until the caller repeated it. Depending on catching a "yes" after a long
utterance is fragile (echo/gate timing).

Fix: stop requiring a "yes". AVA now states the number and invites a correction
only ("...; if that's not the best number, just let me know.") and flows on —
the caller only speaks to correct it. Updated the prompt step, the caller-ID
injection, and the deterministic EndCallProcessor line. Verified 4/4.

Docs: phone step, recording + watchdog entries, recording's post-gate limitation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
tocmo0nlord
2026-06-27 17:57:54 +00:00
parent 80824a7ab0
commit 1cfdf562e2
2 changed files with 27 additions and 14 deletions

View File

@@ -64,6 +64,16 @@ echo false-triggers. Addresses the repeat-yourself / missed-short-answer problem
**`AudioHeartbeat`** — diagnostic processor that distinguishes VAD failure from
transport stall. Keep it.
**Call recording (`AudioBufferProcessor`)** — every call is saved to `recordings/<ts>_<callSID>.wav`
as **stereo** (caller = left, agent = right) for review/debugging. It sits at the end of the
pipeline, so the caller channel is what the system *received* (post-`HalfDuplexGate`) — it does
NOT capture caller audio that arrived while the agent was speaking (gated). `RECORD_CALLS=false`
to disable. `recordings/` is gitignored.
**`SilenceWatchdog` in `bot.py`** — if the caller goes silent after the agent finishes, it
re-prompts ("are you still there?") after `SILENCE_REPROMPT_SECS` (7s), and after `MAX_REPROMPTS`
closes gracefully. Backstop against dead air; `silence_secs` must stay > `HANGUP_DELAY_SECS`.
**`HalfDuplexGate` in `bot.py`** — fixes echo-induced mid-call silence. In this pipecat build
interruptions are VAD-driven and always on (`PipelineParams.allow_interruptions` does NOT exist
— it's silently ignored). On a phone line the agent's own TTS echoes back, the VAD reads it as
@@ -269,10 +279,12 @@ time, leading the call rather than waiting on the caller. Fixed order:
2. **Location** — ask city/area, confirm the matching office (don't offer others — see office rule).
3. **Caller info** — full name (ask last name if only a first is given), then **address the caller
by name** from there on; insurance (log only); preferred day/time in their words.
4. **Verify phone** — near the end, state the caller-ID back in one line ("I have your number
as <number> is that the best number?"), no asking permission first; if not, use the number
they give. Never raised earlier. **Backed by a deterministic safety net** — if the agent
skips it, `EndCallProcessor` injects the confirmation before hang-up (see "already solved").
4. **Confirm phone (no "yes" needed)** — near the end, STATE the caller-ID back and invite a
correction *only* ("I have your number as <number>; if that's not the best number, just let me
know."), then flow on. **No yes/no question, no waiting** — depending on catching a "yes" right
after a long utterance kept failing (echo/gate timing; verified via call recording — the caller's
reply was received but VAD never registered it). Caller speaks only to correct it. Still backed
by the deterministic `EndCallProcessor` safety net (also a "let me know if wrong" statement).
5. **Wrap up** — recap the booking **as a REQUEST** by name ("I've noted your request to come
in…"), make clear staff will call to confirm, then ask **"Is there anything else I can help
you with?"**
@@ -305,7 +317,7 @@ Replies are kept to one short sentence.
| Field | In-call behavior | Logged as |
|-------|------------------|-----------|
| Full name | Asks for last name if only a first is given | `patient_name` / lead `contact_name` |
| Phone | Confirmed **near the end** (not led with); reads back the caller-ID injected pre-spelled so it's said digit-by-digit and if the caller declines, uses the number they give | `callback_number` (+ `phone_confirmed`) |
| Phone | Confirmed **near the end** (not led with); STATES the caller-ID back (injected pre-spelled, digit-by-digit) and invites a correction only — **no "yes" required**; uses a different number only if the caller gives one | `callback_number` (+ `phone_confirmed`) |
| Office / city | Asks city/area; when the caller names a place that matches an office, **confirms that office and moves on** — never offers/compares other offices or asks them to choose; names the nearest only if nothing matches | folded into `reason` prefix |
| Reason | Captured from the conversation | `reason` |
| Insurance | **Log only, never suggest or guess** — asks open-endedly (no plan names read out), captures only what the caller says, never fills in/completes/guesses the plan (asks to repeat if unclear), never says "we accept/take" a plan, never promises/confirms/denies coverage or treatment even for a listed plan; staff verify on callback | `insurance` (note: "log only — staff to verify") |

19
bot.py
View File

@@ -152,10 +152,11 @@ SYSTEM_PROMPT = (
"ask their last name). From this point on, address the caller by their name. Then ask their "
"insurance (log only — see below) and their preferred day and time (in their own words — "
"see the date rule below).\n"
" 4. VERIFY PHONE — near the end, state the callback number back in ONE line, exactly like: "
"'I have your number as <the number spelled out below> — is that the best number to reach "
"you?'. Do NOT ask permission first ('may I read your number back?') and do NOT skip this "
"step. If it's not right, use the number they give. Don't bring up the phone number before this.\n"
" 4. CONFIRM PHONE (no yes needed) — near the end, STATE the callback number back in one "
"line and invite a CORRECTION ONLY, exactly like: 'I have your number as <the number spelled "
"out below>; if that's not the best number, just let me know.' Do NOT ask a yes/no question, "
"do NOT ask permission, and do NOT wait for them to say 'yes' — flow straight into the wrap-up. "
"Only act on the phone number if they give you a different one. Don't bring it up before this.\n"
" 5. WRAP UP — recap the booking as a REQUEST in one warm sentence (for example, 'I've "
"noted your request to come in tomorrow afternoon at our Kendall office'), make clear a "
"staff member will call back to CONFIRM it, then ASK IF THERE IS ANYTHING ELSE you can help "
@@ -593,9 +594,10 @@ async def run_agent(transport, caller_number=None, call_sid=None, do_capture=Tru
if caller_number:
caller_line = (
f"\n\nCALLER ID: the caller's number on file, written so you read it digit by digit, "
f"is: {_spoken_phone(caller_number)}. When it's time to confirm it (near the end), say "
"it back exactly like that and ask if it's the best number; if they say no, use the "
"number they give. Do not say it any earlier in the call."
f"is: {_spoken_phone(caller_number)}. Near the end, state it back and invite a "
"correction only ('...; if that's not the best number, just let me know.') — do NOT "
"ask a yes/no question or wait for a 'yes'. Only change it if they give a different "
"number. Do not say it any earlier in the call."
)
else:
caller_line = (
@@ -613,8 +615,7 @@ async def run_agent(transport, caller_number=None, call_sid=None, do_capture=Tru
if caller_number:
_spoken = _spoken_phone(caller_number)
phone_confirm_line = (
f"Before you go, let me make sure I have the best number to reach you: "
f"{_spoken}. Is that correct?"
f"Also, I have your number as {_spoken}; if that's not the best number, just let me know."
)
phone_marker = _spoken.split(",")[0].strip() # e.g. "nine seven three"
else: