Deterministic phone confirmation safety net + docs
EndCallProcessor now guarantees the callback number is confirmed on booking calls: the 8B reads it back only ~half the time, so if a closing is reached on a booking call (booking keyword seen) without the agent having spoken the number (phone_marker absent from its replies), the hang-up is suppressed and a scripted confirmation line (caller-ID spelled out) is injected as a TTSSpeakFrame first. The agent's own readback satisfies the gate (no double-ask); info-only calls are never asked for a number. Runtime-tested all four paths (inject / no-inject / info-only / inject-then-end). CLAUDE.md: document the safety net, the "never claim a booking" rule, the direct phone-confirm phrasing, and the insurance "never say we accept" rule. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
31
CLAUDE.md
31
CLAUDE.md
@@ -40,6 +40,14 @@ Watches LLM text stream for closing keywords ("goodbye"), waits for TTS to finis
|
||||
clipped, then pushes `EndTaskFrame` upstream. `TwilioFrameSerializer` with `auto_hang_up`
|
||||
drops the carrier leg. Verified working in the Phase 1 gate (4/4 clean hang-ups).
|
||||
|
||||
It also **deterministically guarantees the callback number is confirmed** on booking calls:
|
||||
the 8B reads the number back only ~half the time, so if a closing is reached on a booking
|
||||
call (booking keyword seen) without the agent having spoken the number (`phone_marker` not
|
||||
seen in its replies), the hang-up is suppressed and a scripted confirmation line
|
||||
(`phone_confirm_line`, the caller-ID spelled out) is injected as a `TTSSpeakFrame` first.
|
||||
The agent's own readback satisfies the gate, so there's no double-ask in the common case;
|
||||
info-only calls (no booking keyword) are never asked for a number.
|
||||
|
||||
**Mulaw 8kHz ↔ 16kHz conversion** — handled internally by `TwilioFrameSerializer`.
|
||||
`PIPELINE_SAMPLE_RATE = 16000`, `WIRE_SAMPLE_RATE = 8000` are already set correctly.
|
||||
No custom audio module needed.
|
||||
@@ -250,18 +258,25 @@ time, leading the call rather than waiting on the caller. Fixed order:
|
||||
2. **Location** — ask city/area, confirm the matching office (don't offer others — see office rule).
|
||||
3. **Caller info** — full name (ask last name if only a first is given), then **address the caller
|
||||
by name** from there on; insurance (log only); preferred day/time in their words.
|
||||
4. **Verify phone** — near the end, read the caller-ID back digit-by-digit and ask if it's best;
|
||||
if not, use the number they give. Never raised earlier in the call.
|
||||
5. **Wrap up** — recap the booking by name, then ask **"Is there anything else I can help you
|
||||
with?"**
|
||||
4. **Verify phone** — near the end, state the caller-ID back in one line ("I have your number
|
||||
as <number> — is that the best number?"), no asking permission first; if not, use the number
|
||||
they give. Never raised earlier. **Backed by a deterministic safety net** — if the agent
|
||||
skips it, `EndCallProcessor` injects the confirmation before hang-up (see "already solved").
|
||||
5. **Wrap up** — recap the booking **as a REQUEST** by name ("I've noted your request to come
|
||||
in…"), make clear staff will call to confirm, then ask **"Is there anything else I can help
|
||||
you with?"**
|
||||
|
||||
**Never claims a booking:** AVA must never say an appointment is "booked / scheduled / set /
|
||||
confirmed" — everything is a request staff confirm on callback. **Insurance:** never say "we
|
||||
accept/take" a plan (or invent one) — just note what the caller said; staff verify.
|
||||
|
||||
**Closing is gated:** the word "Goodbye" ends the call (triggers `EndCallProcessor` → hang-up),
|
||||
so it is never said in the same turn as confirming details and never before the anything-else
|
||||
question — only after the caller says they need nothing more.
|
||||
|
||||
> Reliability: this is prompt-driven on the local 8B, so order is followed well but not
|
||||
> perfectly — the phone-readback step in particular varies (sometimes reads back, sometimes
|
||||
> asks for the number), and it can re-ask a last name. Same model ceiling noted elsewhere.
|
||||
> Reliability: the script is prompt-driven on the local 8B (order followed well, not perfectly;
|
||||
> it can re-ask a last name). The phone-confirmation step is the exception — it's now
|
||||
> **guaranteed** by the deterministic `EndCallProcessor` safety net.
|
||||
|
||||
## Call Data Capture
|
||||
|
||||
@@ -277,7 +292,7 @@ Replies are kept to one short sentence.
|
||||
| Phone | Confirmed **near the end** (not led with); reads back the caller-ID — injected pre-spelled so it's said digit-by-digit — and if the caller declines, uses the number they give | `callback_number` (+ `phone_confirmed`) |
|
||||
| Office / city | Asks city/area; when the caller names a place that matches an office, **confirms that office and moves on** — never offers/compares other offices or asks them to choose; names the nearest only if nothing matches | folded into `reason` prefix |
|
||||
| Reason | Captured from the conversation | `reason` |
|
||||
| Insurance | **Log only, never suggest or guess** — asks open-endedly (no plan names read out), captures only what the caller says, never fills in/completes/guesses the plan (asks them to repeat if unclear), never promises/confirms/denies coverage or treatment even for a listed plan; staff verify on callback | `insurance` (note: "log only — staff to verify") |
|
||||
| Insurance | **Log only, never suggest or guess** — asks open-endedly (no plan names read out), captures only what the caller says, never fills in/completes/guesses the plan (asks to repeat if unclear), never says "we accept/take" a plan, never promises/confirms/denies coverage or treatment even for a listed plan; staff verify on callback | `insurance` (note: "log only — staff to verify") |
|
||||
| Preferred day & time | **Capture & defer** — taken in the caller's own words; AVA does not compute or correct the date | `preferred_time` + best-effort resolved `YYYY-MM-DD` |
|
||||
|
||||
### Dates — capture & defer (do NOT compute in-call)
|
||||
|
||||
72
bot.py
72
bot.py
@@ -203,20 +203,35 @@ def _build_tools() -> ToolsSchema:
|
||||
|
||||
|
||||
class EndCallProcessor(FrameProcessor):
|
||||
"""Lets the agent hang up. MUST sit between the LLM and the TTS: there it sees her reply
|
||||
text (LLMTextFrame, flowing downstream) AND the upstream copy of BotStoppedSpeakingFrame
|
||||
the output transport emits. It accumulates each reply; if the finished reply contains a
|
||||
closing ('goodbye'/'adiós'), it waits until she's done speaking, pauses HANGUP_DELAY_SECS
|
||||
so the caller isn't clipped, then pushes EndTaskFrame upstream — the task ends and
|
||||
TwilioFrameSerializer (auto_hang_up) drops the call."""
|
||||
"""Lets the agent hang up AND guarantees the callback number is confirmed once.
|
||||
|
||||
Sits between the LLM and the TTS: it sees reply text (LLMTextFrame, downstream) and the
|
||||
upstream BotStoppedSpeakingFrame. On a closing ('goodbye'/'adiós') it waits for TTS to
|
||||
finish, pauses HANGUP_DELAY_SECS so the caller isn't clipped, then pushes EndTaskFrame
|
||||
(TwilioFrameSerializer auto_hang_up drops the call).
|
||||
|
||||
Deterministic phone confirmation: the prompt asks the agent to read the callback number
|
||||
back, but the 8B skips it ~half the time. So if a closing is reached and the agent never
|
||||
spoke the number this call (`phone_marker` not seen in its replies), we suppress the
|
||||
hang-up and inject a scripted confirmation turn first — guaranteeing it happens exactly
|
||||
once (the agent's own readback satisfies the gate, so no double-ask in the common case)."""
|
||||
|
||||
_CLOSINGS = ("goodbye", "good-bye", "good bye", "adiós", "adios", "hasta luego")
|
||||
# Only force phone confirmation when a booking was actually underway (not info-only calls).
|
||||
_BOOKING_KWS = ("appointment", "schedule", "book", "insurance", "what day", "what time",
|
||||
"come in", "preferred")
|
||||
|
||||
def __init__(self):
|
||||
def __init__(self, phone_confirm_line: str | None = None, phone_marker: str | None = None):
|
||||
super().__init__()
|
||||
self._buf = ""
|
||||
self._should_end = False
|
||||
self._end_task = None
|
||||
self._phone_confirm_line = phone_confirm_line
|
||||
self._phone_marker = (phone_marker or "").lower()
|
||||
# Nothing to confirm (no caller ID) → treat as already handled.
|
||||
self._phone_confirmed = not phone_confirm_line
|
||||
self._assistant_seen = ""
|
||||
self._pending_phone_inject = False
|
||||
|
||||
@classmethod
|
||||
def _is_closing(cls, text: str) -> bool:
|
||||
@@ -235,17 +250,31 @@ class EndCallProcessor(FrameProcessor):
|
||||
await super().process_frame(frame, direction)
|
||||
if isinstance(frame, LLMTextFrame):
|
||||
self._buf += frame.text
|
||||
self._assistant_seen += frame.text.lower()
|
||||
if self._phone_marker and self._phone_marker in self._assistant_seen:
|
||||
self._phone_confirmed = True # the agent read the number back itself
|
||||
elif isinstance(frame, LLMFullResponseEndFrame):
|
||||
if self._is_closing(self._buf):
|
||||
self._should_end = True
|
||||
logger.info(f"{AGENT_NAME} signalled closing -- will hang up "
|
||||
f"{HANGUP_DELAY_SECS:.0f}s after she finishes speaking")
|
||||
booking = any(k in self._assistant_seen for k in self._BOOKING_KWS)
|
||||
if self._phone_confirmed or not booking:
|
||||
self._should_end = True
|
||||
logger.info(f"{AGENT_NAME} signalled closing -- will hang up "
|
||||
f"{HANGUP_DELAY_SECS:.0f}s after she finishes speaking")
|
||||
else:
|
||||
# Booking call closing without the number confirmed — do it deterministically.
|
||||
self._pending_phone_inject = True
|
||||
logger.info(f"{AGENT_NAME} reached closing w/o phone confirmation -- injecting it")
|
||||
self._buf = ""
|
||||
elif isinstance(frame, BotStoppedSpeakingFrame) and self._should_end:
|
||||
self._should_end = False
|
||||
# Schedule the teardown so we don't block the pipeline during the grace pause.
|
||||
if self._end_task is None:
|
||||
self._end_task = asyncio.create_task(self._hang_up_after_delay())
|
||||
elif isinstance(frame, BotStoppedSpeakingFrame):
|
||||
if self._pending_phone_inject:
|
||||
self._pending_phone_inject = False
|
||||
self._phone_confirmed = True
|
||||
await self.push_frame(TTSSpeakFrame(self._phone_confirm_line), FrameDirection.DOWNSTREAM)
|
||||
elif self._should_end:
|
||||
self._should_end = False
|
||||
# Schedule the teardown so we don't block the pipeline during the grace pause.
|
||||
if self._end_task is None:
|
||||
self._end_task = asyncio.create_task(self._hang_up_after_delay())
|
||||
await self.push_frame(frame, direction)
|
||||
|
||||
|
||||
@@ -455,7 +484,18 @@ async def run_agent(transport, caller_number=None, call_sid=None, do_capture=Tru
|
||||
context_kwargs["tools"] = _build_tools()
|
||||
context = LLMContext(**context_kwargs)
|
||||
agg = LLMContextAggregatorPair(context)
|
||||
endcall = EndCallProcessor()
|
||||
# Deterministic phone-confirmation safety net: if the agent reaches a closing without
|
||||
# having read the caller-ID back, EndCallProcessor speaks this scripted line first.
|
||||
if caller_number:
|
||||
_spoken = _spoken_phone(caller_number)
|
||||
phone_confirm_line = (
|
||||
f"Before you go, let me make sure I have the best number to reach you: "
|
||||
f"{_spoken}. Is that correct?"
|
||||
)
|
||||
phone_marker = _spoken.split(",")[0].strip() # e.g. "nine seven three"
|
||||
else:
|
||||
phone_confirm_line = phone_marker = None
|
||||
endcall = EndCallProcessor(phone_confirm_line=phone_confirm_line, phone_marker=phone_marker)
|
||||
|
||||
pipeline = Pipeline(
|
||||
[
|
||||
|
||||
Reference in New Issue
Block a user