avc-phone-ai/callstate.py at 3ed63d8ea9deb0961ae2dd0b164744d000714f6c

Files

tocmo0nlord a47f4b423c Fix re-asking: deterministic slot memory + user-turn merge + reason-loop prompt

Historical calls showed the 8B re-asking for name/reason/phone it already had
("I already gave you my full name", the "I want an appointment" -> "what brings
you in?" loop) and VAD splitting one utterance into consecutive user turns.

- callstate.py: CallStateGroomer between agg.user() and the LLM. After each
  agent turn (off the critical path) it extracts collected slots via one short
  JSON-mode Ollama pass, then before each generation injects an ALREADY
  COLLECTED / STILL NEEDED checklist into the system message and merges
  VAD-fragmented consecutive user messages. Callback-type calls get an explicit
  "no booking questions" line. CALL_STATE_TRACKING env (auto: on for ollama,
  off for anthropic).
- bot.py prompt step 1: "I want an appointment" is the booking intent, not the
  reason - ask the visit reason once, never twice.
- scripts/ab_replay.py: regression harness replaying the real failed calls.
  llama3.1-8b raw = 3 failures; with CALL STATE = 0 failures across all
  scenarios (chat latency 0.31s -> 0.55s median, well under the 3s gate).
  Qwen3-14B A/B'd and rejected: no better raw, ~3s/turn, 11GB VRAM.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-07-03 23:49:39 +00:00

10 KiB

Raw Blame History

View Raw

10 KiB Raw Blame History

10 KiB

Raw Blame History