- Reason extraction missed symptom-style reasons: a caller said "I'm actually
blind" and the lead logged reason=None (it caught "disintegrated eyes" before
but not this). Broadened the extractor's reason rule to capture the eye
problem/symptom as the reason, not just visit types. Verified 3/3 -> "vision
loss / blindness".
- server.py: move the LLM warmup/pin (keep_alive=-1) from the deprecated
on_event("startup") to a lifespan handler — silences the FastAPI deprecation
warning; model still shows ollama ps UNTIL=Forever.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Reason visibility: the reason WAS extracted ("disintegrated eyes") but only
lived in the Odoo description note. Add it to the post-call log line and to
the Odoo lead title so it's visible at a glance.
- Latency: split the timing — Whisper is ~0.1s, latency is LLM-side. The ~3s
tail was cold model reloads after Ollama's keep-alive expired. server.py now
warms + pins the model on startup (keep_alive=-1, ollama ps UNTIL=Forever),
removing cold first-turn stalls. Whisper size left alone (not the bottleneck).
- CLAUDE.md: insurance rule (never suggest/guess the plan), latency note.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Deepgram and the Twilio Standard API Key were reverted per decision:
- bot.py: restore HintedWhisperSTTService (faster-whisper hotwords), default
model medium; remove DeepgramSTTService import + DEEPGRAM_API_KEY.
- server.py: restore TWILIO_AUTH_TOKEN for X-Twilio-Signature validation and
the serializer auto-hang-up. Twilio signs webhooks with the Auth Token, so
an API Key Secret cannot validate signatures.
- .env.example: back to TWILIO_AUTH_TOKEN + Whisper STT vars.
- .gitignore: ignore runtime *.log (avc_run.log).
OLLAMA_MODEL stays activeblue-avc:latest (the existing pulled tag).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>