avc-phone-ai/odoo_client.py at d7bfe2dbe868294c0eec89335034d4cb36605c53

Files

tocmo0nlord ba36ae6891 Log/surface the reason, pin LLM warm for latency, doc insurance rule

- Reason visibility: the reason WAS extracted ("disintegrated eyes") but only
  lived in the Odoo description note. Add it to the post-call log line and to
  the Odoo lead title so it's visible at a glance.
- Latency: split the timing — Whisper is ~0.1s, latency is LLM-side. The ~3s
  tail was cold model reloads after Ollama's keep-alive expired. server.py now
  warms + pins the model on startup (keep_alive=-1, ollama ps UNTIL=Forever),
  removing cold first-turn stalls. Whisper size left alone (not the bottleneck).
- CLAUDE.md: insurance rule (never suggest/guess the plan), latency note.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-27 04:24:10 +00:00

4.7 KiB

Raw Blame History

View Raw

4.7 KiB Raw Blame History

4.7 KiB

Raw Blame History