Parse the per-call outcome (appointment / callback / none / skipped / incomplete)
from the new "Post-call <kind> saved" / "no actionable request" / "skipping card"
log lines. Adds a per-call type column, a "By outcome type" tally, and splits the
leads count into appointment + callback — so a mixed test batch is easy to verify.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Capacity gating verified deterministically: atomic _reserve_call_slot grants
exactly MAX_CONCURRENT_CALLS (2), refuses the 3rd, frees on hangup, and 10
simultaneous attempts grant only 2 (no race); /voice returns BUSY + Hangup at
cap. Marked the gate item done (end-to-end 3-phone test optional).
Add scripts/score_calls.py: grades recent calls from the server log against the
Phase 1 gate (turns, latency LLM->TTS, AVC-side hangup, leads, watchdog
re-prompts, errors) — for scoring the 10-call run once placed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>