Health endpoint called .ping() on both but neither implemented it,
causing ollama/odoo to always show as error and the bot to stay offline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Routers calling /registry/agents raised AttributeError because
get_all() was not defined. Added method returning all registered
agents with active status, capabilities and instance flags.
The classifier was silently falling back to a clarification prompt every
time the LLM wrapped its JSON in markdown fences, prefixed it with
'json', or added surrounding prose. The bot then asked 'Could you
clarify what you need?' to every message regardless of clarity.
Now: strip code fences, slice to the first {...} block, and on parse
failure log the raw content (truncated) and treat the message as 'no
specialist agent' so the direct-answer fallback responds instead of
looping on clarification.
Previously when the LLM classified a message as needing no specialist
agent, the dispatcher built zero directives and _synthesize returned
'No agent responses received.' Greetings, follow-up clarifications,
and general questions all fell into this dead end.
Now when intent.agents is empty and no clarification is needed, the
master makes a second LLM call with the recent conversation as context
and answers directly. Updated master_system.txt to steer the classifier
toward agents=[] for chitchat instead of forcing a clarification loop.
The f-string only spanned the first fragment ('You don') so the
{chr(44).join(...)} placeholder leaked into chat output as literal
text. Build the message with plain string concat.
User messages were only saved inside _update_memory at the end of a
successful directive. The clarification and access-denied branches
returned early without ever calling it, so when a clarification turn
asked 'what do you mean?' and the user replied, the original question
was missing from context — the bot looked at a transcript of nothing
but its own clarifying questions and asked yet another.
Save the user message at the top of handle_message so every branch
includes it. Drop the now-duplicate write from _update_memory.
ollama-python 0.3.x returns the response as a dict, while newer releases
return pydantic objects. The backend assumed objects (response.message)
and crashed with AttributeError on every dispatch. Use a helper that
accepts either shape so the code works across versions.
The prompt template contains a literal JSON example block ({"needs_clarification": ...})
which str.format() tried to interpret as format fields, raising KeyError on every
Discuss DM. Switch to .replace() so braces in the template are taken literally.
Without exc_info we only see the bare exception string, which has been
unhelpful for debugging Discuss DM failures (e.g. a KeyError whose
message is just a JSON key, with no clue where it was raised).
Odoo's bot model serialises user_id as a string (str(uid)) over the
HTTP boundary, but the asyncpg memory queries ($1) expect an integer.
This caused 'str object cannot be interpreted as an integer' on every
Discuss DM. Cast at the entry point so downstream stores get an int.
The router was calling handle_message(user_id, message, context, session_id)
but MasterAgent accepts (user_id, channel_id, message, directive_id) and
returns MasterResponse{response, status, ...} with no .reply or
.agent_reports fields. Discuss DMs to the bot crashed with TypeError.
Now the router:
- Derives directive_id from session_id (or generates one)
- Pulls channel_id out of req.context
- Maps MasterResponse.response -> DispatchResponse.reply
- Returns an empty agent_reports list (the field is reserved for future use;
per-agent reports aren't part of MasterResponse)
Some Odoo instances require the user's actual login/email for API key
auth rather than the __system__ special login. ODOO_USER defaults to
__system__ for standard Odoo 16+ installs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- main.py: MemoryManager(pool=pool, llm=...) -> llm_router=...
Class signature is __init__(self, pool, llm_router=None).
- alembic.ini: script_location = migrations -> agent_service/migrations
When alembic runs from WORKDIR /app inside the container, 'migrations'
resolves to /app/migrations (missing). Correct path is
/app/agent_service/migrations where versions/ actually lives.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes all errors reported in docker compose logs agent-service:
1. config.py: add ollama_max_concurrent, claude_timeout, claude_max_concurrent
fields so LLMRouter(config=settings) can read them without AttributeError.
2. main.py - LLM router: drop manual OllamaBackend/ClaudeBackend construction;
call LLMRouter(config=settings, pg_pool=pool) to match class signature.
Fixes: OllamaBackend.__init__() unexpected kwarg 'base_url'.
3. main.py - DB: add 5-attempt retry with 2s backoff and redacted DSN logging.
Fixes: connection refused race on startup before Postgres accepts connections.
4. main.py - AgentRegistry: call AgentRegistry() with no args (class takes none),
then await agent_registry.load_from_odoo(odoo) to populate active agents.
Fixes: AgentRegistry.__init__() unexpected kwarg 'odoo'.
5. main.py - PeerBus: pass registry=agent_registry at construction; register
specialist agents on agent_registry (not peer_bus, which has no register()).
peer_bus.py: make directive_id optional (default None) — bus is a singleton
at startup; directive_id is only needed per-request.
Fixes: PeerBus.__init__() missing positional args 'registry' and 'directive_id'.
6. main.py - MasterAgent: drop unexpected peer_bus= kwarg from constructor call.
Fixes: MasterAgent.__init__() unexpected kwarg 'peer_bus'.
7. mcp_router.py: pass NotificationOptions() instance instead of None.
Fixes: AttributeError 'NoneType' has no attribute 'tools_changed' (was applied
in running container but not committed; now committed).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>