Adds a new specialist agent that gives the AI system control over its
own infrastructure:
- sysops_tools.py: docker SDK (ps/logs/restart) + git CLI (pull/status/log)
+ Odoo channel notifier for autonomous action broadcasts
- sysops_agent.py: BaseAgent subclass handling on-demand chat requests,
auto_heal() triggered by health failures, and sweep() for audits
- Background auto-heal loop (main.py): runs every 2 minutes, calls
_get_failing_systems() and triggers auto_heal() when degraded
- health.py: extracted _get_failing_systems() helper reused by both
the /health/detailed endpoint and the auto-heal loop
- docker-compose.yml: mount docker socket + /root/odoo workspace +
SSH keys for git authentication
- Dockerfile: add git to apt-get
- requirements.txt: add docker==7.1.0 Python SDK
Auto-heal behavior:
- Detects failing containers, restarts them, notifies all bot DM channels
- Ollama (192.168.2.9) is flagged as external and skipped
- On-demand via chat: "restart agent", "check logs", "pull latest code"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The /registry/agents endpoint was 500 on every call because
AgentRegistry.get_all() is async but was called without await.
Also aligns get_all() dict keys (name, domain) with what the router reads.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The master agent was routing expense/receipt requests to finance_agent
instead of expenses_agent because only DB-registered agents appeared
in get_active_agents(). This adds auto-activation of all in-memory
registered agents with precise capability summaries so the LLM picks
the right specialist.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Routers calling /registry/agents raised AttributeError because
get_all() was not defined. Added method returning all registered
agents with active status, capabilities and instance flags.