feat: sysops_agent — Docker/git self-management with auto-heal

Adds a new specialist agent that gives the AI system control over its
own infrastructure:

- sysops_tools.py: docker SDK (ps/logs/restart) + git CLI (pull/status/log)
  + Odoo channel notifier for autonomous action broadcasts
- sysops_agent.py: BaseAgent subclass handling on-demand chat requests,
  auto_heal() triggered by health failures, and sweep() for audits
- Background auto-heal loop (main.py): runs every 2 minutes, calls
  _get_failing_systems() and triggers auto_heal() when degraded
- health.py: extracted _get_failing_systems() helper reused by both
  the /health/detailed endpoint and the auto-heal loop
- docker-compose.yml: mount docker socket + /root/odoo workspace +
  SSH keys for git authentication
- Dockerfile: add git to apt-get
- requirements.txt: add docker==7.1.0 Python SDK

Auto-heal behavior:
  - Detects failing containers, restarts them, notifies all bot DM channels
  - Ollama (192.168.2.9) is flagged as external and skipped
  - On-demand via chat: "restart agent", "check logs", "pull latest code"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Carlos Garcia
2026-05-19 17:01:57 -04:00
parent f4991dd920
commit 8d1727b498
11 changed files with 485 additions and 0 deletions

View File

@@ -44,6 +44,9 @@ User: "how are sales this month?" or "show me the pipeline"
User: "what projects are overdue?"
-> {"needs_clarification": false, "clarification_question": null, "is_continuation": false, "agents": ["project_agent"], "intent_summary": "find overdue projects", "params": {}, "context_hints": []}
User: "restart the agent service" or "check the docker containers" or "pull the latest code" or "show me the agent logs"
-> {"needs_clarification": false, "clarification_question": null, "is_continuation": false, "agents": ["sysops_agent"], "intent_summary": "infrastructure operation", "params": {}, "context_hints": []}
Now classify the user's message in JSON only:
{
"needs_clarification": false,

View File

@@ -0,0 +1,23 @@
You are the SysOps agent for ActiveBlue AI. You manage the Docker infrastructure
and git repository for the ActiveBlue AI system.
Managed containers:
activeblue-agent — the AI agent service (FastAPI, port 8001)
activeblue-agent-db — agent Postgres database
odoo-web-1 — the Odoo 18 application
odoo-db-1 — the Odoo Postgres database
Git repository: /workspace/odoo-ai (main branch)
Your responsibilities:
- Report container status clearly
- Restart containers when asked or when health checks fail
- Pull latest code from git when requested
- Show relevant log output when diagnosing issues
- Notify users in Odoo chat whenever you take autonomous actions
Rules:
- Only restart containers in the managed list above
- Never delete or stop containers permanently
- Always explain what you did and why
- If ollama is failing, report it as external (192.168.2.9) and outside your control