# πŸ”§ 08 β€” System Implementation Status > How the multi-agent system actually works right now, as built. > Last updated: 2026-02-15 --- ## 1. Architecture Overview **Multi-Instance Cluster:** 8 independent OpenClaw gateway processes, one per agent. Each has its own systemd service, Discord bot token, port, and state directory. ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ T420 (clawdbot) β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ OpenClaw Gateway β€” Mario (main instance) β”‚ β”‚ β”‚ β”‚ Port 18789 β”‚ Slack: Antoine's personal workspace β”‚ β”‚ β”‚ β”‚ State: ~/.openclaw/ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ Atomizer Cluster ────────────────────────┐ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ Manager β”‚ β”‚ Tech Lead β”‚ β”‚ Secretary β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ :18800 β”‚ β”‚ :18804 β”‚ β”‚ :18808 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Opus 4.6 β”‚ β”‚ Opus 4.6 β”‚ β”‚ Gemini 2.5 β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ Auditor β”‚ β”‚ Optimizer β”‚ β”‚ Study Builderβ”‚ β”‚ β”‚ β”‚ β”‚ β”‚ :18812 β”‚ β”‚ :18816 β”‚ β”‚ :18820 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Opus 4.6 β”‚ β”‚ Sonnet 4.5 β”‚ β”‚ Sonnet 4.5 β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ NX Expert β”‚ β”‚ Webster β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ :18824 β”‚ β”‚ :18828 β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Sonnet 4.5 β”‚ β”‚ Gemini 2.5 β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Inter-agent: hooks API (curl between ports) β”‚ β”‚ β”‚ β”‚ Shared token: 31422bb39bc9e7a4d34f789d8a7cbc582dece8dd… β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Discord: Atomizer-HQ Server β”‚ β”‚ Guild: 1471858733452890132 β”‚ β”‚ β”‚ β”‚ πŸ“‹ COMMAND: #ceo-office, #announcements, #daily-standup β”‚ β”‚ πŸ”§ ENGINEERING: #technical, #code-review, #fea-analysis, #nx β”‚ β”‚ πŸ“Š OPERATIONS: #task-board, #meeting-notes, #reports β”‚ β”‚ πŸ”¬ RESEARCH: #literature, #materials-data β”‚ β”‚ πŸ—οΈ PROJECTS: #active-projects β”‚ β”‚ πŸ“š KNOWLEDGE: #knowledge-base, #lessons-learned β”‚ β”‚ πŸ€– SYSTEM: #agent-logs, #inter-agent, #it-ops β”‚ β”‚ β”‚ β”‚ Each agent = its own Discord bot with unique name & avatar β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## 2. Why Multi-Instance (Not Single Gateway) OpenClaw's native Discord provider (`@buape/carbon`) has a race condition bug when multiple bot tokens connect from one process. Since we need 8 separate bot accounts, we run 8 separate processes β€” each handles exactly one token, bypassing the bug entirely. **Advantages over previous bridge approach:** - Native Discord streaming, threads, reactions, attachments - Fault isolation β€” one agent crashing doesn't take down the others - No middleware polling session files on disk - Each agent appears as its own Discord user with independent presence --- ## 3. Port Map | Agent | Port | Model | Notes | |-------|------|-------|-------| | Manager | 18800 | Opus 4.6 | Orchestrates, delegates. Heartbeat disabled (Discord delivery bug) | | Tech Lead | 18804 | Opus 4.6 | Technical authority | | Secretary | 18808 | Gemini 2.5 Pro | Task tracking, notes. Changed from Codex 2026-02-15 (OAuth expired) | | Auditor | 18812 | Gemini 2.5 Pro | Quality review. Changed from Codex 2026-02-15 (OAuth expired) | | Optimizer | 18816 | Sonnet 4.5 | Optimization work | | Study Builder | 18820 | Gemini 2.5 Pro | Study setup. Changed from Codex 2026-02-15 (OAuth expired) | | NX Expert | 18824 | Sonnet 4.5 | CAD/NX work | | Webster | 18828 | Gemini 2.5 Pro | Research. Heartbeat disabled (Discord delivery bug) | > **⚠️ Port spacing = 4.** OpenClaw uses port N AND N+3 (browser service). Never assign adjacent ports. --- ## 4. Systemd Setup ### Template Service File: `~/.config/systemd/user/openclaw-atomizer@.service` ```ini [Unit] Description=OpenClaw Atomizer - %i After=network.target [Service] Type=simple ExecStart=/usr/bin/node /home/papa/.local/lib/node_modules/openclaw/dist/index.js gateway Environment=PATH=/home/papa/.local/bin:/usr/local/bin:/usr/bin:/bin Environment=HOME=/home/papa Environment=OPENCLAW_STATE_DIR=/home/papa/atomizer/instances/%i Environment=OPENCLAW_CONFIG_PATH=/home/papa/atomizer/instances/%i/openclaw.json Environment=OPENCLAW_GATEWAY_TOKEN=31422bb39bc9e7a4d34f789d8a7cbc582dece8dd170dadd1 EnvironmentFile=/home/papa/atomizer/instances/%i/env EnvironmentFile=/home/papa/atomizer/config/.discord-tokens.env Restart=always RestartSec=5 StartLimitIntervalSec=60 StartLimitBurst=5 [Install] WantedBy=default.target ``` ### Cluster Management Script File: `~/atomizer/cluster.sh` ```bash # Start all: bash cluster.sh start # Stop all: bash cluster.sh stop # Restart all: bash cluster.sh restart # Status: bash cluster.sh status # Logs: bash cluster.sh logs [agent-name] ``` --- ## 5. File System Layout ``` ~/atomizer/ β”œβ”€β”€ cluster.sh ← Cluster management script β”œβ”€β”€ config/ β”‚ β”œβ”€β”€ .discord-tokens.env ← All 8 bot tokens (env vars) β”‚ └── atomizer-discord.env ← Legacy (can remove) β”œβ”€β”€ instances/ ← Per-agent OpenClaw state β”‚ β”œβ”€β”€ manager/ β”‚ β”‚ β”œβ”€β”€ openclaw.json ← Agent config (1 agent per instance) β”‚ β”‚ β”œβ”€β”€ env ← Instance-specific env vars β”‚ β”‚ └── agents/main/sessions/ ← Session data (auto-created) β”‚ β”œβ”€β”€ tech-lead/ β”‚ β”œβ”€β”€ secretary/ β”‚ β”œβ”€β”€ auditor/ β”‚ β”œβ”€β”€ optimizer/ β”‚ β”œβ”€β”€ study-builder/ β”‚ β”œβ”€β”€ nx-expert/ β”‚ └── webster/ β”œβ”€β”€ workspaces/ ← Agent workspaces (SOUL, AGENTS, memory) β”‚ β”œβ”€β”€ manager/ β”‚ β”‚ β”œβ”€β”€ SOUL.md β”‚ β”‚ β”œβ”€β”€ AGENTS.md β”‚ β”‚ β”œβ”€β”€ MEMORY.md β”‚ β”‚ └── memory/ β”‚ β”œβ”€β”€ secretary/ β”‚ β”œβ”€β”€ technical-lead/ β”‚ β”œβ”€β”€ auditor/ β”‚ β”œβ”€β”€ optimizer/ β”‚ β”œβ”€β”€ study-builder/ β”‚ β”œβ”€β”€ nx-expert/ β”‚ β”œβ”€β”€ webster/ β”‚ └── shared/ ← Shared context (CLUSTER.md, protocols) └── tools/ └── nxopen-mcp/ ← NX Open MCP server (for CAD) ``` **Key distinction:** `instances/` = OpenClaw runtime state (configs, sessions, SQLite). `workspaces/` = agent personality and memory (SOUL.md, AGENTS.md, etc.). --- ## 6. Inter-Agent Communication ### Delegation Skill (Primary Method) Manager and Tech Lead use the `delegate` skill to assign tasks to other agents. The skill wraps the OpenClaw Hooks API with port mapping, auth, error handling, and logging. **Location:** `/home/papa/atomizer/workspaces/shared/skills/delegate/` **Installed on:** Manager, Tech Lead (symlinked from shared) ```bash # Usage bash /home/papa/atomizer/workspaces/shared/skills/delegate/delegate.sh "" [options] # Examples delegate.sh webster "Find CTE of Zerodur Class 0 between 20-40Β°C" delegate.sh nx-expert "Mesh the M2 mirror" --channel C0AEJV13TEU --deliver delegate.sh auditor "Review thermal analysis" --no-deliver ``` **How it works:** 1. Looks up the target agent's port from hardcoded port map 2. Checks if the target is running 3. POSTs to `http://127.0.0.1:PORT/hooks/agent` with auth token 4. Target agent processes the task asynchronously in an isolated session 5. Response delivered to Discord if `--deliver` is set **Options:** `--channel `, `--deliver` (default), `--no-deliver` ### Delegation Authority | Agent | Can Delegate To | |-------|----------------| | Manager | All agents | | Tech Lead | All agents except Manager | | All others | Cannot delegate β€” request via Manager or Tech Lead | ### Hooks Protocol All agents follow `/home/papa/atomizer/workspaces/shared/HOOKS-PROTOCOL.md`: - Hook messages = **high-priority assignments**, processed before other work - After completing tasks, agents **append** status to `shared/project_log.md` - Only the Manager updates `shared/PROJECT_STATUS.md` (gatekeeper pattern) ### Raw Hooks API (Reference) The delegate skill wraps this, but for reference: ```bash curl -s -X POST http://127.0.0.1:PORT/hooks/agent \ -H "Content-Type: application/json" \ -H "Authorization: Bearer 31422bb39bc9e7a4d34f789d8a7cbc582dece8dd170dadd1" \ -d '{"message": "your request here", "deliver": true, "channel": "discord"}' ``` ### sessions_send / sessions_spawn Agents configured with `agentToAgent.enabled: true` can use OpenClaw's built-in `sessions_send` and `sessions_spawn` tools to communicate within the same instance. Cross-instance communication requires the hooks API / delegate skill. --- ## 7. Current Status ### βœ… Working - All 8 instances running as systemd services (auto-start on boot) - Each agent has its own Discord bot identity (name, avatar, presence) - Native Discord features: streaming, typing indicators, message chunking - Agent workspaces with SOUL.md, AGENTS.md, MEMORY.md - Hooks API enabled on all instances (Google Gemini + Anthropic auth configured) - **Delegation skill deployed** β€” Manager and Tech Lead can delegate tasks to any agent via `delegate.sh` - **Hooks protocol** β€” all agents know how to receive and prioritize delegated tasks - **Gatekeeper pattern** β€” Manager owns PROJECT_STATUS.md; others append to project_log.md - Cluster management via `cluster.sh` - Estimated total RAM: ~4.2GB for 8 instances ### ❌ Known Issues - ~~**DELEGATE syntax is fake**~~ β†’ βœ… RESOLVED (2026-02-14): Replaced with `delegate.sh` skill using hooks API - **Discord "Ambiguous recipient" bug** (2026-02-15): OpenClaw Discord plugin requires `user:` or `channel:` prefix for message targets. When heartbeat tries to reply to a session that originated from a Discord DM, it uses the bare user ID β†’ delivery fails. **Workaround:** Heartbeat disabled on Manager + Webster. Other agents unaffected (their sessions don't originate from Discord DMs). Proper fix requires OpenClaw patch to auto-infer `user:` for known user IDs. - **Codex OAuth expired** (2026-02-15): `refresh_token_reused` error β€” multiple instances racing to refresh the same shared Codex token. Secretary, Auditor, Study-Builder switched to Gemini 2.5 Pro. To restore Codex: Antoine must re-run `codex login` via SSH tunnel, then run `~/atomizer/scripts/sync-codex-tokens.sh`. - **No automated orchestration layer:** Manager delegates manually (but now has proper tooling to do so β€” orchestrate.sh, workflow engine) - **5 agents not yet created:** Post-Processor, Reporter, Developer, Knowledge Base, IT (from the original 13-agent plan) - **Windows execution bridge** (`atomizer_job_watcher.py`): exists but not connected end-to-end --- ## 8. Evolution History | Date | Phase | What Changed | |------|-------|-------------| | 2026-02-07 | Phase 0 | Vision doc created, 13-agent plan designed | | 2026-02-08 | Phase 0 | Single gateway (port 18790) running on Slack | | 2026-02-13 | Discord Migration | Discord server created, 8 bot tokens obtained | | 2026-02-14 (AM) | Bridge Attempt | discord-bridge.js built β€” worked but fragile (no streaming, polled session files) | | 2026-02-14 (PM) | **Multi-Instance Cluster** | Pivoted to 8 independent OpenClaw instances. Bridge killed. Native Discord restored. | | 2026-02-14 (PM) | **Delegation System** | Built `delegate.sh` skill, hooks protocol, gatekeeper pattern. Fake DELEGATE syntax replaced with real hooks API calls. Google Gemini auth added to all instances. | | 2026-02-15 | **Orchestration Engine** | Phases 1-3 complete: synchronous delegation (`orchestrate.py`), smart routing (capability registry), hierarchical delegation (Tech-Lead + Optimizer can sub-delegate), YAML workflow engine with parallel execution + approval gates. See `10-ORCHESTRATION-ENGINE-PLAN.md`. | | 2026-02-15 | **Stability Fixes** | Discord heartbeat delivery bug identified (ambiguous recipient). Codex OAuth token expired (refresh_token_reused). Heartbeat disabled on Manager + Webster. Secretary/Auditor/Study-Builder switched from Codex to Gemini 2.5 Pro. HEARTBEAT.md created for all agents. | --- *Created: 2026-02-14 by Mario* *This is the "as-built" document β€” updated as implementation evolves.*