docs: add HQ multi-agent framework documentation from PKM

- Project plan, agent roster, architecture, roadmap - Decision log, full system plan, Discord setup/migration guides - System implementation status (as-built) - Cluster pivot history - Orchestration engine plan (Phases 1-4) - Webster and Auditor reviews
2026-02-15 21:44:07 +00:00
parent 3289a76e19
commit cf82de4f06
15 changed files with 6933 additions and 0 deletions
--- a/docs/hq/09-CLUSTER-PIVOT-HISTORY.md
+++ b/docs/hq/09-CLUSTER-PIVOT-HISTORY.md
@@ -0,0 +1,281 @@
+
+> **⚠️ HISTORICAL DOCUMENT** — This was the pivot strategy written during the bridge→cluster transition on 2026-02-14. The pivot has been executed. See [[P-Atomizer-Overhaul-Framework-Agentic/08-SYSTEM-IMPLEMENTATION-STATUS|08 — System Implementation Status]] for the current as-built state.
+> Note: This doc proposed Docker Compose, but we went with native systemd instead (no OpenClaw Docker image available).
+
+
+
+# 🔧 Strategic Pivot: From Discord-Bridge to Multi-Instance Cluster
+
+**Project:** Atomizer Overhaul Framework (Agentic)
+
+**Date:** 2026-02-14
+
+**Status:** Architecture Redesign
+
+**Owner:** Mario (Architect)
+
+---
+
+## 1. The Problem Statement: "The Middleware Trap"
+
+The current implementation uses a **Node.js Discord Bridge** to bypass a native OpenClaw bug (the "carbon race condition" when multiple tokens are handled by one process). While functional as a temporary fix, it introduces critical systemic weaknesses:
+
+1. **Fragile Interrogation:** The bridge "polls" JSONL session files on disk. This is prone to race conditions, I/O lag, and breaks if the OpenClaw schema updates.
+    
+2. **Feature Stripping:** By acting as a middleman, the bridge kills **LLM Streaming**, **Discord Attachments**, **Reactions**, and **Thread Support**.
+    
+3. **Single Point of Failure:** If the "Manager" bot (the listener) or the bridge script fails, the entire 8-bot ecosystem goes offline.
+    
+4. **Sequential Processing:** The bridge handles messages one-by-one, preventing true parallel agentic collaboration.
+    
+
+---
+
+## 2. The Solution: Multi-Instance Micro-Service Architecture
+
+Instead of one gateway pretending to be 8 bots, we deploy **8 independent OpenClaw instances**. This treats each agent as a dedicated micro-service.
+
+### Key Advantages:
+
+- **Bypasses the Bug:** Each process handles exactly **one** Discord token. The race condition bug is mathematically impossible in this configuration.
+    
+- **Native Performance:** Restores real-time streaming, rich media handling, and native Discord UI features.
+    
+- **Fault Isolation:** If the "Webster" agent crashes, the "Tech-Lead" remains operational.
+    
+- **Hardware Efficiency:** Allows individual resource limits (RAM/CPU) per agent based on their LLM requirements.
+    
+
+---
+
+## 3. The New Infrastructure (T420 Setup)
+
+### A. Directory Structure
+
+Each agent keeps its own local state (SQLite, logs) to avoid database locking, but shares the project workspaces.
+
+Plaintext
+
+```
+~/atomizer/
+├── docker-compose.yml           # The new Orchestrator
+├── .env                         # All 8 Discord Tokens
+├── instances/                   # Private Agent State (SQLite, local logs)
+│   ├── manager/
+│   ├── tech-lead/
+│   └── ... (8 total)
+└── workspaces/                  # THE SHARED BRAIN (Project files)
+    ├── manager/                 # SOUL.md, MEMORY.md
+    ├── technical-lead/
+    └── shared_context/          # PROJECT_STATUS.md (Global State)
+```
+
+### B. The Orchestrator (`docker-compose.yml`)
+
+This replaces the systemd bridge and the single gateway service.
+
+YAML
+
+```
+services:
+  # Base template for all agents
+  x-agent-base: &agent-base
+    image: openclaw/openclaw:latest
+    restart: unless-stopped
+    volumes:
+      - ./workspaces:/app/workspaces
+      - ./skills:/app/skills
+
+  manager:
+    <<: *agent-base
+    container_name: atom-manager
+    environment:
+      - DISCORD_TOKEN=${MANAGER_TOKEN}
+      - AGENT_CONFIG_PATH=/app/instances/manager/config.json
+    volumes:
+      - ./instances/manager:/root/.openclaw
+
+  tech-lead:
+    <<: *agent-base
+    container_name: atom-tech-lead
+    environment:
+      - DISCORD_TOKEN=${TECH_LEAD_TOKEN}
+    volumes:
+      - ./instances/tech-lead:/root/.openclaw
+# ... (Repeat for all 8 agents)
+```
+
+---
+
+## 4. The "Shared Brain" Logic (Collaboration Protocol)
+
+To ensure agents don't work in silos, we implement a **File-Based Handshake** protocol via their System Prompts:
+
+1. **Read Before Speak:** "Before responding to any Discord message, read `workspaces/shared_context/PROJECT_STATUS.md` to check the current locks and active tasks."
+    
+2. **The Inbox Pattern:** To trigger another bot, an agent writes a file to `workspaces/[target-agent]/INBOX/task.json`.
+    
+3. **The Watcher:** Each agent uses the `fs-watcher` skill to monitor their `INBOX` folder, allowing them to "wake up" when another bot requests help.
+    
+
+---
+
+## 5. Transition Roadmap
+
+|**Phase**|**Action**|**Result**|
+|---|---|---|
+|**1. Decommission**|Stop `discord-bridge.js` and `openclaw-gateway-atomizer`.|Clean slate on T420.|
+|**2. Containerize**|Move agent configs into the Docker structure.|Isolated, stable environments.|
+|**3. Volume Mapping**|Link `~/atomizer/workspaces` to all 8 containers.|Shared project context established.|
+|**4. Prompt Update**|Inject the "Inbox" and "Status-First" rules into SOUL.md.|Agents become aware of the team.|
+|**5. Validation**|Test @mentioning Tech-Lead in a thread.|Native streaming & thread support verified.|
+
+---
+
+## 6. Final Assessment
+
+The move to **Micro-Instance OpenClaw** shifts the project from a "hacky middleware" state to a professional, scalable AI operation. It aligns with the 2026 industry standard of **Distributed Agentic Systems**, ensuring that the Atomizer Framework is not just functional, but future-proof.
+
+---
+
+**Would you like me to generate the specific `PROJECT_STATUS.md` schema and the "Inbox" watcher script to finalize this transition?**
+
+
+
+---
+
+This document outlines the **"Cluster"** approach—moving from one broken process to eight stable ones.
+
+---
+
+# 📄 Atomizer-Architecture-Pivot.md
+
+Markdown
+
+````
+# 🔧 STRATEGIC PIVOT: ATOMIZER MULTI-INSTANCE CLUSTER
+> **Date:** 2026-02-14
+> **Project:** Atomizer Overhaul Framework (Agentic)
+> **Status:** Architecture Redesign (Replacing Discord-Bridge.js)
+
+---
+
+## 1. THE PROBLEM: "The Middleware Trap"
+The current "Bridge" architecture is a bottleneck. By using a single Node.js script to poll session files:
+* **Latency:** No real-time streaming; users wait for full file writes.
+* **Fragility:** The bridge breaks if the OpenClaw `.jsonl` schema changes.
+* **Single Point of Failure:** If the Manager bot or Bridge process hangs, all 8 bots die.
+* **Feature Loss:** No Discord attachments, no native reactions, and broken thread support.
+
+## 2. THE SOLUTION: Micro-Instance Agent Cluster
+Instead of one gateway pretending to be 8 bots, we run **8 independent OpenClaw processes**.
+
+### Why this works:
+1.  **Bypasses the Bug:** The `@buape/carbon` crash only happens when one process handles multiple tokens. One token per process = **100% Stability.**
+2.  **Native Power:** Restores streaming, threads, and rich media.
+3.  **Shared Brain:** All instances mount the same physical workspace folder. They "see" each other's files in real-time.
+
+---
+
+## 3. TECHNICAL IMPLEMENTATION
+
+### A. Directory Structure (T420)
+```text
+~/atomizer/
+├── docker-compose.yml           # The Orchestrator
+├── .env                         # Store all 8 DISCORD_TOKENs here
+├── instances/                   # Private Agent State (SQLite, local logs)
+│   ├── manager/
+│   ├── tech-lead/
+│   └── secretary/ ...
+└── workspaces/                  # THE SHARED PROJECT FOLDERS
+    ├── manager/                 # SOUL.md, MEMORY.md
+    ├── technical-lead/
+    └── shared_context/          # PROJECT_STATUS.md (Global State)
+````
+
+### B. The Orchestrator (`docker-compose.yml`)
+
+Copy this into `~/atomizer/docker-compose.yml`. This allows you to manage all bots with one command: `docker-compose up -d`.
+
+YAML
+
+```
+services:
+  # Template for all Atomizer Agents
+  x-agent-base: &agent-base
+    image: openclaw/openclaw:latest
+    restart: unless-stopped
+    volumes:
+      - ./workspaces:/app/workspaces
+      - ./skills:/app/skills
+
+  manager:
+    <<: *agent-base
+    container_name: atom-manager
+    environment:
+      - DISCORD_TOKEN=${MANAGER_TOKEN}
+    volumes:
+      - ./instances/manager:/root/.openclaw
+
+  tech-lead:
+    <<: *agent-base
+    container_name: atom-tech-lead
+    environment:
+      - DISCORD_TOKEN=${TECH_LEAD_TOKEN}
+    volumes:
+      - ./instances/tech-lead:/root/.openclaw
+
+  # ... Repeat for: secretary, auditor, optimizer, study-builder, nx-expert, webster
+  # Use unique container_names and token environment variables for each.
+```
+
+---
+
+## 4. THE COLLABORATION PROTOCOL (System Prompt)
+
+Inject this logic into the **SOUL.md** or System Prompt of every agent to ensure they act as a team, not 8 strangers.
+
+> ### 🤖 CLUSTER OPERATING RULES
+> 
+> 1. **Context Awareness:** You are part of the Atomizer Cluster. Your shared project data lives in `/app/workspaces/`.
+>     
+> 2. **State Check:** Before every response, read `/app/workspaces/shared_context/PROJECT_STATUS.md`.
+>     
+> 3. **Inter-Agent Triggering:** To request help from another agent, write a task file to `/app/workspaces/[agent-id]/INBOX/task.json`.
+>     
+> 4. **Discord Etiquette:** Only respond if @mentioned or if you are the designated owner of a Discord Thread. Use threads to isolate complex engineering tasks.
+>     
+
+---
+
+## 5. TRANSITION STEPS
+
+1. **Kill the Bridge:** `pm2 stop discord-bridge` or `systemctl stop discord-bridge`.
+    
+2. **Setup Docker:** Install Docker on the T420 if not present.
+    
+3. **Map Volumes:** Ensure `~/atomizer/workspaces` contains your existing agent data.
+    
+4. **Deploy:** Run `docker-compose up -d`.
+    
+5. **Test:** Mention `@Atomizer Tech Lead` in Discord. You should see it typing and streaming immediately.
+    
+
+---
+
+## 6. FINAL VERDICT
+
+This move converts the system from a "simulated" multi-bot setup to a **Native Agentic Ecosystem**. It is more robust, faster, and allows the LLMs to actually use the Discord features (like threads) to manage project complexity.
+
+```
+
+---
+
+### How to use this:
+1.  Copy the code block above.
+2.  Save it as `Atomizer-Architecture-Pivot.md` on your T420.
+3.  You now have a single "Source of Truth" to guide your build.
+
+**Would you like me to write the `PROJECT_STATUS.md` template so your bots know how to read/write to the shared state?**
+```