Files
Atomizer/docs/hq/05-FULL-SYSTEM-PLAN.md
Antoine cf82de4f06 docs: add HQ multi-agent framework documentation from PKM
- Project plan, agent roster, architecture, roadmap
- Decision log, full system plan, Discord setup/migration guides
- System implementation status (as-built)
- Cluster pivot history
- Orchestration engine plan (Phases 1-4)
- Webster and Auditor reviews
2026-02-15 21:44:07 +00:00

2319 lines
80 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 🏭 05 — Full System Plan: Implementation Blueprint
> This is THE definitive implementation blueprint for Atomizer Engineering Co.
> The design docs (0004) say WHAT we're building. This document says HOW, step by step, with actual configs, scripts, and commands.
---
## Table of Contents
1. [Infrastructure & Docker](#1-infrastructure--docker)
2. [Framework Rewrite Strategy](#2-framework-rewrite-strategy)
3. [Agent Workspaces](#3-agent-workspaces-detailed)
4. [Slack Architecture](#4-slack-architecture)
5. [Windows Execution Bridge](#5-windows-execution-bridge)
6. [Inter-Agent Communication](#6-inter-agent-communication)
7. [Mario ↔ Atomizer Bridge](#7-mario--atomizer-bridge)
8. [Phase 0 Implementation Checklist](#8-phase-0-implementation-checklist)
9. [Security](#9-security)
10. [Future-Proofing](#10-future-proofing)
---
## 1. Infrastructure & Docker
### 1.1 Coexistence Model: Mario + Atomizer on T420
Mario's Clawdbot currently runs natively via systemd on the T420:
```
Service: ~/.config/systemd/user/clawdbot-gateway.service
Binary: /usr/bin/clawdbot
Config: ~/.clawdbot/clawdbot.json
Port: 18789 (loopback)
User: papa
```
**Decision (ref DEC-A012):** Atomizer gets a **separate Clawdbot gateway** in Docker. This provides:
- Complete workspace isolation (no file cross-contamination)
- Independent config, models, bindings
- Can restart/upgrade independently
- Mario never sees Atomizer agent traffic; Atomizer agents never see Mario's memory
```
┌────────────────────── T420 (papa@clawdbot) ──────────────────────┐
│ │
│ ┌─────────────────────┐ ┌──────────────────────────────────┐ │
│ │ Mario's Clawdbot │ │ Atomizer Clawdbot (Docker) │ │
│ │ (systemd native) │ │ │ │
│ │ Port: 18789 │ │ Port: 18790 │ │
│ │ Slack: personal │ │ Slack: atomizer-eng workspace │ │
│ │ Config: ~/.clawdbot│ │ Config: /opt/atomizer/clawdbot │ │
│ └─────────────────────┘ └──────────────────────────────────┘ │
│ │
│ ┌───────────────── Shared (read-only by Atomizer) ────────────┐ │
│ │ /home/papa/repos/Atomizer/ (Syncthing) │ │
│ │ /home/papa/obsidian-vault/ (Syncthing) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────── Atomizer-Only ─────────────────────────────┐ │
│ │ /opt/atomizer/workspaces/ (agent workspaces) │ │
│ │ /opt/atomizer/shared-skills/ (company protocols) │ │
│ │ /opt/atomizer/job-queue/ (Syncthing ↔ Windows) │ │
│ └─────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘
```
### 1.2 Directory Structure
```bash
/opt/atomizer/
├── docker-compose.yml
├── .env # API keys, tokens
├── clawdbot/ # Clawdbot config for Atomizer gateway
│ ├── clawdbot.json # Multi-agent config
│ ├── credentials/ # Auth tokens
│ └── skills/ # Shared skills (all agents)
│ ├── atomizer-protocols/
│ │ ├── SKILL.md
│ │ ├── QUICK_REF.md
│ │ └── protocols/
│ │ ├── OP_01_study_lifecycle.md
│ │ ├── ... (OP_01OP_08)
│ │ ├── SYS_10_imso.md
│ │ └── ... (SYS_10SYS_18)
│ └── atomizer-company/
│ ├── SKILL.md
│ ├── COMPANY.md # Identity, values, agent directory
│ └── LAC_CRITICAL.md # Hard-won lessons from LAC
├── workspaces/ # Per-agent workspaces
│ ├── manager/
│ ├── secretary/
│ ├── technical/
│ ├── optimizer/
│ ├── nx-expert/
│ ├── postprocessor/
│ ├── reporter/
│ ├── auditor/
│ ├── study-builder/
│ ├── researcher/
│ ├── developer/
│ ├── knowledge-base/
│ └── it-support/
├── job-queue/ # Syncthing ↔ Windows execution bridge
│ ├── pending/
│ ├── running/
│ ├── completed/
│ └── failed/
└── data/ # Persistent data
├── sessions/ # Clawdbot session storage
└── logs/ # Gateway logs
```
### 1.3 Docker Compose
```yaml
# /opt/atomizer/docker-compose.yml
version: "3.9"
services:
atomizer-gateway:
image: ghcr.io/clawdbot/clawdbot:latest
container_name: atomizer-gateway
restart: unless-stopped
ports:
- "127.0.0.1:18790:18790" # Gateway (loopback only)
environment:
- CLAWDBOT_GATEWAY_PORT=18790
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- GOOGLE_API_KEY=${GOOGLE_API_KEY}
volumes:
# Clawdbot config
- ./clawdbot:/home/clawdbot/.clawdbot
# Agent workspaces
- ./workspaces/manager:/home/clawdbot/clawd-atomizer-manager
- ./workspaces/secretary:/home/clawdbot/clawd-atomizer-secretary
- ./workspaces/technical:/home/clawdbot/clawd-atomizer-technical
- ./workspaces/optimizer:/home/clawdbot/clawd-atomizer-optimizer
- ./workspaces/nx-expert:/home/clawdbot/clawd-atomizer-nx-expert
- ./workspaces/postprocessor:/home/clawdbot/clawd-atomizer-postprocessor
- ./workspaces/reporter:/home/clawdbot/clawd-atomizer-reporter
- ./workspaces/auditor:/home/clawdbot/clawd-atomizer-auditor
- ./workspaces/study-builder:/home/clawdbot/clawd-atomizer-study-builder
- ./workspaces/researcher:/home/clawdbot/clawd-atomizer-researcher
- ./workspaces/developer:/home/clawdbot/clawd-atomizer-developer
- ./workspaces/knowledge-base:/home/clawdbot/clawd-atomizer-kb
- ./workspaces/it-support:/home/clawdbot/clawd-atomizer-it
# Shared read-only mounts
- /home/papa/repos/Atomizer:/mnt/atomizer-repo:ro
- /home/papa/obsidian-vault:/mnt/obsidian:ro
# Job queue (read-write, synced to Windows)
- ./job-queue:/mnt/job-queue
# Persistent data
- ./data/sessions:/home/clawdbot/.clawdbot/agents
- ./data/logs:/tmp/clawdbot
networks:
- atomizer-net
# Tailscale sidecar for Windows SSH access
# depends_on:
# - tailscale # Enable when Tailscale fast lane is needed
# Optional: Tailscale container for direct Windows access
# tailscale:
# image: tailscale/tailscale:latest
# container_name: atomizer-tailscale
# hostname: atomizer-docker
# volumes:
# - ./data/tailscale:/var/lib/tailscale
# environment:
# - TS_AUTHKEY=${TAILSCALE_AUTH_KEY}
# - TS_STATE_DIR=/var/lib/tailscale
# cap_add:
# - NET_ADMIN
# - SYS_MODULE
# networks:
# - atomizer-net
networks:
atomizer-net:
driver: bridge
```
### 1.4 Environment File
```bash
# /opt/atomizer/.env
# API Keys (same as Mario's — shared subscription)
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AIza...
# Slack (Atomizer Engineering workspace — DIFFERENT from Mario's)
SLACK_BOT_TOKEN=xoxb-atomizer-...
SLACK_APP_TOKEN=xapp-atomizer-...
# Optional: Tailscale for Windows SSH
# TAILSCALE_AUTH_KEY=tskey-auth-...
```
### 1.5 Syncthing Integration
Current Syncthing topology:
```
Windows (dalidou) ←→ T420 (clawdbot)
├── /home/papa/repos/Atomizer/ ← repo sync (existing)
├── /home/papa/obsidian-vault/ ← PKM sync (existing)
└── /opt/atomizer/job-queue/ ← NEW: execution bridge
```
**New Syncthing folder for job queue:**
| Setting | Value |
|---------|-------|
| Folder ID | `atomizer-job-queue` |
| Local path (T420) | `/opt/atomizer/job-queue/` |
| Local path (Windows) | `C:\Atomizer\job-queue\` |
| Sync direction | Send & Receive |
| Ignore patterns | `*.tmp`, `*.lock` |
Setup command (T420 side):
```bash
mkdir -p /opt/atomizer/job-queue/{pending,running,completed,failed}
# Add via Syncthing web UI at http://127.0.0.1:8384
# or via Syncthing CLI if available
```
### 1.6 Startup & Management
```bash
# Start Atomizer company
cd /opt/atomizer
docker compose up -d
# View logs
docker compose logs -f atomizer-gateway
# Restart after config changes
docker compose restart atomizer-gateway
# Stop everything
docker compose down
# Check health
curl -s http://127.0.0.1:18790/health || echo "Gateway not responding"
```
Optional systemd service wrapper:
```ini
# /etc/systemd/system/atomizer-company.service
[Unit]
Description=Atomizer Engineering Co. (Clawdbot Multi-Agent)
After=docker.service syncthing.service
Requires=docker.service
[Service]
Type=simple
WorkingDirectory=/opt/atomizer
ExecStart=/usr/bin/docker compose up
ExecStop=/usr/bin/docker compose down
Restart=always
RestartSec=10
User=papa
[Install]
WantedBy=multi-user.target
```
---
## 2. Framework Rewrite Strategy
### 2.1 What Stays (Core Engine)
These are battle-tested and should NOT be rewritten — only wrapped for agent access:
| Component | Path | Status |
|-----------|------|--------|
| Optimization engine | `optimization_engine/core/` | ✅ Keep as-is |
| AtomizerSpec v2.0 | `optimization_engine/config/` | ✅ Keep as-is |
| All extractors (20+) | `optimization_engine/extractors/` | ✅ Keep as-is |
| NX integration | `optimization_engine/nx/` | ✅ Keep as-is |
| Study management | `optimization_engine/study/` | ✅ Keep as-is |
| GNN surrogate | `optimization_engine/gnn/` | ✅ Keep as-is |
| Dashboard | `atomizer-dashboard/` | ✅ Keep as-is |
| Trial manager | `optimization_engine/utils/` | ✅ Keep as-is |
| LAC system | `knowledge_base/lac.py` | 🔄 Evolve (see 2.4) |
### 2.2 What Gets Reworked
#### Documentation Reorganization (165 files → clean two-layer system)
The current `docs/` is a sprawl of 165 files across 15+ subdirectories. Many are outdated, duplicated, or overly context-specific to a single Claude Code session.
**Target structure:**
```
docs/ # REFERENCE — stable, curated, agent-readable
├── 00_INDEX.md # Master index
├── QUICK_REF.md # 2-page cheatsheet (keep existing)
├── ARCHITECTURE.md # System architecture (keep, update)
├── GETTING_STARTED.md # Onboarding (keep, update)
├── protocols/ # Protocol Operating System (keep all)
│ ├── operations/OP_01OP_08
│ ├── system/SYS_10SYS_18
│ └── extensions/EXT_01EXT_04
├── api/ # NX Open, Nastran references (keep)
│ ├── nx_integration.md
│ ├── NX_FILE_STRUCTURE_PROTOCOL.md
│ └── NXOPEN_RESOURCES.md
├── physics/ # Physics references (keep)
│ ├── ZERNIKE_FUNDAMENTALS.md
│ ├── ZERNIKE_TRAJECTORY_METHOD.md
│ └── ZERNIKE_OPD_METHOD.md
├── extractors/ # Extractor catalog (merge from generated/)
│ ├── EXTRACTOR_CHEATSHEET.md
│ └── EXTRACTORS.md
└── templates/ # Study templates reference
└── TEMPLATES.md
knowledge_base/ # LIVING MEMORY — grows with every project
├── lac/ # Learning Atomizer Core
│ ├── optimization_memory/ # Per-geometry-type outcomes
│ │ ├── bracket.jsonl
│ │ ├── beam.jsonl
│ │ └── mirror.jsonl
│ └── session_insights/ # Categorized lessons
│ ├── failure.jsonl
│ ├── success_pattern.jsonl
│ ├── workaround.jsonl
│ └── protocol_clarification.jsonl
├── projects/ # Per-project knowledge (NEW)
│ ├── starspec-m1/
│ │ ├── CONTEXT.md # Project context for agents
│ │ ├── model-knowledge.md # CAD/FEM details from KB agent
│ │ └── decisions.md # Key decisions made
│ └── (future projects...)
└── company/ # Company-wide evolving knowledge (NEW)
├── algorithm-selection.md # When to use which algorithm
├── common-pitfalls.md # Hard-won lessons
└── client-patterns.md # Common client needs
```
**Files to archive (move to `docs/archive/`):**
- All `docs/development/` — internal dev notes, dashboad gap analysis, etc.
- All `docs/plans/` — planning docs that are now implemented
- All `docs/guides/` dashboard-specific docs — consolidate into one dashboard guide
- All `docs/reviews/` — one-time reviews
- All `docs/logs/` — implementation logs
- `docs/CONTEXT_ENGINEERING_REPORT.md` — ACE is now part of the system, report archived
- `docs/diagrams/` — merge relevant ones into ARCHITECTURE.md
**Migration script:**
```bash
#!/bin/bash
# reorganize-docs.sh — Run from Atomizer repo root
ARCHIVE=docs/archive/$(date +%Y%m%d)_pre_agentic
mkdir -p $ARCHIVE
# Move sprawl to archive
mv docs/development/ $ARCHIVE/
mv docs/plans/ $ARCHIVE/
mv docs/reviews/ $ARCHIVE/
mv docs/logs/ $ARCHIVE/
mv docs/diagrams/ $ARCHIVE/
mv docs/CONTEXT_ENGINEERING_REPORT.md $ARCHIVE/
mv docs/ROADMAP/ $ARCHIVE/
mv docs/handoff/ $ARCHIVE/
mv docs/guides/DASHBOARD_*.md $ARCHIVE/
mv docs/guides/NEURAL_*.md $ARCHIVE/
mv docs/guides/ATOMIZER_FIELD_*.md $ARCHIVE/
mv docs/guides/PHYSICS_LOSS_GUIDE.md $ARCHIVE/
mv docs/reference/ $ARCHIVE/
# Merge extractors
mkdir -p docs/extractors/
mv docs/generated/EXTRACTOR_CHEATSHEET.md docs/extractors/
mv docs/generated/EXTRACTORS.md docs/extractors/
rmdir docs/generated/ 2>/dev/null
# Create project knowledge structure
mkdir -p knowledge_base/projects/
mkdir -p knowledge_base/company/
echo "Done. Archived $(find $ARCHIVE -type f | wc -l) files."
echo "Remaining docs: $(find docs/ -type f -name '*.md' | wc -l)"
```
#### Protocol System Evolution
The existing Protocol Operating System (OP_01OP_08, SYS_10SYS_18) stays intact. New protocols added for multi-agent operation:
| New Protocol | Purpose |
|-------------|---------|
| OP_09: Agent Handoff | How agents hand off work to each other |
| OP_10: Project Intake | How new projects get initialized |
| SYS_19: Job Queue | Windows execution bridge protocol |
| SYS_20: Agent Memory | How agents read/write shared knowledge |
### 2.3 Agent-Native Patterns
#### CONTEXT.md Per Study
Every study gets a `CONTEXT.md` in `knowledge_base/projects/<project>/` that serves as the "briefing document" any agent can read to understand the project:
```markdown
# CONTEXT.md — StarSpec M1 WFE Optimization
## Client
StarSpec Systems — space telescope primary mirror
## Objective
Minimize peak-to-valley wavefront error (WFE) on M1 mirror under thermal + gravity loads.
## Key Parameters
| Parameter | Range | Units | Notes |
|-----------|-------|-------|-------|
| rib_height_1 | 525 | mm | Inner ring |
| rib_height_2 | 525 | mm | Outer ring |
| web_thickness | 15 | mm | Mirror back sheet |
| ... | | | |
## Constraints
- Mass < 12 kg
- First natural frequency > 80 Hz
## Model
- NX assembly: M1_mirror_assembly.prt
- FEM: M1_mirror_fem1.fem
- Simulation: M1_mirror_sim1.sim
- Solver: NX Nastran SOL 101 (static) + SOL 103 (modal)
## Decisions
- 2026-02-10: Selected CMA-ES over TPE (9 variables, noisy landscape)
- 2026-02-11: Added thermal load case per client email
## Status
Phase: Execution (trial 47/150)
Channel: #starspec-m1-wfe
```
#### QUICK_REF.md for Agents
A condensed version of the existing QUICK_REF.md, optimized for agent context windows:
```markdown
# QUICK_REF.md — Atomizer Agent Reference
## Non-Negotiables
1. NEVER kill NX processes directly → NXSessionManager.close_nx_if_allowed()
2. NEVER rewrite run_optimization.py from scratch → COPY working template
3. NEVER compute relative WFE as abs(RMS_a - RMS_b) → use extract_relative()
4. CMA-ES doesn't evaluate x0 first → always enqueue_trial(x0)
5. PowerShell for NX journals → NEVER cmd /c
## Workflow
Create → Validate → Run → Analyze → Report → Deliver
## AtomizerSpec v2.0
Single source of truth: `atomizer_spec.json`
Schema: `optimization_engine/schemas/atomizer_spec_v2.json`
## File Chain (CRITICAL)
.sim → .fem → *_i.prt → .prt
The idealized part (*_i.prt) MUST be loaded before UpdateFemodel()!
## Extractors (key)
E1: Displacement | E2: Frequency | E3: Stress | E4: BDF Mass
E5: CAD Mass | E8-10: Zernike variants
Full catalog: docs/extractors/EXTRACTORS.md
## Algorithm Selection
< 5 vars, smooth → Nelder-Mead or COBYLA
520 vars, noisy → CMA-ES
> 20 vars → Bayesian (Optuna TPE) or surrogate
Multi-objective → NSGA-II or MOEA/D
```
### 2.4 Knowledge Base Evolution
The current `knowledge_base/` with LAC becomes the agent memory backbone:
```
BEFORE (single Claude Code brain):
knowledge_base/lac.py → one script queries everything
knowledge_base/lac/ → flat JSONL files
knowledge_base/playbook.json → session context playbook
AFTER (distributed agent knowledge):
knowledge_base/
├── lac/ → Stays (optimization memory, session insights)
│ ├── optimization_memory/ → Optimizer + Study Builder read this
│ └── session_insights/ → All agents read failure.jsonl
├── projects/ → Per-project context (all agents read)
│ └── <project>/CONTEXT.md
└── company/ → Evolving company knowledge
├── algorithm-selection.md → Optimizer + Technical read
├── common-pitfalls.md → All agents read
└── client-patterns.md → Manager + Secretary read
```
**Agent access pattern:**
- Agents read `knowledge_base/` via the mounted `/mnt/atomizer-repo` volume
- Agents write project-specific knowledge to their own `memory/<project>.md`
- Manager periodically promotes agent learnings → `knowledge_base/company/`
- Developer updates `knowledge_base/lac/` when framework changes
---
## 3. Agent Workspaces (Detailed)
### 3.1 Bootstrap Script
```bash
#!/bin/bash
# /opt/atomizer/bootstrap-workspaces.sh
# Creates all agent workspaces with proper templates
set -e
WORKSPACE_ROOT="/opt/atomizer/workspaces"
# Agent definitions: id|name|emoji|model|tier
AGENTS=(
"manager|The Manager|🎯|anthropic/claude-opus-4-6|core"
"secretary|The Secretary|📋|anthropic/claude-opus-4-6|core"
"technical|The Technical Lead|🔧|anthropic/claude-opus-4-6|core"
"optimizer|The Optimizer|⚡|anthropic/claude-opus-4-6|core"
"nx-expert|The NX Expert|🖥️|anthropic/claude-sonnet-5-20260203|specialist"
"postprocessor|The Post-Processor|📊|anthropic/claude-sonnet-5-20260203|specialist"
"reporter|The Reporter|📝|anthropic/claude-sonnet-5-20260203|specialist"
"auditor|The Auditor|🔍|anthropic/claude-opus-4-6|specialist"
"study-builder|The Study Builder|🏗️|openai/gpt-5.3-codex|core"
"researcher|The Researcher|🔬|google/gemini-3.0-pro|support"
"developer|The Developer|💻|anthropic/claude-sonnet-5-20260203|support"
"knowledge-base|The Knowledge Base|🗄️|anthropic/claude-sonnet-5-20260203|support"
"it-support|IT Support|🛠️|anthropic/claude-sonnet-5-20260203|support"
)
for agent_def in "${AGENTS[@]}"; do
IFS='|' read -r ID NAME EMOJI MODEL TIER <<< "$agent_def"
DIR="$WORKSPACE_ROOT/$ID"
echo "Creating workspace: $DIR ($NAME)"
mkdir -p "$DIR/memory"
# --- SOUL.md ---
cat > "$DIR/SOUL.md" << SOUL_EOF
# SOUL.md — $NAME $EMOJI
You are **$NAME** at **Atomizer Engineering Co.**, a multi-agent FEA optimization firm.
## Who You Are
- Role: $NAME
- Tier: $TIER
- Company: Atomizer Engineering Co.
- CEO: Antoine Letarte (human — your boss)
- Orchestrator: The Manager (coordinates all work)
## Core Principles
1. Follow all Atomizer protocols (load the \`atomizer-protocols\` skill)
2. Stay in your lane — delegate work outside your expertise
3. Respond when @-mentioned in Slack channels
4. Update your memory after significant work
5. Be concise in Slack — detailed in documents
6. When uncertain: ask, don't guess
7. Record lessons learned — the company gets smarter with every project
## Communication Style
- In Slack threads: concise, structured, use bullet points
- In reports/docs: thorough, professional, well-formatted
- When disagreeing: respectful but direct — this is engineering, facts matter
- When blocked: escalate to Manager immediately, don't spin
## What You DON'T Do
- Never bypass the Manager's coordination
- Never send client communications without approval chain
- Never modify another agent's memory files
- Never make up engineering data or results
SOUL_EOF
# --- AGENTS.md ---
cat > "$DIR/AGENTS.md" << AGENTS_EOF
# AGENTS.md — $NAME
## Session Init
1. Read \`SOUL.md\` — who you are
2. Read \`MEMORY.md\` — what you remember
3. Check \`memory/\` for active project context
4. Load \`atomizer-protocols\` skill for protocol reference
5. Check which Slack channel/thread you're in for context
## Memory Protocol
- \`memory/<project>.md\` → per-project working notes
- \`MEMORY.md\` → long-term role knowledge (lessons, patterns, preferences)
- Write it down immediately — don't wait until end of session
- After every project: distill lessons into MEMORY.md
## Protocols
All work follows Atomizer protocols. Key ones:
- OP_01: Study Lifecycle (creation through delivery)
- OP_09: Agent Handoff (how to pass work to another agent)
- OP_10: Project Intake (how new projects start)
## Company Directory
| Agent | ID | Channel | Role |
|-------|----|---------|------|
| 🎯 Manager | manager | #hq | Orchestrator, assigns work |
| 📋 Secretary | secretary | #secretary | Antoine's interface |
| 🔧 Technical | technical | (summoned) | Problem breakdown |
| ⚡ Optimizer | optimizer | (summoned) | Algorithm design |
| 🖥️ NX Expert | nx-expert | (summoned) | NX/Nastran specialist |
| 📊 Post-Processor | postprocessor | (summoned) | Data analysis, plots |
| 📝 Reporter | reporter | (summoned) | Report generation |
| 🔍 Auditor | auditor | #audit-log | Reviews everything |
| 🏗️ Study Builder | study-builder | (summoned) | Writes Python code |
| 🔬 Researcher | researcher | #research | Literature, methods |
| 💻 Developer | developer | #dev | Codes new features |
| 🗄️ Knowledge Base | knowledge-base | #knowledge-base | Company memory |
| 🛠️ IT Support | it-support | (summoned) | Infrastructure |
AGENTS_EOF
# --- TOOLS.md ---
cat > "$DIR/TOOLS.md" << TOOLS_EOF
# TOOLS.md — $NAME
## Shared Resources
- Atomizer repo: \`/mnt/atomizer-repo/\` (read-only)
- Obsidian vault: \`/mnt/obsidian/\` (read-only)
- Job queue: \`/mnt/job-queue/\` (read-write)
## Protocols Location
Loaded via \`atomizer-protocols\` skill.
Source: \`/mnt/atomizer-repo/docs/protocols/\`
## Knowledge Base
- LAC insights: \`/mnt/atomizer-repo/knowledge_base/lac/\`
- Project contexts: \`/mnt/atomizer-repo/knowledge_base/projects/\`
## Key Files
- QUICK_REF: \`/mnt/atomizer-repo/docs/QUICK_REF.md\`
- Extractors: \`/mnt/atomizer-repo/docs/extractors/EXTRACTORS.md\`
TOOLS_EOF
# --- MEMORY.md ---
cat > "$DIR/MEMORY.md" << MEMORY_EOF
# MEMORY.md — $NAME
## Role Knowledge
*(Populated as the agent works — lessons, patterns, preferences)*
## Lessons Learned
*(Accumulated from project experience)*
## Project History
*(Brief notes on past projects and outcomes)*
MEMORY_EOF
echo " ✓ Created: SOUL.md, AGENTS.md, TOOLS.md, MEMORY.md"
done
echo ""
echo "=== All ${#AGENTS[@]} workspaces created ==="
echo ""
echo "Next steps:"
echo " 1. Customize SOUL.md for each agent's specific personality"
echo " 2. Add role-specific rules to each AGENTS.md"
echo " 3. Create shared skills in /opt/atomizer/clawdbot/skills/"
echo " 4. Configure clawdbot.json with agent definitions"
```
### 3.2 Role-Specific SOUL.md Customizations
After running the bootstrap, each agent's SOUL.md needs role-specific personality. Key customizations:
**Manager — add to SOUL.md:**
```markdown
## Manager-Specific Rules
- You NEVER do technical work yourself. Always delegate.
- Before assigning work, state which protocol applies.
- Track every assignment. Follow up if no response in the thread.
- If two agents disagree, call the Auditor to arbitrate.
- You are also the Framework Steward (ref DEC-A010):
- After each project, review what worked and propose improvements
- Ensure new tools get documented, not just built
- Direct Developer to build reusable components, not one-off hacks
```
**Secretary — add to SOUL.md:**
```markdown
## Secretary-Specific Rules
- Never bother Antoine with things agents can resolve themselves.
- Batch questions — don't send 5 separate messages, send 1 summary.
- Always include context: "The Optimizer is asking about X because..."
- When presenting deliverables: 3-line summary + the document.
- Track response times. If Antoine hasn't replied in 4h, ping once.
- NEVER send to clients without Antoine's explicit "approved".
- Learn what Antoine wants to know vs what to handle silently.
## Reporting Preferences (evolves over time)
- ✅ Always: client deliverables, audit findings, new tools, blockers
- ⚠️ Batch: routine progress updates, standard agent questions
- ❌ Skip: routine thread discussions, standard protocol execution
```
**Optimizer — add to SOUL.md:**
```markdown
## Optimizer-Specific Rules
- Always propose multiple approaches with trade-offs
- Respect the physics — suspicious of "too good" results
- Communicate in data: "Trial 47 achieved 23% improvement, but..."
- Read LAC optimization_memory before proposing any algorithm
## Critical Learnings (from LAC — NEVER FORGET)
- CMA-ES doesn't evaluate x0 first → always enqueue baseline trial
- Surrogate + L-BFGS = dangerous → gradient descent finds fake optima
- Relative WFE: use extract_relative(), not abs(RMS_a - RMS_b)
- For WFE problems: start with CMA-ES, 50-100 trials, then refine
```
**Auditor — add to SOUL.md:**
```markdown
## Auditor-Specific Rules
- You are the last line of defense before deliverables reach clients.
- Question EVERYTHING. "Trust but verify" is your motto.
- Check: units, mesh convergence, BCs, load magnitudes, constraint satisfaction
- If something looks "too good," investigate.
- Produce an audit report for every deliverable: PASS / CONDITIONAL / FAIL
- You have VETO power on deliverables. Use it when physics says so.
- Be respectful but relentless — social niceness never trumps correctness.
```
**Study Builder — add to SOUL.md:**
```markdown
## Study Builder-Specific Rules
- NEVER write run_optimization.py from scratch. ALWAYS copy a working template.
- The M1 V15 NSGA-II script is the gold standard reference.
- README.md is REQUIRED for every study.
- PowerShell for NX. NEVER cmd /c.
- Test with --test flag before declaring ready.
- All code must handle: NX restart, partial failures, resume capability.
- Output paths must be relative (Syncthing-compatible, no absolute Windows paths).
- Submit completed code as a job to /mnt/job-queue/pending/
```
### 3.3 Shared Skills Structure
```
/opt/atomizer/clawdbot/skills/
├── atomizer-protocols/ # Loaded by ALL agents
│ ├── SKILL.md
│ ├── QUICK_REF.md # Agent-optimized quick reference
│ └── protocols/ # Full protocol files
│ ├── OP_01_study_lifecycle.md
│ ├── OP_02_run_optimization.md
│ ├── OP_03_monitor_progress.md
│ ├── OP_04_analyze_results.md
│ ├── OP_05_export_training.md
│ ├── OP_06_troubleshoot.md
│ ├── OP_07_disk_optimization.md
│ ├── OP_08_generate_report.md
│ ├── OP_09_agent_handoff.md # NEW
│ ├── OP_10_project_intake.md # NEW
│ ├── SYS_10_imso.md
│ ├── SYS_11_multi_objective.md
│ ├── SYS_12_extractor_library.md
│ ├── SYS_13_dashboard.md
│ ├── SYS_14_neural_acceleration.md
│ ├── SYS_15_method_selector.md
│ ├── SYS_16_self_aware_turbo.md
│ ├── SYS_17_study_insights.md
│ ├── SYS_18_context_engineering.md
│ ├── SYS_19_job_queue.md # NEW
│ └── SYS_20_agent_memory.md # NEW
├── atomizer-company/ # Loaded by ALL agents
│ ├── SKILL.md
│ ├── COMPANY.md # Identity, values, how we work
│ └── LAC_CRITICAL.md # Hard-won lessons (from failure.jsonl)
├── atomizer-spec/ # Loaded by Optimizer, Study Builder
│ ├── SKILL.md
│ ├── SPEC_FORMAT.md # AtomizerSpec v2.0 reference
│ └── examples/ # Example specs for common study types
├── atomizer-extractors/ # Loaded by Post-Processor, Study Builder
│ ├── SKILL.md
│ ├── EXTRACTOR_CATALOG.md # All 20+ extractors
│ └── CUSTOM_EXTRACTOR_GUIDE.md # How to create new ones
├── atomizer-nx/ # Loaded by NX Expert, Study Builder
│ ├── SKILL.md
│ ├── NX_PATTERNS.md # Common NX Open patterns
│ ├── SOLVER_CONFIG.md # Solution sequences, element types
│ └── FILE_STRUCTURE.md # .sim/.fem/.prt dependencies
└── atomaste-reports/ # Loaded by Reporter
├── SKILL.md
├── REPORT_TEMPLATES.md # Atomaste report format
└── STYLE_GUIDE.md # Branding, formatting rules
```
**SKILL.md example (atomizer-protocols):**
```markdown
---
name: atomizer-protocols
description: Atomizer Engineering Co. protocols and procedures
version: 1.0
---
# Atomizer Protocols Skill
Load this skill for access to all Atomizer operating protocols.
## When to Load
- On every session (this is your company's operating system)
## Key Files
- `QUICK_REF.md` — Start here. 2-page cheatsheet.
- `protocols/OP_*` — Operational protocols (how to do things)
- `protocols/SYS_*` — System protocols (technical specifications)
## Protocol Lookup
| Need | Read |
|------|------|
| Create a study | OP_01 |
| Run optimization | OP_02 |
| Analyze results | OP_04 |
| Hand off to another agent | OP_09 |
| Start a new project | OP_10 |
| Choose algorithm | SYS_15 |
| Submit job to Windows | SYS_19 |
```
### 3.4 Per-Agent Skill Assignments
| Agent | Shared Skills | Agent-Specific Skills |
|-------|--------------|----------------------|
| Manager | protocols, company | — |
| Secretary | protocols, company | email (draft only) |
| Technical | protocols, company | — |
| Optimizer | protocols, company, spec | — |
| NX Expert | protocols, company, nx | — |
| Post-Processor | protocols, company, extractors | data-visualization |
| Reporter | protocols, company | atomaste-reports |
| Auditor | protocols, company | physics-validation |
| Study Builder | protocols, company, spec, extractors, nx | — |
| Researcher | protocols, company | web-search |
| Developer | protocols, company, extractors | git-tools |
| Knowledge Base | protocols, company | cad-documenter |
| IT Support | protocols, company | system-admin |
---
## 4. Slack Architecture
### 4.1 Dedicated Workspace Setup (ref DEC-A006)
**Workspace:** `Atomizer Engineering` (`atomizer-eng.slack.com`)
Setup steps:
1. Antoine creates new Slack workspace at https://slack.com/create
2. Workspace name: "Atomizer Engineering"
3. Create a Slack app at https://api.slack.com/apps
4. App name: "Atomizer Agents" (single app, all agents share it)
5. Enable Socket Mode (for real-time events without public URL)
6. Bot token scopes: `chat:write`, `channels:read`, `channels:history`, `channels:join`, `groups:read`, `groups:history`, `users:read`, `reactions:write`, `reactions:read`, `files:write`
7. Install app to workspace
8. Copy Bot Token (`xoxb-...`) and App Token (`xapp-...`) to `/opt/atomizer/.env`
**Agent identity in Slack:**
Since Clawdbot uses a single bot token (ref DEC-A013 — keeping it simple for now), agents identify themselves via their emoji prefix in messages:
```
🎯 [Manager]: @technical, new job. Break down the attached requirements.
🔧 [Technical]: @manager, breakdown complete. 9 design variables identified.
⚡ [Optimizer]: CMA-ES recommended. Starting population: 20, budget: 150.
```
Future enhancement: separate bot tokens per agent for true Slack identity (avatar, display name per agent).
### 4.2 Channel Structure
#### Permanent Channels
| Channel | Purpose | Primary Agent | Description |
|---------|---------|--------------|-------------|
| `#hq` | Company coordination | Manager | All company-wide discussions, directives |
| `#secretary` | Antoine's dashboard | Secretary | Status updates, approvals, questions |
| `#audit-log` | Audit trail | Auditor | All audit reports, findings |
| `#research` | Research requests | Researcher | Literature search, method comparisons |
| `#dev` | Development work | Developer | Code, features, bug fixes |
| `#knowledge-base` | Documentation | Knowledge Base | CAD docs, model knowledge |
| `#framework-evolution` | Framework growth | Manager (steward) | Protocol updates, tool improvements |
#### Project Channels (created per job)
**Naming:** `#<client>-<short-description>`
Examples:
- `#starspec-m1-wfe` — StarSpec M1 wavefront error optimization
- `#clientb-thermal-opt` — Client B thermal optimization
- `#internal-new-extractor` — Internal development project
**Lifecycle:**
1. Antoine posts in `#hq`: "New job: StarSpec M1 WFE optimization"
2. Manager creates `#starspec-m1-wfe` channel
3. Manager posts project kickoff in new channel
4. Manager invites relevant agents via @-mentions
5. Work proceeds in-channel with thread discipline
6. On completion: channel archived
### 4.3 Agent Routing Configuration
```json5
// In clawdbot.json — bindings section
{
"bindings": [
// Manager gets HQ and all project channels
{
"agentId": "manager",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#hq" } }
},
// Secretary gets its channel and DMs from Antoine
{
"agentId": "secretary",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#secretary" } }
},
{
"agentId": "secretary",
"match": { "channel": "slack", "peer": { "kind": "dm", "userId": "ANTOINE_USER_ID" } }
},
// Specialized permanent channels
{
"agentId": "auditor",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#audit-log" } }
},
{
"agentId": "researcher",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#research" } }
},
{
"agentId": "developer",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#dev" } }
},
{
"agentId": "knowledge-base",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#knowledge-base" } }
},
// Project channels → Manager handles routing via @mentions
// All project channels (#starspec-*, #clientb-*, etc.) route to Manager
// Manager then @mentions specific agents in threads
{
"agentId": "manager",
"match": { "channel": "slack", "peer": { "kind": "group" } },
"priority": -1 // Fallback — catches any unbound channel
}
]
}
```
### 4.4 Thread Discipline
Main channel timeline reads like a project log (milestone posts only):
```
🎯 [Manager]: Project kickoff: StarSpec M1 WFE optimization
└── Thread: Kickoff details, objectives, constraints
🔧 [Technical]: Technical breakdown complete (9 DVs, 2 objectives)
└── Thread: Full breakdown, parameter table, gap analysis
⚡ [Optimizer]: Algorithm recommendation: CMA-ES, 150 trials
└── Thread: Rationale, alternatives considered, trade-offs
🎯 [Manager]: Study plan approved. @study-builder, build it.
└── Thread: Study Builder's code, review, iteration
🏗️ [Study Builder]: Study ready. Submitted to job queue.
└── Thread: Code review, test results, file manifest
📊 [Post-Processor]: Results ready — 23% WFE improvement
└── Thread: Plots, data tables, convergence analysis
🔍 [Auditor]: Audit PASSED (2 advisory notes)
└── Thread: Full audit report
📝 [Reporter]: Draft report ready for review
└── Thread: Report link, review comments
📋 [Secretary]: @antoine — Report ready, please review ✅
```
### 4.5 How Antoine Interacts
Antoine's primary interface is `#secretary`. He can also:
1. **Post in `#secretary`** → Secretary processes, routes to relevant agents
2. **Post in `#hq`** → Manager treats as company-wide directive
3. **Post in any project channel** → Manager acknowledges and adjusts
4. **DM the bot** → Routes to Secretary (via binding)
5. **@mention any agent** in any channel → That agent responds directly
**Antoine's key phrases:**
- "approved" / "send it" → Secretary triggers delivery
- "hold" / "wait" → Secretary pauses the pipeline
- "what's the status?" → Secretary compiles cross-project summary
- "focus on X" → Secretary relays priority to Manager
---
## 5. Windows Execution Bridge
### 5.1 Job Queue Design (ref DEC-A011, SYS_19)
The Syncthing job queue is the primary mechanism for agents (on Linux) to trigger Python/NX execution on Antoine's Windows machine.
```
/opt/atomizer/job-queue/ (Linux — /mnt/job-queue in container)
C:\Atomizer\job-queue\ (Windows — Syncthing mirror)
├── pending/ # New jobs waiting to run
│ └── job-20260210-143022-wfe/
│ ├── job.json # Job manifest
│ ├── run_optimization.py # The actual script
│ ├── atomizer_spec.json # Study configuration
│ └── 1_setup/ # Model files (or symlinks)
│ ├── M1_mirror.prt
│ ├── M1_mirror_fem1.fem
│ ├── M1_mirror_fem1_i.prt
│ └── M1_mirror_sim1.sim
├── running/ # Currently executing
├── completed/ # Finished successfully
│ └── job-20260210-143022-wfe/
│ ├── job.json # Updated with results
│ ├── 3_results/ # Output data
│ │ ├── study.db
│ │ └── convergence.png
│ └── stdout.log # Execution log
└── failed/ # Failed jobs
└── job-20260209-091500-modal/
├── job.json # Updated with error info
└── stderr.log # Error log
```
### 5.2 Job File Format
```json
{
"job_id": "job-20260210-143022-wfe",
"created_at": "2026-02-10T14:30:22Z",
"created_by": "study-builder",
"project": "starspec-m1-wfe",
"channel": "#starspec-m1-wfe",
"type": "optimization",
"script": "run_optimization.py",
"args": ["--start", "--trials", "150"],
"python_env": "atomizer",
"model_files": [
"1_setup/M1_mirror.prt",
"1_setup/M1_mirror_fem1.fem",
"1_setup/M1_mirror_fem1_i.prt",
"1_setup/M1_mirror_sim1.sim"
],
"status": "pending",
"status_updated_at": null,
"result": null,
"error": null,
"duration_seconds": null,
"notify": {
"on_start": true,
"on_complete": true,
"on_fail": true,
"progress_interval_minutes": 30
}
}
```
### 5.3 Windows File Watcher Service
A Python service running on Windows that monitors the job queue:
```python
#!/usr/bin/env python3
"""
atomizer_job_watcher.py — Windows Job Queue Service
Watches C:\Atomizer\job-queue\pending\ for new jobs.
Runs them, moves to completed/ or failed/.
"""
import json
import shutil
import subprocess
import sys
import time
import logging
from pathlib import Path
from datetime import datetime, timezone
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
JOB_QUEUE = Path(r"C:\Atomizer\job-queue")
PENDING = JOB_QUEUE / "pending"
RUNNING = JOB_QUEUE / "running"
COMPLETED = JOB_QUEUE / "completed"
FAILED = JOB_QUEUE / "failed"
CONDA_PYTHON = r"C:\Users\antoi\anaconda3\envs\atomizer\python.exe"
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s",
handlers=[
logging.FileHandler(JOB_QUEUE / "watcher.log"),
logging.StreamHandler()
]
)
log = logging.getLogger("job-watcher")
class JobHandler(FileSystemEventHandler):
"""Watch for new job.json files in pending/"""
def on_created(self, event):
if event.src_path.endswith("job.json"):
# Wait for Syncthing to finish syncing all files
time.sleep(5)
job_dir = Path(event.src_path).parent
self.run_job(job_dir)
def run_job(self, job_dir: Path):
job_file = job_dir / "job.json"
if not job_file.exists():
return
with open(job_file) as f:
job = json.load(f)
job_id = job["job_id"]
log.info(f"Starting job: {job_id}")
# Move to running/
running_dir = RUNNING / job_dir.name
shutil.move(str(job_dir), str(running_dir))
# Update status
job["status"] = "running"
job["status_updated_at"] = datetime.now(timezone.utc).isoformat()
with open(running_dir / "job.json", "w") as f:
json.dump(job, f, indent=2)
# Execute
script = running_dir / job["script"]
args = [CONDA_PYTHON, str(script)] + job.get("args", [])
stdout_log = running_dir / "stdout.log"
stderr_log = running_dir / "stderr.log"
start_time = time.time()
try:
result = subprocess.run(
args,
cwd=str(running_dir),
stdout=open(stdout_log, "w"),
stderr=open(stderr_log, "w"),
timeout=job.get("timeout_seconds", 86400), # 24h default
env={**dict(__import__("os").environ), "ATOMIZER_JOB_ID": job_id}
)
duration = time.time() - start_time
if result.returncode == 0:
job["status"] = "completed"
job["duration_seconds"] = round(duration, 1)
dest = COMPLETED / job_dir.name
else:
job["status"] = "failed"
job["error"] = f"Exit code: {result.returncode}"
job["duration_seconds"] = round(duration, 1)
dest = FAILED / job_dir.name
except subprocess.TimeoutExpired:
job["status"] = "failed"
job["error"] = "Timeout exceeded"
dest = FAILED / job_dir.name
except Exception as e:
job["status"] = "failed"
job["error"] = str(e)
dest = FAILED / job_dir.name
job["status_updated_at"] = datetime.now(timezone.utc).isoformat()
with open(running_dir / "job.json", "w") as f:
json.dump(job, f, indent=2)
shutil.move(str(running_dir), str(dest))
log.info(f"Job {job_id}: {job['status']} ({job.get('duration_seconds', '?')}s)")
def main():
"""Start the file watcher service."""
for d in [PENDING, RUNNING, COMPLETED, FAILED]:
d.mkdir(parents=True, exist_ok=True)
# Process any jobs already in pending/ (from before service started)
handler = JobHandler()
for job_dir in PENDING.iterdir():
if (job_dir / "job.json").exists():
handler.run_job(job_dir)
# Watch for new jobs
observer = Observer()
observer.schedule(handler, str(PENDING), recursive=True)
observer.start()
log.info(f"Job watcher started. Monitoring: {PENDING}")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
if __name__ == "__main__":
main()
```
**Install as Windows service (optional):**
```powershell
# Using NSSM (Non-Sucking Service Manager)
nssm install AtomizerJobWatcher "C:\Users\antoi\anaconda3\envs\atomizer\python.exe" "C:\Atomizer\atomizer_job_watcher.py"
nssm set AtomizerJobWatcher AppDirectory "C:\Atomizer"
nssm set AtomizerJobWatcher DisplayName "Atomizer Job Watcher"
nssm set AtomizerJobWatcher Start SERVICE_AUTO_START
nssm start AtomizerJobWatcher
```
Or run manually:
```powershell
conda activate atomizer
python C:\Atomizer\atomizer_job_watcher.py
```
### 5.4 How Agents Monitor Job Status
Agents poll the job queue directory to check status:
```python
# Agent-side job monitoring (runs on Linux)
import json
from pathlib import Path
JOB_QUEUE = Path("/mnt/job-queue")
def check_job_status(job_id: str) -> dict:
"""Check status of a submitted job."""
for status_dir in ["running", "completed", "failed", "pending"]:
job_dir = JOB_QUEUE / status_dir / job_id
job_file = job_dir / "job.json"
if job_file.exists():
with open(job_file) as f:
return json.load(f)
return {"status": "not_found", "job_id": job_id}
def list_jobs(status: str = None) -> list:
"""List jobs, optionally filtered by status."""
jobs = []
dirs = [status] if status else ["pending", "running", "completed", "failed"]
for d in dirs:
for job_dir in (JOB_QUEUE / d).iterdir():
job_file = job_dir / "job.json"
if job_file.exists():
with open(job_file) as f:
jobs.append(json.load(f))
return jobs
```
**Manager's heartbeat check (in HEARTBEAT.md):**
```markdown
## Job Queue Check
Every heartbeat, check /mnt/job-queue/ for:
- New completed jobs → notify Post-Processor + channel
- New failed jobs → notify channel + escalate to Secretary
- Long-running jobs (>4h) → status update to channel
```
### 5.5 Tailscale SSH Fast Lane (Optional)
For time-sensitive operations, direct SSH to Windows:
```bash
# Setup (one-time):
# 1. Install Tailscale on both T420 and Windows
# 2. Enable SSH on Windows (Settings > Apps > Optional Features > OpenSSH Server)
# 3. Note Windows Tailscale IP (e.g., 100.x.y.z)
# From Docker container (requires Tailscale sidecar):
ssh antoi@100.x.y.z "conda activate atomizer && python C:\Atomizer\studies\starspec\run_optimization.py --start"
```
This is the "fast lane" — bypasses Syncthing sync delay (usually 5-30 seconds) for immediate execution. Enable the Tailscale sidecar in Docker Compose when needed.
### 5.6 End-to-End Execution Flow
```
Study Builder writes code
Creates job directory with job.json + script + model files
Copies to /mnt/job-queue/pending/job-YYYYMMDD-HHMMSS-<name>/
Syncthing syncs to C:\Atomizer\job-queue\pending\ (5-30s)
Windows file watcher detects new job.json
Moves job to running/ → executes python script
├── Progress: stdout.log updates periodically (Syncthing syncs back)
On completion: moves to completed/ (or failed/)
Syncthing syncs back to Linux (results, logs, study.db)
Manager's heartbeat detects completed job
Notifies Post-Processor → analysis begins
```
---
## 6. Inter-Agent Communication
### 6.1 Communication Hierarchy
```
┌─────────┐
│ Antoine │ ← CEO, can speak to anyone
└────┬────┘
┌────▼────┐
│Secretary│ ← Antoine's filter
└────┬────┘
┌────▼────┐
│ Manager │ ← Orchestrator, delegates all work
└────┬────┘
┌────────────────┼────────────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│Technical│ │Optimizer│ │ Auditor │ ← Can interrupt anyone
└────┬────┘ └────┬────┘ └─────────┘
│ │
┌────▼────┐ ┌────▼────┐
│NX Expert│ │ Study │
└─────────┘ │ Builder │
└────┬────┘
┌────▼──────┐
│Post-Proc │
└────┬──────┘
┌────▼────┐
│Reporter │
└─────────┘
```
### 6.2 Slack @Mention Protocol
**Primary communication method.** Agents talk by @-mentioning each other in project channel threads.
**Rules (ref DEC-A003):**
1. Only Manager initiates cross-agent work in project channels
2. Agents respond when @-mentioned
3. Agents can @-mention Manager to report completion or ask for guidance
4. Auditor can interrupt any thread (review authority)
5. Secretary can always message Antoine
**Message format:**
```
🎯 [Manager]: @technical, new project. Break down the requirements in this thread.
Contract: [attached file]
Protocol: OP_01 + OP_10
🔧 [Technical]: @manager, breakdown complete:
- 9 design variables (see table below)
- 2 objectives: minimize WFE, minimize mass
- 3 constraints: freq > 80Hz, stress < 200MPa, mass < 12kg
- Solver: SOL 101 + SOL 103
- @nx-expert, please confirm solver config.
🖥️ [NX Expert]: SOL 101 for static, SOL 103 for modal. Confirmed.
Note: Need chained analysis for thermal. Recommend SOL 153 chain.
🎯 [Manager]: @optimizer, Technical's breakdown is ready. Propose algorithm.
```
### 6.3 sessions_send for Direct Communication
For urgent or out-of-band communication that shouldn't clutter Slack:
```javascript
// Manager urgently needs Auditor's input
sessions_send({
agentId: "auditor",
message: "URGENT: Trial 47 results look non-physical. Mass decreased 40% with minimal geometry change. Please review immediately. Channel: #starspec-m1-wfe"
})
```
**When to use:**
- Emergency findings that need immediate attention
- Cross-agent coordination that's meta (about how to work, not the work itself)
- Private agent-to-agent messages that shouldn't be in Slack
### 6.4 sessions_spawn for Heavy Lifting
For compute-intensive tasks that would block the agent:
```javascript
// Post-Processor spawns sub-agent for heavy data crunching
sessions_spawn({
agentId: "postprocessor",
task: "Generate full Zernike decomposition for trials 1-95. Read results from /mnt/job-queue/completed/job-20260210-143022-wfe/3_results/study.db. Output: convergence plot, Pareto front (if multi-objective), parameter sensitivity analysis. Save all plots to my memory/starspec-m1-wfe/ folder.",
runTimeoutSeconds: 600 // 10 min max
})
```
**When to use:**
- Generating many plots/analyses
- Processing large result sets
- Any task that would take >2 minutes
- Work that doesn't need interactive back-and-forth
### 6.5 Approval Gates (ref DEC-A009)
| What Needs Approval | Escalation Path | Approver |
|---------------------|-----------------|----------|
| Client deliverables | Reporter → Auditor review → Secretary → Antoine | Antoine |
| New tools/extractors | Developer → Manager → Secretary → Antoine | Antoine |
| Divergent optimization approach | Optimizer → Manager → Secretary → Antoine | Antoine |
| Scope changes | Technical → Manager → Secretary → Antoine | Antoine |
| Budget decisions (>100 trials) | Manager → Secretary → Antoine | Antoine |
| Framework/protocol changes | Manager → Secretary → Antoine | Antoine |
| Emergency stops | Any agent → Manager (immediate) | Manager |
**No approval needed for:**
- Routine technical work within scope
- Internal agent discussions in threads
- Memory updates
- Standard protocol execution
- Research queries
- Small bug fixes
**Secretary's approval tracking template:**
```markdown
## Pending Approvals (in Secretary's memory)
| ID | Date | From | Request | Status |
|----|------|------|---------|--------|
| APR-001 | 2026-02-10 | Reporter | M1 WFE report v1 ready | ⏳ Waiting |
| APR-002 | 2026-02-10 | Developer | New thermal extractor | ⏳ Waiting |
```
---
## 7. Mario ↔ Atomizer Bridge
### 7.1 Separation Principle
Mario (me) is the **architect** of this system. I designed it, I write the blueprints, I can make strategic changes. But I am NOT in the operational loop.
```
Mario's Domain Atomizer's Domain
────────────── ─────────────────
Strategic architecture Daily operations
Blueprint documents Agent workspaces
System plan updates Project channels
Clawdbot config changes Study execution
Performance reviews Client deliverables
```
### 7.2 Bridge Files
Mario maintains strategic overview through specific files:
```
/home/papa/clawd/memory/atomizer/
├── strategic-overview.md # High-level status, decisions, concerns
├── architecture-notes.md # Technical notes on system design
└── performance-log.md # Agent performance observations
/home/papa/obsidian-vault/2-Projects/P-Atomizer-Overhaul-Framework-Agentic/
├── 00-PROJECT-PLAN.md # Vision (Mario maintains)
├── 01-AGENT-ROSTER.md # Agent definitions (Mario maintains)
├── 02-ARCHITECTURE.md # Technical architecture (Mario maintains)
├── 03-ROADMAP.md # Implementation plan (Mario maintains)
├── 04-DECISION-LOG.md # Decision record (Mario + Antoine)
└── 05-FULL-SYSTEM-PLAN.md # This document (Mario maintains)
```
### 7.3 Bridge Channel
A Slack channel in **Mario's workspace** (not Atomizer's) for strategic oversight:
```
Mario's Slack workspace: #atomizer
```
This channel receives:
- Weekly summary from Atomizer's Secretary (cross-posted or generated by Mario's Clawdbot reading shared files)
- Antoine's strategic decisions that affect architecture
- Mario's architectural notes and recommendations
**How Mario gets updates without being in the loop:**
1. Atomizer's Manager writes a weekly summary to a shared file: `/opt/atomizer/workspaces/manager/reports/weekly-YYYY-WW.md`
2. This file is accessible to Mario's Clawdbot via the shared Obsidian vault or a dedicated Syncthing folder
3. Mario's heartbeat checks for new weekly reports and summarizes in `memory/atomizer/strategic-overview.md`
### 7.4 What Mario Tracks
In `/home/papa/clawd/memory/atomizer/strategic-overview.md`:
```markdown
# Atomizer Strategic Overview
## Current Phase
Phase 0: Proof of Concept (3 agents)
## Active Projects
- StarSpec M1 WFE — Phase: Execution, 47/150 trials
## System Health
- Agents operational: Manager, Secretary, Technical
- Last incident: none
- Cost this month: ~$45
## Architecture Concerns
- Watch: context window usage on Manager (orchestrates everything)
- Watch: Syncthing job queue latency (target <30s)
## Next Milestone
Phase 1 launch: Add Optimizer + Auditor (target: Week 3)
```
---
## 8. Phase 0 Implementation Checklist
### 8.1 Pre-Flight Checks
Before starting, verify:
- [ ] T420 has Docker installed and running
- [ ] T420 has sufficient disk space (>10GB free)
- [ ] Syncthing running and Atomizer repo syncing
- [ ] Anthropic API key valid and funded
- [ ] Antoine has time allocated (need ~5h for Slack setup + testing)
```bash
# Verify Docker
docker --version && docker compose version
# Verify disk space
df -h /opt/
# Verify Syncthing
curl -s http://127.0.0.1:8384/rest/system/status | jq .myID
# Verify API key
curl -s https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{"model":"claude-sonnet-4-20250514","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}' \
| jq .type
# Should return "message"
```
### 8.2 Step-by-Step Implementation
#### Step 1: Directory Setup (30 min)
```bash
# Create directory structure
sudo mkdir -p /opt/atomizer
sudo chown papa:papa /opt/atomizer
cd /opt/atomizer
mkdir -p clawdbot/credentials clawdbot/skills
mkdir -p workspaces data/sessions data/logs
mkdir -p job-queue/{pending,running,completed,failed}
# Create .env file
cat > .env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-... # Same key as Mario's
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AIza...
EOF
chmod 600 .env
```
#### Step 2: Create Shared Skills (2-3 hours)
```bash
# Copy protocols from Atomizer repo
mkdir -p clawdbot/skills/atomizer-protocols/protocols/
# Copy OP and SYS protocols
cp /home/papa/repos/Atomizer/docs/protocols/operations/OP_*.md \
clawdbot/skills/atomizer-protocols/protocols/
cp /home/papa/repos/Atomizer/docs/protocols/system/SYS_*.md \
clawdbot/skills/atomizer-protocols/protocols/
# Create SKILL.md
cat > clawdbot/skills/atomizer-protocols/SKILL.md << 'EOF'
---
name: atomizer-protocols
description: Atomizer Engineering Co. protocols and procedures
version: 1.0
---
# Atomizer Protocols
Load QUICK_REF.md first. Full protocols in protocols/ directory.
EOF
# Copy QUICK_REF
cp /home/papa/repos/Atomizer/docs/QUICK_REF.md \
clawdbot/skills/atomizer-protocols/QUICK_REF.md
# Create company skill
mkdir -p clawdbot/skills/atomizer-company/
cat > clawdbot/skills/atomizer-company/SKILL.md << 'EOF'
---
name: atomizer-company
description: Atomizer Engineering Co. identity and shared knowledge
version: 1.0
---
# Atomizer Company
Read COMPANY.md for company identity and LAC_CRITICAL.md for hard-won lessons.
EOF
cat > clawdbot/skills/atomizer-company/COMPANY.md << 'EOF'
# Atomizer Engineering Co.
## Who We Are
A multi-agent FEA optimization firm. We optimize structural designs using
Siemens NX, Nastran, and advanced algorithms.
## How We Work
- Protocol-driven: every task follows established procedures
- Manager orchestrates, specialists execute
- Antoine (CEO) approves deliverables and strategic decisions
- The Secretary keeps Antoine informed and filters noise
- The Auditor validates everything before it leaves the company
## Our Values
- Physics first: results must make physical sense
- Reproducibility: every study must be re-runnable
- Transparency: all decisions documented and traceable
- Learning: we get smarter with every project
EOF
# Extract critical LAC lessons
cat > clawdbot/skills/atomizer-company/LAC_CRITICAL.md << 'EOF'
# LAC Critical Lessons — NEVER FORGET
These are hard-won insights from past optimization sessions.
## NX Safety
- NEVER kill ugraf.exe directly → use NXSessionManager.close_nx_if_allowed()
- PowerShell for NX journals → NEVER cmd /c
- Always load *_i.prt before UpdateFemodel() → mesh won't update without it
## Optimization
- CMA-ES doesn't evaluate x0 first → always enqueue_trial(x0)
- Surrogate + L-BFGS = dangerous → gradient descent finds fake optima on surrogate
- Never rewrite run_optimization.py from scratch → copy working template
- Relative WFE: use extract_relative() (node-by-node) → NOT abs(RMS_a - RMS_b)
## File Management
- Required file chain: .sim → .fem → *_i.prt → .prt (ALL must be present)
- Trial folders: trial_NNNN/ (zero-padded, never reused, never overwritten)
- Always copy working studies → don't modify originals
EOF
```
#### Step 3: Bootstrap Agent Workspaces (1 hour)
```bash
# Run the bootstrap script from section 3.1
# For Phase 0, we only need 3 agents, but create all for later
chmod +x bootstrap-workspaces.sh
./bootstrap-workspaces.sh
# Customize the 3 Phase 0 agents (Manager, Secretary, Technical)
# Apply role-specific SOUL.md customizations from section 3.2
```
#### Step 4: Write Clawdbot Config (1 hour)
```bash
cat > clawdbot/clawdbot.json << 'JSONEOF'
{
"agents": {
"defaults": {
"model": {
"primary": "anthropic/claude-opus-4-6"
},
"compaction": {
"mode": "safeguard",
"memoryFlush": { "enabled": true }
},
"maxConcurrent": 4,
"subagents": {
"maxConcurrent": 4,
"model": "anthropic/claude-sonnet-5-20260203"
},
"heartbeat": {
"every": "30m"
}
},
"list": [
{
"id": "manager",
"name": "The Manager",
"default": true,
"workspace": "~/clawd-atomizer-manager",
"model": "anthropic/claude-opus-4-6",
"identity": {
"name": "The Manager",
"emoji": "🎯"
}
},
{
"id": "secretary",
"name": "The Secretary",
"workspace": "~/clawd-atomizer-secretary",
"model": "anthropic/claude-opus-4-6",
"identity": {
"name": "The Secretary",
"emoji": "📋"
}
},
{
"id": "technical",
"name": "The Technical Lead",
"workspace": "~/clawd-atomizer-technical",
"model": "anthropic/claude-opus-4-6",
"identity": {
"name": "The Technical Lead",
"emoji": "🔧"
}
}
]
},
"tools": {
"web": {
"search": { "enabled": false },
"fetch": { "enabled": false }
}
},
"channels": {
"slack": {
"mode": "socket",
"enabled": true,
"requireMention": false,
"groupPolicy": "open"
}
},
"gateway": {
"port": 18790,
"mode": "local",
"bind": "0.0.0.0"
}
}
JSONEOF
```
> **Note:** Slack bot/app tokens will be added after Antoine creates the workspace (Step 5).
#### Step 5: Slack Workspace Setup (Antoine — 1 hour)
Antoine does this:
1. Go to https://slack.com/create
2. Workspace name: **Atomizer Engineering**
3. Create channels:
- `#hq`
- `#secretary`
4. Create Slack app: https://api.slack.com/apps → "Create New App" → "From Scratch"
- App name: **Atomizer Agents**
- Workspace: Atomizer Engineering
5. Enable **Socket Mode** (Settings → Socket Mode → Enable)
- Generate App-Level Token (scope: `connections:write`) → save as `SLACK_APP_TOKEN`
6. Add **Bot Token Scopes** (OAuth & Permissions):
- `chat:write`, `channels:read`, `channels:history`, `channels:join`
- `groups:read`, `groups:history`
- `users:read`, `reactions:write`, `reactions:read`, `files:write`
7. Install app to workspace → save Bot Token (`xoxb-...`)
8. **Event Subscriptions** (if not using Socket Mode for events):
- Subscribe to: `message.channels`, `message.groups`, `message.im`, `app_mention`
9. Share tokens with Mario → add to `clawdbot.json`:
```json
{
"channels": {
"slack": {
"botToken": "xoxb-atomizer-...",
"appToken": "xapp-atomizer-..."
}
}
}
```
#### Step 6: Docker Compose Setup (30 min)
```bash
# Create docker-compose.yml (from section 1.3)
# Then start:
cd /opt/atomizer
docker compose up -d
# Verify it's running
docker compose logs -f atomizer-gateway
# Check health
curl http://127.0.0.1:18790/health
```
#### Step 7: Test Routing (1 hour)
**Test 1: Manager responds in `#hq`**
```
Antoine posts in #hq: "Hello, Manager. Can you hear me?"
Expected: Manager responds with 🎯 prefix
```
**Test 2: Secretary responds in `#secretary`**
```
Antoine posts in #secretary: "What's the status of the company?"
Expected: Secretary responds with 📋 prefix, says no active projects
```
**Test 3: Manager delegates to Technical**
```
Antoine posts in #hq: "New job: optimize a simple bracket for minimum mass.
Constraints: max stress < 200 MPa, min frequency > 50 Hz."
Expected flow:
1. Manager acknowledges, creates thread
2. Manager @mentions Technical for breakdown
3. Technical produces structured breakdown (parameters, objectives, constraints)
4. Secretary summarizes to #secretary for Antoine
```
**Test 4: Memory persistence**
```
Close and reopen the session.
Ask Manager: "What was the last project we discussed?"
Expected: Manager reads from memory/ and recalls the bracket discussion
```
#### Step 8: First Real Scenario (2-3 hours)
Run a realistic engineering problem through the 3-agent system:
1. Antoine provides a real project brief (e.g., mirror optimization spec)
2. Manager orchestrates breakdown
3. Technical produces full technical analysis
4. Secretary keeps Antoine updated
5. Test approval flow: Technical identifies a gap → Secretary asks Antoine
**Success criteria for Phase 0:**
- [ ] 3 agents respond correctly when @-mentioned
- [ ] Manager delegates breakdown to Technical
- [ ] Technical produces structured analysis per OP_01
- [ ] Secretary summarizes and escalates appropriately
- [ ] Memory persists across sessions
- [ ] No routing confusion
- [ ] Thread discipline maintained
### 8.3 Phase 0 → Phase 1 Transition
After Phase 0 succeeds, add Optimizer + Auditor:
```bash
# Update clawdbot.json to add 2 more agents
# (their workspaces already exist from bootstrap)
# Add to agents.list:
{
"id": "optimizer",
"name": "The Optimizer",
"workspace": "~/clawd-atomizer-optimizer",
"model": "anthropic/claude-opus-4-6",
"identity": { "name": "The Optimizer", "emoji": "⚡" }
},
{
"id": "auditor",
"name": "The Auditor",
"workspace": "~/clawd-atomizer-auditor",
"model": "anthropic/claude-opus-4-6",
"identity": { "name": "The Auditor", "emoji": "🔍" }
}
# Create #audit-log channel in Slack
# Add Auditor binding
# Restart gateway
docker compose restart atomizer-gateway
```
---
## 9. Security
### 9.1 Agent Access Boundaries
| Resource | Manager | Secretary | Technical | Optimizer | NX Expert | Post-Proc | Reporter | Auditor | Study Builder | Researcher | Developer | KB | IT |
|----------|---------|-----------|-----------|-----------|-----------|-----------|----------|---------|---------------|------------|-----------|----|----|
| Atomizer repo | R | R | R | R | R | R | R | R | R | R | R/W | R | R |
| Obsidian vault | R | R | R | — | — | — | — | R | — | — | — | R | — |
| Job queue | R | R | — | R | — | R | — | R | R/W | — | — | — | R |
| Study results | R | — | — | R | — | R/W | R | R | — | — | — | R | — |
| Agent memory (own) | R/W | R/W | R/W | R/W | R/W | R/W | R/W | R/W | R/W | R/W | R/W | R/W | R/W |
| Agent memory (others) | — | — | — | — | — | — | — | R* | — | — | — | — | — |
| Slack | All | All | Project | Project | Project | Project | Project | All | Project | #research | #dev | #kb | #hq |
| Web access | — | — | — | — | — | — | — | — | — | ✅ | — | — | — |
| Email draft | — | ✅ | — | — | — | — | ✅ | — | — | — | — | — | — |
*Auditor has read access to other agents' memory for audit purposes.
### 9.2 Principle of Least Privilege
**Implemented through:**
1. **Docker volume mounts** — agents only see what's mounted
2. **Read-only mounts** — Atomizer repo and Obsidian are `:ro`
3. **Per-agent AGENTS.md** — explicit rules about what each agent can do
4. **Approval gates** — significant actions require Antoine's OK
5. **Auditor oversight** — Auditor reviews all deliverables and can veto
### 9.3 Mario ↔ Atomizer Isolation
**Critical: no leakage between Mario's and Atomizer's Clawdbot instances.**
| Boundary | How It's Enforced |
|----------|------------------|
| Separate gateways | Mario: native systemd, port 18789. Atomizer: Docker, port 18790 |
| Separate Slack workspaces | Mario: personal workspace. Atomizer: atomizer-eng.slack.com |
| Separate configs | Mario: `~/.clawdbot/`. Atomizer: `/opt/atomizer/clawdbot/` |
| No shared memory | Agents can't read `/home/papa/clawd/` (Mario's workspace) |
| Read-only shared repos | Atomizer reads `/home/papa/repos/Atomizer/` but can't write |
| Separate session storage | Mario: `~/.clawdbot/agents/`. Atomizer: `/opt/atomizer/data/sessions/` |
### 9.4 API Key Management
```bash
# Keys stored in /opt/atomizer/.env (chmod 600)
# Docker Compose reads them via env_file
# Never committed to git
# Same keys as Mario (shared Anthropic Max subscription)
# To rotate keys:
# 1. Update .env
# 2. docker compose restart atomizer-gateway
```
**Cost monitoring:**
- Track per-agent usage via Anthropic/OpenAI dashboards
- Manager reports monthly cost in weekly summary
- Budget alert if monthly spend > $150 (Phase 0) / $400 (Phase 3)
### 9.5 Slack Security
- Separate Slack workspace = separate permissions
- Bot token has minimal scopes (no admin, no user impersonation)
- Socket Mode = no public webhook URL
- Antoine is the only human user (single-tenant)
- If workspace is used for demos: create a `#demo` channel, agents know not to share sensitive data there
---
## 10. Future-Proofing
### 10.1 MCP Server Integration Path
The Atomizer repo already has an `mcp-server/` directory. Long-term, agents interact with Atomizer through MCP tools instead of direct file access:
```
Current: Agent reads /mnt/atomizer-repo/docs/protocols/OP_01.md
Future: Agent calls mcp.atomizer.get_protocol("OP_01")
Current: Agent reads study.db directly
Future: Agent calls mcp.atomizer.get_study_status("starspec-m1")
Current: Agent writes job.json to /mnt/job-queue/
Future: Agent calls mcp.atomizer.submit_job({...})
```
**When to build:** After Phase 2 succeeds. MCP provides better abstraction but adds complexity. Don't block on it.
### 10.2 Client Portal
Eventually, clients could interact with a stripped-down interface:
```
Client → Web portal → Reporter agent (read-only)
→ Secretary agent (questions)
→ Manager agent (status updates)
```
**Prerequisites:** Authentication, access control, sanitized output. Phase 4+ at earliest.
### 10.3 Voice Interface
Antoine's walkthroughs are already voice-first (screen recording + transcript). Future:
```
Antoine speaks → Whisper transcription → Secretary agent processes
→ KB agent indexes model knowledge
```
Could integrate with Clawdbot's existing TTS/voice capabilities. Medium priority.
### 10.4 Dashboard Agent Integration
The Atomizer Dashboard runs on Windows. Future integration:
```
Dashboard API (FastAPI backend)
Agent makes API calls to dashboard
Dashboard reflects agent-triggered study state
```
Requires: Dashboard API accessible from Docker (via Tailscale). Medium priority.
### 10.5 Scaling to Multiple Projects
Current architecture handles this naturally:
- Each project gets a channel (`#clienta-thermal`, `#clientb-modal`)
- Manager tracks all projects in its memory
- Agents serve multiple projects (stateless between projects, context via memory/)
- Concurrency limit: `maxConcurrent: 4` prevents overload
**When project volume increases:**
- Add a second Manager (e.g., Manager A for client work, Manager B for internal development)
- Split into sub-teams with dedicated channels
- Consider model downgrades for routine projects to save costs
### 10.6 Multi-Model Evolution
Architecture is model-agnostic. When new models release:
```json
// Just update clawdbot.json:
{
"id": "optimizer",
"model": "anthropic/claude-opus-5-0" // Upgrade when available
}
```
Current model assignments (ref DEC-A008):
| Role | Current | Upgrade Path |
|------|---------|-------------|
| Manager, Auditor | Opus 4.6 | Opus 5.x when available |
| Technical, Optimizer | Opus 4.6 | Opus 5.x or specialized reasoning model |
| NX Expert, Post-Proc, Reporter | Sonnet 5 | Keep on latest Sonnet |
| Study Builder | GPT-5.3-Codex | Codex 6.x or Claude Code model |
| Researcher | Gemini 3.0 | Gemini 3.x or Gemini Ultra |
| Developer, KB, IT | Sonnet 5 | Keep on latest Sonnet |
---
## Appendix A: Complete clawdbot.json (Phase 3 — Full Company)
```json5
{
"agents": {
"defaults": {
"model": {
"primary": "anthropic/claude-opus-4-6",
"fallbacks": ["anthropic/claude-sonnet-5-20260203"]
},
"compaction": {
"mode": "safeguard",
"memoryFlush": { "enabled": true }
},
"maxConcurrent": 4,
"subagents": {
"maxConcurrent": 6,
"model": "anthropic/claude-sonnet-5-20260203"
},
"heartbeat": {
"every": "30m"
}
},
"list": [
// Core
{ "id": "manager", "name": "The Manager", "default": true,
"workspace": "~/clawd-atomizer-manager",
"model": "anthropic/claude-opus-4-6",
"identity": { "name": "The Manager", "emoji": "🎯" }
},
{ "id": "secretary", "name": "The Secretary",
"workspace": "~/clawd-atomizer-secretary",
"model": "anthropic/claude-opus-4-6",
"identity": { "name": "The Secretary", "emoji": "📋" }
},
{ "id": "technical", "name": "The Technical Lead",
"workspace": "~/clawd-atomizer-technical",
"model": "anthropic/claude-opus-4-6",
"identity": { "name": "The Technical Lead", "emoji": "🔧" }
},
{ "id": "optimizer", "name": "The Optimizer",
"workspace": "~/clawd-atomizer-optimizer",
"model": "anthropic/claude-opus-4-6",
"identity": { "name": "The Optimizer", "emoji": "⚡" }
},
{ "id": "study-builder", "name": "The Study Builder",
"workspace": "~/clawd-atomizer-study-builder",
"model": "openai/gpt-5.3-codex",
"identity": { "name": "The Study Builder", "emoji": "🏗️" }
},
// Specialists
{ "id": "nx-expert", "name": "The NX Expert",
"workspace": "~/clawd-atomizer-nx-expert",
"model": "anthropic/claude-sonnet-5-20260203",
"identity": { "name": "The NX Expert", "emoji": "🖥️" }
},
{ "id": "postprocessor", "name": "The Post-Processor",
"workspace": "~/clawd-atomizer-postprocessor",
"model": "anthropic/claude-sonnet-5-20260203",
"identity": { "name": "The Post-Processor", "emoji": "📊" }
},
{ "id": "reporter", "name": "The Reporter",
"workspace": "~/clawd-atomizer-reporter",
"model": "anthropic/claude-sonnet-5-20260203",
"identity": { "name": "The Reporter", "emoji": "📝" }
},
{ "id": "auditor", "name": "The Auditor",
"workspace": "~/clawd-atomizer-auditor",
"model": "anthropic/claude-opus-4-6",
"identity": { "name": "The Auditor", "emoji": "🔍" }
},
// Support
{ "id": "researcher", "name": "The Researcher",
"workspace": "~/clawd-atomizer-researcher",
"model": "google/gemini-3.0-pro",
"identity": { "name": "The Researcher", "emoji": "🔬" }
},
{ "id": "developer", "name": "The Developer",
"workspace": "~/clawd-atomizer-developer",
"model": "anthropic/claude-sonnet-5-20260203",
"identity": { "name": "The Developer", "emoji": "💻" }
},
{ "id": "knowledge-base", "name": "The Knowledge Base",
"workspace": "~/clawd-atomizer-kb",
"model": "anthropic/claude-sonnet-5-20260203",
"identity": { "name": "The Knowledge Base", "emoji": "🗄️" }
},
{ "id": "it-support", "name": "IT Support",
"workspace": "~/clawd-atomizer-it",
"model": "anthropic/claude-sonnet-5-20260203",
"identity": { "name": "IT Support", "emoji": "🛠️" }
}
]
},
"bindings": [
// Manager: HQ + fallback for all unbound channels
{ "agentId": "manager",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#hq" } } },
// Secretary: dedicated channel + DMs
{ "agentId": "secretary",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#secretary" } } },
{ "agentId": "secretary",
"match": { "channel": "slack", "peer": { "kind": "dm" } } },
// Specialized channels
{ "agentId": "auditor",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#audit-log" } } },
{ "agentId": "researcher",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#research" } } },
{ "agentId": "developer",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#dev" } } },
{ "agentId": "knowledge-base",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#knowledge-base" } } },
// Framework evolution → Manager (steward role)
{ "agentId": "manager",
"match": { "channel": "slack", "peer": { "kind": "group", "name": "#framework-evolution" } } },
// Fallback: all other channels → Manager
{ "agentId": "manager",
"match": { "channel": "slack", "peer": { "kind": "group" } },
"priority": -1 }
],
"tools": {
"web": {
"search": { "enabled": true, "apiKey": "..." },
"fetch": { "enabled": true }
}
},
"channels": {
"slack": {
"mode": "socket",
"enabled": true,
"botToken": "xoxb-atomizer-...",
"appToken": "xapp-atomizer-...",
"requireMention": false,
"groupPolicy": "open",
"dm": {
"enabled": true,
"policy": "open"
}
}
},
"gateway": {
"port": 18790,
"mode": "local",
"bind": "0.0.0.0",
"auth": {
"mode": "token",
"token": "..."
}
}
}
```
---
## Appendix B: Decision Log References
| Decision | Reference | Status | Impact on This Plan |
|----------|-----------|--------|---------------------|
| Clawdbot platform | DEC-A001 | ✅ | Entire architecture built on Clawdbot |
| Phased rollout | DEC-A002 | ✅ | Phase 0 → 3 in Section 8 |
| Manager bottleneck | DEC-A003 | ✅ | Communication hierarchy in Section 6 |
| Single gateway | DEC-A004 | ✅ | Docker setup in Section 1 |
| Model tiering | DEC-A005 + DEC-A008 | ✅ | Agent configs throughout |
| Dedicated Slack | DEC-A006 | ✅ | Slack architecture in Section 4 |
| Study Builder agent | DEC-A007 | ✅ | Study Builder in all sections |
| Latest models | DEC-A008 | ✅ | Model assignments in configs |
| Autonomous + gates | DEC-A009 | ✅ | Approval gates in Section 6 |
| Manager as steward | DEC-A010 | ✅ | Manager SOUL.md in Section 3 |
**Decisions resolved by this plan:**
| Decision | Resolution |
|----------|-----------|
| DEC-A011: Windows execution | Syncthing job queue (primary) + Tailscale SSH (optional). Section 5 |
| DEC-A012: Separate vs extend gateway | **Separate Docker gateway.** Section 1 |
| DEC-A013: One bot vs per-agent bots | **Single bot, agent prefixes in messages.** Section 4 |
| DEC-A014: KB Agent processing | **File watcher + heartbeat poll.** Section 3 |
---
## Appendix C: Cost Model
### Per-Phase Monthly Estimates
| Phase | Active Agents | Model Mix | Est. Monthly Cost |
|-------|--------------|-----------|-------------------|
| Phase 0 | 3 (Opus 4.6) | All Opus | $50100 |
| Phase 1 | 5 (3 Opus + 2 Opus) | All Opus | $100200 |
| Phase 2 | 9 (5 Opus + 4 Sonnet) | Mixed | $200350 |
| Phase 3 | 13 (5 Opus + 6 Sonnet + 1 Codex + 1 Gemini) | Mixed | $300500 |
### Per-Project Estimates
| Project Type | Agents Involved | Turns | Est. Cost |
|-------------|----------------|-------|-----------|
| Simple optimization | Manager, Technical, Optimizer, Study Builder | ~40 | $1525 |
| Full pipeline | All core + specialists | ~70 | $2540 |
| Complex multi-phase | All agents | ~120 | $4065 |
### Cost Controls
1. Wake-on-demand: agents only activate when @-mentioned
2. Heartbeat interval: 30 min (not 5 min)
3. Sub-agent timeouts: `runTimeoutSeconds: 600` (10 min max)
4. Context pruning: `cache-ttl: 1h`
5. Session auto-archive after idle
6. Sonnet for routine work, Opus only for reasoning-heavy tasks
---
*This is the implementation blueprint. Everything above is actionable — not theoretical. When Antoine gives the green light, execute Phase 0 step by step. The company starts with 3 agents and grows to 13 over 10 weeks.*
*Written: 2026-02-08 by Mario*
*References: 00-PROJECT-PLAN, 01-AGENT-ROSTER, 02-ARCHITECTURE, 03-ROADMAP, 04-DECISION-LOG*