Files

Antoine cf82de4f06 docs: add HQ multi-agent framework documentation from PKM

- Project plan, agent roster, architecture, roadmap
- Decision log, full system plan, Discord setup/migration guides
- System implementation status (as-built)
- Cluster pivot history
- Orchestration engine plan (Phases 1-4)
- Webster and Auditor reviews

2026-02-15 21:44:07 +00:00

24 KiB

Raw Permalink Blame History

🏗️ Architecture — Atomizer Engineering Co.

Technical architecture: Clawdbot configuration, Slack setup, memory systems, and infrastructure.

1. Clawdbot Multi-Agent Configuration

Config Structure (clawdbot.json)

This is the core configuration that makes it all work. Each agent is defined with its own workspace, model, identity, and tools.

{
  agents: {
    list: [
      // === CORE AGENTS ===
      {
        id: "manager",
        name: "The Manager",
        default: false,
        workspace: "~/clawd-atomizer-manager",
        model: "anthropic/claude-opus-4-6",
        identity: {
          name: "The Manager",
          emoji: "🎯",
        },
        // Manager sees all project channels
      },
      {
        id: "secretary",
        name: "The Secretary",
        workspace: "~/clawd-atomizer-secretary",
        model: "anthropic/claude-opus-4-6",
        identity: {
          name: "The Secretary",
          emoji: "📋",
        },
      },
      {
        id: "technical",
        name: "The Technical Lead",
        workspace: "~/clawd-atomizer-technical",
        model: "anthropic/claude-opus-4-6",
        identity: {
          name: "The Technical Lead",
          emoji: "🔧",
        },
      },
      {
        id: "optimizer",
        name: "The Optimizer",
        workspace: "~/clawd-atomizer-optimizer",
        model: "anthropic/claude-opus-4-6",
        identity: {
          name: "The Optimizer",
          emoji: "⚡",
        },
      },

      // === SPECIALISTS (Phase 2) ===
      {
        id: "nx-expert",
        name: "The NX Expert",
        workspace: "~/clawd-atomizer-nx-expert",
        model: "anthropic/claude-sonnet-5",
        identity: {
          name: "The NX Expert",
          emoji: "🖥️",
        },
      },
      {
        id: "postprocessor",
        name: "The Post-Processor",
        workspace: "~/clawd-atomizer-postprocessor",
        model: "anthropic/claude-sonnet-5",
        identity: {
          name: "The Post-Processor",
          emoji: "📊",
        },
      },
      {
        id: "reporter",
        name: "The Reporter",
        workspace: "~/clawd-atomizer-reporter",
        model: "anthropic/claude-sonnet-5",
        identity: {
          name: "The Reporter",
          emoji: "📝",
        },
      },
      {
        id: "auditor",
        name: "The Auditor",
        workspace: "~/clawd-atomizer-auditor",
        model: "anthropic/claude-opus-4-6",
        identity: {
          name: "The Auditor",
          emoji: "🔍",
        },
      },

      {
        id: "study-builder",
        name: "The Study Builder",
        workspace: "~/clawd-atomizer-study-builder",
        model: "openai/gpt-5.3-codex",  // or anthropic/claude-opus-4-6
        identity: {
          name: "The Study Builder",
          emoji: "🏗️",
        },
      },

      // === SUPPORT (Phase 3) ===
      {
        id: "researcher",
        name: "The Researcher",
        workspace: "~/clawd-atomizer-researcher",
        model: "google/gemini-3.0",
        identity: {
          name: "The Researcher",
          emoji: "🔬",
        },
      },
      {
        id: "developer",
        name: "The Developer",
        workspace: "~/clawd-atomizer-developer",
        model: "anthropic/claude-sonnet-5",
        identity: {
          name: "The Developer",
          emoji: "💻",
        },
      },
      {
        id: "knowledge-base",
        name: "The Knowledge Base",
        workspace: "~/clawd-atomizer-kb",
        model: "anthropic/claude-sonnet-5",
        identity: {
          name: "The Knowledge Base",
          emoji: "🗄️",
        },
      },
      {
        id: "it-support",
        name: "IT Support",
        workspace: "~/clawd-atomizer-it",
        model: "anthropic/claude-sonnet-5",
        identity: {
          name: "IT Support",
          emoji: "🛠️",
        },
      },
    ],
  },

  // Route Slack channels to agents
  bindings: [
    // Manager gets HQ and all project channels
    { agentId: "manager", match: { channel: "slack", peer: { kind: "group", id: "CHID_atomizer_hq" } } },
    
    // Secretary gets its own channel
    { agentId: "secretary", match: { channel: "slack", peer: { kind: "group", id: "CHID_atomizer_secretary" } } },
    
    // Project channels → Manager (who then @mentions specialists)
    // Or use thread-based routing once available
    
    // Specialized channels
    { agentId: "researcher", match: { channel: "slack", peer: { kind: "group", id: "CHID_atomizer_research" } } },
    { agentId: "developer", match: { channel: "slack", peer: { kind: "group", id: "CHID_atomizer_dev" } } },
    { agentId: "knowledge-base", match: { channel: "slack", peer: { kind: "group", id: "CHID_atomizer_kb" } } },
  ],
}

⚠️ Note: The channel IDs (CHID_*) are placeholders. Replace with actual Slack channel IDs after creating them.

Key Architecture Decision: Single Gateway vs Multiple

Recommendation: Single Gateway, Multiple Agents

One Clawdbot gateway process hosting all agents. Benefits:

Shared infrastructure (one process to manage)
sessions_send for inter-agent communication
sessions_spawn for sub-agent heavy lifting
Single config file to manage

If resource constraints become an issue later, we can split into multiple gateways on different machines.

2. Workspace Layout

Each agent gets a workspace following Clawdbot conventions:

~/clawd-atomizer-manager/
├── AGENTS.md              ← Operating instructions, protocol rules
├── SOUL.md                ← Personality, tone, boundaries  
├── TOOLS.md               ← Local tool notes
├── MEMORY.md              ← Long-term role-specific memory
├── IDENTITY.md            ← Name, emoji, avatar
├── memory/                ← Per-project memory files
│   ├── starspec-wfe-opt.md
│   └── client-b-thermal.md
└── skills/                ← Agent-specific skills
    └── (agent-specific)

Shared Skills (all agents)

~/.clawdbot/skills/
├── atomizer-protocols/    ← Company protocols
│   ├── SKILL.md
│   ├── QUICK_REF.md       ← One-page cheatsheet
│   └── protocols/
│       ├── OP_01_study_lifecycle.md
│       ├── OP_02_...
│       └── SYS_18_...
└── atomizer-company/      ← Company identity + shared knowledge
    ├── SKILL.md
    └── COMPANY.md          ← Who we are, how we work, agent directory

Workspace Bootstrap Script

#!/bin/bash
# create-agent-workspace.sh <agent-id> <agent-name> <emoji>
AGENT_ID=$1
AGENT_NAME=$2
EMOJI=$3
DIR=~/clawd-atomizer-$AGENT_ID

mkdir -p $DIR/memory $DIR/skills

cat > $DIR/IDENTITY.md << EOF
# IDENTITY.md
- **Name:** $AGENT_NAME
- **Emoji:** $EMOJI
- **Role:** Atomizer Engineering Co. — $AGENT_NAME
- **Company:** Atomizer Engineering Co.
EOF

cat > $DIR/SOUL.md << EOF
# SOUL.md — $AGENT_NAME

You are $AGENT_NAME at Atomizer Engineering Co., a multi-agent FEA optimization firm.

## Core Rules
- Follow all Atomizer protocols (see atomizer-protocols skill)
- Respond when @-mentioned in Slack channels
- Stay in your lane — delegate outside your expertise
- Update your memory after significant work
- Be concise in Slack — expand in documents

## Communication
- In Slack: concise, structured, use threads
- For reports/documents: thorough, professional
- When uncertain: ask, don't guess
EOF

cat > $DIR/AGENTS.md << EOF
# AGENTS.md — $AGENT_NAME

## Session Init
1. Read SOUL.md
2. Read MEMORY.md  
3. Check memory/ for active project context
4. Check which channel/thread you're in for context

## Memory
- memory/*.md = per-project notes
- MEMORY.md = role-specific long-term knowledge
- Write down lessons learned after every project

## Protocols
Load the atomizer-protocols skill for protocol reference.
EOF

cat > $DIR/MEMORY.md << EOF
# MEMORY.md — $AGENT_NAME

## Role Knowledge

*(To be populated as the agent works)*

## Lessons Learned

*(Accumulated over time)*
EOF

echo "Created workspace: $DIR"

3. Slack Workspace Architecture

Dedicated Slack Workspace: "Atomizer Engineering"

This gets its own Slack workspace — separate from Antoine's personal workspace. Professional, clean, product-ready for video content and demos.

Workspace name: Atomizer Engineering (or atomizer-eng.slack.com)

Permanent Channels

Channel	Purpose	Bound Agent	Who's There
`#hq`	Company coordination, general discussion	Manager	All agents can be summoned
`#secretary`	Antoine's dashboard, directives	Secretary	Secretary + Antoine
`#research`	Research requests and findings	Researcher	Researcher, anyone can ask
`#dev`	Development and coding work	Developer	Developer, Manager
`#knowledge-base`	Knowledge base maintenance	Knowledge Base	KB Agent, anyone can ask
`#audit-log`	Auditor findings and reviews	Auditor	Auditor, Manager

Project Channels (Created Per Client Job)

Naming convention: #<client>-<short-description>

Examples:

#starspec-m1-wfe
#clientb-thermal-opt

R&D / Development Channels

For developing new Atomizer capabilities — vibration tools, fatigue analysis, non-linear methods, new extractors, etc. Antoine works directly with agents here to explore, prototype, and build.

Naming convention: #rd-<topic>

Channel	Purpose	Key Agents
`#rd-vibration`	Develop vibration/modal analysis tools	Technical Lead, Developer, Researcher
`#rd-fatigue`	Fatigue analysis capabilities	Technical Lead, Developer, NX Expert
`#rd-nonlinear`	Non-linear solver integration	Technical Lead, NX Expert, Researcher
`#rd-surrogates`	GNN/surrogate model improvements	Optimizer, Developer, Researcher
`#rd-extractors`	New data extractors	Developer, Post-Processor, Study Builder

How R&D channels work:

Antoine creates #rd-<topic> and posts the idea or problem
Manager routes to Technical Lead as the R&D point person
Technical Lead breaks down the R&D challenge, consults with Researcher for state-of-the-art
Developer prototypes, Auditor validates, Antoine reviews and steers
Once mature → becomes a standard capability (new protocol, new extractor, new skill)
Manager (as Framework Steward) ensures it's properly integrated into the Atomizer framework

Antoine's role in R&D channels:

Ask questions, poke around, explore ideas
The agents are his collaborators, not just executors
Technical Lead acts as the R&D conversation partner — understands the engineering, translates to actionable dev work
Antoine can say "what if we tried X?" and the team runs with it

Lifecycle:

Antoine or Manager creates channel
Manager is invited (auto-bound)
Manager invites relevant agents as needed
After project completion: archive channel

Thread Discipline

Within project channels, use threads for:

Each distinct task or subtask
Agent-to-agent technical discussion
Review cycles (auditor feedback → fixes → re-review)

Main channel timeline should read like a project log:

[Manager] 🎯 Project kickoff: StarSpec M1 WFE optimization
[Technical] 🔧 Technical breakdown complete → [thread]
[Optimizer] ⚡ Algorithm recommendation → [thread]  
[Manager] 🎯 Study approved. Launching optimization.
[Post-Processor] 📊 Results ready, 23% WFE improvement → [thread]
[Auditor] 🔍 Audit PASSED with 2 notes → [thread]
[Reporter] 📝 Report draft ready for review → [thread]
[Secretary] 📋 @antoine — Report ready, please review

4. Inter-Agent Communication

Primary: Slack @Mentions

Agents communicate by @-mentioning each other in project channels:

Manager: "@technical, new job. Break down the attached requirements."
Technical: "@manager, breakdown complete. Recommending @optimizer review the parameter space."
Manager: "@optimizer, review Technical's breakdown in this thread."

Secondary: sessions_send (Direct)

For urgent or private communication that shouldn't be in Slack:

sessions_send(agentId: "auditor", message: "Emergency: results look non-physical...")

Tertiary: sessions_spawn (Heavy Tasks)

For compute-heavy work that shouldn't block the agent:

sessions_spawn(agentId: "postprocessor", task: "Generate full Zernike decomposition for trial 47-95...")

Communication Rules

All project communication in project channels (traceability)
Technical discussions in threads (keep channels clean)
Only Manager initiates cross-agent work (except Secretary → Antoine)
Auditor can interrupt any thread (review authority)
sessions_send for emergencies only (not routine)

5. Memory System Implementation

Company Memory (Shared Skill)

~/.clawdbot/skills/atomizer-protocols/
├── SKILL.md
│   description: "Atomizer Engineering Co. protocols and procedures"
│   read_when: "Working on any Atomizer project"
├── QUICK_REF.md           ← Most agents load this
├── COMPANY.md             ← Company identity, values, how we work
├── protocols/
│   ├── OP_01_study_lifecycle.md
│   ├── OP_02_study_creation.md
│   ├── OP_03_optimization.md
│   ├── OP_04_results.md
│   ├── OP_05_reporting.md
│   ├── OP_06_troubleshooting.md
│   ├── OP_07_knowledge.md
│   ├── OP_08_delivery.md
│   ├── SYS_10_file_management.md
│   ├── SYS_11_nx_sessions.md
│   ├── SYS_12_solver_config.md
│   ├── SYS_13_extractors.md
│   ├── SYS_14_hooks.md
│   ├── SYS_15_surrogates.md
│   ├── SYS_16_dashboard.md
│   ├── SYS_17_insights.md
│   └── SYS_18_validation.md
└── lac/
    ├── critical_lessons.md  ← Hard-won insights from LAC
    └── algorithm_guide.md   ← When to use which algorithm

Agent Memory Lifecycle

New Project Starts
  │
  ├─ Agent reads: MEMORY.md (long-term knowledge)
  ├─ Agent checks: memory/<project>.md (if returning to existing project)
  │
  ├─ During project: updates memory/<project>.md with decisions, findings
  │
  └─ Project Ends
      ├─ Agent distills lessons → updates MEMORY.md
      └─ memory/<project>.md archived (kept for reference)

Agents share knowledge through:

Slack channels — conversations are visible to all invited agents
Shared skill files — updated protocols/lessons accessible to all
Git repo — Atomizer repo synced via Syncthing
KB Agent — can be asked "what do we know about X?"

6. Infrastructure Diagram

┌────────────────────────────────────────────────────────────────┐
│                    CLAWDBOT SERVER (Linux)                      │
│                                                                │
│  ┌──────────────────────────────────────────────────────┐      │
│  │              Clawdbot Gateway                         │      │
│  │                                                       │      │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐    │      │
│  │  │Manager  │ │Secretary│ │Technical│ │Optimizer│    │      │
│  │  │Agent    │ │Agent    │ │Agent    │ │Agent    │    │      │
│  │  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘    │      │
│  │       │           │           │           │          │      │
│  │  ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐    │      │
│  │  │NX Expert│ │PostProc │ │Reporter │ │Auditor  │    │      │
│  │  │Agent    │ │Agent    │ │Agent    │ │Agent    │    │      │
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘    │      │
│  │       + Researcher, Developer, KB, IT                 │      │
│  └──────────────────────┬────────────────────────────────┘      │
│                         │                                       │
│  ┌──────────────────────▼────────────────────────────────┐      │
│  │              Shared Resources                          │      │
│  │  /home/papa/repos/Atomizer/     (Git, via Syncthing)  │      │
│  │  /home/papa/obsidian-vault/     (PKM, via Syncthing)  │      │
│  │  /home/papa/ATODrive/           (Work docs)           │      │
│  │  ~/.clawdbot/skills/atomizer-*/ (Shared skills)       │      │
│  └───────────────────────────────────────────────────────┘      │
│                         │                                       │
│                    Syncthing                                    │
│                         │                                       │
└─────────────────────────┼───────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    WINDOWS (Antoine's PC)                        │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ NX/Simcenter │  │ Claude Code  │  │ Atomizer     │          │
│  │ (FEA Solver) │  │ (Local)      │  │ Dashboard    │          │
│  └──────────────┘  └──────────────┘  └──────────────┘          │
│                                                                 │
│  Study files synced to Linux via Syncthing                      │
└─────────────────────────────────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    SLACK WORKSPACE                               │
│                                                                 │
│  #hq  #secretary  #<client>-<project>  #rd-<topic>              │
│  #research  #dev  #knowledge-base  #audit-log                  │
│                                                                 │
│  All agents have Slack accounts via Clawdbot                    │
└─────────────────────────────────────────────────────────────────┘

7. Security & Isolation

Agent Access Boundaries

Agent	File Access	External Access	Special Permissions
Manager	Read Atomizer repo, PKM projects	Slack only	Can spawn sub-agents
Secretary	Read PKM, ATODrive	Slack + Email (draft only)	Can message Antoine directly
Technical	Read Atomizer repo, PKM projects	Slack only	—
Optimizer	Read/write study configs	Slack only	—
NX Expert	Read Atomizer repo, NX docs	Slack only	—
Post-Processor	Read study results, write plots	Slack only	—
Reporter	Read results, write reports	Slack + Email (with approval)	Atomaste report skill
Auditor	Read everything (audit scope)	Slack only	Veto power on deliverables
Researcher	Read Atomizer repo	Slack + Web search	Internet access
Developer	Read/write Atomizer repo	Slack only	Git operations
KB	Read/write PKM knowledge folders	Slack only	CAD Documenter skill
IT	Read system status	Slack only	System diagnostics

Principle of Least Privilege

No agent has SSH access to external systems
Email sending requires Antoine's approval (enforced in Secretary + Reporter AGENTS.md)
Only Developer can write to the Atomizer repo
Only Reporter + Secretary can draft client communications
Auditor has read-all access (necessary for audit role)

8. Cost Estimation

Per-Project Estimate (Typical Optimization Job)

Phase	Agents Active	Estimated Turns	Estimated Cost
Intake	Manager, Technical, Secretary	~10 turns	~$2-4
Planning	Technical, Optimizer, NX Expert	~15 turns	~$5-8
Execution	Optimizer, Post-Processor	~20 turns	~$6-10
Analysis	Post-Processor, Auditor	~15 turns	~$5-8
Reporting	Reporter, Auditor, Secretary	~10 turns	~$4-6
Total		~70 turns	~$22-36

Based on current Anthropic API pricing for Opus 4.6 / Sonnet 5 with typical context lengths.

Cost Optimization Strategies

Wake-on-demand: Agents only activate when @-mentioned
Tiered models: Support agents on cheaper models
Sub-agent timeouts: runTimeoutSeconds prevents runaway sessions
Session archiving: Auto-archive after 60 minutes of inactivity
Context management: Keep AGENTS.md lean, load skills on-demand
Batch operations: Secretary batches questions instead of individual pings

9. Autonomy Model — Bootstrap → Self-Maintain

Principle

Mario (main Clawdbot) bootstraps the Atomizer system. After that, the agents own themselves.

What Mario Does (One-Time Bootstrap)

Task	Description
Gateway config	`clawdbot.json` — agents, models, bindings
Slack setup	Create workspace, channels, bot app
Workspace scaffolding	Initial SOUL.md, AGENTS.md, IDENTITY.md per agent
Shared skills	Protocols, company identity, quick reference
Connection points	Syncthing job queue, repo mounts
First boot	Start the gateway, verify agents respond

What Agents Own (Post-Bootstrap)

Domain	Owner	Examples
Workspace files	Each agent	SOUL.md, AGENTS.md, TOOLS.md, MEMORY.md
Memory	Each agent	memory/*.md, MEMORY.md
Cron jobs & heartbeats	Each agent	Scheduling, periodic checks
Skills	Each agent (+ shared)	Installing new skills, evolving existing ones
Protocols	Manager + relevant agents	Updating, adding, deprecating protocols
Self-improvement	Each agent	Lessons learned, workflow tweaks, error recovery
Workspace organization	Each agent	Folder structure, tooling notes

Mario's Ongoing Role

Peer/advisor — not infrastructure owner
System resource oversight — T420 disk, CPU, ports (shared hardware)
Emergency support — if the gateway breaks, Mario can help diagnose
Not a gatekeeper — agents don't need Mario's permission to evolve

Why This Matters

If Mario does all infrastructure work, agents are puppets. The Atomizer Clawdbot should be as self-directed as Mario's own instance — reading its own files, updating its own memory, learning from mistakes, improving its processes. That's the whole point of autonomous agents.

Created: 2026-02-07 by Mario | Updated: 2026-02-08 (added autonomy model)

24 KiB Raw Permalink Blame History