feat: add Atomizer HQ multi-agent cluster infrastructure

- 8-agent OpenClaw cluster (Manager, Tech-Lead, Secretary, Auditor, Optimizer, Study-Builder, NX-Expert, Webster) - Orchestration engine: orchestrate.py (sync delegation + handoffs) - Workflow engine: YAML-defined multi-step pipelines - Agent workspaces: SOUL.md, AGENTS.md, MEMORY.md per agent - Shared skills: delegate, orchestrate, atomizer-protocols - Capability registry (AGENTS_REGISTRY.json) - Cluster management: cluster.sh, systemd template - All secrets replaced with env var references
2026-02-15 21:18:18 +00:00
parent d6a1d6eee1
commit 3289a76e19
170 changed files with 24949 additions and 0 deletions
--- a/hq/shared/skills/README.md
+++ b/hq/shared/skills/README.md
@@ -0,0 +1,56 @@
+# Atomizer-HQ Shared Skills
+
+## Overview
+
+Shared skills are maintained by Mario and synced here for Atomizer-HQ use.
+
+## Accessing Shared Skills
+
+### knowledge-base (Design/FEA KB)
+**Source:** `/home/papa/clawd/skills/knowledge-base/SKILL.md`
+**Reference:** `/home/papa/obsidian-vault/2-Projects/Knowledge-Base-System/Development/SKILL-REFERENCE.md`
+
+Before using this skill:
+```bash
+# Read the skill definition
+cat /home/papa/clawd/skills/knowledge-base/SKILL.md
+
+# Or read the quick reference
+cat /home/papa/obsidian-vault/2-Projects/Knowledge-Base-System/Development/SKILL-REFERENCE.md
+```
+
+**Key commands:**
+```bash
+cad_kb.py status <project>        # KB status
+cad_kb.py context <project>       # AI context
+cad_kb.py cdr <project>           # CDR content
+```
+
+### atomaste-reports (PDF Reports)
+**Source:** `/home/papa/clawd/skills/atomaste-reports/SKILL.md`
+
+### fem-documenter (FEA KB) — PLANNED
+**Concept:** `/home/papa/obsidian-vault/2-Projects/Knowledge-Base-System/Concepts/FEM-Documenter.md`
+
+## Skill Updates
+
+Mario maintains the master copies. To get latest:
+1. Check Mario's skill folder for updates
+2. Read SKILL.md for current API
+3. Apply any Atomizer-specific extensions locally
+
+## Extension Protocol
+
+To extend a shared skill for Atomizer:
+1. Create extension file in this folder: `<skill>-atomizer-ext.md`
+2. Document Atomizer-specific prompts, templates, workflows
+3. Extensions DON'T modify the original skill
+4. Mario may incorporate useful extensions back
+
+## Current Extensions
+
+(None yet — add Atomizer-specific extensions here)
+
+---
+
+*Last updated: 2026-02-09*
--- a/hq/shared/skills/knowledge-base-atomizer-ext.md
+++ b/hq/shared/skills/knowledge-base-atomizer-ext.md
@@ -0,0 +1,199 @@
+# Knowledge Base — Atomizer Extension
+
+> Extension of Mario's shared `knowledge-base` skill for Atomizer HQ's agentic workflow.
+>
+> **Base skill:** `/home/papa/clawd/skills/knowledge-base/SKILL.md`
+> **This file:** Atomizer-specific conventions for how agents use the KB system.
+
+---
+
+## Key Differences from Base Skill
+
+### Location
+- **Base:** KB lives in Obsidian vault (`/obsidian-vault/2-Projects/<Project>/KB/`)
+- **Atomizer:** KB lives in Atomizer repo (`/repos/Atomizer/projects/<project>/kb/`)
+- Same structure, different home. Gitea-browseable, git-tracked.
+
+### Input Sources
+- **Base:** Primarily video session exports via CAD-Documenter
+- **Atomizer:** Mixed sources:
+  - CEO input via Slack channels
+  - Agent-generated analysis (Tech Lead breakdowns, optimization results)
+  - NX model introspection data
+  - Automated study results
+  - Video sessions (when applicable — uses base skill pipeline)
+
+### Contributors
+- **Base:** Single AI (Mario) processes sessions
+- **Atomizer:** Multiple agents contribute:
+
+| Agent | Writes To | When |
+|-------|-----------|------|
+| Manager 🎯 | `_index.md`, `_history.md`, `dev/gen-XXX.md` | After each project phase |
+| Technical Lead 🔧 | `fea/`, `components/` (technical sections) | During analysis + review |
+| Optimizer ⚡ (future) | `fea/results/`, `components/` (optimization data) | After study completion |
+| Study Builder 🏗️ (future) | Study configs, introspection data | During study setup |
+| CEO (Antoine) | Any file via Gitea or Slack input | Anytime |
+
+---
+
+## Project Structure (Atomizer Standard)
+
+```
+projects/<project-name>/
+├── README.md                # Project overview, status, links
+├── CONTEXT.md               # Intake requirements, constraints
+├── BREAKDOWN.md             # Technical analysis (Tech Lead)
+├── DECISIONS.md             # Numbered decision log
+│
+├── models/                  # Reference NX models (golden copies)
+│   ├── *.prt, *.sim, *.fem
+│   └── README.md
+│
+├── kb/                      # Living Knowledge Base
+│   ├── _index.md            # Master overview (auto-maintained)
+│   ├── _history.md          # Modification log per generation
+│   ├── components/          # One file per component
+│   ├── materials/           # Material data + cards
+│   ├── fea/                 # FEA knowledge
+│   │   ├── models/          # Model setup docs
+│   │   ├── load-cases/      # BCs, loads, conditions
+│   │   └── results/         # Analysis outputs + validation
+│   └── dev/                 # Generation documents (gen-XXX.md)
+│
+├── images/                  # Screenshots, plots, CAD renders
+│   ├── components/
+│   └── studies/
+│
+├── studies/                 # Optimization campaigns
+│   └── XX_<name>/
+│       ├── README.md        # Study goals, findings
+│       ├── atomizer_spec.json
+│       ├── model/           # Study-specific model copy
+│       │   └── CHANGES.md   # Delta from reference model
+│       ├── introspection/   # Model discovery for this study
+│       └── results/         # Outputs, plots, STUDY_REPORT.md
+│
+└── deliverables/            # Final client-facing outputs
+    ├── FINAL_REPORT.md      # Compiled from KB
+    └── RECOMMENDATIONS.md
+```
+
+---
+
+## Agent Workflows
+
+### 1. Project Intake (Manager)
+```
+CEO posts request → Manager creates:
+  - CONTEXT.md (from intake data)
+  - README.md (project overview)
+  - DECISIONS.md (empty template)
+  - kb/ structure (initialized)
+  - kb/dev/gen-001.md (intake generation)
+  → Delegates technical breakdown to Tech Lead
+```
+
+### 2. Technical Breakdown (Tech Lead)
+```
+Manager delegates → Tech Lead produces:
+  - BREAKDOWN.md (full analysis)
+  - Updates kb/components/ with structural behavior
+  - Updates kb/fea/models/ with solver considerations
+  - Identifies gaps → listed in kb/_index.md
+  → Manager creates gen-002 if substantial new knowledge
+```
+
+### 3. Model Introspection (Tech Lead / Study Builder)
+```
+Before each study:
+  - Copy reference models/ → studies/XX/model/
+  - Run NX introspection → studies/XX/introspection/
+  - Document changes in model/CHANGES.md
+  - Update kb/fea/ with any new model knowledge
+```
+
+### 4. Study Execution (Optimizer / Study Builder)
+```
+During/after optimization:
+  - Results written to studies/XX/results/
+  - STUDY_REPORT.md summarizes findings
+  - Key insights feed back into kb/:
+    - Component sensitivities → kb/components/
+    - FEA validation → kb/fea/results/
+    - New generation doc → kb/dev/gen-XXX.md
+```
+
+### 5. Deliverable Compilation (Reporter / Manager)
+```
+When project is complete:
+  - Compile kb/ → deliverables/FINAL_REPORT.md
+  - Use cad_kb.py cdr patterns for structured output
+  - Cross-reference DECISIONS.md for rationale
+  - Include key plots from images/ and studies/XX/results/plots/
+```
+
+---
+
+## Generation Conventions
+
+Each major project event creates a new generation document:
+
+| Gen | Trigger | Author |
+|-----|---------|--------|
+| 001 | Project intake + initial breakdown | Manager |
+| 002 | Gap resolution / model introspection | Tech Lead |
+| 003 | DoE study complete (landscape insights) | Manager / Optimizer |
+| 004 | Optimization complete (best design) | Manager / Optimizer |
+| 005 | Validation / final review | Tech Lead |
+
+Generation docs go in `kb/dev/gen-XXX.md` and follow the format:
+```markdown
+# Gen XXX — <Title>
+**Date:** YYYY-MM-DD
+**Sources:** <what triggered this>
+**Author:** <agent>
+
+## What Happened
+## Key Findings
+## KB Entries Created/Updated
+## Decisions Made
+## Open Items
+## Next Steps
+```
+
+---
+
+## Decision Log Conventions
+
+All project decisions go in `DECISIONS.md`:
+
+```markdown
+## DEC-<PROJECT>-NNN: <Title>
+- **Date:** YYYY-MM-DD
+- **By:** <agent or person>
+- **Decision:** <what was decided>
+- **Rationale:** <why>
+- **Status:** Proposed | Approved | Superseded by DEC-XXX
+```
+
+Agents MUST check DECISIONS.md before proposing changes that could contradict prior decisions.
+
+---
+
+## Relationship to Base Skill
+
+- **Use base skill CLI** (`cad_kb.py`) when applicable — adapt paths to `projects/<name>/kb/`
+- **Use base skill templates** for component files, generation docs
+- **Follow base accumulation logic** — sessions add, never replace
+- **Push general improvements upstream** — if we improve KB processing, notify Mario for potential merge into shared skill
+
+---
+
+## Handoff Protocol
+
+When delegating KB-related work between agents, use OP_09 format and specify:
+1. Which KB files to read for context
+2. Which KB files to update with results
+3. What generation number to use
+4. Whether a new gen doc is needed
--- a/hq/shared/windows/README.md
+++ b/hq/shared/windows/README.md
@@ -0,0 +1,47 @@
+# Windows Setup — Atomizer Job Queue
+
+## Quick Setup
+
+1. Copy this folder to `C:\Atomizer\` on Windows
+2. Create the job queue directories:
+   ```powershell
+   mkdir C:\Atomizer\job-queue\pending
+   mkdir C:\Atomizer\job-queue\running
+   mkdir C:\Atomizer\job-queue\completed
+   mkdir C:\Atomizer\job-queue\failed
+   ```
+3. Set up Syncthing to sync `C:\Atomizer\job-queue\` ↔ `/home/papa/atomizer/job-queue/`
+4. Edit `atomizer_job_watcher.py` — update `CONDA_PYTHON` path if needed
+
+## Running the Watcher
+
+### Manual (recommended for now)
+```powershell
+conda activate atomizer
+python C:\Atomizer\atomizer_job_watcher.py
+```
+
+### Process single pending job
+```powershell
+python C:\Atomizer\atomizer_job_watcher.py --once
+```
+
+### As a Windows Service (optional)
+```powershell
+# Install NSSM: https://nssm.cc/
+nssm install AtomizerJobWatcher "C:\Users\antoi\anaconda3\envs\atomizer\python.exe" "C:\Atomizer\atomizer_job_watcher.py"
+nssm set AtomizerJobWatcher AppDirectory "C:\Atomizer"
+nssm start AtomizerJobWatcher
+```
+
+## How It Works
+
+1. Agents on Linux write job directories to `/job-queue/outbox/`
+2. Syncthing syncs to `C:\Atomizer\job-queue\pending\`
+3. Watcher picks up new jobs, runs them, moves to `completed/` or `failed/`
+4. Results sync back to Linux via Syncthing
+5. Agents detect completed jobs and process results
+
+## Note
+For Phase 0, Antoine runs `python run_optimization.py` manually instead of using the watcher.
+The watcher is for Phase 1+ when the workflow is more automated.
--- a/hq/shared/windows/atomizer_job_watcher.py
+++ b/hq/shared/windows/atomizer_job_watcher.py
@@ -0,0 +1,170 @@
+#!/usr/bin/env python3
+"""
+atomizer_job_watcher.py — Windows Job Queue Service
+Watches C:\\Atomizer\\job-queue\\pending\\ for new jobs.
+Moves them through pending → running → completed/failed.
+
+Usage:
+    python atomizer_job_watcher.py              # Watch mode (continuous)
+    python atomizer_job_watcher.py --once       # Process pending, then exit
+
+Install as service (optional):
+    nssm install AtomizerJobWatcher "C:\\...\\python.exe" "C:\\Atomizer\\atomizer_job_watcher.py"
+"""
+
+import json
+import shutil
+import subprocess
+import sys
+import time
+import logging
+from pathlib import Path
+from datetime import datetime, timezone
+
+JOB_QUEUE = Path(r"C:\Atomizer\job-queue")
+PENDING = JOB_QUEUE / "pending"
+RUNNING = JOB_QUEUE / "running"
+COMPLETED = JOB_QUEUE / "completed"
+FAILED = JOB_QUEUE / "failed"
+
+# Update this to match your Conda/Python path
+CONDA_PYTHON = r"C:\Users\antoi\anaconda3\envs\atomizer\python.exe"
+
+logging.basicConfig(
+    level=logging.INFO,
+    format="%(asctime)s [%(levelname)s] %(message)s",
+    handlers=[
+        logging.FileHandler(JOB_QUEUE / "watcher.log"),
+        logging.StreamHandler()
+    ]
+)
+log = logging.getLogger("job-watcher")
+
+
+def now_iso():
+    return datetime.now(timezone.utc).isoformat()
+
+
+def run_job(job_dir: Path):
+    """Execute a single job."""
+    job_file = job_dir / "job.json"
+    if not job_file.exists():
+        log.warning(f"No job.json in {job_dir}, skipping")
+        return
+
+    with open(job_file) as f:
+        job = json.load(f)
+
+    job_id = job.get("job_id", job_dir.name)
+    log.info(f"Starting job: {job_id}")
+
+    # Move to running/
+    running_dir = RUNNING / job_dir.name
+    if running_dir.exists():
+        shutil.rmtree(running_dir)
+    shutil.move(str(job_dir), str(running_dir))
+
+    # Update status
+    job["status"] = "running"
+    job["status_updated_at"] = now_iso()
+    with open(running_dir / "job.json", "w") as f:
+        json.dump(job, f, indent=2)
+
+    # Execute
+    script = running_dir / job.get("script", "run_optimization.py")
+    args = [CONDA_PYTHON, str(script)] + job.get("args", [])
+
+    stdout_log = running_dir / "stdout.log"
+    stderr_log = running_dir / "stderr.log"
+
+    start_time = time.time()
+    try:
+        import os
+        env = {**os.environ, "ATOMIZER_JOB_ID": job_id}
+
+        result = subprocess.run(
+            args,
+            cwd=str(running_dir),
+            stdout=open(stdout_log, "w"),
+            stderr=open(stderr_log, "w"),
+            timeout=job.get("timeout_seconds", 86400),  # 24h default
+            env=env
+        )
+        duration = time.time() - start_time
+
+        if result.returncode == 0:
+            job["status"] = "completed"
+            dest = COMPLETED / job_dir.name
+        else:
+            job["status"] = "failed"
+            job["error"] = f"Exit code: {result.returncode}"
+            dest = FAILED / job_dir.name
+
+        job["duration_seconds"] = round(duration, 1)
+
+    except subprocess.TimeoutExpired:
+        job["status"] = "failed"
+        job["error"] = "Timeout exceeded"
+        job["duration_seconds"] = round(time.time() - start_time, 1)
+        dest = FAILED / job_dir.name
+
+    except Exception as e:
+        job["status"] = "failed"
+        job["error"] = str(e)
+        dest = FAILED / job_dir.name
+
+    job["status_updated_at"] = now_iso()
+    with open(running_dir / "job.json", "w") as f:
+        json.dump(job, f, indent=2)
+
+    if dest.exists():
+        shutil.rmtree(dest)
+    shutil.move(str(running_dir), str(dest))
+    log.info(f"Job {job_id}: {job['status']} ({job.get('duration_seconds', '?')}s)")
+
+
+def process_pending():
+    """Process all pending jobs."""
+    for job_dir in sorted(PENDING.iterdir()):
+        if job_dir.is_dir() and (job_dir / "job.json").exists():
+            run_job(job_dir)
+
+
+def watch():
+    """Watch for new jobs (polling mode — no watchdog dependency)."""
+    log.info(f"Job watcher started. Monitoring: {PENDING}")
+    seen = set()
+
+    while True:
+        try:
+            current = set()
+            for job_dir in PENDING.iterdir():
+                if job_dir.is_dir() and (job_dir / "job.json").exists():
+                    current.add(job_dir.name)
+                    if job_dir.name not in seen:
+                        # Wait for Syncthing to finish syncing
+                        time.sleep(5)
+                        if (job_dir / "job.json").exists():
+                            run_job(job_dir)
+            seen = current
+        except Exception as e:
+            log.error(f"Watch loop error: {e}")
+
+        time.sleep(10)  # Poll every 10 seconds
+
+
+def main():
+    for d in [PENDING, RUNNING, COMPLETED, FAILED]:
+        d.mkdir(parents=True, exist_ok=True)
+
+    if "--once" in sys.argv:
+        process_pending()
+    else:
+        # Process existing pending first
+        process_pending()
+        # Then watch for new ones
+        watch()
+
+
+if __name__ == "__main__":
+    main()