Files

Antoine 8d9d55356c docs: Archive stale docs and create Atomizer-HQ agent documentation

Archive Management:
- Moved RALPH_LOOP, CANVAS, and dashboard implementation plans to archive/review/ for CEO review
- Moved completed restructuring plan and protocol v1 to archive/historical/
- Moved old session summaries to archive/review/

New HQ Documentation (docs/hq/):
- README.md: Overview of Atomizer-HQ multi-agent optimization team
- PROJECT_STRUCTURE.md: Standard KB-integrated project layout with Hydrotech reference
- KB_CONVENTIONS.md: Knowledge Base accumulation principles with generation tracking
- AGENT_WORKFLOWS.md: Project lifecycle phases and agent handoffs (OP_09 integration)
- STUDY_CONVENTIONS.md: Technical study execution standards and atomizer_spec.json format

Index Update:
- Reorganized docs/00_INDEX.md with HQ docs prominent
- Updated structure to reflect new agent-focused organization
- Maintained core documentation access for engineers

No files deleted, only moved to appropriate archive locations.

2026-02-09 02:48:35 +00:00

24 KiB

Raw Permalink Blame History

Dashboard Claude Code Integration Plan

Date: January 16, 2026 Status: 🟢 IMPLEMENTED Priority: CRITICAL Implemented: January 16, 2026

Problem Statement

The dashboard chat assistant is fundamentally underpowered compared to Claude Code CLI. Users expect the same level of intelligence, proactivity, and capability when interacting with the dashboard as they get from the terminal.

Current Experience (Terminal - Claude Code CLI)

User: "Add 10 new design variables to the M1 mirror study"

Claude Code:
1. Reads optimization_config.json
2. Understands the current structure
3. Adds 10 variables with intelligent defaults
4. ACTUALLY MODIFIES the file
5. Shows the diff
6. Can immediately run/test

Current Experience (Dashboard Chat)

User: "Add 10 new design variables"

Dashboard Chat:
1. Calls MCP tool canvas_add_node
2. Returns JSON instruction
3. Frontend SHOULD apply it but doesn't
4. Nothing visible happens
5. User frustrated

Root Cause Analysis

Issue 1: MCP Tools Don't Actually Modify Anything

The current MCP tools (canvas_add_node, etc.) just return instructions like:

{
  "success": true,
  "modification": {
    "action": "add_node",
    "nodeType": "designVar",
    "data": {...}
  }
}

The frontend is supposed to receive and apply these, but:

WebSocket message handling may not process tool results
No automatic application of modifications
User sees "success" message but nothing changes

Issue 2: Claude API vs Claude Code CLI

Capability	Claude API (Dashboard)	Claude Code CLI (Terminal)
Read files	Via MCP tool	Native
Write files	Via MCP tool (limited)	Native
Run commands	Via MCP tool (limited)	Native
Edit in place	NO	YES
Git operations	NO	YES
Multi-step reasoning	Limited	Full
Tool chaining	Awkward	Natural
Context window	200k	Unlimited (summarization)

Issue 3: Model Capability Gap

Dashboard uses Claude API (likely Sonnet or Haiku for cost). Terminal uses Opus 4.5 with full Claude Code capabilities.

Proposed Solution: Claude Code CLI Backend

Instead of MCP tools calling Python scripts, spawn actual Claude Code CLI sessions in the backend that have full power.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        DASHBOARD FRONTEND                        │
├─────────────────────────────────────────────────────────────────┤
│  Canvas Builder  │  Chat Panel  │  Study Views  │  Results      │
└────────┬────────────────┬─────────────────────────────────────┬─┘
         │                │                                      │
         │  WebSocket     │  REST API                           │
         ▼                ▼                                      │
┌─────────────────────────────────────────────────────────────────┐
│                      BACKEND (FastAPI)                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            CLAUDE CODE SESSION MANAGER                   │   │
│  │                                                          │   │
│  │  - Spawns claude CLI processes                          │   │
│  │  - Maintains conversation context                        │   │
│  │  - Streams output back to frontend                      │   │
│  │  - Has FULL Atomizer codebase access                    │   │
│  │  - Uses Opus 4.5 model                                  │   │
│  │  - Can edit files, run commands, modify studies         │   │
│  │                                                          │   │
│  └─────────────────────────────────────────────────────────┘   │
│                              │                                   │
│                              ▼                                   │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              ATOMIZER CODEBASE                           │   │
│  │                                                          │   │
│  │  studies/                   optimization_engine/         │   │
│  │    M1_Mirror/                 extractors/                │   │
│  │      optimization_config.json runner.py                  │   │
│  │      run_optimization.py      ...                        │   │
│  │                                                          │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Key Changes

Backend spawns Claude Code CLI instead of calling Claude API
Full file system access - Claude can read/write any file
Full command execution - Run Python, git, npm, etc.
Opus 4.5 model - Same intelligence as terminal
Stream output - Real-time feedback to user
Canvas sync - After Claude modifies files, canvas reloads from config

Implementation Plan

Phase 1: Claude Code CLI Session Manager

File: atomizer-dashboard/backend/api/services/claude_code_session.py

"""
Claude Code CLI Session Manager

Spawns actual Claude Code CLI processes with full Atomizer access.
This gives dashboard users the same power as terminal users.
"""

import asyncio
import json
import os
import subprocess
from pathlib import Path
from typing import AsyncGenerator, Dict, List, Optional

ATOMIZER_ROOT = Path(__file__).parent.parent.parent.parent.parent

class ClaudeCodeSession:
    """
    Manages a Claude Code CLI session.

    Unlike MCP tools, this spawns the actual claude CLI which has:
    - Full file system access
    - Full command execution
    - Opus 4.5 model
    - All Claude Code capabilities
    """

    def __init__(self, session_id: str, study_id: Optional[str] = None):
        self.session_id = session_id
        self.study_id = study_id
        self.canvas_state: Optional[Dict] = None  # Current canvas state from frontend
        self.working_dir = ATOMIZER_ROOT
        if study_id:
            study_path = ATOMIZER_ROOT / "studies" / study_id
            if study_path.exists():
                self.working_dir = study_path

    def set_canvas_state(self, canvas_state: Dict):
        """Update canvas state from frontend"""
        self.canvas_state = canvas_state

    async def send_message(self, message: str) -> AsyncGenerator[str, None]:
        """
        Send message to Claude Code CLI and stream response.

        Uses claude CLI with:
        - --print for output
        - --dangerously-skip-permissions for full access (controlled environment)
        - Runs from Atomizer root to get CLAUDE.md context automatically
        - Study-specific context injected into prompt
        """
        # Build study-specific context
        study_context = self._build_study_context() if self.study_id else ""

        # The user's message with study context prepended
        full_message = f"""## Current Study Context
{study_context}

## User Request
{message}

Remember: You have FULL power to edit files. Make the actual changes, don't just describe them."""

        # Write prompt to a temp file (better than stdin for complex prompts)
        prompt_file = ATOMIZER_ROOT / f".claude-prompt-{self.session_id}.md"
        prompt_file.write_text(full_message)

        try:
            # Spawn claude CLI from ATOMIZER_ROOT so it picks up CLAUDE.md
            # This gives it full Atomizer context automatically
            process = await asyncio.create_subprocess_exec(
                "claude",
                "--print",
                "--dangerously-skip-permissions",  # Full access in controlled env
                "-p", str(prompt_file),  # Read prompt from file
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE,
                cwd=str(ATOMIZER_ROOT),  # Run from root to get CLAUDE.md
                env={
                    **os.environ,
                    "ATOMIZER_STUDY": self.study_id or "",
                    "ATOMIZER_STUDY_PATH": str(self.working_dir),
                }
            )

            # Stream output
            stdout, stderr = await process.communicate()

            if stdout:
                yield stdout.decode()
            if stderr and process.returncode != 0:
                yield f"\n[Error]: {stderr.decode()}"

        finally:
            # Clean up prompt file
            if prompt_file.exists():
                prompt_file.unlink()

    def _build_system_prompt(self) -> str:
        """Build Atomizer-aware system prompt with full context"""

        # Load CLAUDE.md for Atomizer system instructions
        claude_md_path = ATOMIZER_ROOT / "CLAUDE.md"
        claude_md_content = ""
        if claude_md_path.exists():
            claude_md_content = claude_md_path.read_text()

        # Load study-specific context
        study_context = ""
        if self.study_id:
            study_context = self._build_study_context()

        prompt = f"""# Atomizer Dashboard Assistant

You are running as the Atomizer Dashboard Assistant with FULL Claude Code CLI capabilities.
You have the same power as a terminal Claude Code session.

## Atomizer System Instructions
{claude_md_content[:8000]}  # Truncate if too long

## Your Capabilities

You can and MUST:
- Read and EDIT any file in the codebase
- Modify optimization_config.json directly
- Update run_optimization.py
- Run Python scripts
- Execute git commands
- Create new studies
- Modify existing studies

When the user asks to add design variables, objectives, or other config changes:
1. Read the current config file
2. Make the actual modifications using Edit tool
3. Save the file
4. Report what you changed with a diff

DO NOT just return instructions - ACTUALLY MAKE THE CHANGES.

## Current Context

**Atomizer Root**: {ATOMIZER_ROOT}
**Working Directory**: {self.working_dir}

{study_context}

## Important Paths

- Studies: {ATOMIZER_ROOT / 'studies'}
- Extractors: {ATOMIZER_ROOT / 'optimization_engine' / 'extractors'}
- Protocols: {ATOMIZER_ROOT / 'docs' / 'protocols'}

## After Making Changes

After modifying any study files:
1. Confirm the changes were saved
2. Show the relevant diff
3. The dashboard canvas will auto-refresh to reflect your changes
"""
        return prompt

    def _build_study_context(self) -> str:
        """Build detailed context for the active study"""
        context = f"## Active Study: {self.study_id}\n\n"

        # Find and read optimization_config.json
        config_path = self.working_dir / "1_setup" / "optimization_config.json"
        if not config_path.exists():
            config_path = self.working_dir / "optimization_config.json"

        if config_path.exists():
            import json
            try:
                config = json.loads(config_path.read_text())
                context += f"**Config File**: `{config_path}`\n\n"

                # Design variables summary
                dvs = config.get("design_variables", [])
                if dvs:
                    context += "### Design Variables\n\n"
                    context += "| Name | Min | Max | Baseline | Unit |\n"
                    context += "|------|-----|-----|----------|------|\n"
                    for dv in dvs[:15]:
                        name = dv.get("name", dv.get("expression_name", "?"))
                        min_v = dv.get("min", dv.get("lower", "?"))
                        max_v = dv.get("max", dv.get("upper", "?"))
                        baseline = dv.get("baseline", "-")
                        unit = dv.get("units", dv.get("unit", "-"))
                        context += f"| {name} | {min_v} | {max_v} | {baseline} | {unit} |\n"
                    if len(dvs) > 15:
                        context += f"\n*... and {len(dvs) - 15} more*\n"
                    context += "\n"

                # Objectives
                objs = config.get("objectives", [])
                if objs:
                    context += "### Objectives\n\n"
                    for obj in objs:
                        name = obj.get("name", "?")
                        direction = obj.get("direction", "minimize")
                        weight = obj.get("weight", 1)
                        context += f"- **{name}**: {direction} (weight: {weight})\n"
                    context += "\n"

                # Extraction method (for Zernike)
                ext_method = config.get("extraction_method", {})
                if ext_method:
                    context += "### Extraction Method\n\n"
                    context += f"- Type: {ext_method.get('type', '?')}\n"
                    context += f"- Class: {ext_method.get('class', '?')}\n"
                    if ext_method.get("inner_radius"):
                        context += f"- Inner Radius: {ext_method.get('inner_radius')}\n"
                    context += "\n"

                # Zernike settings
                zernike = config.get("zernike_settings", {})
                if zernike:
                    context += "### Zernike Settings\n\n"
                    context += f"- Modes: {zernike.get('n_modes', '?')}\n"
                    context += f"- Filter Low Orders: {zernike.get('filter_low_orders', '?')}\n"
                    context += f"- Subcases: {zernike.get('subcases', [])}\n"
                    context += "\n"

                # Algorithm
                method = config.get("method", config.get("optimization", {}).get("sampler", "TPE"))
                max_trials = config.get("max_trials", config.get("optimization", {}).get("n_trials", 100))
                context += f"### Algorithm\n\n"
                context += f"- Method: {method}\n"
                context += f"- Max Trials: {max_trials}\n\n"

            except Exception as e:
                context += f"*Error reading config: {e}*\n\n"
        else:
            context += "*No optimization_config.json found*\n\n"

        # Check for run_optimization.py
        run_opt_path = self.working_dir / "run_optimization.py"
        if run_opt_path.exists():
            context += f"**Run Script**: `{run_opt_path}` (exists)\n\n"

        # Check results
        db_path = self.working_dir / "3_results" / "study.db"
        if db_path.exists():
            context += "**Results Database**: exists\n"
            # Could query trial count here
        else:
            context += "**Results Database**: not found (no optimization run yet)\n"

        return context

Phase 2: WebSocket Handler for Claude Code

File: atomizer-dashboard/backend/api/routes/claude_code.py

"""
Claude Code WebSocket Routes

Provides WebSocket endpoint that connects to actual Claude Code CLI.
"""

from fastapi import APIRouter, WebSocket, WebSocketDisconnect
from api.services.claude_code_session import ClaudeCodeSession
import uuid

router = APIRouter()

# Active sessions
sessions: Dict[str, ClaudeCodeSession] = {}

@router.websocket("/ws/{study_id}")
async def claude_code_websocket(websocket: WebSocket, study_id: str = None):
    """
    WebSocket for full Claude Code CLI access.

    This gives dashboard users the SAME power as terminal users.
    """
    await websocket.accept()

    session_id = str(uuid.uuid4())[:8]
    session = ClaudeCodeSession(session_id, study_id)
    sessions[session_id] = session

    try:
        while True:
            data = await websocket.receive_json()

            if data.get("type") == "message":
                content = data.get("content", "")

                # Stream response from Claude Code CLI
                async for chunk in session.send_message(content):
                    await websocket.send_json({
                        "type": "text",
                        "content": chunk,
                    })

                await websocket.send_json({"type": "done"})

                # After response, trigger canvas refresh
                await websocket.send_json({
                    "type": "refresh_canvas",
                    "study_id": study_id,
                })

    except WebSocketDisconnect:
        sessions.pop(session_id, None)

Phase 3: Frontend - Use Claude Code Endpoint

File: atomizer-dashboard/frontend/src/hooks/useClaudeCode.ts

/**
 * Hook for Claude Code CLI integration
 *
 * Connects to backend that spawns actual Claude Code CLI processes.
 * This gives full power: file editing, command execution, etc.
 */

export function useClaudeCode(studyId?: string) {
  const [messages, setMessages] = useState<Message[]>([]);
  const [isThinking, setIsThinking] = useState(false);
  const wsRef = useRef<WebSocket | null>(null);

  // Reload canvas after Claude makes changes
  const { loadFromConfig } = useCanvasStore();

  useEffect(() => {
    // Connect to Claude Code WebSocket
    const ws = new WebSocket(`ws://${location.host}/api/claude-code/ws/${studyId || ''}`);

    ws.onmessage = (event) => {
      const data = JSON.parse(event.data);

      if (data.type === 'text') {
        // Stream Claude's response
        appendToLastMessage(data.content);
      }
      else if (data.type === 'done') {
        setIsThinking(false);
      }
      else if (data.type === 'refresh_canvas') {
        // Claude made file changes - reload canvas from config
        reloadCanvasFromStudy(data.study_id);
      }
    };

    wsRef.current = ws;
    return () => ws.close();
  }, [studyId]);

  const sendMessage = async (content: string) => {
    setIsThinking(true);
    addMessage({ role: 'user', content });
    addMessage({ role: 'assistant', content: '', isStreaming: true });

    wsRef.current?.send(JSON.stringify({
      type: 'message',
      content,
    }));
  };

  return { messages, isThinking, sendMessage };
}

Phase 4: Canvas Auto-Refresh

When Claude modifies optimization_config.json, the canvas should automatically reload:

// In AtomizerCanvas.tsx or useCanvasChat.ts

const reloadCanvasFromStudy = async (studyId: string) => {
  // Fetch fresh config from backend
  const response = await fetch(`/api/studies/${studyId}/config`);
  const config = await response.json();

  // Reload canvas
  loadFromConfig(config);

  // Notify user
  showNotification('Canvas updated with Claude\'s changes');
};

Phase 5: Smart Prompting for Canvas Context

When user sends a message from canvas view, include canvas state:

const sendCanvasMessage = (userMessage: string) => {
  const canvasContext = generateCanvasMarkdown();

  const enrichedMessage = `
## Current Canvas State
${canvasContext}

## User Request
${userMessage}

When making changes, modify the actual optimization_config.json file.
After changes, the canvas will auto-refresh.
`;

  sendMessage(enrichedMessage);
};

Expected Behavior After Implementation

Example 1: Add Design Variables

User: "Add 10 new design variables for hole diameters, range 5-25mm"

Claude Code (in dashboard):
1. Reads studies/M1_Mirror/.../optimization_config.json
2. Adds 10 entries to design_variables array:
   - hole_diameter_1: [5, 25] mm
   - hole_diameter_2: [5, 25] mm
   - ... (10 total)
3. WRITES the file
4. Reports: "Added 10 design variables to optimization_config.json"
5. Frontend receives "refresh_canvas" signal
6. Canvas reloads and shows 10 new nodes
7. User sees actual changes

Example 2: Modify Optimization

User: "Change the algorithm to CMA-ES with 500 trials and add a stress constraint < 200 MPa"

Claude Code (in dashboard):
1. Reads config
2. Changes method: "TPE" -> "CMA-ES"
3. Changes max_trials: 100 -> 500
4. Adds constraint: {name: "stress_limit", operator: "<=", value: 200, unit: "MPa"}
5. WRITES the file
6. Reports changes
7. Canvas refreshes with updated algorithm node and new constraint node

Example 3: Complex Multi-File Changes

User: "Add a new Zernike extractor for the secondary mirror and connect it to a new objective"

Claude Code (in dashboard):
1. Reads config
2. Adds extractor to extractors array
3. Adds objective connected to extractor
4. If needed, modifies run_optimization.py to import new extractor
5. WRITES all modified files
6. Canvas refreshes with new extractor and objective nodes, properly connected

Implementation Checklist

Phase 1: Backend Claude Code Session

Create claude_code_session.py with session manager
Implement send_message() with CLI spawning
Build Atomizer-aware system prompt
Handle study context (working directory)
Stream output properly

Phase 2: WebSocket Route

Create /api/claude-code/ws/{study_id} endpoint
Handle message routing
Implement refresh_canvas signal
Session cleanup on disconnect

Phase 3: Frontend Hook

Create useClaudeCode.ts hook
Connect to Claude Code WebSocket
Handle streaming responses
Handle canvas refresh signals

Phase 4: Canvas Auto-Refresh

Add reloadCanvasFromStudy() function
Wire refresh signal to canvas store
Add loading state during refresh
Show notification on refresh (system message)

Phase 5: Chat Panel Integration

Update ChatPanel to use useClaudeCode
Include canvas context in messages
Add "Claude Code" indicator in UI (mode toggle)
Show when Claude is editing files

Phase 6: Testing

Test adding design variables
Test modifying objectives
Test complex multi-file changes
Test canvas refresh after changes
Test error handling

Implementation Notes

Files Created/Modified

Backend:

atomizer-dashboard/backend/api/services/claude_code_session.py - New session manager
atomizer-dashboard/backend/api/routes/claude_code.py - New WebSocket routes
atomizer-dashboard/backend/api/main.py - Added claude_code router

Frontend:

atomizer-dashboard/frontend/src/hooks/useClaudeCode.ts - New hook for Claude Code CLI
atomizer-dashboard/frontend/src/components/canvas/AtomizerCanvas.tsx - Added mode toggle
atomizer-dashboard/frontend/src/components/chat/ChatMessage.tsx - Added system message support

Security Considerations

Claude Code CLI with --dangerously-skip-permissions has full system access. Mitigations:

Sandboxed environment: Dashboard runs on user's machine, not public server
Study-scoped working directory: Claude starts in study folder
Audit logging: Log all file modifications
User confirmation: Option to require approval for destructive operations

Cost Considerations

Using Opus 4.5 via Claude Code CLI is more expensive than Sonnet API. Options:

Default to Sonnet, upgrade on request: "Use full power mode" button
Per-session token budget: Warn user when approaching limit
Cache common operations: Pre-generate responses for common tasks

Success Criteria

Parity with terminal: Dashboard chat can do everything Claude Code CLI can
Real modifications: Files actually change, not just instructions
Canvas sync: Canvas reflects file changes immediately
Intelligent defaults: Claude makes smart choices without asking clarifying questions
Proactive behavior: Claude anticipates needs and handles edge cases

This document to be implemented by Claude Code CLI

24 KiB Raw Permalink Blame History

Dashboard Claude Code Integration Plan

Problem Statement

Current Experience (Terminal - Claude Code CLI)

Current Experience (Dashboard Chat)

Root Cause Analysis

Issue 1: MCP Tools Don't Actually Modify Anything

Issue 2: Claude API vs Claude Code CLI

Issue 3: Model Capability Gap

Proposed Solution: Claude Code CLI Backend

Architecture

Key Changes

Implementation Plan

Phase 1: Claude Code CLI Session Manager

Phase 2: WebSocket Handler for Claude Code

Phase 3: Frontend - Use Claude Code Endpoint

Phase 4: Canvas Auto-Refresh

Phase 5: Smart Prompting for Canvas Context

Expected Behavior After Implementation

Example 1: Add Design Variables

Example 2: Modify Optimization

Example 3: Complex Multi-File Changes

Implementation Checklist

Phase 1: Backend Claude Code Session

Phase 2: WebSocket Route

Phase 3: Frontend Hook

Phase 4: Canvas Auto-Refresh

Phase 5: Chat Panel Integration

Phase 6: Testing

Implementation Notes

Files Created/Modified

Security Considerations

Cost Considerations

Success Criteria

24 KiB

Raw Permalink Blame History