Atomizer/docs/plans/DASHBOARD_CLAUDE_CODE_INTEGRATION.md

# Dashboard Claude Code Integration Plan

**Date**: January 16, 2026
**Status**: 🟢 IMPLEMENTED
**Priority**: CRITICAL
**Implemented**: January 16, 2026

---

## Problem Statement

The dashboard chat assistant is **fundamentally underpowered** compared to Claude Code CLI. Users expect the same level of intelligence, proactivity, and capability when interacting with the dashboard as they get from the terminal.

### Current Experience (Terminal - Claude Code CLI)
```
User: "Add 10 new design variables to the M1 mirror study"

Claude Code:
1. Reads optimization_config.json
2. Understands the current structure
3. Adds 10 variables with intelligent defaults
4. ACTUALLY MODIFIES the file
5. Shows the diff
6. Can immediately run/test
```

### Current Experience (Dashboard Chat)
```
User: "Add 10 new design variables"

Dashboard Chat:
1. Calls MCP tool canvas_add_node
2. Returns JSON instruction
3. Frontend SHOULD apply it but doesn't
4. Nothing visible happens
5. User frustrated
```

---

## Root Cause Analysis

### Issue 1: MCP Tools Don't Actually Modify Anything

The current MCP tools (`canvas_add_node`, etc.) just return instructions like:
```json
{
  "success": true,
  "modification": {
    "action": "add_node",
    "nodeType": "designVar",
    "data": {...}
  }
}
```

The **frontend is supposed to receive and apply these**, but:
- WebSocket message handling may not process tool results
- No automatic application of modifications
- User sees "success" message but nothing changes

### Issue 2: Claude API vs Claude Code CLI

| Capability | Claude API (Dashboard) | Claude Code CLI (Terminal) |
|------------|------------------------|---------------------------|
| Read files | Via MCP tool | Native |
| Write files | Via MCP tool (limited) | Native |
| Run commands | Via MCP tool (limited) | Native |
| Edit in place | NO | YES |
| Git operations | NO | YES |
| Multi-step reasoning | Limited | Full |
| Tool chaining | Awkward | Natural |
| Context window | 200k | Unlimited (summarization) |

### Issue 3: Model Capability Gap

Dashboard uses Claude API (likely Sonnet or Haiku for cost). Terminal uses **Opus 4.5** with full Claude Code capabilities.

---

## Proposed Solution: Claude Code CLI Backend

Instead of MCP tools calling Python scripts, **spawn actual Claude Code CLI sessions** in the backend that have full power.

### Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                        DASHBOARD FRONTEND                        │
├─────────────────────────────────────────────────────────────────┤
│  Canvas Builder  │  Chat Panel  │  Study Views  │  Results      │
└────────┬────────────────┬─────────────────────────────────────┬─┘
         │                │                                      │
         │  WebSocket     │  REST API                           │
         ▼                ▼                                      │
┌─────────────────────────────────────────────────────────────────┐
│                      BACKEND (FastAPI)                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            CLAUDE CODE SESSION MANAGER                   │   │
│  │                                                          │   │
│  │  - Spawns claude CLI processes                          │   │
│  │  - Maintains conversation context                        │   │
│  │  - Streams output back to frontend                      │   │
│  │  - Has FULL Atomizer codebase access                    │   │
│  │  - Uses Opus 4.5 model                                  │   │
│  │  - Can edit files, run commands, modify studies         │   │
│  │                                                          │   │
│  └─────────────────────────────────────────────────────────┘   │
│                              │                                   │
│                              ▼                                   │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              ATOMIZER CODEBASE                           │   │
│  │                                                          │   │
│  │  studies/                   optimization_engine/         │   │
│  │    M1_Mirror/                 extractors/                │   │
│  │      optimization_config.json runner.py                  │   │
│  │      run_optimization.py      ...                        │   │
│  │                                                          │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

### Key Changes

1. **Backend spawns Claude Code CLI** instead of calling Claude API
2. **Full file system access** - Claude can read/write any file
3. **Full command execution** - Run Python, git, npm, etc.
4. **Opus 4.5 model** - Same intelligence as terminal
5. **Stream output** - Real-time feedback to user
6. **Canvas sync** - After Claude modifies files, canvas reloads from config

---

## Implementation Plan

### Phase 1: Claude Code CLI Session Manager

**File**: `atomizer-dashboard/backend/api/services/claude_code_session.py`

```python
"""
Claude Code CLI Session Manager

Spawns actual Claude Code CLI processes with full Atomizer access.
This gives dashboard users the same power as terminal users.
"""

import asyncio
import json
import os
import subprocess
from pathlib import Path
from typing import AsyncGenerator, Dict, List, Optional

ATOMIZER_ROOT = Path(__file__).parent.parent.parent.parent.parent

class ClaudeCodeSession:
    """
    Manages a Claude Code CLI session.

    Unlike MCP tools, this spawns the actual claude CLI which has:
    - Full file system access
    - Full command execution
    - Opus 4.5 model
    - All Claude Code capabilities
    """

    def __init__(self, session_id: str, study_id: Optional[str] = None):
        self.session_id = session_id
        self.study_id = study_id
        self.canvas_state: Optional[Dict] = None  # Current canvas state from frontend
        self.working_dir = ATOMIZER_ROOT
        if study_id:
            study_path = ATOMIZER_ROOT / "studies" / study_id
            if study_path.exists():
                self.working_dir = study_path

    def set_canvas_state(self, canvas_state: Dict):
        """Update canvas state from frontend"""
        self.canvas_state = canvas_state

    async def send_message(self, message: str) -> AsyncGenerator[str, None]:
        """
        Send message to Claude Code CLI and stream response.

        Uses claude CLI with:
        - --print for output
        - --dangerously-skip-permissions for full access (controlled environment)
        - Runs from Atomizer root to get CLAUDE.md context automatically
        - Study-specific context injected into prompt
        """
        # Build study-specific context
        study_context = self._build_study_context() if self.study_id else ""

        # The user's message with study context prepended
        full_message = f"""## Current Study Context
{study_context}

## User Request
{message}

Remember: You have FULL power to edit files. Make the actual changes, don't just describe them."""

        # Write prompt to a temp file (better than stdin for complex prompts)
        prompt_file = ATOMIZER_ROOT / f".claude-prompt-{self.session_id}.md"
        prompt_file.write_text(full_message)

        try:
            # Spawn claude CLI from ATOMIZER_ROOT so it picks up CLAUDE.md
            # This gives it full Atomizer context automatically
            process = await asyncio.create_subprocess_exec(
                "claude",
                "--print",
                "--dangerously-skip-permissions",  # Full access in controlled env
                "-p", str(prompt_file),  # Read prompt from file
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE,
                cwd=str(ATOMIZER_ROOT),  # Run from root to get CLAUDE.md
                env={
                    **os.environ,
                    "ATOMIZER_STUDY": self.study_id or "",
                    "ATOMIZER_STUDY_PATH": str(self.working_dir),
                }
            )

            # Stream output
            stdout, stderr = await process.communicate()

            if stdout:
                yield stdout.decode()
            if stderr and process.returncode != 0:
                yield f"\n[Error]: {stderr.decode()}"

        finally:
            # Clean up prompt file
            if prompt_file.exists():
                prompt_file.unlink()

    def _build_system_prompt(self) -> str:
        """Build Atomizer-aware system prompt with full context"""

        # Load CLAUDE.md for Atomizer system instructions
        claude_md_path = ATOMIZER_ROOT / "CLAUDE.md"
        claude_md_content = ""
        if claude_md_path.exists():
            claude_md_content = claude_md_path.read_text()

        # Load study-specific context
        study_context = ""
        if self.study_id:
            study_context = self._build_study_context()

        prompt = f"""# Atomizer Dashboard Assistant

You are running as the Atomizer Dashboard Assistant with FULL Claude Code CLI capabilities.
You have the same power as a terminal Claude Code session.

## Atomizer System Instructions
{claude_md_content[:8000]}  # Truncate if too long

## Your Capabilities

You can and MUST:
- Read and EDIT any file in the codebase
- Modify optimization_config.json directly
- Update run_optimization.py
- Run Python scripts
- Execute git commands
- Create new studies
- Modify existing studies

When the user asks to add design variables, objectives, or other config changes:
1. Read the current config file
2. Make the actual modifications using Edit tool
3. Save the file
4. Report what you changed with a diff

DO NOT just return instructions - ACTUALLY MAKE THE CHANGES.

## Current Context

**Atomizer Root**: {ATOMIZER_ROOT}
**Working Directory**: {self.working_dir}

{study_context}

## Important Paths

- Studies: {ATOMIZER_ROOT / 'studies'}
- Extractors: {ATOMIZER_ROOT / 'optimization_engine' / 'extractors'}
- Protocols: {ATOMIZER_ROOT / 'docs' / 'protocols'}

## After Making Changes

After modifying any study files:
1. Confirm the changes were saved
2. Show the relevant diff
3. The dashboard canvas will auto-refresh to reflect your changes
"""
        return prompt

    def _build_study_context(self) -> str:
        """Build detailed context for the active study"""
        context = f"## Active Study: {self.study_id}\n\n"

        # Find and read optimization_config.json
        config_path = self.working_dir / "1_setup" / "optimization_config.json"
        if not config_path.exists():
            config_path = self.working_dir / "optimization_config.json"

        if config_path.exists():
            import json
            try:
                config = json.loads(config_path.read_text())
                context += f"**Config File**: `{config_path}`\n\n"

                # Design variables summary
                dvs = config.get("design_variables", [])
                if dvs:
                    context += "### Design Variables\n\n"
                    context += "| Name | Min | Max | Baseline | Unit |\n"
                    context += "|------|-----|-----|----------|------|\n"
                    for dv in dvs[:15]:
                        name = dv.get("name", dv.get("expression_name", "?"))
                        min_v = dv.get("min", dv.get("lower", "?"))
                        max_v = dv.get("max", dv.get("upper", "?"))
                        baseline = dv.get("baseline", "-")
                        unit = dv.get("units", dv.get("unit", "-"))
                        context += f"| {name} | {min_v} | {max_v} | {baseline} | {unit} |\n"
                    if len(dvs) > 15:
                        context += f"\n*... and {len(dvs) - 15} more*\n"
                    context += "\n"

                # Objectives
                objs = config.get("objectives", [])
                if objs:
                    context += "### Objectives\n\n"
                    for obj in objs:
                        name = obj.get("name", "?")
                        direction = obj.get("direction", "minimize")
                        weight = obj.get("weight", 1)
                        context += f"- **{name}**: {direction} (weight: {weight})\n"
                    context += "\n"

                # Extraction method (for Zernike)
                ext_method = config.get("extraction_method", {})
                if ext_method:
                    context += "### Extraction Method\n\n"
                    context += f"- Type: {ext_method.get('type', '?')}\n"
                    context += f"- Class: {ext_method.get('class', '?')}\n"
                    if ext_method.get("inner_radius"):
                        context += f"- Inner Radius: {ext_method.get('inner_radius')}\n"
                    context += "\n"

                # Zernike settings
                zernike = config.get("zernike_settings", {})
                if zernike:
                    context += "### Zernike Settings\n\n"
                    context += f"- Modes: {zernike.get('n_modes', '?')}\n"
                    context += f"- Filter Low Orders: {zernike.get('filter_low_orders', '?')}\n"
                    context += f"- Subcases: {zernike.get('subcases', [])}\n"
                    context += "\n"

                # Algorithm
                method = config.get("method", config.get("optimization", {}).get("sampler", "TPE"))
                max_trials = config.get("max_trials", config.get("optimization", {}).get("n_trials", 100))
                context += f"### Algorithm\n\n"
                context += f"- Method: {method}\n"
                context += f"- Max Trials: {max_trials}\n\n"

            except Exception as e:
                context += f"*Error reading config: {e}*\n\n"
        else:
            context += "*No optimization_config.json found*\n\n"

        # Check for run_optimization.py
        run_opt_path = self.working_dir / "run_optimization.py"
        if run_opt_path.exists():
            context += f"**Run Script**: `{run_opt_path}` (exists)\n\n"

        # Check results
        db_path = self.working_dir / "3_results" / "study.db"
        if db_path.exists():
            context += "**Results Database**: exists\n"
            # Could query trial count here
        else:
            context += "**Results Database**: not found (no optimization run yet)\n"

        return context
```

### Phase 2: WebSocket Handler for Claude Code

**File**: `atomizer-dashboard/backend/api/routes/claude_code.py`

```python
"""
Claude Code WebSocket Routes

Provides WebSocket endpoint that connects to actual Claude Code CLI.
"""

from fastapi import APIRouter, WebSocket, WebSocketDisconnect
from api.services.claude_code_session import ClaudeCodeSession
import uuid

router = APIRouter()

# Active sessions
sessions: Dict[str, ClaudeCodeSession] = {}

@router.websocket("/ws/{study_id}")
async def claude_code_websocket(websocket: WebSocket, study_id: str = None):
    """
    WebSocket for full Claude Code CLI access.

    This gives dashboard users the SAME power as terminal users.
    """
    await websocket.accept()

    session_id = str(uuid.uuid4())[:8]
    session = ClaudeCodeSession(session_id, study_id)
    sessions[session_id] = session

    try:
        while True:
            data = await websocket.receive_json()

            if data.get("type") == "message":
                content = data.get("content", "")

                # Stream response from Claude Code CLI
                async for chunk in session.send_message(content):
                    await websocket.send_json({
                        "type": "text",
                        "content": chunk,
                    })

                await websocket.send_json({"type": "done"})

                # After response, trigger canvas refresh
                await websocket.send_json({
                    "type": "refresh_canvas",
                    "study_id": study_id,
                })

    except WebSocketDisconnect:
        sessions.pop(session_id, None)
```

### Phase 3: Frontend - Use Claude Code Endpoint

**File**: `atomizer-dashboard/frontend/src/hooks/useClaudeCode.ts`

```typescript
/**
 * Hook for Claude Code CLI integration
 *
 * Connects to backend that spawns actual Claude Code CLI processes.
 * This gives full power: file editing, command execution, etc.
 */

export function useClaudeCode(studyId?: string) {
  const [messages, setMessages] = useState<Message[]>([]);
  const [isThinking, setIsThinking] = useState(false);
  const wsRef = useRef<WebSocket | null>(null);

  // Reload canvas after Claude makes changes
  const { loadFromConfig } = useCanvasStore();

  useEffect(() => {
    // Connect to Claude Code WebSocket
    const ws = new WebSocket(`ws://${location.host}/api/claude-code/ws/${studyId || ''}`);

    ws.onmessage = (event) => {
      const data = JSON.parse(event.data);

      if (data.type === 'text') {
        // Stream Claude's response
        appendToLastMessage(data.content);
      }
      else if (data.type === 'done') {
        setIsThinking(false);
      }
      else if (data.type === 'refresh_canvas') {
        // Claude made file changes - reload canvas from config
        reloadCanvasFromStudy(data.study_id);
      }
    };

    wsRef.current = ws;
    return () => ws.close();
  }, [studyId]);

  const sendMessage = async (content: string) => {
    setIsThinking(true);
    addMessage({ role: 'user', content });
    addMessage({ role: 'assistant', content: '', isStreaming: true });

    wsRef.current?.send(JSON.stringify({
      type: 'message',
      content,
    }));
  };

  return { messages, isThinking, sendMessage };
}
```

### Phase 4: Canvas Auto-Refresh

When Claude modifies `optimization_config.json`, the canvas should automatically reload:

```typescript
// In AtomizerCanvas.tsx or useCanvasChat.ts

const reloadCanvasFromStudy = async (studyId: string) => {
  // Fetch fresh config from backend
  const response = await fetch(`/api/studies/${studyId}/config`);
  const config = await response.json();

  // Reload canvas
  loadFromConfig(config);

  // Notify user
  showNotification('Canvas updated with Claude\'s changes');
};
```

### Phase 5: Smart Prompting for Canvas Context

When user sends a message from canvas view, include canvas state:

```typescript
const sendCanvasMessage = (userMessage: string) => {
  const canvasContext = generateCanvasMarkdown();

  const enrichedMessage = `
## Current Canvas State
${canvasContext}

## User Request
${userMessage}

When making changes, modify the actual optimization_config.json file.
After changes, the canvas will auto-refresh.
`;

  sendMessage(enrichedMessage);
};
```

---

## Expected Behavior After Implementation

### Example 1: Add Design Variables
```
User: "Add 10 new design variables for hole diameters, range 5-25mm"

Claude Code (in dashboard):
1. Reads studies/M1_Mirror/.../optimization_config.json
2. Adds 10 entries to design_variables array:
   - hole_diameter_1: [5, 25] mm
   - hole_diameter_2: [5, 25] mm
   - ... (10 total)
3. WRITES the file
4. Reports: "Added 10 design variables to optimization_config.json"
5. Frontend receives "refresh_canvas" signal
6. Canvas reloads and shows 10 new nodes
7. User sees actual changes
```

### Example 2: Modify Optimization
```
User: "Change the algorithm to CMA-ES with 500 trials and add a stress constraint < 200 MPa"

Claude Code (in dashboard):
1. Reads config
2. Changes method: "TPE" -> "CMA-ES"
3. Changes max_trials: 100 -> 500
4. Adds constraint: {name: "stress_limit", operator: "<=", value: 200, unit: "MPa"}
5. WRITES the file
6. Reports changes
7. Canvas refreshes with updated algorithm node and new constraint node
```

### Example 3: Complex Multi-File Changes
```
User: "Add a new Zernike extractor for the secondary mirror and connect it to a new objective"

Claude Code (in dashboard):
1. Reads config
2. Adds extractor to extractors array
3. Adds objective connected to extractor
4. If needed, modifies run_optimization.py to import new extractor
5. WRITES all modified files
6. Canvas refreshes with new extractor and objective nodes, properly connected
```

---

## Implementation Checklist

### Phase 1: Backend Claude Code Session
- [x] Create `claude_code_session.py` with session manager
- [x] Implement `send_message()` with CLI spawning
- [x] Build Atomizer-aware system prompt
- [x] Handle study context (working directory)
- [x] Stream output properly

### Phase 2: WebSocket Route
- [x] Create `/api/claude-code/ws/{study_id}` endpoint
- [x] Handle message routing
- [x] Implement `refresh_canvas` signal
- [x] Session cleanup on disconnect

### Phase 3: Frontend Hook
- [x] Create `useClaudeCode.ts` hook
- [x] Connect to Claude Code WebSocket
- [x] Handle streaming responses
- [x] Handle canvas refresh signals

### Phase 4: Canvas Auto-Refresh
- [x] Add `reloadCanvasFromStudy()` function
- [x] Wire refresh signal to canvas store
- [x] Add loading state during refresh
- [x] Show notification on refresh (system message)

### Phase 5: Chat Panel Integration
- [x] Update ChatPanel to use `useClaudeCode`
- [x] Include canvas context in messages
- [x] Add "Claude Code" indicator in UI (mode toggle)
- [x] Show when Claude is editing files

### Phase 6: Testing
- [ ] Test adding design variables
- [ ] Test modifying objectives
- [ ] Test complex multi-file changes
- [ ] Test canvas refresh after changes
- [ ] Test error handling

## Implementation Notes

### Files Created/Modified

**Backend:**
- `atomizer-dashboard/backend/api/services/claude_code_session.py` - New session manager
- `atomizer-dashboard/backend/api/routes/claude_code.py` - New WebSocket routes
- `atomizer-dashboard/backend/api/main.py` - Added claude_code router

**Frontend:**
- `atomizer-dashboard/frontend/src/hooks/useClaudeCode.ts` - New hook for Claude Code CLI
- `atomizer-dashboard/frontend/src/components/canvas/AtomizerCanvas.tsx` - Added mode toggle
- `atomizer-dashboard/frontend/src/components/chat/ChatMessage.tsx` - Added system message support

---

## Security Considerations

Claude Code CLI with `--dangerously-skip-permissions` has full system access. Mitigations:

1. **Sandboxed environment**: Dashboard runs on user's machine, not public server
2. **Study-scoped working directory**: Claude starts in study folder
3. **Audit logging**: Log all file modifications
4. **User confirmation**: Option to require approval for destructive operations

---

## Cost Considerations

Using Opus 4.5 via Claude Code CLI is more expensive than Sonnet API. Options:

1. **Default to Sonnet, upgrade on request**: "Use full power mode" button
2. **Per-session token budget**: Warn user when approaching limit
3. **Cache common operations**: Pre-generate responses for common tasks

---

## Success Criteria

1. **Parity with terminal**: Dashboard chat can do everything Claude Code CLI can
2. **Real modifications**: Files actually change, not just instructions
3. **Canvas sync**: Canvas reflects file changes immediately
4. **Intelligent defaults**: Claude makes smart choices without asking clarifying questions
5. **Proactive behavior**: Claude anticipates needs and handles edge cases

---

*This document to be implemented by Claude Code CLI*