Files

Antoine 8d9d55356c docs: Archive stale docs and create Atomizer-HQ agent documentation

Archive Management:
- Moved RALPH_LOOP, CANVAS, and dashboard implementation plans to archive/review/ for CEO review
- Moved completed restructuring plan and protocol v1 to archive/historical/
- Moved old session summaries to archive/review/

New HQ Documentation (docs/hq/):
- README.md: Overview of Atomizer-HQ multi-agent optimization team
- PROJECT_STRUCTURE.md: Standard KB-integrated project layout with Hydrotech reference
- KB_CONVENTIONS.md: Knowledge Base accumulation principles with generation tracking
- AGENT_WORKFLOWS.md: Project lifecycle phases and agent handoffs (OP_09 integration)
- STUDY_CONVENTIONS.md: Technical study execution standards and atomizer_spec.json format

Index Update:
- Reorganized docs/00_INDEX.md with HQ docs prominent
- Updated structure to reflect new agent-focused organization
- Maintained core documentation access for engineers

No files deleted, only moved to appropriate archive locations.

2026-02-09 02:48:35 +00:00

13 KiB

Raw Blame History

Canvas Builder Robustness & Enhancement Plan

Created: January 21, 2026
Branch: feature/studio-enhancement
Status: Planning

Executive Summary

This plan addresses critical issues and enhancements to make the Canvas Builder robust and production-ready:

Panel Management - Panels (Introspection, Config, Chat) disappear unexpectedly
Pre-run Validation - No validation before starting optimization
Error Handling - Poor feedback when things go wrong
Live Updates - Polling is inefficient; need WebSocket
Visualization - No convergence charts or progress indicators
Testing - No automated tests for critical flows

Phase 1: Panel Management System (HIGH PRIORITY)

Problem

IntrospectionPanel disappears when user clicks elsewhere on canvas
Panel state is lost (e.g., introspection results, expanded sections)
No way to have multiple panels open simultaneously
Chat panel and Config panel are mutually exclusive

Root Cause

// Current: Local state in ModelNodeConfig (NodeConfigPanelV2.tsx:275)
const [showIntrospection, setShowIntrospection] = useState(false);

// When selectedNodeId changes, ModelNodeConfig unmounts, losing state

Solution: Centralized Panel Store

Create usePanelStore.ts - a Zustand store for panel management:

// atomizer-dashboard/frontend/src/hooks/usePanelStore.ts

interface PanelState {
  // Panel visibility
  panels: {
    introspection: { open: boolean; filePath?: string; data?: IntrospectionResult };
    config: { open: boolean; nodeId?: string };
    chat: { open: boolean; powerMode: boolean };
    validation: { open: boolean; errors?: ValidationError[] };
    results: { open: boolean; trialId?: number };
  };
  
  // Actions
  openPanel: (panel: PanelName, data?: any) => void;
  closePanel: (panel: PanelName) => void;
  togglePanel: (panel: PanelName) => void;
  
  // Panel data persistence
  setIntrospectionData: (data: IntrospectionResult) => void;
  clearIntrospectionData: () => void;
}

Implementation Tasks

Task	File	Description
1.1	`usePanelStore.ts`	Create Zustand store for panel state
1.2	`PanelContainer.tsx`	Create container that renders open panels
1.3	`IntrospectionPanel.tsx`	Refactor to use store instead of local state
1.4	`NodeConfigPanelV2.tsx`	Remove local panel state, use store
1.5	`CanvasView.tsx`	Integrate PanelContainer, remove chat panel logic
1.6	`SpecRenderer.tsx`	Add panel trigger buttons (introspect, validate)

UI Changes

Before:

[Canvas] [Config Panel OR Chat Panel]
         ↑ mutually exclusive

After:

[Canvas] [Right Panel Area]
         ├── Config Panel (pinnable)
         ├── Chat Panel (collapsible)
         └── Floating Panels:
             ├── Introspection (draggable, persistent)
             ├── Validation Results
             └── Trial Details

Panel Behaviors

Panel	Trigger	Persistence	Position
Config	Node click	While node selected	Right sidebar
Chat	Toggle button	Always available	Right sidebar (below config)
Introspection	"Introspect" button	Until explicitly closed	Floating, draggable
Validation	"Validate" or pre-run	Until fixed or dismissed	Floating
Results	Click on result badge	Until dismissed	Floating

Phase 2: Pre-run Validation (HIGH PRIORITY)

Problem

User can click "Run" with incomplete spec
No feedback about missing extractors, objectives, or connections
Optimization fails silently or with cryptic errors

Solution: Validation Pipeline

// Types of validation
interface ValidationResult {
  valid: boolean;
  errors: ValidationError[];   // Must fix before running
  warnings: ValidationWarning[]; // Can proceed but risky
}

interface ValidationError {
  code: string;
  severity: 'error' | 'warning';
  path: string;       // e.g., "objectives[0]"
  message: string;
  suggestion?: string;
  autoFix?: () => void;
}

Validation Rules

Rule	Severity	Message
No design variables	Error	"Add at least one design variable"
No objectives	Error	"Add at least one objective"
Objective not connected to extractor	Error	"Objective '{name}' has no source extractor"
Extractor type not set	Error	"Extractor '{name}' needs a type selected"
Design var bounds invalid	Error	"Min must be less than max for '{name}'"
No model file	Error	"No simulation file configured"
Custom extractor no code	Warning	"Custom extractor '{name}' has no code"
High trial count (>500)	Warning	"Large budget may take hours to complete"
Single trial	Warning	"Only 1 trial - results won't be meaningful"

Implementation Tasks

Task	File	Description
2.1	`validation/specValidator.ts`	Client-side validation rules
2.2	`ValidationPanel.tsx`	Display validation results
2.3	`SpecRenderer.tsx`	Add "Validate" button, pre-run check
2.4	`api/routes/spec.py`	Server-side validation endpoint
2.5	`useSpecStore.ts`	Add `validate()` action

UI Flow

User clicks "Run Optimization"
    ↓
[Validate Spec] ──failed──→ [Show ValidationPanel]
    ↓ passed                      │
[Confirm Dialog]                  │
    ↓ confirmed                   │
[Start Optimization] ←── fix ─────┘

Phase 3: Error Handling & Recovery (HIGH PRIORITY)

Problem

NX crashes don't show useful feedback
Solver failures leave user confused
No way to resume after errors

Solution: Error Classification & Display

interface OptimizationError {
  type: 'nx_crash' | 'solver_fail' | 'extractor_error' | 'config_error' | 'system_error';
  trial?: number;
  message: string;
  details?: string;
  recoverable: boolean;
  suggestions: string[];
}

Error Handling Strategy

Error Type	Display	Recovery
NX Crash	Toast + Error Panel	Retry trial, skip trial
Solver Failure	Badge on trial	Mark infeasible, continue
Extractor Error	Log + badge	Use NaN, continue
Config Error	Block run	Show validation panel
System Error	Full modal	Restart optimization

Implementation Tasks

Task	File	Description
3.1	`ErrorBoundary.tsx`	Wrap canvas in error boundary
3.2	`ErrorPanel.tsx`	Detailed error display with suggestions
3.3	`optimization.py`	Enhanced error responses with type/recovery
3.4	`SpecRenderer.tsx`	Error state handling, retry buttons
3.5	`useOptimizationStatus.ts`	Hook for status polling with error handling

Phase 4: Live Updates via WebSocket (MEDIUM PRIORITY)

Problem

Current polling (3s) is inefficient and has latency
Missed updates between polls
No real-time progress indication

Solution: WebSocket for Trial Updates

// WebSocket events
interface TrialStartEvent {
  type: 'trial_start';
  trial_number: number;
  params: Record<string, number>;
}

interface TrialCompleteEvent {
  type: 'trial_complete';
  trial_number: number;
  objectives: Record<string, number>;
  is_best: boolean;
  is_feasible: boolean;
}

interface OptimizationCompleteEvent {
  type: 'optimization_complete';
  best_trial: number;
  total_trials: number;
}

Implementation Tasks

Task	File	Description
4.1	`websocket.py`	Add optimization events to WS
4.2	`run_optimization.py`	Emit events during optimization
4.3	`useOptimizationWebSocket.ts`	Hook for WS subscription
4.4	`SpecRenderer.tsx`	Use WS instead of polling
4.5	`ResultBadge.tsx`	Animate on new results

Phase 5: Convergence Visualization (MEDIUM PRIORITY)

Problem

No visual feedback on optimization progress
Can't tell if converging or stuck
No Pareto front visualization for multi-objective

Solution: Embedded Charts

Components

Component	Description
`ConvergenceSparkline`	Tiny chart in ObjectiveNode showing trend
`ProgressRing`	Circular progress in header (trials/total)
`ConvergenceChart`	Full chart in Results panel
`ParetoPlot`	2D Pareto front for multi-objective

Implementation Tasks

Task	File	Description
5.1	`ConvergenceSparkline.tsx`	SVG sparkline component
5.2	`ObjectiveNode.tsx`	Integrate sparkline
5.3	`ProgressRing.tsx`	Circular progress indicator
5.4	`ConvergenceChart.tsx`	Full chart with Recharts
5.5	`ResultsPanel.tsx`	Panel showing detailed results

Phase 6: End-to-End Testing (MEDIUM PRIORITY)

Problem

No automated tests for canvas operations
Manual testing is time-consuming and error-prone
Regressions go unnoticed

Solution: Playwright E2E Tests

Test Scenarios

Test	Steps	Assertions
Load study	Navigate to /canvas/{id}	Spec loads, nodes render
Add design var	Drag from palette	Node appears, spec updates
Connect nodes	Drag edge	Edge renders, spec has edge
Edit node	Click node, change value	Value persists, API called
Run validation	Click validate	Errors shown for incomplete
Start optimization	Complete spec, click run	Status shows running
View results	Wait for trial	Badge shows value
Stop optimization	Click stop	Status shows stopped

Implementation Tasks

Task	File	Description
6.1	`e2e/canvas.spec.ts`	Basic canvas operations
6.2	`e2e/optimization.spec.ts`	Run/stop/status flow
6.3	`e2e/panels.spec.ts`	Panel open/close/persist
6.4	`playwright.config.ts`	Configure Playwright
6.5	`CI workflow`	Run tests in GitHub Actions

Implementation Order

Week 1:
├── Phase 1: Panel Management (critical UX fix)
│   ├── Day 1-2: usePanelStore + PanelContainer
│   └── Day 3-4: Refactor existing panels
│
├── Phase 2: Validation (prevent user errors)
│   └── Day 5: Validation rules + UI

Week 2:
├── Phase 3: Error Handling
│   ├── Day 1-2: Error types + ErrorPanel
│   └── Day 3: Integration with optimization flow
│
├── Phase 4: WebSocket Updates
│   └── Day 4-5: WS events + frontend hook

Week 3:
├── Phase 5: Visualization
│   ├── Day 1-2: Sparklines
│   └── Day 3: Progress indicators
│
├── Phase 6: Testing
│   └── Day 4-5: Playwright setup + core tests

Quick Wins (Can Do Now)

These can be implemented immediately with minimal changes:

Persist introspection data in localStorage
- Cache introspection results
- Restore on panel reopen
Add loading states to all buttons
- Disable during operations
- Show spinners
Add confirmation dialogs
- Before stopping optimization
- Before clearing canvas
Improve error messages
- Parse NX error logs
- Show actionable suggestions

Files to Create/Modify

New Files

atomizer-dashboard/frontend/src/
├── hooks/
│   ├── usePanelStore.ts
│   └── useOptimizationWebSocket.ts
├── components/canvas/
│   ├── PanelContainer.tsx
│   ├── panels/
│   │   ├── ValidationPanel.tsx
│   │   ├── ErrorPanel.tsx
│   │   └── ResultsPanel.tsx
│   └── visualization/
│       ├── ConvergenceSparkline.tsx
│       ├── ProgressRing.tsx
│       └── ConvergenceChart.tsx
└── lib/
    └── validation/
        └── specValidator.ts

e2e/
├── canvas.spec.ts
├── optimization.spec.ts
└── panels.spec.ts

Modified Files

atomizer-dashboard/frontend/src/
├── pages/CanvasView.tsx
├── components/canvas/SpecRenderer.tsx
├── components/canvas/panels/IntrospectionPanel.tsx
├── components/canvas/panels/NodeConfigPanelV2.tsx
├── components/canvas/nodes/ObjectiveNode.tsx
└── hooks/useSpecStore.ts

atomizer-dashboard/backend/api/
├── routes/optimization.py
├── routes/spec.py
└── websocket.py

Success Criteria

Phase	Success Metric
1	Introspection panel persists across node selections
2	Invalid spec shows clear error before run
3	NX errors display with recovery options
4	Results update within 500ms of trial completion
5	Convergence trend visible on objective nodes
6	All E2E tests pass in CI

Next Steps

Review this plan
Start with Phase 1 (Panel Management) - fixes your immediate issue
Implement incrementally, commit after each phase

13 KiB Raw Blame History

Canvas Builder Robustness & Enhancement Plan

Executive Summary

Phase 1: Panel Management System (HIGH PRIORITY)

Problem

Root Cause

Solution: Centralized Panel Store

Implementation Tasks

UI Changes

Panel Behaviors

Phase 2: Pre-run Validation (HIGH PRIORITY)

Problem

Solution: Validation Pipeline

Validation Rules

Implementation Tasks

UI Flow

Phase 3: Error Handling & Recovery (HIGH PRIORITY)

Problem

Solution: Error Classification & Display

Error Handling Strategy

Implementation Tasks

Phase 4: Live Updates via WebSocket (MEDIUM PRIORITY)

Problem

Solution: WebSocket for Trial Updates

Implementation Tasks

Phase 5: Convergence Visualization (MEDIUM PRIORITY)

Problem

Solution: Embedded Charts

Components

Implementation Tasks

Phase 6: End-to-End Testing (MEDIUM PRIORITY)

Problem

Solution: Playwright E2E Tests

Test Scenarios

Implementation Tasks

Implementation Order

Quick Wins (Can Do Now)

Files to Create/Modify

New Files

Modified Files

Success Criteria

Next Steps

13 KiB

Raw Blame History