# Canvas Builder Robustness & Enhancement Plan **Created**: January 21, 2026 **Branch**: `feature/studio-enhancement` **Status**: Planning --- ## Executive Summary This plan addresses critical issues and enhancements to make the Canvas Builder robust and production-ready: 1. **Panel Management** - Panels (Introspection, Config, Chat) disappear unexpectedly 2. **Pre-run Validation** - No validation before starting optimization 3. **Error Handling** - Poor feedback when things go wrong 4. **Live Updates** - Polling is inefficient; need WebSocket 5. **Visualization** - No convergence charts or progress indicators 6. **Testing** - No automated tests for critical flows --- ## Phase 1: Panel Management System (HIGH PRIORITY) ### Problem - IntrospectionPanel disappears when user clicks elsewhere on canvas - Panel state is lost (e.g., introspection results, expanded sections) - No way to have multiple panels open simultaneously - Chat panel and Config panel are mutually exclusive ### Root Cause ```typescript // Current: Local state in ModelNodeConfig (NodeConfigPanelV2.tsx:275) const [showIntrospection, setShowIntrospection] = useState(false); // When selectedNodeId changes, ModelNodeConfig unmounts, losing state ``` ### Solution: Centralized Panel Store Create `usePanelStore.ts` - a Zustand store for panel management: ```typescript // atomizer-dashboard/frontend/src/hooks/usePanelStore.ts interface PanelState { // Panel visibility panels: { introspection: { open: boolean; filePath?: string; data?: IntrospectionResult }; config: { open: boolean; nodeId?: string }; chat: { open: boolean; powerMode: boolean }; validation: { open: boolean; errors?: ValidationError[] }; results: { open: boolean; trialId?: number }; }; // Actions openPanel: (panel: PanelName, data?: any) => void; closePanel: (panel: PanelName) => void; togglePanel: (panel: PanelName) => void; // Panel data persistence setIntrospectionData: (data: IntrospectionResult) => void; clearIntrospectionData: () => void; } ``` ### Implementation Tasks | Task | File | Description | |------|------|-------------| | 1.1 | `usePanelStore.ts` | Create Zustand store for panel state | | 1.2 | `PanelContainer.tsx` | Create container that renders open panels | | 1.3 | `IntrospectionPanel.tsx` | Refactor to use store instead of local state | | 1.4 | `NodeConfigPanelV2.tsx` | Remove local panel state, use store | | 1.5 | `CanvasView.tsx` | Integrate PanelContainer, remove chat panel logic | | 1.6 | `SpecRenderer.tsx` | Add panel trigger buttons (introspect, validate) | ### UI Changes **Before:** ``` [Canvas] [Config Panel OR Chat Panel] ↑ mutually exclusive ``` **After:** ``` [Canvas] [Right Panel Area] ├── Config Panel (pinnable) ├── Chat Panel (collapsible) └── Floating Panels: ├── Introspection (draggable, persistent) ├── Validation Results └── Trial Details ``` ### Panel Behaviors | Panel | Trigger | Persistence | Position | |-------|---------|-------------|----------| | **Config** | Node click | While node selected | Right sidebar | | **Chat** | Toggle button | Always available | Right sidebar (below config) | | **Introspection** | "Introspect" button | Until explicitly closed | Floating, draggable | | **Validation** | "Validate" or pre-run | Until fixed or dismissed | Floating | | **Results** | Click on result badge | Until dismissed | Floating | --- ## Phase 2: Pre-run Validation (HIGH PRIORITY) ### Problem - User can click "Run" with incomplete spec - No feedback about missing extractors, objectives, or connections - Optimization fails silently or with cryptic errors ### Solution: Validation Pipeline ```typescript // Types of validation interface ValidationResult { valid: boolean; errors: ValidationError[]; // Must fix before running warnings: ValidationWarning[]; // Can proceed but risky } interface ValidationError { code: string; severity: 'error' | 'warning'; path: string; // e.g., "objectives[0]" message: string; suggestion?: string; autoFix?: () => void; } ``` ### Validation Rules | Rule | Severity | Message | |------|----------|---------| | No design variables | Error | "Add at least one design variable" | | No objectives | Error | "Add at least one objective" | | Objective not connected to extractor | Error | "Objective '{name}' has no source extractor" | | Extractor type not set | Error | "Extractor '{name}' needs a type selected" | | Design var bounds invalid | Error | "Min must be less than max for '{name}'" | | No model file | Error | "No simulation file configured" | | Custom extractor no code | Warning | "Custom extractor '{name}' has no code" | | High trial count (>500) | Warning | "Large budget may take hours to complete" | | Single trial | Warning | "Only 1 trial - results won't be meaningful" | ### Implementation Tasks | Task | File | Description | |------|------|-------------| | 2.1 | `validation/specValidator.ts` | Client-side validation rules | | 2.2 | `ValidationPanel.tsx` | Display validation results | | 2.3 | `SpecRenderer.tsx` | Add "Validate" button, pre-run check | | 2.4 | `api/routes/spec.py` | Server-side validation endpoint | | 2.5 | `useSpecStore.ts` | Add `validate()` action | ### UI Flow ``` User clicks "Run Optimization" ↓ [Validate Spec] ──failed──→ [Show ValidationPanel] ↓ passed │ [Confirm Dialog] │ ↓ confirmed │ [Start Optimization] ←── fix ─────┘ ``` --- ## Phase 3: Error Handling & Recovery (HIGH PRIORITY) ### Problem - NX crashes don't show useful feedback - Solver failures leave user confused - No way to resume after errors ### Solution: Error Classification & Display ```typescript interface OptimizationError { type: 'nx_crash' | 'solver_fail' | 'extractor_error' | 'config_error' | 'system_error'; trial?: number; message: string; details?: string; recoverable: boolean; suggestions: string[]; } ``` ### Error Handling Strategy | Error Type | Display | Recovery | |------------|---------|----------| | NX Crash | Toast + Error Panel | Retry trial, skip trial | | Solver Failure | Badge on trial | Mark infeasible, continue | | Extractor Error | Log + badge | Use NaN, continue | | Config Error | Block run | Show validation panel | | System Error | Full modal | Restart optimization | ### Implementation Tasks | Task | File | Description | |------|------|-------------| | 3.1 | `ErrorBoundary.tsx` | Wrap canvas in error boundary | | 3.2 | `ErrorPanel.tsx` | Detailed error display with suggestions | | 3.3 | `optimization.py` | Enhanced error responses with type/recovery | | 3.4 | `SpecRenderer.tsx` | Error state handling, retry buttons | | 3.5 | `useOptimizationStatus.ts` | Hook for status polling with error handling | --- ## Phase 4: Live Updates via WebSocket (MEDIUM PRIORITY) ### Problem - Current polling (3s) is inefficient and has latency - Missed updates between polls - No real-time progress indication ### Solution: WebSocket for Trial Updates ```typescript // WebSocket events interface TrialStartEvent { type: 'trial_start'; trial_number: number; params: Record; } interface TrialCompleteEvent { type: 'trial_complete'; trial_number: number; objectives: Record; is_best: boolean; is_feasible: boolean; } interface OptimizationCompleteEvent { type: 'optimization_complete'; best_trial: number; total_trials: number; } ``` ### Implementation Tasks | Task | File | Description | |------|------|-------------| | 4.1 | `websocket.py` | Add optimization events to WS | | 4.2 | `run_optimization.py` | Emit events during optimization | | 4.3 | `useOptimizationWebSocket.ts` | Hook for WS subscription | | 4.4 | `SpecRenderer.tsx` | Use WS instead of polling | | 4.5 | `ResultBadge.tsx` | Animate on new results | --- ## Phase 5: Convergence Visualization (MEDIUM PRIORITY) ### Problem - No visual feedback on optimization progress - Can't tell if converging or stuck - No Pareto front visualization for multi-objective ### Solution: Embedded Charts ### Components | Component | Description | |-----------|-------------| | `ConvergenceSparkline` | Tiny chart in ObjectiveNode showing trend | | `ProgressRing` | Circular progress in header (trials/total) | | `ConvergenceChart` | Full chart in Results panel | | `ParetoPlot` | 2D Pareto front for multi-objective | ### Implementation Tasks | Task | File | Description | |------|------|-------------| | 5.1 | `ConvergenceSparkline.tsx` | SVG sparkline component | | 5.2 | `ObjectiveNode.tsx` | Integrate sparkline | | 5.3 | `ProgressRing.tsx` | Circular progress indicator | | 5.4 | `ConvergenceChart.tsx` | Full chart with Recharts | | 5.5 | `ResultsPanel.tsx` | Panel showing detailed results | --- ## Phase 6: End-to-End Testing (MEDIUM PRIORITY) ### Problem - No automated tests for canvas operations - Manual testing is time-consuming and error-prone - Regressions go unnoticed ### Solution: Playwright E2E Tests ### Test Scenarios | Test | Steps | Assertions | |------|-------|------------| | Load study | Navigate to /canvas/{id} | Spec loads, nodes render | | Add design var | Drag from palette | Node appears, spec updates | | Connect nodes | Drag edge | Edge renders, spec has edge | | Edit node | Click node, change value | Value persists, API called | | Run validation | Click validate | Errors shown for incomplete | | Start optimization | Complete spec, click run | Status shows running | | View results | Wait for trial | Badge shows value | | Stop optimization | Click stop | Status shows stopped | ### Implementation Tasks | Task | File | Description | |------|------|-------------| | 6.1 | `e2e/canvas.spec.ts` | Basic canvas operations | | 6.2 | `e2e/optimization.spec.ts` | Run/stop/status flow | | 6.3 | `e2e/panels.spec.ts` | Panel open/close/persist | | 6.4 | `playwright.config.ts` | Configure Playwright | | 6.5 | `CI workflow` | Run tests in GitHub Actions | --- ## Implementation Order ``` Week 1: ├── Phase 1: Panel Management (critical UX fix) │ ├── Day 1-2: usePanelStore + PanelContainer │ └── Day 3-4: Refactor existing panels │ ├── Phase 2: Validation (prevent user errors) │ └── Day 5: Validation rules + UI Week 2: ├── Phase 3: Error Handling │ ├── Day 1-2: Error types + ErrorPanel │ └── Day 3: Integration with optimization flow │ ├── Phase 4: WebSocket Updates │ └── Day 4-5: WS events + frontend hook Week 3: ├── Phase 5: Visualization │ ├── Day 1-2: Sparklines │ └── Day 3: Progress indicators │ ├── Phase 6: Testing │ └── Day 4-5: Playwright setup + core tests ``` --- ## Quick Wins (Can Do Now) These can be implemented immediately with minimal changes: 1. **Persist introspection data in localStorage** - Cache introspection results - Restore on panel reopen 2. **Add loading states to all buttons** - Disable during operations - Show spinners 3. **Add confirmation dialogs** - Before stopping optimization - Before clearing canvas 4. **Improve error messages** - Parse NX error logs - Show actionable suggestions --- ## Files to Create/Modify ### New Files ``` atomizer-dashboard/frontend/src/ ├── hooks/ │ ├── usePanelStore.ts │ └── useOptimizationWebSocket.ts ├── components/canvas/ │ ├── PanelContainer.tsx │ ├── panels/ │ │ ├── ValidationPanel.tsx │ │ ├── ErrorPanel.tsx │ │ └── ResultsPanel.tsx │ └── visualization/ │ ├── ConvergenceSparkline.tsx │ ├── ProgressRing.tsx │ └── ConvergenceChart.tsx └── lib/ └── validation/ └── specValidator.ts e2e/ ├── canvas.spec.ts ├── optimization.spec.ts └── panels.spec.ts ``` ### Modified Files ``` atomizer-dashboard/frontend/src/ ├── pages/CanvasView.tsx ├── components/canvas/SpecRenderer.tsx ├── components/canvas/panels/IntrospectionPanel.tsx ├── components/canvas/panels/NodeConfigPanelV2.tsx ├── components/canvas/nodes/ObjectiveNode.tsx └── hooks/useSpecStore.ts atomizer-dashboard/backend/api/ ├── routes/optimization.py ├── routes/spec.py └── websocket.py ``` --- ## Success Criteria | Phase | Success Metric | |-------|----------------| | 1 | Introspection panel persists across node selections | | 2 | Invalid spec shows clear error before run | | 3 | NX errors display with recovery options | | 4 | Results update within 500ms of trial completion | | 5 | Convergence trend visible on objective nodes | | 6 | All E2E tests pass in CI | --- ## Next Steps 1. Review this plan 2. Start with Phase 1 (Panel Management) - fixes your immediate issue 3. Implement incrementally, commit after each phase