Phase 1 - Panel Management System: - Create usePanelStore.ts for centralized panel state management - Add PanelContainer.tsx for draggable floating panels - Create FloatingIntrospectionPanel.tsx (persistent, doesn't disappear on node click) - Create ResultsPanel.tsx for trial result details - Refactor NodeConfigPanelV2 to use panel store for introspection - Integrate PanelContainer into CanvasView Phase 2 - Pre-run Validation: - Create specValidator.ts with comprehensive validation rules - Add ValidationPanel (enhanced version with error navigation) - Add Validate button to SpecRenderer with status indicator - Block run if validation fails - Check for: design vars, objectives, extractors, bounds, connections Phase 3 - Error Handling & Recovery: - Create ErrorPanel.tsx for displaying optimization errors - Add error classification (nx_crash, solver_fail, extractor_error, etc.) - Add recovery suggestions based on error type - Update status endpoint to return error info - Add _get_study_error_info helper to check error_status.json and DB - Integrate error detection into status polling Documentation: - Add CANVAS_ROBUSTNESS_PLAN.md with full implementation plan
439 lines
13 KiB
Markdown
439 lines
13 KiB
Markdown
# Canvas Builder Robustness & Enhancement Plan
|
|
|
|
**Created**: January 21, 2026
|
|
**Branch**: `feature/studio-enhancement`
|
|
**Status**: Planning
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
This plan addresses critical issues and enhancements to make the Canvas Builder robust and production-ready:
|
|
|
|
1. **Panel Management** - Panels (Introspection, Config, Chat) disappear unexpectedly
|
|
2. **Pre-run Validation** - No validation before starting optimization
|
|
3. **Error Handling** - Poor feedback when things go wrong
|
|
4. **Live Updates** - Polling is inefficient; need WebSocket
|
|
5. **Visualization** - No convergence charts or progress indicators
|
|
6. **Testing** - No automated tests for critical flows
|
|
|
|
---
|
|
|
|
## Phase 1: Panel Management System (HIGH PRIORITY)
|
|
|
|
### Problem
|
|
- IntrospectionPanel disappears when user clicks elsewhere on canvas
|
|
- Panel state is lost (e.g., introspection results, expanded sections)
|
|
- No way to have multiple panels open simultaneously
|
|
- Chat panel and Config panel are mutually exclusive
|
|
|
|
### Root Cause
|
|
```typescript
|
|
// Current: Local state in ModelNodeConfig (NodeConfigPanelV2.tsx:275)
|
|
const [showIntrospection, setShowIntrospection] = useState(false);
|
|
|
|
// When selectedNodeId changes, ModelNodeConfig unmounts, losing state
|
|
```
|
|
|
|
### Solution: Centralized Panel Store
|
|
|
|
Create `usePanelStore.ts` - a Zustand store for panel management:
|
|
|
|
```typescript
|
|
// atomizer-dashboard/frontend/src/hooks/usePanelStore.ts
|
|
|
|
interface PanelState {
|
|
// Panel visibility
|
|
panels: {
|
|
introspection: { open: boolean; filePath?: string; data?: IntrospectionResult };
|
|
config: { open: boolean; nodeId?: string };
|
|
chat: { open: boolean; powerMode: boolean };
|
|
validation: { open: boolean; errors?: ValidationError[] };
|
|
results: { open: boolean; trialId?: number };
|
|
};
|
|
|
|
// Actions
|
|
openPanel: (panel: PanelName, data?: any) => void;
|
|
closePanel: (panel: PanelName) => void;
|
|
togglePanel: (panel: PanelName) => void;
|
|
|
|
// Panel data persistence
|
|
setIntrospectionData: (data: IntrospectionResult) => void;
|
|
clearIntrospectionData: () => void;
|
|
}
|
|
```
|
|
|
|
### Implementation Tasks
|
|
|
|
| Task | File | Description |
|
|
|------|------|-------------|
|
|
| 1.1 | `usePanelStore.ts` | Create Zustand store for panel state |
|
|
| 1.2 | `PanelContainer.tsx` | Create container that renders open panels |
|
|
| 1.3 | `IntrospectionPanel.tsx` | Refactor to use store instead of local state |
|
|
| 1.4 | `NodeConfigPanelV2.tsx` | Remove local panel state, use store |
|
|
| 1.5 | `CanvasView.tsx` | Integrate PanelContainer, remove chat panel logic |
|
|
| 1.6 | `SpecRenderer.tsx` | Add panel trigger buttons (introspect, validate) |
|
|
|
|
### UI Changes
|
|
|
|
**Before:**
|
|
```
|
|
[Canvas] [Config Panel OR Chat Panel]
|
|
↑ mutually exclusive
|
|
```
|
|
|
|
**After:**
|
|
```
|
|
[Canvas] [Right Panel Area]
|
|
├── Config Panel (pinnable)
|
|
├── Chat Panel (collapsible)
|
|
└── Floating Panels:
|
|
├── Introspection (draggable, persistent)
|
|
├── Validation Results
|
|
└── Trial Details
|
|
```
|
|
|
|
### Panel Behaviors
|
|
|
|
| Panel | Trigger | Persistence | Position |
|
|
|-------|---------|-------------|----------|
|
|
| **Config** | Node click | While node selected | Right sidebar |
|
|
| **Chat** | Toggle button | Always available | Right sidebar (below config) |
|
|
| **Introspection** | "Introspect" button | Until explicitly closed | Floating, draggable |
|
|
| **Validation** | "Validate" or pre-run | Until fixed or dismissed | Floating |
|
|
| **Results** | Click on result badge | Until dismissed | Floating |
|
|
|
|
---
|
|
|
|
## Phase 2: Pre-run Validation (HIGH PRIORITY)
|
|
|
|
### Problem
|
|
- User can click "Run" with incomplete spec
|
|
- No feedback about missing extractors, objectives, or connections
|
|
- Optimization fails silently or with cryptic errors
|
|
|
|
### Solution: Validation Pipeline
|
|
|
|
```typescript
|
|
// Types of validation
|
|
interface ValidationResult {
|
|
valid: boolean;
|
|
errors: ValidationError[]; // Must fix before running
|
|
warnings: ValidationWarning[]; // Can proceed but risky
|
|
}
|
|
|
|
interface ValidationError {
|
|
code: string;
|
|
severity: 'error' | 'warning';
|
|
path: string; // e.g., "objectives[0]"
|
|
message: string;
|
|
suggestion?: string;
|
|
autoFix?: () => void;
|
|
}
|
|
```
|
|
|
|
### Validation Rules
|
|
|
|
| Rule | Severity | Message |
|
|
|------|----------|---------|
|
|
| No design variables | Error | "Add at least one design variable" |
|
|
| No objectives | Error | "Add at least one objective" |
|
|
| Objective not connected to extractor | Error | "Objective '{name}' has no source extractor" |
|
|
| Extractor type not set | Error | "Extractor '{name}' needs a type selected" |
|
|
| Design var bounds invalid | Error | "Min must be less than max for '{name}'" |
|
|
| No model file | Error | "No simulation file configured" |
|
|
| Custom extractor no code | Warning | "Custom extractor '{name}' has no code" |
|
|
| High trial count (>500) | Warning | "Large budget may take hours to complete" |
|
|
| Single trial | Warning | "Only 1 trial - results won't be meaningful" |
|
|
|
|
### Implementation Tasks
|
|
|
|
| Task | File | Description |
|
|
|------|------|-------------|
|
|
| 2.1 | `validation/specValidator.ts` | Client-side validation rules |
|
|
| 2.2 | `ValidationPanel.tsx` | Display validation results |
|
|
| 2.3 | `SpecRenderer.tsx` | Add "Validate" button, pre-run check |
|
|
| 2.4 | `api/routes/spec.py` | Server-side validation endpoint |
|
|
| 2.5 | `useSpecStore.ts` | Add `validate()` action |
|
|
|
|
### UI Flow
|
|
|
|
```
|
|
User clicks "Run Optimization"
|
|
↓
|
|
[Validate Spec] ──failed──→ [Show ValidationPanel]
|
|
↓ passed │
|
|
[Confirm Dialog] │
|
|
↓ confirmed │
|
|
[Start Optimization] ←── fix ─────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Phase 3: Error Handling & Recovery (HIGH PRIORITY)
|
|
|
|
### Problem
|
|
- NX crashes don't show useful feedback
|
|
- Solver failures leave user confused
|
|
- No way to resume after errors
|
|
|
|
### Solution: Error Classification & Display
|
|
|
|
```typescript
|
|
interface OptimizationError {
|
|
type: 'nx_crash' | 'solver_fail' | 'extractor_error' | 'config_error' | 'system_error';
|
|
trial?: number;
|
|
message: string;
|
|
details?: string;
|
|
recoverable: boolean;
|
|
suggestions: string[];
|
|
}
|
|
```
|
|
|
|
### Error Handling Strategy
|
|
|
|
| Error Type | Display | Recovery |
|
|
|------------|---------|----------|
|
|
| NX Crash | Toast + Error Panel | Retry trial, skip trial |
|
|
| Solver Failure | Badge on trial | Mark infeasible, continue |
|
|
| Extractor Error | Log + badge | Use NaN, continue |
|
|
| Config Error | Block run | Show validation panel |
|
|
| System Error | Full modal | Restart optimization |
|
|
|
|
### Implementation Tasks
|
|
|
|
| Task | File | Description |
|
|
|------|------|-------------|
|
|
| 3.1 | `ErrorBoundary.tsx` | Wrap canvas in error boundary |
|
|
| 3.2 | `ErrorPanel.tsx` | Detailed error display with suggestions |
|
|
| 3.3 | `optimization.py` | Enhanced error responses with type/recovery |
|
|
| 3.4 | `SpecRenderer.tsx` | Error state handling, retry buttons |
|
|
| 3.5 | `useOptimizationStatus.ts` | Hook for status polling with error handling |
|
|
|
|
---
|
|
|
|
## Phase 4: Live Updates via WebSocket (MEDIUM PRIORITY)
|
|
|
|
### Problem
|
|
- Current polling (3s) is inefficient and has latency
|
|
- Missed updates between polls
|
|
- No real-time progress indication
|
|
|
|
### Solution: WebSocket for Trial Updates
|
|
|
|
```typescript
|
|
// WebSocket events
|
|
interface TrialStartEvent {
|
|
type: 'trial_start';
|
|
trial_number: number;
|
|
params: Record<string, number>;
|
|
}
|
|
|
|
interface TrialCompleteEvent {
|
|
type: 'trial_complete';
|
|
trial_number: number;
|
|
objectives: Record<string, number>;
|
|
is_best: boolean;
|
|
is_feasible: boolean;
|
|
}
|
|
|
|
interface OptimizationCompleteEvent {
|
|
type: 'optimization_complete';
|
|
best_trial: number;
|
|
total_trials: number;
|
|
}
|
|
```
|
|
|
|
### Implementation Tasks
|
|
|
|
| Task | File | Description |
|
|
|------|------|-------------|
|
|
| 4.1 | `websocket.py` | Add optimization events to WS |
|
|
| 4.2 | `run_optimization.py` | Emit events during optimization |
|
|
| 4.3 | `useOptimizationWebSocket.ts` | Hook for WS subscription |
|
|
| 4.4 | `SpecRenderer.tsx` | Use WS instead of polling |
|
|
| 4.5 | `ResultBadge.tsx` | Animate on new results |
|
|
|
|
---
|
|
|
|
## Phase 5: Convergence Visualization (MEDIUM PRIORITY)
|
|
|
|
### Problem
|
|
- No visual feedback on optimization progress
|
|
- Can't tell if converging or stuck
|
|
- No Pareto front visualization for multi-objective
|
|
|
|
### Solution: Embedded Charts
|
|
|
|
### Components
|
|
|
|
| Component | Description |
|
|
|-----------|-------------|
|
|
| `ConvergenceSparkline` | Tiny chart in ObjectiveNode showing trend |
|
|
| `ProgressRing` | Circular progress in header (trials/total) |
|
|
| `ConvergenceChart` | Full chart in Results panel |
|
|
| `ParetoPlot` | 2D Pareto front for multi-objective |
|
|
|
|
### Implementation Tasks
|
|
|
|
| Task | File | Description |
|
|
|------|------|-------------|
|
|
| 5.1 | `ConvergenceSparkline.tsx` | SVG sparkline component |
|
|
| 5.2 | `ObjectiveNode.tsx` | Integrate sparkline |
|
|
| 5.3 | `ProgressRing.tsx` | Circular progress indicator |
|
|
| 5.4 | `ConvergenceChart.tsx` | Full chart with Recharts |
|
|
| 5.5 | `ResultsPanel.tsx` | Panel showing detailed results |
|
|
|
|
---
|
|
|
|
## Phase 6: End-to-End Testing (MEDIUM PRIORITY)
|
|
|
|
### Problem
|
|
- No automated tests for canvas operations
|
|
- Manual testing is time-consuming and error-prone
|
|
- Regressions go unnoticed
|
|
|
|
### Solution: Playwright E2E Tests
|
|
|
|
### Test Scenarios
|
|
|
|
| Test | Steps | Assertions |
|
|
|------|-------|------------|
|
|
| Load study | Navigate to /canvas/{id} | Spec loads, nodes render |
|
|
| Add design var | Drag from palette | Node appears, spec updates |
|
|
| Connect nodes | Drag edge | Edge renders, spec has edge |
|
|
| Edit node | Click node, change value | Value persists, API called |
|
|
| Run validation | Click validate | Errors shown for incomplete |
|
|
| Start optimization | Complete spec, click run | Status shows running |
|
|
| View results | Wait for trial | Badge shows value |
|
|
| Stop optimization | Click stop | Status shows stopped |
|
|
|
|
### Implementation Tasks
|
|
|
|
| Task | File | Description |
|
|
|------|------|-------------|
|
|
| 6.1 | `e2e/canvas.spec.ts` | Basic canvas operations |
|
|
| 6.2 | `e2e/optimization.spec.ts` | Run/stop/status flow |
|
|
| 6.3 | `e2e/panels.spec.ts` | Panel open/close/persist |
|
|
| 6.4 | `playwright.config.ts` | Configure Playwright |
|
|
| 6.5 | `CI workflow` | Run tests in GitHub Actions |
|
|
|
|
---
|
|
|
|
## Implementation Order
|
|
|
|
```
|
|
Week 1:
|
|
├── Phase 1: Panel Management (critical UX fix)
|
|
│ ├── Day 1-2: usePanelStore + PanelContainer
|
|
│ └── Day 3-4: Refactor existing panels
|
|
│
|
|
├── Phase 2: Validation (prevent user errors)
|
|
│ └── Day 5: Validation rules + UI
|
|
|
|
Week 2:
|
|
├── Phase 3: Error Handling
|
|
│ ├── Day 1-2: Error types + ErrorPanel
|
|
│ └── Day 3: Integration with optimization flow
|
|
│
|
|
├── Phase 4: WebSocket Updates
|
|
│ └── Day 4-5: WS events + frontend hook
|
|
|
|
Week 3:
|
|
├── Phase 5: Visualization
|
|
│ ├── Day 1-2: Sparklines
|
|
│ └── Day 3: Progress indicators
|
|
│
|
|
├── Phase 6: Testing
|
|
│ └── Day 4-5: Playwright setup + core tests
|
|
```
|
|
|
|
---
|
|
|
|
## Quick Wins (Can Do Now)
|
|
|
|
These can be implemented immediately with minimal changes:
|
|
|
|
1. **Persist introspection data in localStorage**
|
|
- Cache introspection results
|
|
- Restore on panel reopen
|
|
|
|
2. **Add loading states to all buttons**
|
|
- Disable during operations
|
|
- Show spinners
|
|
|
|
3. **Add confirmation dialogs**
|
|
- Before stopping optimization
|
|
- Before clearing canvas
|
|
|
|
4. **Improve error messages**
|
|
- Parse NX error logs
|
|
- Show actionable suggestions
|
|
|
|
---
|
|
|
|
## Files to Create/Modify
|
|
|
|
### New Files
|
|
```
|
|
atomizer-dashboard/frontend/src/
|
|
├── hooks/
|
|
│ ├── usePanelStore.ts
|
|
│ └── useOptimizationWebSocket.ts
|
|
├── components/canvas/
|
|
│ ├── PanelContainer.tsx
|
|
│ ├── panels/
|
|
│ │ ├── ValidationPanel.tsx
|
|
│ │ ├── ErrorPanel.tsx
|
|
│ │ └── ResultsPanel.tsx
|
|
│ └── visualization/
|
|
│ ├── ConvergenceSparkline.tsx
|
|
│ ├── ProgressRing.tsx
|
|
│ └── ConvergenceChart.tsx
|
|
└── lib/
|
|
└── validation/
|
|
└── specValidator.ts
|
|
|
|
e2e/
|
|
├── canvas.spec.ts
|
|
├── optimization.spec.ts
|
|
└── panels.spec.ts
|
|
```
|
|
|
|
### Modified Files
|
|
```
|
|
atomizer-dashboard/frontend/src/
|
|
├── pages/CanvasView.tsx
|
|
├── components/canvas/SpecRenderer.tsx
|
|
├── components/canvas/panels/IntrospectionPanel.tsx
|
|
├── components/canvas/panels/NodeConfigPanelV2.tsx
|
|
├── components/canvas/nodes/ObjectiveNode.tsx
|
|
└── hooks/useSpecStore.ts
|
|
|
|
atomizer-dashboard/backend/api/
|
|
├── routes/optimization.py
|
|
├── routes/spec.py
|
|
└── websocket.py
|
|
```
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
| Phase | Success Metric |
|
|
|-------|----------------|
|
|
| 1 | Introspection panel persists across node selections |
|
|
| 2 | Invalid spec shows clear error before run |
|
|
| 3 | NX errors display with recovery options |
|
|
| 4 | Results update within 500ms of trial completion |
|
|
| 5 | Convergence trend visible on objective nodes |
|
|
| 6 | All E2E tests pass in CI |
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
1. Review this plan
|
|
2. Start with Phase 1 (Panel Management) - fixes your immediate issue
|
|
3. Implement incrementally, commit after each phase
|