feat: Add panel management, validation, and error handling to canvas
Phase 1 - Panel Management System: - Create usePanelStore.ts for centralized panel state management - Add PanelContainer.tsx for draggable floating panels - Create FloatingIntrospectionPanel.tsx (persistent, doesn't disappear on node click) - Create ResultsPanel.tsx for trial result details - Refactor NodeConfigPanelV2 to use panel store for introspection - Integrate PanelContainer into CanvasView Phase 2 - Pre-run Validation: - Create specValidator.ts with comprehensive validation rules - Add ValidationPanel (enhanced version with error navigation) - Add Validate button to SpecRenderer with status indicator - Block run if validation fails - Check for: design vars, objectives, extractors, bounds, connections Phase 3 - Error Handling & Recovery: - Create ErrorPanel.tsx for displaying optimization errors - Add error classification (nx_crash, solver_fail, extractor_error, etc.) - Add recovery suggestions based on error type - Update status endpoint to return error info - Add _get_study_error_info helper to check error_status.json and DB - Integrate error detection into status polling Documentation: - Add CANVAS_ROBUSTNESS_PLAN.md with full implementation plan
This commit is contained in:
438
docs/plans/CANVAS_ROBUSTNESS_PLAN.md
Normal file
438
docs/plans/CANVAS_ROBUSTNESS_PLAN.md
Normal file
@@ -0,0 +1,438 @@
|
||||
# Canvas Builder Robustness & Enhancement Plan
|
||||
|
||||
**Created**: January 21, 2026
|
||||
**Branch**: `feature/studio-enhancement`
|
||||
**Status**: Planning
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This plan addresses critical issues and enhancements to make the Canvas Builder robust and production-ready:
|
||||
|
||||
1. **Panel Management** - Panels (Introspection, Config, Chat) disappear unexpectedly
|
||||
2. **Pre-run Validation** - No validation before starting optimization
|
||||
3. **Error Handling** - Poor feedback when things go wrong
|
||||
4. **Live Updates** - Polling is inefficient; need WebSocket
|
||||
5. **Visualization** - No convergence charts or progress indicators
|
||||
6. **Testing** - No automated tests for critical flows
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Panel Management System (HIGH PRIORITY)
|
||||
|
||||
### Problem
|
||||
- IntrospectionPanel disappears when user clicks elsewhere on canvas
|
||||
- Panel state is lost (e.g., introspection results, expanded sections)
|
||||
- No way to have multiple panels open simultaneously
|
||||
- Chat panel and Config panel are mutually exclusive
|
||||
|
||||
### Root Cause
|
||||
```typescript
|
||||
// Current: Local state in ModelNodeConfig (NodeConfigPanelV2.tsx:275)
|
||||
const [showIntrospection, setShowIntrospection] = useState(false);
|
||||
|
||||
// When selectedNodeId changes, ModelNodeConfig unmounts, losing state
|
||||
```
|
||||
|
||||
### Solution: Centralized Panel Store
|
||||
|
||||
Create `usePanelStore.ts` - a Zustand store for panel management:
|
||||
|
||||
```typescript
|
||||
// atomizer-dashboard/frontend/src/hooks/usePanelStore.ts
|
||||
|
||||
interface PanelState {
|
||||
// Panel visibility
|
||||
panels: {
|
||||
introspection: { open: boolean; filePath?: string; data?: IntrospectionResult };
|
||||
config: { open: boolean; nodeId?: string };
|
||||
chat: { open: boolean; powerMode: boolean };
|
||||
validation: { open: boolean; errors?: ValidationError[] };
|
||||
results: { open: boolean; trialId?: number };
|
||||
};
|
||||
|
||||
// Actions
|
||||
openPanel: (panel: PanelName, data?: any) => void;
|
||||
closePanel: (panel: PanelName) => void;
|
||||
togglePanel: (panel: PanelName) => void;
|
||||
|
||||
// Panel data persistence
|
||||
setIntrospectionData: (data: IntrospectionResult) => void;
|
||||
clearIntrospectionData: () => void;
|
||||
}
|
||||
```
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
| Task | File | Description |
|
||||
|------|------|-------------|
|
||||
| 1.1 | `usePanelStore.ts` | Create Zustand store for panel state |
|
||||
| 1.2 | `PanelContainer.tsx` | Create container that renders open panels |
|
||||
| 1.3 | `IntrospectionPanel.tsx` | Refactor to use store instead of local state |
|
||||
| 1.4 | `NodeConfigPanelV2.tsx` | Remove local panel state, use store |
|
||||
| 1.5 | `CanvasView.tsx` | Integrate PanelContainer, remove chat panel logic |
|
||||
| 1.6 | `SpecRenderer.tsx` | Add panel trigger buttons (introspect, validate) |
|
||||
|
||||
### UI Changes
|
||||
|
||||
**Before:**
|
||||
```
|
||||
[Canvas] [Config Panel OR Chat Panel]
|
||||
↑ mutually exclusive
|
||||
```
|
||||
|
||||
**After:**
|
||||
```
|
||||
[Canvas] [Right Panel Area]
|
||||
├── Config Panel (pinnable)
|
||||
├── Chat Panel (collapsible)
|
||||
└── Floating Panels:
|
||||
├── Introspection (draggable, persistent)
|
||||
├── Validation Results
|
||||
└── Trial Details
|
||||
```
|
||||
|
||||
### Panel Behaviors
|
||||
|
||||
| Panel | Trigger | Persistence | Position |
|
||||
|-------|---------|-------------|----------|
|
||||
| **Config** | Node click | While node selected | Right sidebar |
|
||||
| **Chat** | Toggle button | Always available | Right sidebar (below config) |
|
||||
| **Introspection** | "Introspect" button | Until explicitly closed | Floating, draggable |
|
||||
| **Validation** | "Validate" or pre-run | Until fixed or dismissed | Floating |
|
||||
| **Results** | Click on result badge | Until dismissed | Floating |
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Pre-run Validation (HIGH PRIORITY)
|
||||
|
||||
### Problem
|
||||
- User can click "Run" with incomplete spec
|
||||
- No feedback about missing extractors, objectives, or connections
|
||||
- Optimization fails silently or with cryptic errors
|
||||
|
||||
### Solution: Validation Pipeline
|
||||
|
||||
```typescript
|
||||
// Types of validation
|
||||
interface ValidationResult {
|
||||
valid: boolean;
|
||||
errors: ValidationError[]; // Must fix before running
|
||||
warnings: ValidationWarning[]; // Can proceed but risky
|
||||
}
|
||||
|
||||
interface ValidationError {
|
||||
code: string;
|
||||
severity: 'error' | 'warning';
|
||||
path: string; // e.g., "objectives[0]"
|
||||
message: string;
|
||||
suggestion?: string;
|
||||
autoFix?: () => void;
|
||||
}
|
||||
```
|
||||
|
||||
### Validation Rules
|
||||
|
||||
| Rule | Severity | Message |
|
||||
|------|----------|---------|
|
||||
| No design variables | Error | "Add at least one design variable" |
|
||||
| No objectives | Error | "Add at least one objective" |
|
||||
| Objective not connected to extractor | Error | "Objective '{name}' has no source extractor" |
|
||||
| Extractor type not set | Error | "Extractor '{name}' needs a type selected" |
|
||||
| Design var bounds invalid | Error | "Min must be less than max for '{name}'" |
|
||||
| No model file | Error | "No simulation file configured" |
|
||||
| Custom extractor no code | Warning | "Custom extractor '{name}' has no code" |
|
||||
| High trial count (>500) | Warning | "Large budget may take hours to complete" |
|
||||
| Single trial | Warning | "Only 1 trial - results won't be meaningful" |
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
| Task | File | Description |
|
||||
|------|------|-------------|
|
||||
| 2.1 | `validation/specValidator.ts` | Client-side validation rules |
|
||||
| 2.2 | `ValidationPanel.tsx` | Display validation results |
|
||||
| 2.3 | `SpecRenderer.tsx` | Add "Validate" button, pre-run check |
|
||||
| 2.4 | `api/routes/spec.py` | Server-side validation endpoint |
|
||||
| 2.5 | `useSpecStore.ts` | Add `validate()` action |
|
||||
|
||||
### UI Flow
|
||||
|
||||
```
|
||||
User clicks "Run Optimization"
|
||||
↓
|
||||
[Validate Spec] ──failed──→ [Show ValidationPanel]
|
||||
↓ passed │
|
||||
[Confirm Dialog] │
|
||||
↓ confirmed │
|
||||
[Start Optimization] ←── fix ─────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Error Handling & Recovery (HIGH PRIORITY)
|
||||
|
||||
### Problem
|
||||
- NX crashes don't show useful feedback
|
||||
- Solver failures leave user confused
|
||||
- No way to resume after errors
|
||||
|
||||
### Solution: Error Classification & Display
|
||||
|
||||
```typescript
|
||||
interface OptimizationError {
|
||||
type: 'nx_crash' | 'solver_fail' | 'extractor_error' | 'config_error' | 'system_error';
|
||||
trial?: number;
|
||||
message: string;
|
||||
details?: string;
|
||||
recoverable: boolean;
|
||||
suggestions: string[];
|
||||
}
|
||||
```
|
||||
|
||||
### Error Handling Strategy
|
||||
|
||||
| Error Type | Display | Recovery |
|
||||
|------------|---------|----------|
|
||||
| NX Crash | Toast + Error Panel | Retry trial, skip trial |
|
||||
| Solver Failure | Badge on trial | Mark infeasible, continue |
|
||||
| Extractor Error | Log + badge | Use NaN, continue |
|
||||
| Config Error | Block run | Show validation panel |
|
||||
| System Error | Full modal | Restart optimization |
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
| Task | File | Description |
|
||||
|------|------|-------------|
|
||||
| 3.1 | `ErrorBoundary.tsx` | Wrap canvas in error boundary |
|
||||
| 3.2 | `ErrorPanel.tsx` | Detailed error display with suggestions |
|
||||
| 3.3 | `optimization.py` | Enhanced error responses with type/recovery |
|
||||
| 3.4 | `SpecRenderer.tsx` | Error state handling, retry buttons |
|
||||
| 3.5 | `useOptimizationStatus.ts` | Hook for status polling with error handling |
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Live Updates via WebSocket (MEDIUM PRIORITY)
|
||||
|
||||
### Problem
|
||||
- Current polling (3s) is inefficient and has latency
|
||||
- Missed updates between polls
|
||||
- No real-time progress indication
|
||||
|
||||
### Solution: WebSocket for Trial Updates
|
||||
|
||||
```typescript
|
||||
// WebSocket events
|
||||
interface TrialStartEvent {
|
||||
type: 'trial_start';
|
||||
trial_number: number;
|
||||
params: Record<string, number>;
|
||||
}
|
||||
|
||||
interface TrialCompleteEvent {
|
||||
type: 'trial_complete';
|
||||
trial_number: number;
|
||||
objectives: Record<string, number>;
|
||||
is_best: boolean;
|
||||
is_feasible: boolean;
|
||||
}
|
||||
|
||||
interface OptimizationCompleteEvent {
|
||||
type: 'optimization_complete';
|
||||
best_trial: number;
|
||||
total_trials: number;
|
||||
}
|
||||
```
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
| Task | File | Description |
|
||||
|------|------|-------------|
|
||||
| 4.1 | `websocket.py` | Add optimization events to WS |
|
||||
| 4.2 | `run_optimization.py` | Emit events during optimization |
|
||||
| 4.3 | `useOptimizationWebSocket.ts` | Hook for WS subscription |
|
||||
| 4.4 | `SpecRenderer.tsx` | Use WS instead of polling |
|
||||
| 4.5 | `ResultBadge.tsx` | Animate on new results |
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Convergence Visualization (MEDIUM PRIORITY)
|
||||
|
||||
### Problem
|
||||
- No visual feedback on optimization progress
|
||||
- Can't tell if converging or stuck
|
||||
- No Pareto front visualization for multi-objective
|
||||
|
||||
### Solution: Embedded Charts
|
||||
|
||||
### Components
|
||||
|
||||
| Component | Description |
|
||||
|-----------|-------------|
|
||||
| `ConvergenceSparkline` | Tiny chart in ObjectiveNode showing trend |
|
||||
| `ProgressRing` | Circular progress in header (trials/total) |
|
||||
| `ConvergenceChart` | Full chart in Results panel |
|
||||
| `ParetoPlot` | 2D Pareto front for multi-objective |
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
| Task | File | Description |
|
||||
|------|------|-------------|
|
||||
| 5.1 | `ConvergenceSparkline.tsx` | SVG sparkline component |
|
||||
| 5.2 | `ObjectiveNode.tsx` | Integrate sparkline |
|
||||
| 5.3 | `ProgressRing.tsx` | Circular progress indicator |
|
||||
| 5.4 | `ConvergenceChart.tsx` | Full chart with Recharts |
|
||||
| 5.5 | `ResultsPanel.tsx` | Panel showing detailed results |
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: End-to-End Testing (MEDIUM PRIORITY)
|
||||
|
||||
### Problem
|
||||
- No automated tests for canvas operations
|
||||
- Manual testing is time-consuming and error-prone
|
||||
- Regressions go unnoticed
|
||||
|
||||
### Solution: Playwright E2E Tests
|
||||
|
||||
### Test Scenarios
|
||||
|
||||
| Test | Steps | Assertions |
|
||||
|------|-------|------------|
|
||||
| Load study | Navigate to /canvas/{id} | Spec loads, nodes render |
|
||||
| Add design var | Drag from palette | Node appears, spec updates |
|
||||
| Connect nodes | Drag edge | Edge renders, spec has edge |
|
||||
| Edit node | Click node, change value | Value persists, API called |
|
||||
| Run validation | Click validate | Errors shown for incomplete |
|
||||
| Start optimization | Complete spec, click run | Status shows running |
|
||||
| View results | Wait for trial | Badge shows value |
|
||||
| Stop optimization | Click stop | Status shows stopped |
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
| Task | File | Description |
|
||||
|------|------|-------------|
|
||||
| 6.1 | `e2e/canvas.spec.ts` | Basic canvas operations |
|
||||
| 6.2 | `e2e/optimization.spec.ts` | Run/stop/status flow |
|
||||
| 6.3 | `e2e/panels.spec.ts` | Panel open/close/persist |
|
||||
| 6.4 | `playwright.config.ts` | Configure Playwright |
|
||||
| 6.5 | `CI workflow` | Run tests in GitHub Actions |
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
```
|
||||
Week 1:
|
||||
├── Phase 1: Panel Management (critical UX fix)
|
||||
│ ├── Day 1-2: usePanelStore + PanelContainer
|
||||
│ └── Day 3-4: Refactor existing panels
|
||||
│
|
||||
├── Phase 2: Validation (prevent user errors)
|
||||
│ └── Day 5: Validation rules + UI
|
||||
|
||||
Week 2:
|
||||
├── Phase 3: Error Handling
|
||||
│ ├── Day 1-2: Error types + ErrorPanel
|
||||
│ └── Day 3: Integration with optimization flow
|
||||
│
|
||||
├── Phase 4: WebSocket Updates
|
||||
│ └── Day 4-5: WS events + frontend hook
|
||||
|
||||
Week 3:
|
||||
├── Phase 5: Visualization
|
||||
│ ├── Day 1-2: Sparklines
|
||||
│ └── Day 3: Progress indicators
|
||||
│
|
||||
├── Phase 6: Testing
|
||||
│ └── Day 4-5: Playwright setup + core tests
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Wins (Can Do Now)
|
||||
|
||||
These can be implemented immediately with minimal changes:
|
||||
|
||||
1. **Persist introspection data in localStorage**
|
||||
- Cache introspection results
|
||||
- Restore on panel reopen
|
||||
|
||||
2. **Add loading states to all buttons**
|
||||
- Disable during operations
|
||||
- Show spinners
|
||||
|
||||
3. **Add confirmation dialogs**
|
||||
- Before stopping optimization
|
||||
- Before clearing canvas
|
||||
|
||||
4. **Improve error messages**
|
||||
- Parse NX error logs
|
||||
- Show actionable suggestions
|
||||
|
||||
---
|
||||
|
||||
## Files to Create/Modify
|
||||
|
||||
### New Files
|
||||
```
|
||||
atomizer-dashboard/frontend/src/
|
||||
├── hooks/
|
||||
│ ├── usePanelStore.ts
|
||||
│ └── useOptimizationWebSocket.ts
|
||||
├── components/canvas/
|
||||
│ ├── PanelContainer.tsx
|
||||
│ ├── panels/
|
||||
│ │ ├── ValidationPanel.tsx
|
||||
│ │ ├── ErrorPanel.tsx
|
||||
│ │ └── ResultsPanel.tsx
|
||||
│ └── visualization/
|
||||
│ ├── ConvergenceSparkline.tsx
|
||||
│ ├── ProgressRing.tsx
|
||||
│ └── ConvergenceChart.tsx
|
||||
└── lib/
|
||||
└── validation/
|
||||
└── specValidator.ts
|
||||
|
||||
e2e/
|
||||
├── canvas.spec.ts
|
||||
├── optimization.spec.ts
|
||||
└── panels.spec.ts
|
||||
```
|
||||
|
||||
### Modified Files
|
||||
```
|
||||
atomizer-dashboard/frontend/src/
|
||||
├── pages/CanvasView.tsx
|
||||
├── components/canvas/SpecRenderer.tsx
|
||||
├── components/canvas/panels/IntrospectionPanel.tsx
|
||||
├── components/canvas/panels/NodeConfigPanelV2.tsx
|
||||
├── components/canvas/nodes/ObjectiveNode.tsx
|
||||
└── hooks/useSpecStore.ts
|
||||
|
||||
atomizer-dashboard/backend/api/
|
||||
├── routes/optimization.py
|
||||
├── routes/spec.py
|
||||
└── websocket.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
| Phase | Success Metric |
|
||||
|-------|----------------|
|
||||
| 1 | Introspection panel persists across node selections |
|
||||
| 2 | Invalid spec shows clear error before run |
|
||||
| 3 | NX errors display with recovery options |
|
||||
| 4 | Results update within 500ms of trial completion |
|
||||
| 5 | Convergence trend visible on objective nodes |
|
||||
| 6 | All E2E tests pass in CI |
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Review this plan
|
||||
2. Start with Phase 1 (Panel Management) - fixes your immediate issue
|
||||
3. Implement incrementally, commit after each phase
|
||||
Reference in New Issue
Block a user