feat: Add MLP surrogate with Turbo Mode for 100x faster optimization
Neural Acceleration (MLP Surrogate): - Add run_nn_optimization.py with hybrid FEA/NN workflow - MLP architecture: 4-layer (64->128->128->64) with BatchNorm/Dropout - Three workflow modes: - --all: Sequential export->train->optimize->validate - --hybrid-loop: Iterative Train->NN->Validate->Retrain cycle - --turbo: Aggressive single-best validation (RECOMMENDED) - Turbo mode: 5000 NN trials + 50 FEA validations in ~12 minutes - Separate nn_study.db to avoid overloading dashboard Performance Results (bracket_pareto_3obj study): - NN prediction errors: mass 1-5%, stress 1-4%, stiffness 5-15% - Found minimum mass designs at boundary (angle~30deg, thick~30mm) - 100x speedup vs pure FEA exploration Protocol Operating System: - Add .claude/skills/ with Bootstrap, Cheatsheet, Context Loader - Add docs/protocols/ with operations (OP_01-06) and system (SYS_10-14) - Update SYS_14_NEURAL_ACCELERATION.md with MLP Turbo Mode docs NX Automation: - Add optimization_engine/hooks/ for NX CAD/CAE automation - Add study_wizard.py for guided study creation - Fix FEM mesh update: load idealized part before UpdateFemodel() New Study: - bracket_pareto_3obj: 3-objective Pareto (mass, stress, stiffness) - 167 FEA trials + 5000 NN trials completed - Demonstrates full hybrid workflow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
246
docs/protocols/operations/OP_03_MONITOR_PROGRESS.md
Normal file
246
docs/protocols/operations/OP_03_MONITOR_PROGRESS.md
Normal file
@@ -0,0 +1,246 @@
|
||||
# OP_03: Monitor Progress
|
||||
|
||||
<!--
|
||||
PROTOCOL: Monitor Optimization Progress
|
||||
LAYER: Operations
|
||||
VERSION: 1.0
|
||||
STATUS: Active
|
||||
LAST_UPDATED: 2025-12-05
|
||||
PRIVILEGE: user
|
||||
LOAD_WITH: [SYS_13_DASHBOARD_TRACKING]
|
||||
-->
|
||||
|
||||
## Overview
|
||||
|
||||
This protocol covers monitoring optimization progress through console output, dashboard, database queries, and Optuna's built-in tools.
|
||||
|
||||
---
|
||||
|
||||
## When to Use
|
||||
|
||||
| Trigger | Action |
|
||||
|---------|--------|
|
||||
| "status", "progress" | Follow this protocol |
|
||||
| "how many trials" | Query database |
|
||||
| "what's happening" | Check console or dashboard |
|
||||
| "is it running" | Check process status |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Method | Command/URL | Best For |
|
||||
|--------|-------------|----------|
|
||||
| Console | Watch terminal output | Quick check |
|
||||
| Dashboard | `http://localhost:3000` | Visual monitoring |
|
||||
| Database query | Python one-liner | Scripted checks |
|
||||
| Optuna Dashboard | `http://localhost:8080` | Detailed analysis |
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Methods
|
||||
|
||||
### 1. Console Output
|
||||
|
||||
If running in foreground, watch terminal:
|
||||
```
|
||||
[10:15:30] Trial 15/50 started
|
||||
[10:17:45] Trial 15/50 complete: mass=0.234 kg (best: 0.212 kg)
|
||||
[10:17:46] Trial 16/50 started
|
||||
```
|
||||
|
||||
### 2. Atomizer Dashboard
|
||||
|
||||
**Start Dashboard** (if not running):
|
||||
```bash
|
||||
# Terminal 1: Backend
|
||||
cd atomizer-dashboard/backend
|
||||
python -m uvicorn api.main:app --reload --port 8000
|
||||
|
||||
# Terminal 2: Frontend
|
||||
cd atomizer-dashboard/frontend
|
||||
npm run dev
|
||||
```
|
||||
|
||||
**View at**: `http://localhost:3000`
|
||||
|
||||
**Features**:
|
||||
- Real-time trial progress bar
|
||||
- Current optimizer phase (if Protocol 10)
|
||||
- Pareto front visualization (if multi-objective)
|
||||
- Parallel coordinates plot
|
||||
- Convergence chart
|
||||
|
||||
### 3. Database Query
|
||||
|
||||
**Quick status**:
|
||||
```bash
|
||||
python -c "
|
||||
import optuna
|
||||
study = optuna.load_study(
|
||||
study_name='my_study',
|
||||
storage='sqlite:///studies/my_study/2_results/study.db'
|
||||
)
|
||||
print(f'Trials completed: {len(study.trials)}')
|
||||
print(f'Best value: {study.best_value}')
|
||||
print(f'Best params: {study.best_params}')
|
||||
"
|
||||
```
|
||||
|
||||
**Detailed status**:
|
||||
```python
|
||||
import optuna
|
||||
|
||||
study = optuna.load_study(
|
||||
study_name='my_study',
|
||||
storage='sqlite:///studies/my_study/2_results/study.db'
|
||||
)
|
||||
|
||||
# Trial counts by state
|
||||
from collections import Counter
|
||||
states = Counter(t.state.name for t in study.trials)
|
||||
print(f"Complete: {states.get('COMPLETE', 0)}")
|
||||
print(f"Pruned: {states.get('PRUNED', 0)}")
|
||||
print(f"Failed: {states.get('FAIL', 0)}")
|
||||
print(f"Running: {states.get('RUNNING', 0)}")
|
||||
|
||||
# Best trials
|
||||
if len(study.directions) > 1:
|
||||
print(f"Pareto front size: {len(study.best_trials)}")
|
||||
else:
|
||||
print(f"Best value: {study.best_value}")
|
||||
```
|
||||
|
||||
### 4. Optuna Dashboard
|
||||
|
||||
```bash
|
||||
optuna-dashboard sqlite:///studies/my_study/2_results/study.db
|
||||
# Open http://localhost:8080
|
||||
```
|
||||
|
||||
**Features**:
|
||||
- Trial history table
|
||||
- Parameter importance
|
||||
- Optimization history plot
|
||||
- Slice plot (parameter vs objective)
|
||||
|
||||
### 5. Check Running Processes
|
||||
|
||||
```bash
|
||||
# Linux/Mac
|
||||
ps aux | grep run_optimization
|
||||
|
||||
# Windows
|
||||
tasklist | findstr python
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Metrics to Monitor
|
||||
|
||||
### Trial Progress
|
||||
- Completed trials vs target
|
||||
- Completion rate (trials/hour)
|
||||
- Estimated time remaining
|
||||
|
||||
### Objective Improvement
|
||||
- Current best value
|
||||
- Improvement trend
|
||||
- Plateau detection
|
||||
|
||||
### Constraint Satisfaction
|
||||
- Feasibility rate (% passing constraints)
|
||||
- Most violated constraint
|
||||
|
||||
### For Protocol 10 (IMSO)
|
||||
- Current phase (Characterization vs Optimization)
|
||||
- Current strategy (TPE, GP, CMA-ES)
|
||||
- Characterization confidence
|
||||
|
||||
### For Protocol 11 (Multi-Objective)
|
||||
- Pareto front size
|
||||
- Hypervolume indicator
|
||||
- Spread of solutions
|
||||
|
||||
---
|
||||
|
||||
## Interpreting Results
|
||||
|
||||
### Healthy Optimization
|
||||
```
|
||||
Trial 45/50: mass=0.198 kg (best: 0.195 kg)
|
||||
Feasibility rate: 78%
|
||||
```
|
||||
- Progress toward target
|
||||
- Reasonable feasibility rate (60-90%)
|
||||
- Gradual improvement
|
||||
|
||||
### Potential Issues
|
||||
|
||||
**All Trials Pruned**:
|
||||
```
|
||||
Trial 20 pruned: constraint violated
|
||||
Trial 21 pruned: constraint violated
|
||||
...
|
||||
```
|
||||
→ Constraints too tight. Consider relaxing.
|
||||
|
||||
**No Improvement**:
|
||||
```
|
||||
Trial 30: best=0.234 (unchanged since trial 8)
|
||||
Trial 31: best=0.234 (unchanged since trial 8)
|
||||
```
|
||||
→ May have converged, or stuck in local minimum.
|
||||
|
||||
**High Failure Rate**:
|
||||
```
|
||||
Failed: 15/50 (30%)
|
||||
```
|
||||
→ Model issues. Check NX logs.
|
||||
|
||||
---
|
||||
|
||||
## Real-Time State File
|
||||
|
||||
If using Protocol 10, check:
|
||||
```bash
|
||||
cat studies/my_study/2_results/intelligent_optimizer/optimizer_state.json
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2025-12-05T10:15:30",
|
||||
"trial_number": 29,
|
||||
"total_trials": 50,
|
||||
"current_phase": "adaptive_optimization",
|
||||
"current_strategy": "GP_UCB",
|
||||
"is_multi_objective": false
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Cause | Solution |
|
||||
|---------|-------|----------|
|
||||
| Dashboard shows old data | Backend not running | Start backend |
|
||||
| "No study found" | Wrong path | Check study name and path |
|
||||
| Trial count not increasing | Process stopped | Check if still running |
|
||||
| Dashboard not updating | Polling issue | Refresh browser |
|
||||
|
||||
---
|
||||
|
||||
## Cross-References
|
||||
|
||||
- **Preceded By**: [OP_02_RUN_OPTIMIZATION](./OP_02_RUN_OPTIMIZATION.md)
|
||||
- **Followed By**: [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
|
||||
- **Integrates With**: [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md)
|
||||
|
||||
---
|
||||
|
||||
## Version History
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0 | 2025-12-05 | Initial release |
|
||||
Reference in New Issue
Block a user