Neural Acceleration (MLP Surrogate): - Add run_nn_optimization.py with hybrid FEA/NN workflow - MLP architecture: 4-layer (64->128->128->64) with BatchNorm/Dropout - Three workflow modes: - --all: Sequential export->train->optimize->validate - --hybrid-loop: Iterative Train->NN->Validate->Retrain cycle - --turbo: Aggressive single-best validation (RECOMMENDED) - Turbo mode: 5000 NN trials + 50 FEA validations in ~12 minutes - Separate nn_study.db to avoid overloading dashboard Performance Results (bracket_pareto_3obj study): - NN prediction errors: mass 1-5%, stress 1-4%, stiffness 5-15% - Found minimum mass designs at boundary (angle~30deg, thick~30mm) - 100x speedup vs pure FEA exploration Protocol Operating System: - Add .claude/skills/ with Bootstrap, Cheatsheet, Context Loader - Add docs/protocols/ with operations (OP_01-06) and system (SYS_10-14) - Update SYS_14_NEURAL_ACCELERATION.md with MLP Turbo Mode docs NX Automation: - Add optimization_engine/hooks/ for NX CAD/CAE automation - Add study_wizard.py for guided study creation - Fix FEM mesh update: load idealized part before UpdateFemodel() New Study: - bracket_pareto_3obj: 3-objective Pareto (mass, stress, stiffness) - 167 FEA trials + 5000 NN trials completed - Demonstrates full hybrid workflow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
247 lines
5.1 KiB
Markdown
247 lines
5.1 KiB
Markdown
# OP_03: Monitor Progress
|
|
|
|
<!--
|
|
PROTOCOL: Monitor Optimization Progress
|
|
LAYER: Operations
|
|
VERSION: 1.0
|
|
STATUS: Active
|
|
LAST_UPDATED: 2025-12-05
|
|
PRIVILEGE: user
|
|
LOAD_WITH: [SYS_13_DASHBOARD_TRACKING]
|
|
-->
|
|
|
|
## Overview
|
|
|
|
This protocol covers monitoring optimization progress through console output, dashboard, database queries, and Optuna's built-in tools.
|
|
|
|
---
|
|
|
|
## When to Use
|
|
|
|
| Trigger | Action |
|
|
|---------|--------|
|
|
| "status", "progress" | Follow this protocol |
|
|
| "how many trials" | Query database |
|
|
| "what's happening" | Check console or dashboard |
|
|
| "is it running" | Check process status |
|
|
|
|
---
|
|
|
|
## Quick Reference
|
|
|
|
| Method | Command/URL | Best For |
|
|
|--------|-------------|----------|
|
|
| Console | Watch terminal output | Quick check |
|
|
| Dashboard | `http://localhost:3000` | Visual monitoring |
|
|
| Database query | Python one-liner | Scripted checks |
|
|
| Optuna Dashboard | `http://localhost:8080` | Detailed analysis |
|
|
|
|
---
|
|
|
|
## Monitoring Methods
|
|
|
|
### 1. Console Output
|
|
|
|
If running in foreground, watch terminal:
|
|
```
|
|
[10:15:30] Trial 15/50 started
|
|
[10:17:45] Trial 15/50 complete: mass=0.234 kg (best: 0.212 kg)
|
|
[10:17:46] Trial 16/50 started
|
|
```
|
|
|
|
### 2. Atomizer Dashboard
|
|
|
|
**Start Dashboard** (if not running):
|
|
```bash
|
|
# Terminal 1: Backend
|
|
cd atomizer-dashboard/backend
|
|
python -m uvicorn api.main:app --reload --port 8000
|
|
|
|
# Terminal 2: Frontend
|
|
cd atomizer-dashboard/frontend
|
|
npm run dev
|
|
```
|
|
|
|
**View at**: `http://localhost:3000`
|
|
|
|
**Features**:
|
|
- Real-time trial progress bar
|
|
- Current optimizer phase (if Protocol 10)
|
|
- Pareto front visualization (if multi-objective)
|
|
- Parallel coordinates plot
|
|
- Convergence chart
|
|
|
|
### 3. Database Query
|
|
|
|
**Quick status**:
|
|
```bash
|
|
python -c "
|
|
import optuna
|
|
study = optuna.load_study(
|
|
study_name='my_study',
|
|
storage='sqlite:///studies/my_study/2_results/study.db'
|
|
)
|
|
print(f'Trials completed: {len(study.trials)}')
|
|
print(f'Best value: {study.best_value}')
|
|
print(f'Best params: {study.best_params}')
|
|
"
|
|
```
|
|
|
|
**Detailed status**:
|
|
```python
|
|
import optuna
|
|
|
|
study = optuna.load_study(
|
|
study_name='my_study',
|
|
storage='sqlite:///studies/my_study/2_results/study.db'
|
|
)
|
|
|
|
# Trial counts by state
|
|
from collections import Counter
|
|
states = Counter(t.state.name for t in study.trials)
|
|
print(f"Complete: {states.get('COMPLETE', 0)}")
|
|
print(f"Pruned: {states.get('PRUNED', 0)}")
|
|
print(f"Failed: {states.get('FAIL', 0)}")
|
|
print(f"Running: {states.get('RUNNING', 0)}")
|
|
|
|
# Best trials
|
|
if len(study.directions) > 1:
|
|
print(f"Pareto front size: {len(study.best_trials)}")
|
|
else:
|
|
print(f"Best value: {study.best_value}")
|
|
```
|
|
|
|
### 4. Optuna Dashboard
|
|
|
|
```bash
|
|
optuna-dashboard sqlite:///studies/my_study/2_results/study.db
|
|
# Open http://localhost:8080
|
|
```
|
|
|
|
**Features**:
|
|
- Trial history table
|
|
- Parameter importance
|
|
- Optimization history plot
|
|
- Slice plot (parameter vs objective)
|
|
|
|
### 5. Check Running Processes
|
|
|
|
```bash
|
|
# Linux/Mac
|
|
ps aux | grep run_optimization
|
|
|
|
# Windows
|
|
tasklist | findstr python
|
|
```
|
|
|
|
---
|
|
|
|
## Key Metrics to Monitor
|
|
|
|
### Trial Progress
|
|
- Completed trials vs target
|
|
- Completion rate (trials/hour)
|
|
- Estimated time remaining
|
|
|
|
### Objective Improvement
|
|
- Current best value
|
|
- Improvement trend
|
|
- Plateau detection
|
|
|
|
### Constraint Satisfaction
|
|
- Feasibility rate (% passing constraints)
|
|
- Most violated constraint
|
|
|
|
### For Protocol 10 (IMSO)
|
|
- Current phase (Characterization vs Optimization)
|
|
- Current strategy (TPE, GP, CMA-ES)
|
|
- Characterization confidence
|
|
|
|
### For Protocol 11 (Multi-Objective)
|
|
- Pareto front size
|
|
- Hypervolume indicator
|
|
- Spread of solutions
|
|
|
|
---
|
|
|
|
## Interpreting Results
|
|
|
|
### Healthy Optimization
|
|
```
|
|
Trial 45/50: mass=0.198 kg (best: 0.195 kg)
|
|
Feasibility rate: 78%
|
|
```
|
|
- Progress toward target
|
|
- Reasonable feasibility rate (60-90%)
|
|
- Gradual improvement
|
|
|
|
### Potential Issues
|
|
|
|
**All Trials Pruned**:
|
|
```
|
|
Trial 20 pruned: constraint violated
|
|
Trial 21 pruned: constraint violated
|
|
...
|
|
```
|
|
→ Constraints too tight. Consider relaxing.
|
|
|
|
**No Improvement**:
|
|
```
|
|
Trial 30: best=0.234 (unchanged since trial 8)
|
|
Trial 31: best=0.234 (unchanged since trial 8)
|
|
```
|
|
→ May have converged, or stuck in local minimum.
|
|
|
|
**High Failure Rate**:
|
|
```
|
|
Failed: 15/50 (30%)
|
|
```
|
|
→ Model issues. Check NX logs.
|
|
|
|
---
|
|
|
|
## Real-Time State File
|
|
|
|
If using Protocol 10, check:
|
|
```bash
|
|
cat studies/my_study/2_results/intelligent_optimizer/optimizer_state.json
|
|
```
|
|
|
|
```json
|
|
{
|
|
"timestamp": "2025-12-05T10:15:30",
|
|
"trial_number": 29,
|
|
"total_trials": 50,
|
|
"current_phase": "adaptive_optimization",
|
|
"current_strategy": "GP_UCB",
|
|
"is_multi_objective": false
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
| Symptom | Cause | Solution |
|
|
|---------|-------|----------|
|
|
| Dashboard shows old data | Backend not running | Start backend |
|
|
| "No study found" | Wrong path | Check study name and path |
|
|
| Trial count not increasing | Process stopped | Check if still running |
|
|
| Dashboard not updating | Polling issue | Refresh browser |
|
|
|
|
---
|
|
|
|
## Cross-References
|
|
|
|
- **Preceded By**: [OP_02_RUN_OPTIMIZATION](./OP_02_RUN_OPTIMIZATION.md)
|
|
- **Followed By**: [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
|
|
- **Integrates With**: [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md)
|
|
|
|
---
|
|
|
|
## Version History
|
|
|
|
| Version | Date | Changes |
|
|
|---------|------|---------|
|
|
| 1.0 | 2025-12-05 | Initial release |
|