feat: Add MLP surrogate with Turbo Mode for 100x faster optimization
Neural Acceleration (MLP Surrogate): - Add run_nn_optimization.py with hybrid FEA/NN workflow - MLP architecture: 4-layer (64->128->128->64) with BatchNorm/Dropout - Three workflow modes: - --all: Sequential export->train->optimize->validate - --hybrid-loop: Iterative Train->NN->Validate->Retrain cycle - --turbo: Aggressive single-best validation (RECOMMENDED) - Turbo mode: 5000 NN trials + 50 FEA validations in ~12 minutes - Separate nn_study.db to avoid overloading dashboard Performance Results (bracket_pareto_3obj study): - NN prediction errors: mass 1-5%, stress 1-4%, stiffness 5-15% - Found minimum mass designs at boundary (angle~30deg, thick~30mm) - 100x speedup vs pure FEA exploration Protocol Operating System: - Add .claude/skills/ with Bootstrap, Cheatsheet, Context Loader - Add docs/protocols/ with operations (OP_01-06) and system (SYS_10-14) - Update SYS_14_NEURAL_ACCELERATION.md with MLP Turbo Mode docs NX Automation: - Add optimization_engine/hooks/ for NX CAD/CAE automation - Add study_wizard.py for guided study creation - Fix FEM mesh update: load idealized part before UpdateFemodel() New Study: - bracket_pareto_3obj: 3-objective Pareto (mass, stress, stiffness) - 167 FEA trials + 5000 NN trials completed - Demonstrates full hybrid workflow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
297
docs/protocols/operations/OP_02_RUN_OPTIMIZATION.md
Normal file
297
docs/protocols/operations/OP_02_RUN_OPTIMIZATION.md
Normal file
@@ -0,0 +1,297 @@
|
||||
# OP_02: Run Optimization
|
||||
|
||||
<!--
|
||||
PROTOCOL: Run Optimization
|
||||
LAYER: Operations
|
||||
VERSION: 1.0
|
||||
STATUS: Active
|
||||
LAST_UPDATED: 2025-12-05
|
||||
PRIVILEGE: user
|
||||
LOAD_WITH: []
|
||||
-->
|
||||
|
||||
## Overview
|
||||
|
||||
This protocol covers executing optimization runs, including pre-flight validation, execution modes, monitoring, and handling common issues.
|
||||
|
||||
---
|
||||
|
||||
## When to Use
|
||||
|
||||
| Trigger | Action |
|
||||
|---------|--------|
|
||||
| "start", "run", "execute" | Follow this protocol |
|
||||
| "begin optimization" | Follow this protocol |
|
||||
| Study setup complete | Execute this protocol |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
**Start Command**:
|
||||
```bash
|
||||
conda activate atomizer
|
||||
cd studies/{study_name}
|
||||
python run_optimization.py
|
||||
```
|
||||
|
||||
**Common Options**:
|
||||
| Flag | Purpose |
|
||||
|------|---------|
|
||||
| `--n-trials 100` | Override trial count |
|
||||
| `--resume` | Continue interrupted run |
|
||||
| `--test` | Run single trial for validation |
|
||||
| `--export-training` | Export data for neural training |
|
||||
|
||||
---
|
||||
|
||||
## Pre-Flight Checklist
|
||||
|
||||
Before running, verify:
|
||||
|
||||
- [ ] **Environment**: `conda activate atomizer`
|
||||
- [ ] **Config exists**: `1_setup/optimization_config.json`
|
||||
- [ ] **Script exists**: `run_optimization.py`
|
||||
- [ ] **Model files**: NX files in `1_setup/model/`
|
||||
- [ ] **No conflicts**: No other optimization running on same study
|
||||
- [ ] **Disk space**: Sufficient for results
|
||||
|
||||
**Quick Validation**:
|
||||
```bash
|
||||
python run_optimization.py --test
|
||||
```
|
||||
This runs a single trial to verify setup.
|
||||
|
||||
---
|
||||
|
||||
## Execution Modes
|
||||
|
||||
### 1. Standard Run
|
||||
|
||||
```bash
|
||||
python run_optimization.py
|
||||
```
|
||||
Uses settings from `optimization_config.json`.
|
||||
|
||||
### 2. Override Trials
|
||||
|
||||
```bash
|
||||
python run_optimization.py --n-trials 100
|
||||
```
|
||||
Override trial count from config.
|
||||
|
||||
### 3. Resume Interrupted
|
||||
|
||||
```bash
|
||||
python run_optimization.py --resume
|
||||
```
|
||||
Continues from last completed trial.
|
||||
|
||||
### 4. Neural Acceleration
|
||||
|
||||
```bash
|
||||
python run_optimization.py --neural
|
||||
```
|
||||
Requires trained surrogate model.
|
||||
|
||||
### 5. Export Training Data
|
||||
|
||||
```bash
|
||||
python run_optimization.py --export-training
|
||||
```
|
||||
Saves BDF/OP2 for neural network training.
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Progress
|
||||
|
||||
### Option 1: Console Output
|
||||
The script prints progress:
|
||||
```
|
||||
Trial 15/50 complete. Best: 0.234 kg
|
||||
Trial 16/50 complete. Best: 0.234 kg
|
||||
```
|
||||
|
||||
### Option 2: Dashboard
|
||||
See [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md).
|
||||
|
||||
```bash
|
||||
# Start dashboard (separate terminal)
|
||||
cd atomizer-dashboard/backend && python -m uvicorn api.main:app --port 8000
|
||||
cd atomizer-dashboard/frontend && npm run dev
|
||||
|
||||
# Open browser
|
||||
http://localhost:3000
|
||||
```
|
||||
|
||||
### Option 3: Query Database
|
||||
|
||||
```bash
|
||||
python -c "
|
||||
import optuna
|
||||
study = optuna.load_study('study_name', 'sqlite:///2_results/study.db')
|
||||
print(f'Trials: {len(study.trials)}')
|
||||
print(f'Best value: {study.best_value}')
|
||||
"
|
||||
```
|
||||
|
||||
### Option 4: Optuna Dashboard
|
||||
|
||||
```bash
|
||||
optuna-dashboard sqlite:///2_results/study.db
|
||||
# Open http://localhost:8080
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## During Execution
|
||||
|
||||
### What Happens Per Trial
|
||||
|
||||
1. **Sample parameters**: Optuna suggests design variable values
|
||||
2. **Update model**: NX expressions updated via journal
|
||||
3. **Solve**: NX Nastran runs FEA simulation
|
||||
4. **Extract results**: Extractors read OP2 file
|
||||
5. **Evaluate**: Check constraints, compute objectives
|
||||
6. **Record**: Trial stored in Optuna database
|
||||
|
||||
### Normal Output
|
||||
|
||||
```
|
||||
[2025-12-05 10:15:30] Trial 1 started
|
||||
[2025-12-05 10:17:45] NX solve complete (135.2s)
|
||||
[2025-12-05 10:17:46] Extraction complete
|
||||
[2025-12-05 10:17:46] Trial 1 complete: mass=0.342 kg, stress=198.5 MPa
|
||||
|
||||
[2025-12-05 10:17:47] Trial 2 started
|
||||
...
|
||||
```
|
||||
|
||||
### Expected Timing
|
||||
|
||||
| Operation | Typical Time |
|
||||
|-----------|--------------|
|
||||
| NX solve | 30s - 30min |
|
||||
| Extraction | <1s |
|
||||
| Per trial total | 1-30 min |
|
||||
| 50 trials | 1-24 hours |
|
||||
|
||||
---
|
||||
|
||||
## Handling Issues
|
||||
|
||||
### Trial Failed / Pruned
|
||||
|
||||
```
|
||||
[WARNING] Trial 12 pruned: Stress constraint violated (312.5 MPa > 250 MPa)
|
||||
```
|
||||
**Normal behavior** - optimizer learns from failures.
|
||||
|
||||
### NX Session Timeout
|
||||
|
||||
```
|
||||
[ERROR] NX session timeout after 600s
|
||||
```
|
||||
**Solution**: Increase timeout in config or simplify model.
|
||||
|
||||
### Expression Not Found
|
||||
|
||||
```
|
||||
[ERROR] Expression 'thicknes' not found in model
|
||||
```
|
||||
**Solution**: Check spelling, verify expression exists in NX.
|
||||
|
||||
### OP2 File Missing
|
||||
|
||||
```
|
||||
[ERROR] OP2 file not found: model.op2
|
||||
```
|
||||
**Solution**: Check NX solve completed. Review NX log file.
|
||||
|
||||
### Database Locked
|
||||
|
||||
```
|
||||
[ERROR] Database is locked
|
||||
```
|
||||
**Solution**: Another process using database. Wait or kill stale process.
|
||||
|
||||
---
|
||||
|
||||
## Stopping and Resuming
|
||||
|
||||
### Graceful Stop
|
||||
Press `Ctrl+C` once. Current trial completes, then exits.
|
||||
|
||||
### Force Stop
|
||||
Press `Ctrl+C` twice. Immediate exit (may lose current trial).
|
||||
|
||||
### Resume
|
||||
```bash
|
||||
python run_optimization.py --resume
|
||||
```
|
||||
Continues from last completed trial. Same study database used.
|
||||
|
||||
---
|
||||
|
||||
## Post-Run Actions
|
||||
|
||||
After optimization completes:
|
||||
|
||||
1. **Check results**:
|
||||
```bash
|
||||
python -c "import optuna; s=optuna.load_study(...); print(s.best_params)"
|
||||
```
|
||||
|
||||
2. **View in dashboard**: `http://localhost:3000`
|
||||
|
||||
3. **Generate report**: See [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
|
||||
|
||||
4. **Update STUDY_REPORT.md**: Fill in results template
|
||||
|
||||
---
|
||||
|
||||
## Protocol Integration
|
||||
|
||||
### With Protocol 10 (IMSO)
|
||||
If enabled, optimization runs in two phases:
|
||||
1. Characterization (10-30 trials)
|
||||
2. Optimization (remaining trials)
|
||||
|
||||
Dashboard shows phase transitions.
|
||||
|
||||
### With Protocol 11 (Multi-Objective)
|
||||
If 2+ objectives, uses NSGA-II. Returns Pareto front, not single best.
|
||||
|
||||
### With Protocol 13 (Dashboard)
|
||||
Writes `optimizer_state.json` every trial for real-time updates.
|
||||
|
||||
### With Protocol 14 (Neural)
|
||||
If `--neural` flag, uses trained surrogate for fast evaluation.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Cause | Solution |
|
||||
|---------|-------|----------|
|
||||
| "ModuleNotFoundError" | Wrong environment | `conda activate atomizer` |
|
||||
| All trials pruned | Constraints too tight | Relax constraints |
|
||||
| Very slow | Model too complex | Simplify mesh, increase timeout |
|
||||
| No improvement | Wrong sampler | Try different algorithm |
|
||||
| "NX license error" | License unavailable | Check NX license server |
|
||||
|
||||
---
|
||||
|
||||
## Cross-References
|
||||
|
||||
- **Preceded By**: [OP_01_CREATE_STUDY](./OP_01_CREATE_STUDY.md)
|
||||
- **Followed By**: [OP_03_MONITOR_PROGRESS](./OP_03_MONITOR_PROGRESS.md), [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
|
||||
- **Integrates With**: [SYS_10_IMSO](../system/SYS_10_IMSO.md), [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md)
|
||||
|
||||
---
|
||||
|
||||
## Version History
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0 | 2025-12-05 | Initial release |
|
||||
Reference in New Issue
Block a user