feat: Add MLP surrogate with Turbo Mode for 100x faster optimization

Neural Acceleration (MLP Surrogate): - Add run_nn_optimization.py with hybrid FEA/NN workflow - MLP architecture: 4-layer (64->128->128->64) with BatchNorm/Dropout - Three workflow modes: - --all: Sequential export->train->optimize->validate - --hybrid-loop: Iterative Train->NN->Validate->Retrain cycle - --turbo: Aggressive single-best validation (RECOMMENDED) - Turbo mode: 5000 NN trials + 50 FEA validations in ~12 minutes - Separate nn_study.db to avoid overloading dashboard Performance Results (bracket_pareto_3obj study): - NN prediction errors: mass 1-5%, stress 1-4%, stiffness 5-15% - Found minimum mass designs at boundary (angle~30deg, thick~30mm) - 100x speedup vs pure FEA exploration Protocol Operating System: - Add .claude/skills/ with Bootstrap, Cheatsheet, Context Loader - Add docs/protocols/ with operations (OP_01-06) and system (SYS_10-14) - Update SYS_14_NEURAL_ACCELERATION.md with MLP Turbo Mode docs NX Automation: - Add optimization_engine/hooks/ for NX CAD/CAE automation - Add study_wizard.py for guided study creation - Fix FEM mesh update: load idealized part before UpdateFemodel() New Study: - bracket_pareto_3obj: 3-objective Pareto (mass, stress, stiffness) - 167 FEA trials + 5000 NN trials completed - Demonstrates full hybrid workflow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 20:01:59 -05:00
parent 0cb2808c44
commit 602560c46a
70 changed files with 31018 additions and 289 deletions
--- a/docs/protocols/operations/OP_02_RUN_OPTIMIZATION.md
+++ b/docs/protocols/operations/OP_02_RUN_OPTIMIZATION.md
@@ -0,0 +1,297 @@
+# OP_02: Run Optimization
+
+<!--
+PROTOCOL: Run Optimization
+LAYER: Operations
+VERSION: 1.0
+STATUS: Active
+LAST_UPDATED: 2025-12-05
+PRIVILEGE: user
+LOAD_WITH: []
+-->
+
+## Overview
+
+This protocol covers executing optimization runs, including pre-flight validation, execution modes, monitoring, and handling common issues.
+
+---
+
+## When to Use
+
+| Trigger | Action |
+|---------|--------|
+| "start", "run", "execute" | Follow this protocol |
+| "begin optimization" | Follow this protocol |
+| Study setup complete | Execute this protocol |
+
+---
+
+## Quick Reference
+
+**Start Command**:
+```bash
+conda activate atomizer
+cd studies/{study_name}
+python run_optimization.py
+```
+
+**Common Options**:
+| Flag | Purpose |
+|------|---------|
+| `--n-trials 100` | Override trial count |
+| `--resume` | Continue interrupted run |
+| `--test` | Run single trial for validation |
+| `--export-training` | Export data for neural training |
+
+---
+
+## Pre-Flight Checklist
+
+Before running, verify:
+
+- [ ] **Environment**: `conda activate atomizer`
+- [ ] **Config exists**: `1_setup/optimization_config.json`
+- [ ] **Script exists**: `run_optimization.py`
+- [ ] **Model files**: NX files in `1_setup/model/`
+- [ ] **No conflicts**: No other optimization running on same study
+- [ ] **Disk space**: Sufficient for results
+
+**Quick Validation**:
+```bash
+python run_optimization.py --test
+```
+This runs a single trial to verify setup.
+
+---
+
+## Execution Modes
+
+### 1. Standard Run
+
+```bash
+python run_optimization.py
+```
+Uses settings from `optimization_config.json`.
+
+### 2. Override Trials
+
+```bash
+python run_optimization.py --n-trials 100
+```
+Override trial count from config.
+
+### 3. Resume Interrupted
+
+```bash
+python run_optimization.py --resume
+```
+Continues from last completed trial.
+
+### 4. Neural Acceleration
+
+```bash
+python run_optimization.py --neural
+```
+Requires trained surrogate model.
+
+### 5. Export Training Data
+
+```bash
+python run_optimization.py --export-training
+```
+Saves BDF/OP2 for neural network training.
+
+---
+
+## Monitoring Progress
+
+### Option 1: Console Output
+The script prints progress:
+```
+Trial 15/50 complete. Best: 0.234 kg
+Trial 16/50 complete. Best: 0.234 kg
+```
+
+### Option 2: Dashboard
+See [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md).
+
+```bash
+# Start dashboard (separate terminal)
+cd atomizer-dashboard/backend && python -m uvicorn api.main:app --port 8000
+cd atomizer-dashboard/frontend && npm run dev
+
+# Open browser
+http://localhost:3000
+```
+
+### Option 3: Query Database
+
+```bash
+python -c "
+import optuna
+study = optuna.load_study('study_name', 'sqlite:///2_results/study.db')
+print(f'Trials: {len(study.trials)}')
+print(f'Best value: {study.best_value}')
+"
+```
+
+### Option 4: Optuna Dashboard
+
+```bash
+optuna-dashboard sqlite:///2_results/study.db
+# Open http://localhost:8080
+```
+
+---
+
+## During Execution
+
+### What Happens Per Trial
+
+1. **Sample parameters**: Optuna suggests design variable values
+2. **Update model**: NX expressions updated via journal
+3. **Solve**: NX Nastran runs FEA simulation
+4. **Extract results**: Extractors read OP2 file
+5. **Evaluate**: Check constraints, compute objectives
+6. **Record**: Trial stored in Optuna database
+
+### Normal Output
+
+```
+[2025-12-05 10:15:30] Trial 1 started
+[2025-12-05 10:17:45] NX solve complete (135.2s)
+[2025-12-05 10:17:46] Extraction complete
+[2025-12-05 10:17:46] Trial 1 complete: mass=0.342 kg, stress=198.5 MPa
+
+[2025-12-05 10:17:47] Trial 2 started
+...
+```
+
+### Expected Timing
+
+| Operation | Typical Time |
+|-----------|--------------|
+| NX solve | 30s - 30min |
+| Extraction | <1s |
+| Per trial total | 1-30 min |
+| 50 trials | 1-24 hours |
+
+---
+
+## Handling Issues
+
+### Trial Failed / Pruned
+
+```
+[WARNING] Trial 12 pruned: Stress constraint violated (312.5 MPa > 250 MPa)
+```
+**Normal behavior** - optimizer learns from failures.
+
+### NX Session Timeout
+
+```
+[ERROR] NX session timeout after 600s
+```
+**Solution**: Increase timeout in config or simplify model.
+
+### Expression Not Found
+
+```
+[ERROR] Expression 'thicknes' not found in model
+```
+**Solution**: Check spelling, verify expression exists in NX.
+
+### OP2 File Missing
+
+```
+[ERROR] OP2 file not found: model.op2
+```
+**Solution**: Check NX solve completed. Review NX log file.
+
+### Database Locked
+
+```
+[ERROR] Database is locked
+```
+**Solution**: Another process using database. Wait or kill stale process.
+
+---
+
+## Stopping and Resuming
+
+### Graceful Stop
+Press `Ctrl+C` once. Current trial completes, then exits.
+
+### Force Stop
+Press `Ctrl+C` twice. Immediate exit (may lose current trial).
+
+### Resume
+```bash
+python run_optimization.py --resume
+```
+Continues from last completed trial. Same study database used.
+
+---
+
+## Post-Run Actions
+
+After optimization completes:
+
+1. **Check results**:
+   ```bash
+   python -c "import optuna; s=optuna.load_study(...); print(s.best_params)"
+   ```
+
+2. **View in dashboard**: `http://localhost:3000`
+
+3. **Generate report**: See [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
+
+4. **Update STUDY_REPORT.md**: Fill in results template
+
+---
+
+## Protocol Integration
+
+### With Protocol 10 (IMSO)
+If enabled, optimization runs in two phases:
+1. Characterization (10-30 trials)
+2. Optimization (remaining trials)
+
+Dashboard shows phase transitions.
+
+### With Protocol 11 (Multi-Objective)
+If 2+ objectives, uses NSGA-II. Returns Pareto front, not single best.
+
+### With Protocol 13 (Dashboard)
+Writes `optimizer_state.json` every trial for real-time updates.
+
+### With Protocol 14 (Neural)
+If `--neural` flag, uses trained surrogate for fast evaluation.
+
+---
+
+## Troubleshooting
+
+| Symptom | Cause | Solution |
+|---------|-------|----------|
+| "ModuleNotFoundError" | Wrong environment | `conda activate atomizer` |
+| All trials pruned | Constraints too tight | Relax constraints |
+| Very slow | Model too complex | Simplify mesh, increase timeout |
+| No improvement | Wrong sampler | Try different algorithm |
+| "NX license error" | License unavailable | Check NX license server |
+
+---
+
+## Cross-References
+
+- **Preceded By**: [OP_01_CREATE_STUDY](./OP_01_CREATE_STUDY.md)
+- **Followed By**: [OP_03_MONITOR_PROGRESS](./OP_03_MONITOR_PROGRESS.md), [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
+- **Integrates With**: [SYS_10_IMSO](../system/SYS_10_IMSO.md), [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md)
+
+---
+
+## Version History
+
+| Version | Date | Changes |
+|---------|------|---------|
+| 1.0 | 2025-12-05 | Initial release |