feat: Add MLP surrogate with Turbo Mode for 100x faster optimization

Neural Acceleration (MLP Surrogate): - Add run_nn_optimization.py with hybrid FEA/NN workflow - MLP architecture: 4-layer (64->128->128->64) with BatchNorm/Dropout - Three workflow modes: - --all: Sequential export->train->optimize->validate - --hybrid-loop: Iterative Train->NN->Validate->Retrain cycle - --turbo: Aggressive single-best validation (RECOMMENDED) - Turbo mode: 5000 NN trials + 50 FEA validations in ~12 minutes - Separate nn_study.db to avoid overloading dashboard Performance Results (bracket_pareto_3obj study): - NN prediction errors: mass 1-5%, stress 1-4%, stiffness 5-15% - Found minimum mass designs at boundary (angle~30deg, thick~30mm) - 100x speedup vs pure FEA exploration Protocol Operating System: - Add .claude/skills/ with Bootstrap, Cheatsheet, Context Loader - Add docs/protocols/ with operations (OP_01-06) and system (SYS_10-14) - Update SYS_14_NEURAL_ACCELERATION.md with MLP Turbo Mode docs NX Automation: - Add optimization_engine/hooks/ for NX CAD/CAE automation - Add study_wizard.py for guided study creation - Fix FEM mesh update: load idealized part before UpdateFemodel() New Study: - bracket_pareto_3obj: 3-objective Pareto (mass, stress, stiffness) - 167 FEA trials + 5000 NN trials completed - Demonstrates full hybrid workflow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 20:01:59 -05:00
parent 0cb2808c44
commit 602560c46a
70 changed files with 31018 additions and 289 deletions
--- a/docs/protocols/system/SYS_14_NEURAL_ACCELERATION.md
+++ b/docs/protocols/system/SYS_14_NEURAL_ACCELERATION.md
@@ -0,0 +1,564 @@
+# SYS_14: Neural Network Acceleration
+
+<!--
+PROTOCOL: Neural Network Surrogate Acceleration
+LAYER: System
+VERSION: 2.0
+STATUS: Active
+LAST_UPDATED: 2025-12-06
+PRIVILEGE: user
+LOAD_WITH: [SYS_10_IMSO, SYS_11_MULTI_OBJECTIVE]
+-->
+
+## Overview
+
+Atomizer provides **neural network surrogate acceleration** enabling 100-1000x faster optimization by replacing expensive FEA evaluations with instant neural predictions.
+
+**Two approaches available**:
+1. **MLP Surrogate** (Simple, integrated) - 4-layer MLP trained on FEA data, runs within study
+2. **GNN Field Predictor** (Advanced) - Graph neural network for full field predictions
+
+**Key Innovation**: Train once on FEA data, then explore 5,000-50,000+ designs in the time it takes to run 50 FEA trials.
+
+---
+
+## When to Use
+
+| Trigger | Action |
+|---------|--------|
+| >50 trials needed | Consider neural acceleration |
+| "neural", "surrogate", "NN" mentioned | Load this protocol |
+| "fast", "acceleration", "speed" needed | Suggest neural acceleration |
+| Training data available | Enable surrogate |
+
+---
+
+## Quick Reference
+
+**Performance Comparison**:
+
+| Metric | Traditional FEA | Neural Network | Improvement |
+|--------|-----------------|----------------|-------------|
+| Time per evaluation | 10-30 minutes | 4.5 milliseconds | **2,000-500,000x** |
+| Trials per hour | 2-6 | 800,000+ | **1000x** |
+| Design exploration | ~50 designs | ~50,000 designs | **1000x** |
+
+**Model Types**:
+
+| Model | Purpose | Use When |
+|-------|---------|----------|
+| **MLP Surrogate** | Direct objective prediction | Simple studies, quick setup |
+| Field Predictor GNN | Full displacement/stress fields | Need field visualization |
+| Parametric Predictor GNN | Direct objective prediction | Complex geometry, need accuracy |
+| Ensemble | Uncertainty quantification | Need confidence bounds |
+
+---
+
+## MLP Surrogate (Recommended for Quick Start)
+
+### Overview
+
+The MLP (Multi-Layer Perceptron) surrogate is a simple but effective neural network that predicts objectives directly from design parameters. It's integrated into the study workflow via `run_nn_optimization.py`.
+
+### Architecture
+
+```
+Input Layer (N design variables)
+    ↓
+Linear(N, 64) + ReLU + BatchNorm + Dropout(0.1)
+    ↓
+Linear(64, 128) + ReLU + BatchNorm + Dropout(0.1)
+    ↓
+Linear(128, 128) + ReLU + BatchNorm + Dropout(0.1)
+    ↓
+Linear(128, 64) + ReLU + BatchNorm + Dropout(0.1)
+    ↓
+Linear(64, M objectives)
+```
+
+**Parameters**: ~34,000 trainable
+
+### Workflow Modes
+
+#### 1. Standard Hybrid Mode (`--all`)
+
+Run all phases sequentially:
+```bash
+python run_nn_optimization.py --all
+```
+
+Phases:
+1. **Export**: Extract training data from existing FEA trials
+2. **Train**: Train MLP surrogate (300 epochs default)
+3. **NN-Optimize**: Run 1000 NN trials with NSGA-II
+4. **Validate**: Validate top 10 candidates with FEA
+
+#### 2. Hybrid Loop Mode (`--hybrid-loop`)
+
+Iterative refinement:
+```bash
+python run_nn_optimization.py --hybrid-loop --iterations 5 --nn-trials 500
+```
+
+Each iteration:
+1. Train/retrain surrogate from current FEA data
+2. Run NN optimization
+3. Validate top candidates with FEA
+4. Add validated results to training set
+5. Repeat until convergence (max error < 5%)
+
+#### 3. Turbo Mode (`--turbo`) ⚡ RECOMMENDED
+
+Aggressive single-best validation:
+```bash
+python run_nn_optimization.py --turbo --nn-trials 5000 --batch-size 100 --retrain-every 10
+```
+
+Strategy:
+- Run NN in small batches (100 trials)
+- Validate ONLY the single best candidate with FEA
+- Add to training data immediately
+- Retrain surrogate every N FEA validations
+- Repeat until total NN budget exhausted
+
+**Example**: 5,000 NN trials with batch=100 → 50 FEA validations in ~12 minutes
+
+### Configuration
+
+```json
+{
+  "neural_acceleration": {
+    "enabled": true,
+    "min_training_points": 50,
+    "auto_train": true,
+    "epochs": 300,
+    "validation_split": 0.2,
+    "nn_trials": 1000,
+    "validate_top_n": 10,
+    "model_file": "surrogate_best.pt",
+    "separate_nn_database": true
+  }
+}
+```
+
+**Important**: `separate_nn_database: true` stores NN trials in `nn_study.db` instead of `study.db` to avoid overloading the dashboard with thousands of NN-only results.
+
+### Typical Accuracy
+
+| Objective | Expected Error |
+|-----------|----------------|
+| Mass | 1-5% |
+| Stress | 1-4% |
+| Stiffness | 5-15% |
+
+### Output Files
+
+```
+2_results/
+├── study.db                    # Main FEA + validated results (dashboard)
+├── nn_study.db                 # NN-only results (not in dashboard)
+├── surrogate_best.pt           # Trained model weights
+├── training_data.json          # Normalized training data
+├── nn_optimization_state.json  # NN optimization state
+├── nn_pareto_front.json        # NN-predicted Pareto front
+├── validation_report.json      # FEA validation results
+└── turbo_report.json           # Turbo mode results (if used)
+```
+
+---
+
+## GNN Field Predictor (Advanced)
+
+### Core Components
+
+| Component | File | Purpose |
+|-----------|------|---------|
+| BDF/OP2 Parser | `neural_field_parser.py` | Convert NX files to neural format |
+| Data Validator | `validate_parsed_data.py` | Physics and quality checks |
+| Field Predictor | `field_predictor.py` | GNN for full field prediction |
+| Parametric Predictor | `parametric_predictor.py` | GNN for direct objectives |
+| Physics Loss | `physics_losses.py` | Physics-informed training |
+| Neural Surrogate | `neural_surrogate.py` | Integration with Atomizer |
+| Neural Runner | `runner_with_neural.py` | Optimization with NN acceleration |
+
+### Workflow Diagram
+
+```
+Traditional:
+Design → NX Model → Mesh → Solve (30 min) → Results → Objective
+
+Neural (after training):
+Design → Neural Network (4.5 ms) → Results → Objective
+```
+
+---
+
+## Neural Model Types
+
+### 1. Field Predictor GNN
+
+**Use Case**: When you need full field predictions (stress distribution, deformation shape).
+
+```
+Input Features (12D per node):
+├── Node coordinates (x, y, z)
+├── Material properties (E, nu, rho)
+├── Boundary conditions (fixed/free per DOF)
+└── Load information (force magnitude, direction)
+
+GNN Layers (6 message passing):
+├── MeshGraphConv (custom for FEA topology)
+├── Layer normalization
+├── ReLU activation
+└── Dropout (0.1)
+
+Output (per node):
+├── Displacement (6 DOF: Tx, Ty, Tz, Rx, Ry, Rz)
+└── Von Mises stress (1 value)
+```
+
+**Parameters**: ~718,221 trainable
+
+### 2. Parametric Predictor GNN (Recommended)
+
+**Use Case**: Direct optimization objective prediction (fastest option).
+
+```
+Design Parameters (ND) → Design Encoder (MLP) → GNN Backbone → Scalar Heads
+
+Output (objectives):
+├── mass (grams)
+├── frequency (Hz)
+├── max_displacement (mm)
+└── max_stress (MPa)
+```
+
+**Parameters**: ~500,000 trainable
+
+### 3. Ensemble Models
+
+**Use Case**: Uncertainty quantification.
+
+1. Train 3-5 models with different random seeds
+2. At inference, run all models
+3. Use mean for prediction, std for uncertainty
+4. High uncertainty → trigger FEA validation
+
+---
+
+## Training Pipeline
+
+### Step 1: Collect Training Data
+
+Enable export in workflow config:
+
+```json
+{
+  "training_data_export": {
+    "enabled": true,
+    "export_dir": "atomizer_field_training_data/my_study"
+  }
+}
+```
+
+Output structure:
+```
+atomizer_field_training_data/my_study/
+├── trial_0001/
+│   ├── input/model.bdf       # Nastran input
+│   ├── output/model.op2      # Binary results
+│   └── metadata.json         # Design params + objectives
+├── trial_0002/
+│   └── ...
+└── study_summary.json
+```
+
+**Recommended**: 100-500 FEA samples for good generalization.
+
+### Step 2: Parse to Neural Format
+
+```bash
+cd atomizer-field
+python batch_parser.py ../atomizer_field_training_data/my_study
+```
+
+Creates HDF5 + JSON files per trial.
+
+### Step 3: Train Model
+
+**Parametric Predictor** (recommended):
+```bash
+python train_parametric.py \
+  --train_dir ../training_data/parsed \
+  --val_dir ../validation_data/parsed \
+  --epochs 200 \
+  --hidden_channels 128 \
+  --num_layers 4
+```
+
+**Field Predictor**:
+```bash
+python train.py \
+  --train_dir ../training_data/parsed \
+  --epochs 200 \
+  --model FieldPredictorGNN \
+  --hidden_channels 128 \
+  --num_layers 6 \
+  --physics_loss_weight 0.3
+```
+
+### Step 4: Validate
+
+```bash
+python validate.py --checkpoint runs/my_model/checkpoint_best.pt
+```
+
+Expected output:
+```
+Validation Results:
+├── Mean Absolute Error: 2.3% (mass), 1.8% (frequency)
+├── R² Score: 0.987
+├── Inference Time: 4.5ms ± 0.8ms
+└── Physics Violations: 0.2%
+```
+
+### Step 5: Deploy
+
+```json
+{
+  "neural_surrogate": {
+    "enabled": true,
+    "model_checkpoint": "atomizer-field/runs/my_model/checkpoint_best.pt",
+    "confidence_threshold": 0.85
+  }
+}
+```
+
+---
+
+## Configuration
+
+### Full Neural Configuration Example
+
+```json
+{
+  "study_name": "bracket_neural_optimization",
+
+  "surrogate_settings": {
+    "enabled": true,
+    "model_type": "parametric_gnn",
+    "model_path": "models/bracket_surrogate.pt",
+    "confidence_threshold": 0.85,
+    "validation_frequency": 10,
+    "fallback_to_fea": true
+  },
+
+  "training_data_export": {
+    "enabled": true,
+    "export_dir": "atomizer_field_training_data/bracket_study",
+    "export_bdf": true,
+    "export_op2": true,
+    "export_fields": ["displacement", "stress"]
+  },
+
+  "neural_optimization": {
+    "initial_fea_trials": 50,
+    "neural_trials": 5000,
+    "retraining_interval": 500,
+    "uncertainty_threshold": 0.15
+  }
+}
+```
+
+### Configuration Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `enabled` | bool | false | Enable neural surrogate |
+| `model_type` | string | "parametric_gnn" | Model architecture |
+| `model_path` | string | - | Path to trained model |
+| `confidence_threshold` | float | 0.85 | Min confidence for predictions |
+| `validation_frequency` | int | 10 | FEA validation every N trials |
+| `fallback_to_fea` | bool | true | Use FEA when uncertain |
+
+---
+
+## Hybrid FEA/Neural Workflow
+
+### Phase 1: FEA Exploration (50-100 trials)
+- Run standard FEA optimization
+- Export training data automatically
+- Build landscape understanding
+
+### Phase 2: Neural Training
+- Parse collected data
+- Train parametric predictor
+- Validate accuracy
+
+### Phase 3: Neural Acceleration (1000s of trials)
+- Use neural network for rapid exploration
+- Periodic FEA validation
+- Retrain if distribution shifts
+
+### Phase 4: FEA Refinement (10-20 trials)
+- Validate top candidates with FEA
+- Ensure results are physically accurate
+- Generate final Pareto front
+
+---
+
+## Adaptive Iteration Loop
+
+For complex optimizations, use iterative refinement:
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│  Iteration 1:                                                    │
+│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
+│  │ Initial FEA  │ -> │ Train NN     │ -> │ NN Search    │       │
+│  │ (50-100)     │    │ Surrogate    │    │ (1000 trials)│       │
+│  └──────────────┘    └──────────────┘    └──────────────┘       │
+│                                                 │                │
+│  Iteration 2+:                                  ▼                │
+│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
+│  │ Validate Top │ -> │ Retrain NN   │ -> │ NN Search    │       │
+│  │ NN with FEA  │    │ with new data│    │ (1000 trials)│       │
+│  └──────────────┘    └──────────────┘    └──────────────┘       │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+### Adaptive Configuration
+
+```json
+{
+  "adaptive_settings": {
+    "enabled": true,
+    "initial_fea_trials": 50,
+    "nn_trials_per_iteration": 1000,
+    "fea_validation_per_iteration": 5,
+    "max_iterations": 10,
+    "convergence_threshold": 0.01,
+    "retrain_epochs": 100
+  }
+}
+```
+
+### Convergence Criteria
+
+Stop when:
+- No improvement for 2-3 consecutive iterations
+- Reached FEA budget limit
+- Objective improvement < 1% threshold
+
+### Output Files
+
+```
+studies/my_study/3_results/
+├── adaptive_state.json      # Current iteration state
+├── surrogate_model.pt       # Trained neural network
+└── training_history.json    # NN training metrics
+```
+
+---
+
+## Loss Functions
+
+### Data Loss (MSE)
+Standard prediction error:
+```python
+data_loss = MSE(predicted, target)
+```
+
+### Physics Loss
+Enforce physical constraints:
+```python
+physics_loss = (
+    equilibrium_loss +      # Force balance
+    boundary_loss +         # BC satisfaction
+    compatibility_loss      # Strain compatibility
+)
+```
+
+### Combined Training
+```python
+total_loss = data_loss + 0.3 * physics_loss
+```
+
+Physics loss weight typically 0.1-0.5.
+
+---
+
+## Uncertainty Quantification
+
+### Ensemble Method
+```python
+# Run N models
+predictions = [model_i(x) for model_i in ensemble]
+
+# Statistics
+mean_prediction = np.mean(predictions)
+uncertainty = np.std(predictions)
+
+# Decision
+if uncertainty > threshold:
+    # Use FEA instead
+    result = run_fea(x)
+else:
+    result = mean_prediction
+```
+
+### Confidence Thresholds
+
+| Uncertainty | Action |
+|-------------|--------|
+| < 5% | Use neural prediction |
+| 5-15% | Use neural, flag for validation |
+| > 15% | Fall back to FEA |
+
+---
+
+## Troubleshooting
+
+| Symptom | Cause | Solution |
+|---------|-------|----------|
+| High prediction error | Insufficient training data | Collect more FEA samples |
+| Out-of-distribution warnings | Design outside training range | Retrain with expanded range |
+| Slow inference | Large mesh | Use parametric predictor instead |
+| Physics violations | Low physics loss weight | Increase `physics_loss_weight` |
+
+---
+
+## Cross-References
+
+- **Depends On**: [SYS_10_IMSO](./SYS_10_IMSO.md) for optimization framework
+- **Used By**: [OP_02_RUN_OPTIMIZATION](../operations/OP_02_RUN_OPTIMIZATION.md), [OP_05_EXPORT_TRAINING_DATA](../operations/OP_05_EXPORT_TRAINING_DATA.md)
+- **See Also**: [modules/neural-acceleration.md](../../.claude/skills/modules/neural-acceleration.md)
+
+---
+
+## Implementation Files
+
+```
+atomizer-field/
+├── neural_field_parser.py       # BDF/OP2 parsing
+├── field_predictor.py           # Field GNN
+├── parametric_predictor.py      # Parametric GNN
+├── train.py                     # Field training
+├── train_parametric.py          # Parametric training
+├── validate.py                  # Model validation
+├── physics_losses.py            # Physics-informed loss
+└── batch_parser.py              # Batch data conversion
+
+optimization_engine/
+├── neural_surrogate.py          # Atomizer integration
+└── runner_with_neural.py        # Neural runner
+```
+
+---
+
+## Version History
+
+| Version | Date | Changes |
+|---------|------|---------|
+| 2.0 | 2025-12-06 | Added MLP Surrogate with Turbo Mode |
+| 1.0 | 2025-12-05 | Initial consolidation from neural docs |