# SYS_14: Neural Network Acceleration

<!--
PROTOCOL: Neural Network Surrogate Acceleration
LAYER: System
VERSION: 2.0
STATUS: Active
LAST_UPDATED: 2025-12-06
PRIVILEGE: user
LOAD_WITH: [SYS_10_IMSO, SYS_11_MULTI_OBJECTIVE]
-->

## Overview

Atomizer provides **neural network surrogate acceleration** enabling 100-1000x faster optimization by replacing expensive FEA evaluations with instant neural predictions.

**Two approaches available**:
1. **MLP Surrogate** (Simple, integrated) - 4-layer MLP trained on FEA data, runs within study
2. **GNN Field Predictor** (Advanced) - Graph neural network for full field predictions

**Key Innovation**: Train once on FEA data, then explore 5,000-50,000+ designs in the time it takes to run 50 FEA trials.

---

## When to Use

| Trigger | Action |
|---------|--------|
| >50 trials needed | Consider neural acceleration |
| "neural", "surrogate", "NN" mentioned | Load this protocol |
| "fast", "acceleration", "speed" needed | Suggest neural acceleration |
| Training data available | Enable surrogate |

---

## Quick Reference

**Performance Comparison**:

| Metric | Traditional FEA | Neural Network | Improvement |
|--------|-----------------|----------------|-------------|
| Time per evaluation | 10-30 minutes | 4.5 milliseconds | **2,000-500,000x** |
| Trials per hour | 2-6 | 800,000+ | **1000x** |
| Design exploration | ~50 designs | ~50,000 designs | **1000x** |

**Model Types**:

| Model | Purpose | Use When |
|-------|---------|----------|
| **MLP Surrogate** | Direct objective prediction | Simple studies, quick setup |
| Field Predictor GNN | Full displacement/stress fields | Need field visualization |
| Parametric Predictor GNN | Direct objective prediction | Complex geometry, need accuracy |
| Ensemble | Uncertainty quantification | Need confidence bounds |

---

## MLP Surrogate (Recommended for Quick Start)

### Overview

The MLP (Multi-Layer Perceptron) surrogate is a simple but effective neural network that predicts objectives directly from design parameters. It's integrated into the study workflow via `run_nn_optimization.py`.

### Architecture

```
Input Layer (N design variables)
    ↓
Linear(N, 64) + ReLU + BatchNorm + Dropout(0.1)
    ↓
Linear(64, 128) + ReLU + BatchNorm + Dropout(0.1)
    ↓
Linear(128, 128) + ReLU + BatchNorm + Dropout(0.1)
    ↓
Linear(128, 64) + ReLU + BatchNorm + Dropout(0.1)
    ↓
Linear(64, M objectives)
```

**Parameters**: ~34,000 trainable

### Workflow Modes

#### 1. Standard Hybrid Mode (`--all`)

Run all phases sequentially:
```bash
python run_nn_optimization.py --all
```

Phases:
1. **Export**: Extract training data from existing FEA trials
2. **Train**: Train MLP surrogate (300 epochs default)
3. **NN-Optimize**: Run 1000 NN trials with NSGA-II
4. **Validate**: Validate top 10 candidates with FEA

#### 2. Hybrid Loop Mode (`--hybrid-loop`)

Iterative refinement:
```bash
python run_nn_optimization.py --hybrid-loop --iterations 5 --nn-trials 500
```

Each iteration:
1. Train/retrain surrogate from current FEA data
2. Run NN optimization
3. Validate top candidates with FEA
4. Add validated results to training set
5. Repeat until convergence (max error < 5%)

#### 3. Turbo Mode (`--turbo`) ⚡ RECOMMENDED

Aggressive single-best validation:
```bash
python run_nn_optimization.py --turbo --nn-trials 5000 --batch-size 100 --retrain-every 10
```

Strategy:
- Run NN in small batches (100 trials)
- Validate ONLY the single best candidate with FEA
- Add to training data immediately
- Retrain surrogate every N FEA validations
- Repeat until total NN budget exhausted

**Example**: 5,000 NN trials with batch=100 → 50 FEA validations in ~12 minutes

### Configuration

```json
{
  "neural_acceleration": {
    "enabled": true,
    "min_training_points": 50,
    "auto_train": true,
    "epochs": 300,
    "validation_split": 0.2,
    "nn_trials": 1000,
    "validate_top_n": 10,
    "model_file": "surrogate_best.pt",
    "separate_nn_database": true
  }
}
```

**Important**: `separate_nn_database: true` stores NN trials in `nn_study.db` instead of `study.db` to avoid overloading the dashboard with thousands of NN-only results.

### Typical Accuracy

| Objective | Expected Error |
|-----------|----------------|
| Mass | 1-5% |
| Stress | 1-4% |
| Stiffness | 5-15% |

### Output Files

```
2_results/
├── study.db                    # Main FEA + validated results (dashboard)
├── nn_study.db                 # NN-only results (not in dashboard)
├── surrogate_best.pt           # Trained model weights
├── training_data.json          # Normalized training data
├── nn_optimization_state.json  # NN optimization state
├── nn_pareto_front.json        # NN-predicted Pareto front
├── validation_report.json      # FEA validation results
└── turbo_report.json           # Turbo mode results (if used)
```

---

## GNN Field Predictor (Advanced)

### Core Components

| Component | File | Purpose |
|-----------|------|---------|
| BDF/OP2 Parser | `neural_field_parser.py` | Convert NX files to neural format |
| Data Validator | `validate_parsed_data.py` | Physics and quality checks |
| Field Predictor | `field_predictor.py` | GNN for full field prediction |
| Parametric Predictor | `parametric_predictor.py` | GNN for direct objectives |
| Physics Loss | `physics_losses.py` | Physics-informed training |
| Neural Surrogate | `neural_surrogate.py` | Integration with Atomizer |
| Neural Runner | `runner_with_neural.py` | Optimization with NN acceleration |

### Workflow Diagram

```
Traditional:
Design → NX Model → Mesh → Solve (30 min) → Results → Objective

Neural (after training):
Design → Neural Network (4.5 ms) → Results → Objective
```

---

## Neural Model Types

### 1. Field Predictor GNN

**Use Case**: When you need full field predictions (stress distribution, deformation shape).

```
Input Features (12D per node):
├── Node coordinates (x, y, z)
├── Material properties (E, nu, rho)
├── Boundary conditions (fixed/free per DOF)
└── Load information (force magnitude, direction)

GNN Layers (6 message passing):
├── MeshGraphConv (custom for FEA topology)
├── Layer normalization
├── ReLU activation
└── Dropout (0.1)

Output (per node):
├── Displacement (6 DOF: Tx, Ty, Tz, Rx, Ry, Rz)
└── Von Mises stress (1 value)
```

**Parameters**: ~718,221 trainable

### 2. Parametric Predictor GNN (Recommended)

**Use Case**: Direct optimization objective prediction (fastest option).

```
Design Parameters (ND) → Design Encoder (MLP) → GNN Backbone → Scalar Heads

Output (objectives):
├── mass (grams)
├── frequency (Hz)
├── max_displacement (mm)
└── max_stress (MPa)
```

**Parameters**: ~500,000 trainable

### 3. Ensemble Models

**Use Case**: Uncertainty quantification.

1. Train 3-5 models with different random seeds
2. At inference, run all models
3. Use mean for prediction, std for uncertainty
4. High uncertainty → trigger FEA validation

---

## Training Pipeline

### Step 1: Collect Training Data

Enable export in workflow config:

```json
{
  "training_data_export": {
    "enabled": true,
    "export_dir": "atomizer_field_training_data/my_study"
  }
}
```

Output structure:
```
atomizer_field_training_data/my_study/
├── trial_0001/
│   ├── input/model.bdf       # Nastran input
│   ├── output/model.op2      # Binary results
│   └── metadata.json         # Design params + objectives
├── trial_0002/
│   └── ...
└── study_summary.json
```

**Recommended**: 100-500 FEA samples for good generalization.

### Step 2: Parse to Neural Format

```bash
cd atomizer-field
python batch_parser.py ../atomizer_field_training_data/my_study
```

Creates HDF5 + JSON files per trial.

### Step 3: Train Model

**Parametric Predictor** (recommended):
```bash
python train_parametric.py \
  --train_dir ../training_data/parsed \
  --val_dir ../validation_data/parsed \
  --epochs 200 \
  --hidden_channels 128 \
  --num_layers 4
```

**Field Predictor**:
```bash
python train.py \
  --train_dir ../training_data/parsed \
  --epochs 200 \
  --model FieldPredictorGNN \
  --hidden_channels 128 \
  --num_layers 6 \
  --physics_loss_weight 0.3
```

### Step 4: Validate

```bash
python validate.py --checkpoint runs/my_model/checkpoint_best.pt
```

Expected output:
```
Validation Results:
├── Mean Absolute Error: 2.3% (mass), 1.8% (frequency)
├── R² Score: 0.987
├── Inference Time: 4.5ms ± 0.8ms
└── Physics Violations: 0.2%
```

### Step 5: Deploy

```json
{
  "neural_surrogate": {
    "enabled": true,
    "model_checkpoint": "atomizer-field/runs/my_model/checkpoint_best.pt",
    "confidence_threshold": 0.85
  }
}
```

---

## Configuration

### Full Neural Configuration Example

```json
{
  "study_name": "bracket_neural_optimization",

  "surrogate_settings": {
    "enabled": true,
    "model_type": "parametric_gnn",
    "model_path": "models/bracket_surrogate.pt",
    "confidence_threshold": 0.85,
    "validation_frequency": 10,
    "fallback_to_fea": true
  },

  "training_data_export": {
    "enabled": true,
    "export_dir": "atomizer_field_training_data/bracket_study",
    "export_bdf": true,
    "export_op2": true,
    "export_fields": ["displacement", "stress"]
  },

  "neural_optimization": {
    "initial_fea_trials": 50,
    "neural_trials": 5000,
    "retraining_interval": 500,
    "uncertainty_threshold": 0.15
  }
}
```

### Configuration Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `enabled` | bool | false | Enable neural surrogate |
| `model_type` | string | "parametric_gnn" | Model architecture |
| `model_path` | string | - | Path to trained model |
| `confidence_threshold` | float | 0.85 | Min confidence for predictions |
| `validation_frequency` | int | 10 | FEA validation every N trials |
| `fallback_to_fea` | bool | true | Use FEA when uncertain |

---

## Hybrid FEA/Neural Workflow

### Phase 1: FEA Exploration (50-100 trials)
- Run standard FEA optimization
- Export training data automatically
- Build landscape understanding

### Phase 2: Neural Training
- Parse collected data
- Train parametric predictor
- Validate accuracy

### Phase 3: Neural Acceleration (1000s of trials)
- Use neural network for rapid exploration
- Periodic FEA validation
- Retrain if distribution shifts

### Phase 4: FEA Refinement (10-20 trials)
- Validate top candidates with FEA
- Ensure results are physically accurate
- Generate final Pareto front

---

## Adaptive Iteration Loop

For complex optimizations, use iterative refinement:

```
┌─────────────────────────────────────────────────────────────────┐
│  Iteration 1:                                                    │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
│  │ Initial FEA  │ -> │ Train NN     │ -> │ NN Search    │       │
│  │ (50-100)     │    │ Surrogate    │    │ (1000 trials)│       │
│  └──────────────┘    └──────────────┘    └──────────────┘       │
│                                                 │                │
│  Iteration 2+:                                  ▼                │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
│  │ Validate Top │ -> │ Retrain NN   │ -> │ NN Search    │       │
│  │ NN with FEA  │    │ with new data│    │ (1000 trials)│       │
│  └──────────────┘    └──────────────┘    └──────────────┘       │
└─────────────────────────────────────────────────────────────────┘
```

### Adaptive Configuration

```json
{
  "adaptive_settings": {
    "enabled": true,
    "initial_fea_trials": 50,
    "nn_trials_per_iteration": 1000,
    "fea_validation_per_iteration": 5,
    "max_iterations": 10,
    "convergence_threshold": 0.01,
    "retrain_epochs": 100
  }
}
```

### Convergence Criteria

Stop when:
- No improvement for 2-3 consecutive iterations
- Reached FEA budget limit
- Objective improvement < 1% threshold

### Output Files

```
studies/my_study/3_results/
├── adaptive_state.json      # Current iteration state
├── surrogate_model.pt       # Trained neural network
└── training_history.json    # NN training metrics
```

---

## Loss Functions

### Data Loss (MSE)
Standard prediction error:
```python
data_loss = MSE(predicted, target)
```

### Physics Loss
Enforce physical constraints:
```python
physics_loss = (
    equilibrium_loss +      # Force balance
    boundary_loss +         # BC satisfaction
    compatibility_loss      # Strain compatibility
)
```

### Combined Training
```python
total_loss = data_loss + 0.3 * physics_loss
```

Physics loss weight typically 0.1-0.5.

---

## Uncertainty Quantification

### Ensemble Method
```python
# Run N models
predictions = [model_i(x) for model_i in ensemble]

# Statistics
mean_prediction = np.mean(predictions)
uncertainty = np.std(predictions)

# Decision
if uncertainty > threshold:
    # Use FEA instead
    result = run_fea(x)
else:
    result = mean_prediction
```

### Confidence Thresholds

| Uncertainty | Action |
|-------------|--------|
| < 5% | Use neural prediction |
| 5-15% | Use neural, flag for validation |
| > 15% | Fall back to FEA |

---

## Troubleshooting

| Symptom | Cause | Solution |
|---------|-------|----------|
| High prediction error | Insufficient training data | Collect more FEA samples |
| Out-of-distribution warnings | Design outside training range | Retrain with expanded range |
| Slow inference | Large mesh | Use parametric predictor instead |
| Physics violations | Low physics loss weight | Increase `physics_loss_weight` |

---

## Cross-References

- **Depends On**: [SYS_10_IMSO](./SYS_10_IMSO.md) for optimization framework
- **Used By**: [OP_02_RUN_OPTIMIZATION](../operations/OP_02_RUN_OPTIMIZATION.md), [OP_05_EXPORT_TRAINING_DATA](../operations/OP_05_EXPORT_TRAINING_DATA.md)
- **See Also**: [modules/neural-acceleration.md](../../.claude/skills/modules/neural-acceleration.md)

---

## Implementation Files

```
atomizer-field/
├── neural_field_parser.py       # BDF/OP2 parsing
├── field_predictor.py           # Field GNN
├── parametric_predictor.py      # Parametric GNN
├── train.py                     # Field training
├── train_parametric.py          # Parametric training
├── validate.py                  # Model validation
├── physics_losses.py            # Physics-informed loss
└── batch_parser.py              # Batch data conversion

optimization_engine/
├── neural_surrogate.py          # Atomizer integration
└── runner_with_neural.py        # Neural runner
```

---

## Version History

| Version | Date | Changes |
|---------|------|---------|
| 2.0 | 2025-12-06 | Added MLP Surrogate with Turbo Mode |
| 1.0 | 2025-12-05 | Initial consolidation from neural docs |