Files
Atomizer/docs/protocols/operations/OP_05_EXPORT_TRAINING_DATA.md
Antoine 602560c46a feat: Add MLP surrogate with Turbo Mode for 100x faster optimization
Neural Acceleration (MLP Surrogate):
- Add run_nn_optimization.py with hybrid FEA/NN workflow
- MLP architecture: 4-layer (64->128->128->64) with BatchNorm/Dropout
- Three workflow modes:
  - --all: Sequential export->train->optimize->validate
  - --hybrid-loop: Iterative Train->NN->Validate->Retrain cycle
  - --turbo: Aggressive single-best validation (RECOMMENDED)
- Turbo mode: 5000 NN trials + 50 FEA validations in ~12 minutes
- Separate nn_study.db to avoid overloading dashboard

Performance Results (bracket_pareto_3obj study):
- NN prediction errors: mass 1-5%, stress 1-4%, stiffness 5-15%
- Found minimum mass designs at boundary (angle~30deg, thick~30mm)
- 100x speedup vs pure FEA exploration

Protocol Operating System:
- Add .claude/skills/ with Bootstrap, Cheatsheet, Context Loader
- Add docs/protocols/ with operations (OP_01-06) and system (SYS_10-14)
- Update SYS_14_NEURAL_ACCELERATION.md with MLP Turbo Mode docs

NX Automation:
- Add optimization_engine/hooks/ for NX CAD/CAE automation
- Add study_wizard.py for guided study creation
- Fix FEM mesh update: load idealized part before UpdateFemodel()

New Study:
- bracket_pareto_3obj: 3-objective Pareto (mass, stress, stiffness)
- 167 FEA trials + 5000 NN trials completed
- Demonstrates full hybrid workflow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 20:01:59 -05:00

6.0 KiB

OP_05: Export Training Data

Overview

This protocol covers exporting FEA simulation data for training neural network surrogates. Proper data export enables Protocol 14 (Neural Acceleration).


When to Use

Trigger Action
"export training data" Follow this protocol
"neural network data" Follow this protocol
Planning >50 trials Consider export for acceleration
Want to train surrogate Follow this protocol

Quick Reference

Export Command:

python run_optimization.py --export-training

Output Structure:

atomizer_field_training_data/{study_name}/
├── trial_0001/
│   ├── input/model.bdf
│   ├── output/model.op2
│   └── metadata.json
├── trial_0002/
│   └── ...
└── study_summary.json

Recommended Data Volume:

Complexity Training Samples Validation Samples
Simple (2-3 params) 50-100 20-30
Medium (4-6 params) 100-200 30-50
Complex (7+ params) 200-500 50-100

Configuration

Enable Export in Config

Add to optimization_config.json:

{
  "training_data_export": {
    "enabled": true,
    "export_dir": "atomizer_field_training_data/my_study",
    "export_bdf": true,
    "export_op2": true,
    "export_fields": ["displacement", "stress"],
    "include_failed": false
  }
}

Configuration Options

Option Type Default Description
enabled bool false Enable export
export_dir string - Output directory
export_bdf bool true Save Nastran input
export_op2 bool true Save binary results
export_fields list all Which result fields
include_failed bool false Include failed trials

Export Workflow

Step 1: Run with Export Enabled

conda activate atomizer
cd studies/my_study
python run_optimization.py --export-training

Or run standard optimization with config export enabled.

Step 2: Verify Export

ls atomizer_field_training_data/my_study/
# Should see trial_0001/, trial_0002/, etc.

# Check a trial
ls atomizer_field_training_data/my_study/trial_0001/
# input/model.bdf
# output/model.op2
# metadata.json

Step 3: Check Metadata

cat atomizer_field_training_data/my_study/trial_0001/metadata.json
{
  "trial_number": 1,
  "design_parameters": {
    "thickness": 5.2,
    "width": 30.0
  },
  "objectives": {
    "mass": 0.234,
    "max_stress": 198.5
  },
  "constraints_satisfied": true,
  "simulation_time": 145.2
}

Step 4: Check Study Summary

cat atomizer_field_training_data/my_study/study_summary.json
{
  "study_name": "my_study",
  "total_trials": 50,
  "successful_exports": 47,
  "failed_exports": 3,
  "design_parameters": ["thickness", "width"],
  "objectives": ["mass", "max_stress"],
  "export_timestamp": "2025-12-05T15:30:00"
}

Data Quality Checks

Verify Sample Count

from pathlib import Path
import json

export_dir = Path("atomizer_field_training_data/my_study")
trials = list(export_dir.glob("trial_*"))
print(f"Exported trials: {len(trials)}")

# Check for missing files
for trial_dir in trials:
    bdf = trial_dir / "input" / "model.bdf"
    op2 = trial_dir / "output" / "model.op2"
    meta = trial_dir / "metadata.json"

    if not all([bdf.exists(), op2.exists(), meta.exists()]):
        print(f"Missing files in {trial_dir}")

Check Parameter Coverage

import json
import numpy as np

# Load all metadata
params = []
for trial_dir in export_dir.glob("trial_*"):
    with open(trial_dir / "metadata.json") as f:
        meta = json.load(f)
        params.append(meta["design_parameters"])

# Check coverage
import pandas as pd
df = pd.DataFrame(params)
print(df.describe())

# Look for gaps
for col in df.columns:
    print(f"{col}: min={df[col].min():.2f}, max={df[col].max():.2f}")

Space-Filling Sampling

For best neural network training, use space-filling designs:

Latin Hypercube Sampling

from scipy.stats import qmc

# Generate space-filling samples
n_samples = 100
n_params = 4

sampler = qmc.LatinHypercube(d=n_params)
samples = sampler.random(n=n_samples)

# Scale to parameter bounds
lower = [2.0, 20.0, 5.0, 1.0]
upper = [10.0, 50.0, 15.0, 5.0]
scaled = qmc.scale(samples, lower, upper)

Sobol Sequence

sampler = qmc.Sobol(d=n_params)
samples = sampler.random(n=n_samples)
scaled = qmc.scale(samples, lower, upper)

Next Steps After Export

1. Parse to Neural Format

cd atomizer-field
python batch_parser.py ../atomizer_field_training_data/my_study

2. Split Train/Validation

from sklearn.model_selection import train_test_split

# 80/20 split
train_trials, val_trials = train_test_split(
    all_trials,
    test_size=0.2,
    random_state=42
)

3. Train Model

python train_parametric.py \
  --train_dir ../training_data/parsed \
  --val_dir ../validation_data/parsed \
  --epochs 200

See SYS_14_NEURAL_ACCELERATION for full training workflow.


Troubleshooting

Symptom Cause Solution
No export directory Export not enabled Add training_data_export to config
Missing OP2 files Solve failed Check include_failed: false
Incomplete metadata Extraction error Check extractor logs
Low sample count Too many failures Relax constraints

Cross-References


Version History

Version Date Changes
1.0 2025-12-05 Initial release