295 lines
6.0 KiB
Markdown
295 lines
6.0 KiB
Markdown
|
|
# OP_05: Export Training Data
|
||
|
|
|
||
|
|
<!--
|
||
|
|
PROTOCOL: Export Neural Network Training Data
|
||
|
|
LAYER: Operations
|
||
|
|
VERSION: 1.0
|
||
|
|
STATUS: Active
|
||
|
|
LAST_UPDATED: 2025-12-05
|
||
|
|
PRIVILEGE: user
|
||
|
|
LOAD_WITH: [SYS_14_NEURAL_ACCELERATION]
|
||
|
|
-->
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
This protocol covers exporting FEA simulation data for training neural network surrogates. Proper data export enables Protocol 14 (Neural Acceleration).
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## When to Use
|
||
|
|
|
||
|
|
| Trigger | Action |
|
||
|
|
|---------|--------|
|
||
|
|
| "export training data" | Follow this protocol |
|
||
|
|
| "neural network data" | Follow this protocol |
|
||
|
|
| Planning >50 trials | Consider export for acceleration |
|
||
|
|
| Want to train surrogate | Follow this protocol |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Reference
|
||
|
|
|
||
|
|
**Export Command**:
|
||
|
|
```bash
|
||
|
|
python run_optimization.py --export-training
|
||
|
|
```
|
||
|
|
|
||
|
|
**Output Structure**:
|
||
|
|
```
|
||
|
|
atomizer_field_training_data/{study_name}/
|
||
|
|
├── trial_0001/
|
||
|
|
│ ├── input/model.bdf
|
||
|
|
│ ├── output/model.op2
|
||
|
|
│ └── metadata.json
|
||
|
|
├── trial_0002/
|
||
|
|
│ └── ...
|
||
|
|
└── study_summary.json
|
||
|
|
```
|
||
|
|
|
||
|
|
**Recommended Data Volume**:
|
||
|
|
| Complexity | Training Samples | Validation Samples |
|
||
|
|
|------------|-----------------|-------------------|
|
||
|
|
| Simple (2-3 params) | 50-100 | 20-30 |
|
||
|
|
| Medium (4-6 params) | 100-200 | 30-50 |
|
||
|
|
| Complex (7+ params) | 200-500 | 50-100 |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Configuration
|
||
|
|
|
||
|
|
### Enable Export in Config
|
||
|
|
|
||
|
|
Add to `optimization_config.json`:
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"training_data_export": {
|
||
|
|
"enabled": true,
|
||
|
|
"export_dir": "atomizer_field_training_data/my_study",
|
||
|
|
"export_bdf": true,
|
||
|
|
"export_op2": true,
|
||
|
|
"export_fields": ["displacement", "stress"],
|
||
|
|
"include_failed": false
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Configuration Options
|
||
|
|
|
||
|
|
| Option | Type | Default | Description |
|
||
|
|
|--------|------|---------|-------------|
|
||
|
|
| `enabled` | bool | false | Enable export |
|
||
|
|
| `export_dir` | string | - | Output directory |
|
||
|
|
| `export_bdf` | bool | true | Save Nastran input |
|
||
|
|
| `export_op2` | bool | true | Save binary results |
|
||
|
|
| `export_fields` | list | all | Which result fields |
|
||
|
|
| `include_failed` | bool | false | Include failed trials |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Export Workflow
|
||
|
|
|
||
|
|
### Step 1: Run with Export Enabled
|
||
|
|
|
||
|
|
```bash
|
||
|
|
conda activate atomizer
|
||
|
|
cd studies/my_study
|
||
|
|
python run_optimization.py --export-training
|
||
|
|
```
|
||
|
|
|
||
|
|
Or run standard optimization with config export enabled.
|
||
|
|
|
||
|
|
### Step 2: Verify Export
|
||
|
|
|
||
|
|
```bash
|
||
|
|
ls atomizer_field_training_data/my_study/
|
||
|
|
# Should see trial_0001/, trial_0002/, etc.
|
||
|
|
|
||
|
|
# Check a trial
|
||
|
|
ls atomizer_field_training_data/my_study/trial_0001/
|
||
|
|
# input/model.bdf
|
||
|
|
# output/model.op2
|
||
|
|
# metadata.json
|
||
|
|
```
|
||
|
|
|
||
|
|
### Step 3: Check Metadata
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cat atomizer_field_training_data/my_study/trial_0001/metadata.json
|
||
|
|
```
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"trial_number": 1,
|
||
|
|
"design_parameters": {
|
||
|
|
"thickness": 5.2,
|
||
|
|
"width": 30.0
|
||
|
|
},
|
||
|
|
"objectives": {
|
||
|
|
"mass": 0.234,
|
||
|
|
"max_stress": 198.5
|
||
|
|
},
|
||
|
|
"constraints_satisfied": true,
|
||
|
|
"simulation_time": 145.2
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Step 4: Check Study Summary
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cat atomizer_field_training_data/my_study/study_summary.json
|
||
|
|
```
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"study_name": "my_study",
|
||
|
|
"total_trials": 50,
|
||
|
|
"successful_exports": 47,
|
||
|
|
"failed_exports": 3,
|
||
|
|
"design_parameters": ["thickness", "width"],
|
||
|
|
"objectives": ["mass", "max_stress"],
|
||
|
|
"export_timestamp": "2025-12-05T15:30:00"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Data Quality Checks
|
||
|
|
|
||
|
|
### Verify Sample Count
|
||
|
|
|
||
|
|
```python
|
||
|
|
from pathlib import Path
|
||
|
|
import json
|
||
|
|
|
||
|
|
export_dir = Path("atomizer_field_training_data/my_study")
|
||
|
|
trials = list(export_dir.glob("trial_*"))
|
||
|
|
print(f"Exported trials: {len(trials)}")
|
||
|
|
|
||
|
|
# Check for missing files
|
||
|
|
for trial_dir in trials:
|
||
|
|
bdf = trial_dir / "input" / "model.bdf"
|
||
|
|
op2 = trial_dir / "output" / "model.op2"
|
||
|
|
meta = trial_dir / "metadata.json"
|
||
|
|
|
||
|
|
if not all([bdf.exists(), op2.exists(), meta.exists()]):
|
||
|
|
print(f"Missing files in {trial_dir}")
|
||
|
|
```
|
||
|
|
|
||
|
|
### Check Parameter Coverage
|
||
|
|
|
||
|
|
```python
|
||
|
|
import json
|
||
|
|
import numpy as np
|
||
|
|
|
||
|
|
# Load all metadata
|
||
|
|
params = []
|
||
|
|
for trial_dir in export_dir.glob("trial_*"):
|
||
|
|
with open(trial_dir / "metadata.json") as f:
|
||
|
|
meta = json.load(f)
|
||
|
|
params.append(meta["design_parameters"])
|
||
|
|
|
||
|
|
# Check coverage
|
||
|
|
import pandas as pd
|
||
|
|
df = pd.DataFrame(params)
|
||
|
|
print(df.describe())
|
||
|
|
|
||
|
|
# Look for gaps
|
||
|
|
for col in df.columns:
|
||
|
|
print(f"{col}: min={df[col].min():.2f}, max={df[col].max():.2f}")
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Space-Filling Sampling
|
||
|
|
|
||
|
|
For best neural network training, use space-filling designs:
|
||
|
|
|
||
|
|
### Latin Hypercube Sampling
|
||
|
|
|
||
|
|
```python
|
||
|
|
from scipy.stats import qmc
|
||
|
|
|
||
|
|
# Generate space-filling samples
|
||
|
|
n_samples = 100
|
||
|
|
n_params = 4
|
||
|
|
|
||
|
|
sampler = qmc.LatinHypercube(d=n_params)
|
||
|
|
samples = sampler.random(n=n_samples)
|
||
|
|
|
||
|
|
# Scale to parameter bounds
|
||
|
|
lower = [2.0, 20.0, 5.0, 1.0]
|
||
|
|
upper = [10.0, 50.0, 15.0, 5.0]
|
||
|
|
scaled = qmc.scale(samples, lower, upper)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Sobol Sequence
|
||
|
|
|
||
|
|
```python
|
||
|
|
sampler = qmc.Sobol(d=n_params)
|
||
|
|
samples = sampler.random(n=n_samples)
|
||
|
|
scaled = qmc.scale(samples, lower, upper)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Next Steps After Export
|
||
|
|
|
||
|
|
### 1. Parse to Neural Format
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cd atomizer-field
|
||
|
|
python batch_parser.py ../atomizer_field_training_data/my_study
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Split Train/Validation
|
||
|
|
|
||
|
|
```python
|
||
|
|
from sklearn.model_selection import train_test_split
|
||
|
|
|
||
|
|
# 80/20 split
|
||
|
|
train_trials, val_trials = train_test_split(
|
||
|
|
all_trials,
|
||
|
|
test_size=0.2,
|
||
|
|
random_state=42
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. Train Model
|
||
|
|
|
||
|
|
```bash
|
||
|
|
python train_parametric.py \
|
||
|
|
--train_dir ../training_data/parsed \
|
||
|
|
--val_dir ../validation_data/parsed \
|
||
|
|
--epochs 200
|
||
|
|
```
|
||
|
|
|
||
|
|
See [SYS_14_NEURAL_ACCELERATION](../system/SYS_14_NEURAL_ACCELERATION.md) for full training workflow.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
| Symptom | Cause | Solution |
|
||
|
|
|---------|-------|----------|
|
||
|
|
| No export directory | Export not enabled | Add `training_data_export` to config |
|
||
|
|
| Missing OP2 files | Solve failed | Check `include_failed: false` |
|
||
|
|
| Incomplete metadata | Extraction error | Check extractor logs |
|
||
|
|
| Low sample count | Too many failures | Relax constraints |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Cross-References
|
||
|
|
|
||
|
|
- **Related**: [SYS_14_NEURAL_ACCELERATION](../system/SYS_14_NEURAL_ACCELERATION.md)
|
||
|
|
- **Preceded By**: [OP_02_RUN_OPTIMIZATION](./OP_02_RUN_OPTIMIZATION.md)
|
||
|
|
- **Skill**: `.claude/skills/modules/neural-acceleration.md`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Version History
|
||
|
|
|
||
|
|
| Version | Date | Changes |
|
||
|
|
|---------|------|---------|
|
||
|
|
| 1.0 | 2025-12-05 | Initial release |
|