Files

Anto01 e3bdb08a22 feat: Major update with validators, skills, dashboard, and docs reorganization

- Add validation framework (config, model, results, study validators)
- Add Claude Code skills (create-study, run-optimization, generate-report,
  troubleshoot, analyze-model)
- Add Atomizer Dashboard (React frontend + FastAPI backend)
- Reorganize docs into structured directories (00-09)
- Add neural surrogate modules and training infrastructure
- Add multi-objective optimization support

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-25 19:23:58 -05:00

13 KiB

Raw Blame History

Training Data Export for AtomizerField

Overview

The Training Data Export feature automatically captures NX Nastran input/output files and metadata during Atomizer optimization runs. This data is used to train AtomizerField neural network surrogate models that can replace slow FEA evaluations (30 min) with fast predictions (50 ms).

Quick Start

Add this configuration to your workflow_config.json:

{
  "study_name": "my_optimization",
  "design_variables": [...],
  "objectives": [...],

  "training_data_export": {
    "enabled": true,
    "export_dir": "atomizer_field_training_data/my_study_001"
  }
}

Run your optimization as normal:

cd studies/my_optimization
python run_optimization.py

The training data will be automatically exported to the specified directory.

How It Works

During Optimization

After each trial:

FEA Solve Completes: NX Nastran generates .dat (input deck) and .op2 (binary results) files
Results Extraction: Atomizer extracts objectives, constraints, and other metrics
Data Export: The exporter copies the NX files and creates metadata
Trial Directory Created: Structured directory with input, output, and metadata

After Optimization

When optimization completes:

Finalize Called: Creates study_summary.json with overall study metadata
README Generated: Instructions for using the data with AtomizerField
Ready for Training: Data is structured for AtomizerField batch parser

Directory Structure

After running an optimization with training data export enabled:

atomizer_field_training_data/my_study_001/
├── trial_0001/
│   ├── input/
│   │   └── model.bdf          # NX Nastran input deck (BDF format)
│   ├── output/
│   │   └── model.op2          # NX Nastran binary results (OP2 format)
│   └── metadata.json          # Design parameters, objectives, constraints
├── trial_0002/
│   └── ...
├── trial_0003/
│   └── ...
├── study_summary.json         # Overall study metadata
└── README.md                  # Usage instructions

metadata.json Format

Each trial's metadata.json contains:

{
  "trial_number": 42,
  "timestamp": "2025-01-15T10:30:45.123456",
  "atomizer_study": "my_optimization",
  "design_parameters": {
    "thickness": 3.5,
    "width": 50.0,
    "length": 200.0
  },
  "results": {
    "objectives": {
      "max_stress": 245.3,
      "mass": 1.25
    },
    "constraints": {
      "stress_limit": -54.7
    },
    "max_displacement": 1.23
  }
}

study_summary.json Format

The study_summary.json file contains:

{
  "study_name": "my_optimization",
  "total_trials": 100,
  "design_variables": ["thickness", "width", "length"],
  "objectives": ["max_stress", "mass"],
  "constraints": ["stress_limit"],
  "export_timestamp": "2025-01-15T12:00:00.000000",
  "metadata": {
    "atomizer_version": "1.0",
    "optimization_algorithm": "NSGA-II",
    "n_trials": 100
  }
}

Configuration Options

Basic Configuration

"training_data_export": {
  "enabled": true,
  "export_dir": "path/to/export/directory"
}

Parameters:

enabled (required): true to enable export, false to disable
export_dir (required if enabled): Path to export directory (relative or absolute)

Recommended Directory Structure

For organizing multiple studies:

atomizer_field_training_data/
├── beam_study_001/          # First beam optimization
│   └── trial_0001/ ...
├── beam_study_002/          # Second beam optimization (different parameters)
│   └── trial_0001/ ...
├── bracket_study_001/       # Bracket optimization
│   └── trial_0001/ ...
└── plate_study_001/         # Plate optimization
    └── trial_0001/ ...

Using Exported Data with AtomizerField

Step 1: Parse Training Data

Convert BDF/OP2 files to PyTorch Geometric format:

cd Atomizer-Field
python batch_parser.py --data-dir "../Atomizer/atomizer_field_training_data/my_study_001"

This creates graph representations of the FEA data suitable for GNN training.

Step 2: Validate Parsed Data

Ensure data was parsed correctly:

python validate_parsed_data.py

Step 3: Train Neural Network

Train the GNN surrogate model:

python train.py --data-dir "training_data/parsed/" --epochs 200

Step 4: Use Trained Model in Atomizer

Enable neural network surrogate in your optimization:

cd ../Atomizer
python run_optimization.py --config studies/my_study/workflow_config.json --use-neural

Integration Points

The training data exporter integrates seamlessly with Atomizer's optimization flow:

In `optimization_engine/runner.py`:

from optimization_engine.training_data_exporter import create_exporter_from_config

class OptimizationRunner:
    def __init__(self, config_path):
        # ... existing initialization ...

        # Initialize training data exporter (if enabled)
        self.training_data_exporter = create_exporter_from_config(self.config)
        if self.training_data_exporter:
            print(f"Training data export enabled: {self.training_data_exporter.export_dir}")

    def objective(self, trial):
        # ... simulation and results extraction ...

        # Export training data (if enabled)
        if self.training_data_exporter:
            simulation_files = {
                'dat_file': path_to_dat,
                'op2_file': path_to_op2
            }
            self.training_data_exporter.export_trial(
                trial_number=trial.number,
                design_variables=design_vars,
                results=extracted_results,
                simulation_files=simulation_files
            )

    def run(self):
        # ... optimization loop ...

        # Finalize training data export (if enabled)
        if self.training_data_exporter:
            self.training_data_exporter.finalize()

File Formats

BDF (.bdf) - Nastran Bulk Data File

Format: ASCII text
Contains:
- Mesh geometry (nodes, elements)
- Material properties
- Loads and boundary conditions
- Analysis parameters

OP2 (.op2) - Nastran Output2

Format: Binary
Contains:
- Displacements
- Stresses (von Mises, principal, etc.)
- Strains
- Reaction forces
- Modal results (if applicable)

JSON (.json) - Metadata

Format: UTF-8 JSON
Contains:
- Design parameter values
- Objective function values
- Constraint values
- Trial metadata (number, timestamp, study name)

Example: Complete Workflow

1. Create Optimization Study

import json
from pathlib import Path

config = {
    "study_name": "beam_optimization",
    "sim_file": "examples/Models/Beam/Beam.sim",
    "fem_file": "examples/Models/Beam/Beam_fem1.fem",

    "design_variables": [
        {"name": "thickness", "expression_name": "thickness", "min": 2.0, "max": 8.0},
        {"name": "width", "expression_name": "width", "min": 20.0, "max": 60.0}
    ],

    "objectives": [
        {
            "name": "max_stress",
            "type": "minimize",
            "extractor": {"type": "result_parameter", "parameter_name": "Max Von Mises Stress"}
        },
        {
            "name": "mass",
            "type": "minimize",
            "extractor": {"type": "expression", "expression_name": "mass"}
        }
    ],

    "optimization": {
        "algorithm": "NSGA-II",
        "n_trials": 100
    },

    # Enable training data export
    "training_data_export": {
        "enabled": True,
        "export_dir": "atomizer_field_training_data/beam_study_001"
    }
}

# Save config
config_path = Path("studies/beam_optimization/1_setup/workflow_config.json")
config_path.parent.mkdir(parents=True, exist_ok=True)
with open(config_path, 'w') as f:
    json.dump(config, f, indent=2)

2. Run Optimization

cd studies/beam_optimization
python run_optimization.py

Console output will show:

Training data export enabled: atomizer_field_training_data/beam_study_001
...
Training data export finalized: 100 trials exported

3. Verify Export

dir atomizer_field_training_data\beam_study_001

You should see:

trial_0001/
trial_0002/
...
trial_0100/
study_summary.json
README.md

4. Train AtomizerField

cd Atomizer-Field
python batch_parser.py --data-dir "../Atomizer/atomizer_field_training_data/beam_study_001"
python train.py --data-dir "training_data/parsed/" --epochs 200

Troubleshooting

No .dat or .op2 Files Found

Problem: Export logs show "dat file not found" or "op2 file not found"

Solution:

Ensure NX Nastran solver is writing these files
Check NX simulation settings
Verify file paths in result_path

Export Directory Permission Error

Problem: PermissionError when creating export directory

Solution:

Use absolute path or path relative to Atomizer root
Ensure write permissions for the target directory
Check disk space

Missing Metadata Fields

Problem: metadata.json doesn't contain expected fields

Solution:

Verify extractors are configured correctly in workflow_config.json
Check that results are being extracted before export
Review extracted_results dict in runner

Large File Sizes

Problem: Export directory grows very large

Solution:

OP2 files can be large (10-100 MB per trial)
For 1000 trials, expect 10-100 GB of training data
Use compression or cloud storage for large datasets

Performance Considerations

Disk I/O

Each trial export involves 2 file copies (.dat and .op2)
Minimal overhead (~100-500ms per trial)
Negligible compared to FEA solve time (30 minutes)

Storage Requirements

Typical file sizes per trial:

.dat file: 1-10 MB (depends on mesh density)
.op2 file: 5-50 MB (depends on results requested)
metadata.json: 1-5 KB

For 100 trials: ~600 MB - 6 GB For 1000 trials: ~6 GB - 60 GB

API Reference

TrainingDataExporter Class

from optimization_engine.training_data_exporter import TrainingDataExporter

exporter = TrainingDataExporter(
    export_dir=Path("training_data/study_001"),
    study_name="my_study",
    design_variable_names=["thickness", "width"],
    objective_names=["stress", "mass"],
    constraint_names=["stress_limit"],  # Optional
    metadata={"version": "1.0"}         # Optional
)

Methods

export_trial(trial_number, design_variables, results, simulation_files)

Export training data for a single trial.

trial_number (int): Optuna trial number
design_variables (dict): Design parameter names and values
results (dict): Objectives, constraints, and other results
simulation_files (dict): Paths to 'dat_file' and 'op2_file'

Returns True if successful, False otherwise.

finalize()

Finalize export by creating study_summary.json.

Factory Function

create_exporter_from_config(config)

Create exporter from workflow configuration dict.

config (dict): Workflow configuration

Returns TrainingDataExporter if enabled, None otherwise.

Best Practices

1. Organize by Study Type

Group related studies together:

atomizer_field_training_data/
├── beams/
│   ├── cantilever_001/
│   ├── cantilever_002/
│   └── simply_supported_001/
└── brackets/
    ├── L_bracket_001/
    └── T_bracket_001/

2. Use Descriptive Names

Include important parameters in study names:

beam_study_thickness_2-8_width_20-60_100trials

3. Version Your Studies

Track changes to design space or objectives:

bracket_study_001  # Initial study
bracket_study_002  # Expanded design space
bracket_study_003  # Added constraint

4. Document Metadata

Add custom metadata to track study details:

"metadata": {
  "description": "Initial beam study with basic design variables",
  "date": "2025-01-15",
  "engineer": "Your Name",
  "validation_status": "pending"
}

5. Backup Training Data

Training data is valuable:

Expensive to generate (hours/days of computation)
Back up to cloud storage
Consider version control for study configurations

Future Enhancements

Planned improvements:

Incremental export (resume after crash)
Compression options (gzip .dat and .op2 files)
Cloud upload integration (S3, Azure Blob)
Export filtering (only export Pareto-optimal trials)
Multi-fidelity support (tag high/low fidelity trials)

Support

For issues or questions:

Check the troubleshooting section above
Review AtomizerField integration test plan
Open an issue on GitHub with:
- Your workflow_config.json
- Export logs
- Error messages

13 KiB Raw Blame History

Training Data Export for AtomizerField

Overview

Quick Start

How It Works

During Optimization

After Optimization

Directory Structure

metadata.json Format

study_summary.json Format

Configuration Options

Basic Configuration

Recommended Directory Structure

Using Exported Data with AtomizerField

Step 1: Parse Training Data

Step 2: Validate Parsed Data

Step 3: Train Neural Network

Step 4: Use Trained Model in Atomizer

Integration Points

In optimization_engine/runner.py:

File Formats

BDF (.bdf) - Nastran Bulk Data File

OP2 (.op2) - Nastran Output2

JSON (.json) - Metadata

Example: Complete Workflow

1. Create Optimization Study

2. Run Optimization

3. Verify Export

4. Train AtomizerField

Troubleshooting

No .dat or .op2 Files Found

Export Directory Permission Error

Missing Metadata Fields

Large File Sizes

Performance Considerations

Disk I/O

Storage Requirements

API Reference

TrainingDataExporter Class

Methods

Factory Function

Best Practices

1. Organize by Study Type

2. Use Descriptive Names

3. Version Your Studies

4. Document Metadata

5. Backup Training Data

Future Enhancements

See Also

Support

13 KiB

Raw Blame History

In `optimization_engine/runner.py`: