Atomizer/.claude/skills/run-optimization.md

# Run Optimization Skill

**Last Updated**: December 3, 2025
**Version**: 2.0 - Added Adaptive Multi-Objective (Protocol 13)

You are helping the user run and monitor Atomizer optimization studies.

## Purpose

Execute optimization studies with proper:
1. Pre-flight validation
2. Resource management
3. Progress monitoring
4. Error recovery
5. Dashboard integration

## Triggers

- "run optimization"
- "start the study"
- "run {study_name}"
- "execute optimization"
- "begin the optimization"

## Prerequisites

- Study must exist in `studies/{study_name}/`
- `optimization_config.json` must be present and valid
- `run_optimization.py` must exist
- NX model files must be in place

## Pre-Flight Checklist

Before running, verify:

### 1. Study Structure
```
studies/{study_name}/
├── 1_setup/
│   ├── model/
│   │   ├── {Model}.prt          ✓ Required
│   │   ├── {Model}_sim1.sim     ✓ Required
│   │   └── {Model}_fem1.fem     ? Optional (created by NX)
│   ├── optimization_config.json ✓ Required
│   └── workflow_config.json     ? Optional
├── 2_results/                   ? Created automatically
└── run_optimization.py          ✓ Required
```

### 2. Configuration Validation
```python
from optimization_engine.validators.config_validator import validate_config

result = validate_config(study_dir / "1_setup" / "optimization_config.json")
if result.errors:
    # STOP - fix errors first
    for error in result.errors:
        print(f"ERROR: {error}")
if result.warnings:
    # WARN but can continue
    for warning in result.warnings:
        print(f"WARNING: {warning}")
```

### 3. NX Environment
- Verify NX is installed (check `config.py` for `NX_VERSION`)
- Verify Nastran solver is available
- Check for any running NX processes that might conflict

## Execution Modes

### Mode 1: Quick Test (3-5 trials)
```bash
cd studies/{study_name}
python run_optimization.py --trials 3
```
**Use when**: First time running, testing configuration

### Mode 2: Standard Run
```bash
cd studies/{study_name}
python run_optimization.py --trials 30
```
**Use when**: Production optimization with FEA only

### Mode 3: Extended with NN Surrogate
```bash
cd studies/{study_name}
python run_optimization.py --trials 200 --enable-nn
```
**Use when**: Large-scale optimization with trained surrogate

### Mode 4: Resume Interrupted
```bash
cd studies/{study_name}
python run_optimization.py --trials 30 --resume
```
**Use when**: Optimization was interrupted

### Mode 5: Adaptive Multi-Objective (Protocol 13)
```bash
cd studies/{study_name}
python run_optimization.py --start
```
**Use when**:
- FEA takes > 5 minutes per run
- Multi-objective optimization (2-4 objectives)
- Need to explore > 100 designs efficiently

**Workflow:**
1. Initial FEA trials (50-100) for NN training data
2. Train neural network surrogate
3. NN-accelerated search (1000+ trials in seconds)
4. Validate top NN predictions with FEA
5. Retrain NN with new data, repeat

**Configuration** (in optimization_config.json):
```json
{
  "protocol": 13,
  "adaptive_settings": {
    "enabled": true,
    "initial_fea_trials": 50,
    "nn_trials_per_iteration": 1000,
    "fea_validation_per_iteration": 5,
    "max_iterations": 10
  }
}
```

**Monitoring:**
- `adaptive_state.json`: Current iteration, best values, history
- Dashboard shows FEA (blue) vs NN (orange) trials
- Pareto front updates after each FEA validation

## Execution Steps

### Step 1: Validate Study

Run pre-flight checks and present status:

```
PRE-FLIGHT CHECK: {study_name}
===============================

Configuration:
  ✓ optimization_config.json valid
  ✓ 4 design variables defined
  ✓ 2 objectives (multi-objective)
  ✓ 2 constraints configured

Model Files:
  ✓ Beam.prt exists (3.2 MB)
  ✓ Beam_sim1.sim exists (1.1 MB)
  ✓ Beam_fem1.fem exists (0.8 MB)

Environment:
  ✓ NX 2412 detected
  ✓ Nastran solver available
  ? No NX processes running (clean slate)

Estimated Runtime:
  - 30 trials × ~30s/trial = ~15 minutes
  - With NN: 200 trials × ~2s/trial = ~7 minutes (after training)

Ready to run. Proceed? (Y/n)
```

### Step 2: Start Optimization

```python
import subprocess
import sys
from pathlib import Path

def run_optimization(study_name: str, trials: int = 30,
                     enable_nn: bool = False, resume: bool = False):
    """Start optimization as background process."""
    study_dir = Path(f"studies/{study_name}")

    cmd = [
        sys.executable,
        str(study_dir / "run_optimization.py"),
        "--trials", str(trials)
    ]

    if enable_nn:
        cmd.append("--enable-nn")
    if resume:
        cmd.append("--resume")

    # Run in background
    process = subprocess.Popen(
        cmd,
        cwd=study_dir,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT
    )

    return process.pid
```

### Step 3: Monitor Progress

Provide real-time updates:

```
OPTIMIZATION RUNNING: {study_name}
==================================

Progress: [████████░░░░░░░░░░░░] 12/30 trials (40%)
Elapsed:  5m 32s
ETA:      ~8 minutes

Current Trial #12:
  Parameters: thickness=2.3, diameter=15.2, ...
  Status: Running FEA...

Best So Far (Pareto Front):
  #1: mass=245g, freq=125Hz (feasible)
  #2: mass=280g, freq=142Hz (feasible)
  #3: mass=310g, freq=158Hz (feasible)

Constraint Violations: 3/12 trials (25%)
  Common: max_stress exceeded

Dashboard: http://localhost:3003
Optuna Dashboard: http://localhost:8081
```

### Step 4: Handle Completion/Errors

**On Success**:
```
OPTIMIZATION COMPLETE
=====================

Study: {study_name}
Duration: 14m 23s
Trials: 30/30

Results:
  Feasible designs: 24/30 (80%)
  Pareto-optimal: 8 designs

Best Designs:
  #1: mass=231g, freq=118Hz ← Lightest
  #2: mass=298g, freq=156Hz ← Stiffest
  #3: mass=265g, freq=138Hz ← Balanced

Next Steps:
  1. View results: /generate-report {study_name}
  2. Continue optimization: python run_optimization.py --trials 50 --resume
  3. Export designs: python export_pareto.py
```

**On Error**:
```
OPTIMIZATION ERROR
==================

Trial #15 failed with:
  Error: NX simulation timeout after 600s
  Design: thickness=1.2, diameter=45, hole_count=12

Possible causes:
  1. Mesh quality issues with extreme parameters
  2. Convergence problems in solver
  3. NX process locked/crashed

Recovery options:
  1. [Recommended] Resume with --resume flag (skips failed)
  2. Narrow design variable bounds
  3. Check NX manually with these parameters

Resume command:
  python run_optimization.py --trials 30 --resume
```

## Dashboard Integration

Always inform user about monitoring options:

```
MONITORING OPTIONS
==================

1. Atomizer Dashboard (recommended):
   cd atomizer-dashboard/backend && python -m uvicorn api.main:app --port 8000
   cd atomizer-dashboard/frontend && npm run dev
   → http://localhost:3003

   Features:
   - ALL extracted metrics displayed per trial (not just mass/frequency)
   - Quick preview shows first 6 metrics with abbreviations
   - Expanded view shows full metric names and values
   - Multi-objective studies: Zernike RMS, coefficients, workload metrics

2. Optuna Dashboard:
   python -c "import optuna; from optuna_dashboard import run_server; ..."
   → http://localhost:8081

3. Command Line:
   Watch study.db for updates
   Tail the log file
```

**IMPORTANT**: For multi-objective studies with custom metrics (Zernike, thermal, etc.),
the Atomizer Dashboard automatically displays ALL numeric metrics from user_attrs.
No configuration needed - metrics are discovered dynamically.

## Resource Management

### Before Running
- Check available RAM (recommend 8GB+ free)
- Check disk space (OP2 files can be large)
- Close unnecessary NX sessions

### During Run
- Monitor CPU usage (Nastran is CPU-intensive)
- Watch for disk space issues
- Handle NX license timeouts

### Cleanup After
- Option to remove intermediate OP2 files
- Compress results for archival
- Clean worker directories

## Error Handling

| Error | Cause | Recovery |
|-------|-------|----------|
| NX not found | NX not installed/configured | Check config.py settings |
| Simulation timeout | Complex mesh/convergence | Increase timeout or simplify |
| License error | NX license unavailable | Wait for license or use queue |
| Database locked | Multiple processes | Stop conflicting processes |
| Out of memory | Large mesh | Reduce mesh density |
| OP2 parse error | Corrupted output | Re-run trial |

## Parallel Execution (Advanced)

For large studies, can run parallel workers:

```python
# Warning: Requires careful NX session management
# Each worker needs isolated model directory

python run_training_fea.py --study {study_name} --workers 2
```

## Output Format

After starting optimization:

```
OPTIMIZATION STARTED
====================

Study: {study_name}
Trials: 30
Mode: Multi-objective (NSGA-II)
Started: 2025-11-25 15:30:00

Process ID: 12345
Log File: studies/{study_name}/2_results/optimization.log

Monitoring:
  Dashboard: http://localhost:3003
  Database: studies/{study_name}/2_results/study.db

Commands:
  Check progress: "how's the optimization going?"
  Stop: Ctrl+C in terminal or kill process 12345
  Resume: python run_optimization.py --resume
```

## Integration with Other Skills

- After completion → `/generate-report`
- On error → `/troubleshoot`
- For more trials → `/configure-surrogate` (enable NN)

## Example Conversation

```
USER: Run the uav_arm optimization

CLAUDE: I'll start the UAV arm optimization. Let me run pre-flight checks...

PRE-FLIGHT CHECK: uav_arm_optimization
======================================
✓ Configuration valid (Protocol 11, 30 trials)
✓ Model files present
✓ NX 2412 available

Ready to start. This will take approximately 15 minutes.
Starting now...

OPTIMIZATION STARTED
====================
Process running in background.
Monitor at: http://localhost:3003

I'll let you know when it completes. You can ask "how's it going?" anytime.

[... 15 minutes later ...]

OPTIMIZATION COMPLETE
=====================
30/30 trials completed successfully.
Found 8 Pareto-optimal designs.

Best lightweight: 231g @ 118Hz
Best stiff: 298g @ 156Hz

Would you like me to generate a detailed report?
```