.claude/skills/run-optimization.md

# Run Optimization Skill

**Last Updated**: November 25, 2025
**Version**: 1.0 - Optimization Execution and Monitoring

You are helping the user run and monitor Atomizer optimization studies.

## Purpose

Execute optimization studies with proper:
1. Pre-flight validation
2. Resource management
3. Progress monitoring
4. Error recovery
5. Dashboard integration

## Triggers

- "run optimization"
- "start the study"
- "run {study_name}"
- "execute optimization"
- "begin the optimization"

## Prerequisites

- Study must exist in `studies/{study_name}/`
- `optimization_config.json` must be present and valid
- `run_optimization.py` must exist
- NX model files must be in place

## Pre-Flight Checklist

Before running, verify:

### 1. Study Structure
```
studies/{study_name}/
├── 1_setup/
│   ├── model/
│   │   ├── {Model}.prt          ✓ Required
│   │   ├── {Model}_sim1.sim     ✓ Required
│   │   └── {Model}_fem1.fem     ? Optional (created by NX)
│   ├── optimization_config.json ✓ Required
│   └── workflow_config.json     ? Optional
├── 2_results/                   ? Created automatically
└── run_optimization.py          ✓ Required
```

### 2. Configuration Validation
```python
from optimization_engine.validators.config_validator import validate_config

result = validate_config(study_dir / "1_setup" / "optimization_config.json")
if result.errors:
    # STOP - fix errors first
    for error in result.errors:
        print(f"ERROR: {error}")
if result.warnings:
    # WARN but can continue
    for warning in result.warnings:
        print(f"WARNING: {warning}")
```

### 3. NX Environment
- Verify NX is installed (check `config.py` for `NX_VERSION`)
- Verify Nastran solver is available
- Check for any running NX processes that might conflict

## Execution Modes

### Mode 1: Quick Test (3-5 trials)
```bash
cd studies/{study_name}
python run_optimization.py --trials 3
```
**Use when**: First time running, testing configuration

### Mode 2: Standard Run
```bash
cd studies/{study_name}
python run_optimization.py --trials 30
```
**Use when**: Production optimization with FEA only

### Mode 3: Extended with NN Surrogate
```bash
cd studies/{study_name}
python run_optimization.py --trials 200 --enable-nn
```
**Use when**: Large-scale optimization with trained surrogate

### Mode 4: Resume Interrupted
```bash
cd studies/{study_name}
python run_optimization.py --trials 30 --resume
```
**Use when**: Optimization was interrupted

## Execution Steps

### Step 1: Validate Study

Run pre-flight checks and present status:

```
PRE-FLIGHT CHECK: {study_name}
===============================

Configuration:
  ✓ optimization_config.json valid
  ✓ 4 design variables defined
  ✓ 2 objectives (multi-objective)
  ✓ 2 constraints configured

Model Files:
  ✓ Beam.prt exists (3.2 MB)
  ✓ Beam_sim1.sim exists (1.1 MB)
  ✓ Beam_fem1.fem exists (0.8 MB)

Environment:
  ✓ NX 2412 detected
  ✓ Nastran solver available
  ? No NX processes running (clean slate)

Estimated Runtime:
  - 30 trials × ~30s/trial = ~15 minutes
  - With NN: 200 trials × ~2s/trial = ~7 minutes (after training)

Ready to run. Proceed? (Y/n)
```

### Step 2: Start Optimization

```python
import subprocess
import sys
from pathlib import Path

def run_optimization(study_name: str, trials: int = 30,
                     enable_nn: bool = False, resume: bool = False):
    """Start optimization as background process."""
    study_dir = Path(f"studies/{study_name}")

    cmd = [
        sys.executable,
        str(study_dir / "run_optimization.py"),
        "--trials", str(trials)
    ]

    if enable_nn:
        cmd.append("--enable-nn")
    if resume:
        cmd.append("--resume")

    # Run in background
    process = subprocess.Popen(
        cmd,
        cwd=study_dir,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT
    )

    return process.pid
```

### Step 3: Monitor Progress

Provide real-time updates:

```
OPTIMIZATION RUNNING: {study_name}
==================================

Progress: [████████░░░░░░░░░░░░] 12/30 trials (40%)
Elapsed:  5m 32s
ETA:      ~8 minutes

Current Trial #12:
  Parameters: thickness=2.3, diameter=15.2, ...
  Status: Running FEA...

Best So Far (Pareto Front):
  #1: mass=245g, freq=125Hz (feasible)
  #2: mass=280g, freq=142Hz (feasible)
  #3: mass=310g, freq=158Hz (feasible)

Constraint Violations: 3/12 trials (25%)
  Common: max_stress exceeded

Dashboard: http://localhost:3003
Optuna Dashboard: http://localhost:8081
```

### Step 4: Handle Completion/Errors

**On Success**:
```
OPTIMIZATION COMPLETE
=====================

Study: {study_name}
Duration: 14m 23s
Trials: 30/30

Results:
  Feasible designs: 24/30 (80%)
  Pareto-optimal: 8 designs

Best Designs:
  #1: mass=231g, freq=118Hz ← Lightest
  #2: mass=298g, freq=156Hz ← Stiffest
  #3: mass=265g, freq=138Hz ← Balanced

Next Steps:
  1. View results: /generate-report {study_name}
  2. Continue optimization: python run_optimization.py --trials 50 --resume
  3. Export designs: python export_pareto.py
```

**On Error**:
```
OPTIMIZATION ERROR
==================

Trial #15 failed with:
  Error: NX simulation timeout after 600s
  Design: thickness=1.2, diameter=45, hole_count=12

Possible causes:
  1. Mesh quality issues with extreme parameters
  2. Convergence problems in solver
  3. NX process locked/crashed

Recovery options:
  1. [Recommended] Resume with --resume flag (skips failed)
  2. Narrow design variable bounds
  3. Check NX manually with these parameters

Resume command:
  python run_optimization.py --trials 30 --resume
```

## Dashboard Integration

Always inform user about monitoring options:

```
MONITORING OPTIONS
==================

1. Atomizer Dashboard (recommended):
   cd atomizer-dashboard/backend && python -m uvicorn api.main:app --port 8000
   cd atomizer-dashboard/frontend && npm run dev
   → http://localhost:3003

2. Optuna Dashboard:
   python -c "import optuna; from optuna_dashboard import run_server; ..."
   → http://localhost:8081

3. Command Line:
   Watch study.db for updates
   Tail the log file
```

## Resource Management

### Before Running
- Check available RAM (recommend 8GB+ free)
- Check disk space (OP2 files can be large)
- Close unnecessary NX sessions

### During Run
- Monitor CPU usage (Nastran is CPU-intensive)
- Watch for disk space issues
- Handle NX license timeouts

### Cleanup After
- Option to remove intermediate OP2 files
- Compress results for archival
- Clean worker directories

## Error Handling

| Error | Cause | Recovery |
|-------|-------|----------|
| NX not found | NX not installed/configured | Check config.py settings |
| Simulation timeout | Complex mesh/convergence | Increase timeout or simplify |
| License error | NX license unavailable | Wait for license or use queue |
| Database locked | Multiple processes | Stop conflicting processes |
| Out of memory | Large mesh | Reduce mesh density |
| OP2 parse error | Corrupted output | Re-run trial |

## Parallel Execution (Advanced)

For large studies, can run parallel workers:

```python
# Warning: Requires careful NX session management
# Each worker needs isolated model directory

python run_training_fea.py --study {study_name} --workers 2
```

## Output Format

After starting optimization:

```
OPTIMIZATION STARTED
====================

Study: {study_name}
Trials: 30
Mode: Multi-objective (NSGA-II)
Started: 2025-11-25 15:30:00

Process ID: 12345
Log File: studies/{study_name}/2_results/optimization.log

Monitoring:
  Dashboard: http://localhost:3003
  Database: studies/{study_name}/2_results/study.db

Commands:
  Check progress: "how's the optimization going?"
  Stop: Ctrl+C in terminal or kill process 12345
  Resume: python run_optimization.py --resume
```

## Integration with Other Skills

- After completion → `/generate-report`
- On error → `/troubleshoot`
- For more trials → `/configure-surrogate` (enable NN)

## Example Conversation

```
USER: Run the uav_arm optimization

CLAUDE: I'll start the UAV arm optimization. Let me run pre-flight checks...

PRE-FLIGHT CHECK: uav_arm_optimization
======================================
✓ Configuration valid (Protocol 11, 30 trials)
✓ Model files present
✓ NX 2412 available

Ready to start. This will take approximately 15 minutes.
Starting now...

OPTIMIZATION STARTED
====================
Process running in background.
Monitor at: http://localhost:3003

I'll let you know when it completes. You can ask "how's it going?" anytime.

[... 15 minutes later ...]

OPTIMIZATION COMPLETE
=====================
30/30 trials completed successfully.
Found 8 Pareto-optimal designs.

Best lightweight: 231g @ 118Hz
Best stiff: 298g @ 156Hz

Would you like me to generate a detailed report?
```