Atomizer/docs/protocols/operations/OP_02_RUN_OPTIMIZATION.md

# OP_02: Run Optimization

<!--
PROTOCOL: Run Optimization
LAYER: Operations
VERSION: 1.0
STATUS: Active
LAST_UPDATED: 2025-12-12
PRIVILEGE: user
LOAD_WITH: []
-->

## Overview

This protocol covers executing optimization runs, including pre-flight validation, execution modes, monitoring, and handling common issues.

---

## When to Use

| Trigger | Action |
|---------|--------|
| "start", "run", "execute" | Follow this protocol |
| "begin optimization" | Follow this protocol |
| Study setup complete | Execute this protocol |

---

## Quick Reference

**Start Command**:
```bash
conda activate atomizer
cd studies/{study_name}
python run_optimization.py
```

**Common Options**:
| Flag | Purpose |
|------|---------|
| `--n-trials 100` | Override trial count |
| `--resume` | Continue interrupted run |
| `--test` | Run single trial for validation |
| `--export-training` | Export data for neural training |

---

## Pre-Flight Checklist

Before running, verify:

- [ ] **Environment**: `conda activate atomizer`
- [ ] **Config exists**: `1_setup/optimization_config.json`
- [ ] **Script exists**: `run_optimization.py`
- [ ] **Model files**: NX files in `1_setup/model/`
- [ ] **No conflicts**: No other optimization running on same study
- [ ] **Disk space**: Sufficient for results

**Quick Validation**:
```bash
python run_optimization.py --test
```
This runs a single trial to verify setup.

---

## Execution Modes

### 1. Standard Run

```bash
python run_optimization.py
```
Uses settings from `optimization_config.json`.

### 2. Override Trials

```bash
python run_optimization.py --n-trials 100
```
Override trial count from config.

### 3. Resume Interrupted

```bash
python run_optimization.py --resume
```
Continues from last completed trial.

### 4. Neural Acceleration

```bash
python run_optimization.py --neural
```
Requires trained surrogate model.

### 5. Export Training Data

```bash
python run_optimization.py --export-training
```
Saves BDF/OP2 for neural network training.

---

## Monitoring Progress

### Option 1: Console Output
The script prints progress:
```
Trial 15/50 complete. Best: 0.234 kg
Trial 16/50 complete. Best: 0.234 kg
```

### Option 2: Dashboard
See [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md).

```bash
# Start dashboard (separate terminal)
cd atomizer-dashboard/backend && python -m uvicorn api.main:app --port 8000
cd atomizer-dashboard/frontend && npm run dev

# Open browser
http://localhost:3000
```

### Option 3: Query Database

```bash
python -c "
import optuna
study = optuna.load_study('study_name', 'sqlite:///2_results/study.db')
print(f'Trials: {len(study.trials)}')
print(f'Best value: {study.best_value}')
"
```

### Option 4: Optuna Dashboard

```bash
optuna-dashboard sqlite:///2_results/study.db
# Open http://localhost:8080
```

---

## During Execution

### What Happens Per Trial

1. **Sample parameters**: Optuna suggests design variable values
2. **Update model**: NX expressions updated via journal
3. **Solve**: NX Nastran runs FEA simulation
4. **Extract results**: Extractors read OP2 file
5. **Evaluate**: Check constraints, compute objectives
6. **Record**: Trial stored in Optuna database

### Normal Output

```
[2025-12-05 10:15:30] Trial 1 started
[2025-12-05 10:17:45] NX solve complete (135.2s)
[2025-12-05 10:17:46] Extraction complete
[2025-12-05 10:17:46] Trial 1 complete: mass=0.342 kg, stress=198.5 MPa

[2025-12-05 10:17:47] Trial 2 started
...
```

### Expected Timing

| Operation | Typical Time |
|-----------|--------------|
| NX solve | 30s - 30min |
| Extraction | <1s |
| Per trial total | 1-30 min |
| 50 trials | 1-24 hours |

---

## Handling Issues

### Trial Failed / Pruned

```
[WARNING] Trial 12 pruned: Stress constraint violated (312.5 MPa > 250 MPa)
```
**Normal behavior** - optimizer learns from failures.

### NX Session Timeout

```
[ERROR] NX session timeout after 600s
```
**Solution**: Increase timeout in config or simplify model.

### Expression Not Found

```
[ERROR] Expression 'thicknes' not found in model
```
**Solution**: Check spelling, verify expression exists in NX.

### OP2 File Missing

```
[ERROR] OP2 file not found: model.op2
```
**Solution**: Check NX solve completed. Review NX log file.

### Database Locked

```
[ERROR] Database is locked
```
**Solution**: Another process using database. Wait or kill stale process.

---

## Stopping and Resuming

### Graceful Stop
Press `Ctrl+C` once. Current trial completes, then exits.

### Force Stop
Press `Ctrl+C` twice. Immediate exit (may lose current trial).

### Resume
```bash
python run_optimization.py --resume
```
Continues from last completed trial. Same study database used.

---

## Post-Run Actions

After optimization completes:

1. **Archive best design** (REQUIRED):
   ```bash
   python tools/archive_best_design.py {study_name}
   ```
   This copies the best iteration folder to `3_results/best_design_archive/<timestamp>/`
   with metadata. **Always do this** to preserve the winning design.

2. **Analyze results**:
   ```bash
   python tools/analyze_study.py {study_name}
   ```
   Generates comprehensive report with statistics, parameter bounds analysis.

3. **Find best iteration folder**:
   ```bash
   python tools/find_best_iteration.py {study_name}
   ```
   Shows which `iter{N}` folder contains the best design.

4. **View in dashboard**: `http://localhost:3000`

5. **Generate detailed report**: See [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)

### Automated Archiving

The `run_optimization.py` script should call `archive_best_design()` automatically
at the end of each run. If implementing a new study, add this at the end:

```python
# At end of optimization
from tools.archive_best_design import archive_best_design
archive_best_design(study_name)
```

---

## Protocol Integration

### With Protocol 10 (IMSO)
If enabled, optimization runs in two phases:
1. Characterization (10-30 trials)
2. Optimization (remaining trials)

Dashboard shows phase transitions.

### With Protocol 11 (Multi-Objective)
If 2+ objectives, uses NSGA-II. Returns Pareto front, not single best.

### With Protocol 13 (Dashboard)
Writes `optimizer_state.json` every trial for real-time updates.

### With Protocol 14 (Neural)
If `--neural` flag, uses trained surrogate for fast evaluation.

---

## Troubleshooting

| Symptom | Cause | Solution |
|---------|-------|----------|
| "ModuleNotFoundError" | Wrong environment | `conda activate atomizer` |
| All trials pruned | Constraints too tight | Relax constraints |
| Very slow | Model too complex | Simplify mesh, increase timeout |
| No improvement | Wrong sampler | Try different algorithm |
| "NX license error" | License unavailable | Check NX license server |

---

## Cross-References

- **Preceded By**: [OP_01_CREATE_STUDY](./OP_01_CREATE_STUDY.md)
- **Followed By**: [OP_03_MONITOR_PROGRESS](./OP_03_MONITOR_PROGRESS.md), [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
- **Integrates With**: [SYS_10_IMSO](../system/SYS_10_IMSO.md), [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md)

---

## Version History

| Version | Date | Changes |
|---------|------|---------|
| 1.1 | 2025-12-12 | Added mandatory archive_best_design step, analyze_study and find_best_iteration tools |
| 1.0 | 2025-12-05 | Initial release |