Atomizer/docs/protocols/operations/OP_03_MONITOR_PROGRESS.md

# OP_03: Monitor Progress

<!--
PROTOCOL: Monitor Optimization Progress
LAYER: Operations
VERSION: 1.0
STATUS: Active
LAST_UPDATED: 2025-12-05
PRIVILEGE: user
LOAD_WITH: [SYS_13_DASHBOARD_TRACKING]
-->

## Overview

This protocol covers monitoring optimization progress through console output, dashboard, database queries, and Optuna's built-in tools.

---

## When to Use

| Trigger | Action |
|---------|--------|
| "status", "progress" | Follow this protocol |
| "how many trials" | Query database |
| "what's happening" | Check console or dashboard |
| "is it running" | Check process status |

---

## Quick Reference

| Method | Command/URL | Best For |
|--------|-------------|----------|
| Console | Watch terminal output | Quick check |
| Dashboard | `http://localhost:3000` | Visual monitoring |
| Database query | Python one-liner | Scripted checks |
| Optuna Dashboard | `http://localhost:8080` | Detailed analysis |

---

## Monitoring Methods

### 1. Console Output

If running in foreground, watch terminal:
```
[10:15:30] Trial 15/50 started
[10:17:45] Trial 15/50 complete: mass=0.234 kg (best: 0.212 kg)
[10:17:46] Trial 16/50 started
```

### 2. Atomizer Dashboard

**Start Dashboard** (if not running):
```bash
# Terminal 1: Backend
cd atomizer-dashboard/backend
python -m uvicorn api.main:app --reload --port 8000

# Terminal 2: Frontend
cd atomizer-dashboard/frontend
npm run dev
```

**View at**: `http://localhost:3000`

**Features**:
- Real-time trial progress bar
- Current optimizer phase (if Protocol 10)
- Pareto front visualization (if multi-objective)
- Parallel coordinates plot
- Convergence chart

### 3. Database Query

**Quick status**:
```bash
python -c "
import optuna
study = optuna.load_study(
    study_name='my_study',
    storage='sqlite:///studies/my_study/2_results/study.db'
)
print(f'Trials completed: {len(study.trials)}')
print(f'Best value: {study.best_value}')
print(f'Best params: {study.best_params}')
"
```

**Detailed status**:
```python
import optuna

study = optuna.load_study(
    study_name='my_study',
    storage='sqlite:///studies/my_study/2_results/study.db'
)

# Trial counts by state
from collections import Counter
states = Counter(t.state.name for t in study.trials)
print(f"Complete: {states.get('COMPLETE', 0)}")
print(f"Pruned: {states.get('PRUNED', 0)}")
print(f"Failed: {states.get('FAIL', 0)}")
print(f"Running: {states.get('RUNNING', 0)}")

# Best trials
if len(study.directions) > 1:
    print(f"Pareto front size: {len(study.best_trials)}")
else:
    print(f"Best value: {study.best_value}")
```

### 4. Optuna Dashboard

```bash
optuna-dashboard sqlite:///studies/my_study/2_results/study.db
# Open http://localhost:8080
```

**Features**:
- Trial history table
- Parameter importance
- Optimization history plot
- Slice plot (parameter vs objective)

### 5. Check Running Processes

```bash
# Linux/Mac
ps aux | grep run_optimization

# Windows
tasklist | findstr python
```

---

## Key Metrics to Monitor

### Trial Progress
- Completed trials vs target
- Completion rate (trials/hour)
- Estimated time remaining

### Objective Improvement
- Current best value
- Improvement trend
- Plateau detection

### Constraint Satisfaction
- Feasibility rate (% passing constraints)
- Most violated constraint

### For Protocol 10 (IMSO)
- Current phase (Characterization vs Optimization)
- Current strategy (TPE, GP, CMA-ES)
- Characterization confidence

### For Protocol 11 (Multi-Objective)
- Pareto front size
- Hypervolume indicator
- Spread of solutions

---

## Interpreting Results

### Healthy Optimization
```
Trial 45/50: mass=0.198 kg (best: 0.195 kg)
Feasibility rate: 78%
```
- Progress toward target
- Reasonable feasibility rate (60-90%)
- Gradual improvement

### Potential Issues

**All Trials Pruned**:
```
Trial 20 pruned: constraint violated
Trial 21 pruned: constraint violated
...
```
→ Constraints too tight. Consider relaxing.

**No Improvement**:
```
Trial 30: best=0.234 (unchanged since trial 8)
Trial 31: best=0.234 (unchanged since trial 8)
```
→ May have converged, or stuck in local minimum.

**High Failure Rate**:
```
Failed: 15/50 (30%)
```
→ Model issues. Check NX logs.

---

## Real-Time State File

If using Protocol 10, check:
```bash
cat studies/my_study/2_results/intelligent_optimizer/optimizer_state.json
```

```json
{
  "timestamp": "2025-12-05T10:15:30",
  "trial_number": 29,
  "total_trials": 50,
  "current_phase": "adaptive_optimization",
  "current_strategy": "GP_UCB",
  "is_multi_objective": false
}
```

---

## Troubleshooting

| Symptom | Cause | Solution |
|---------|-------|----------|
| Dashboard shows old data | Backend not running | Start backend |
| "No study found" | Wrong path | Check study name and path |
| Trial count not increasing | Process stopped | Check if still running |
| Dashboard not updating | Polling issue | Refresh browser |

---

## Cross-References

- **Preceded By**: [OP_02_RUN_OPTIMIZATION](./OP_02_RUN_OPTIMIZATION.md)
- **Followed By**: [OP_04_ANALYZE_RESULTS](./OP_04_ANALYZE_RESULTS.md)
- **Integrates With**: [SYS_13_DASHBOARD_TRACKING](../system/SYS_13_DASHBOARD_TRACKING.md)

---

## Version History

| Version | Date | Changes |
|---------|------|---------|
| 1.0 | 2025-12-05 | Initial release |