Atomizer/docs/guides/CMA-ES_EXPLAINED.md

# CMA-ES Explained for Engineers

**CMA-ES** = **Covariance Matrix Adaptation Evolution Strategy**

A derivative-free optimization algorithm ideal for:
- Local refinement around known good solutions
- 4-10 dimensional problems
- Smooth, continuous objective functions
- Problems where gradient information is unavailable (like FEA)

---

## The Core Idea

Imagine searching for the lowest point in a hilly landscape while blindfolded:

1. **Throw darts** around your current best guess
2. **Observe which darts land lower** (better objective)
3. **Learn the shape of the valley** from those results
4. **Adjust future throws** to follow the valley's direction

---

## Key Components

```
┌─────────────────────────────────────────────────────────────┐
│                      CMA-ES Components                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. MEAN (μ) - Current best guess location                  │
│     • Moves toward better solutions each generation         │
│                                                             │
│  2. STEP SIZE (σ) - How far to throw darts                  │
│     • Adapts: shrinks when close, grows when exploring      │
│     • sigma0=0.3 means 30% of parameter range initially     │
│                                                             │
│  3. COVARIANCE MATRIX (C) - Shape of the search cloud       │
│     • Learns parameter correlations                         │
│     • Stretches search along promising directions           │
│                                                             │
└─────────────────────────────────────────────────────────────┘
```

---

## Visual: How the Search Evolves

```
Generation 1 (Round search):        Generation 10 (Learned shape):

    x    x                              x
      x    x                          x   x
    x  ●  x     ──────►            x    ●    x
      x    x                          x   x
    x    x                              x

  ● = mean (center)                Ellipse aligned with
  x = samples                      the valley direction
```

CMA-ES learns that certain parameter combinations work well together and stretches its search cloud in that direction.

---

## The Algorithm (Simplified)

```python
def cma_es_generation():
    # 1. SAMPLE: Generate λ candidates around the mean
    for i in range(population_size):
        candidates[i] = mean + sigma * sample_from_gaussian(covariance=C)

    # 2. EVALUATE: Run FEA for each candidate
    for candidate in candidates:
        fitness[candidate] = run_simulation(candidate)

    # 3. SELECT: Keep the best μ candidates
    selected = top_k(candidates, by=fitness, k=mu)

    # 4. UPDATE MEAN: Move toward the best solutions
    new_mean = weighted_average(selected)

    # 5. UPDATE COVARIANCE: Learn parameter correlations
    C = update_covariance(C, selected, mean, new_mean)

    # 6. UPDATE STEP SIZE: Adapt exploration range
    sigma = adapt_step_size(sigma, evolution_path)
```

---

## The Covariance Matrix Magic

Consider 4 design variables:

```
Covariance Matrix C (4x4):
                    var1    var2    var3    var4
var1               [ 1.0     0.3    -0.5     0.1 ]
var2               [ 0.3     1.0     0.2    -0.2 ]
var3               [-0.5     0.2     1.0     0.4 ]
var4               [ 0.1    -0.2     0.4     1.0 ]
```

**Reading the matrix:**
- **Diagonal (1.0)**: Variance in each parameter
- **Off-diagonal**: Correlations between parameters
- **Positive (0.3)**: When var1 increases, var2 should increase
- **Negative (-0.5)**: When var1 increases, var3 should decrease

CMA-ES **learns these correlations automatically** from simulation results!

---

## CMA-ES vs TPE

| Property | TPE | CMA-ES |
|----------|-----|--------|
| **Best for** | Global exploration | Local refinement |
| **Starting point** | Random | Known baseline |
| **Correlation learning** | None (independent) | Automatic |
| **Step size** | Fixed ranges | Adaptive |
| **Dimensionality** | Good for high-D | Best for 4-10D |
| **Sample efficiency** | Good | Excellent (locally) |

---

## Optuna Configuration

```python
from optuna.samplers import CmaEsSampler

# Baseline values (starting point)
x0 = {
    'whiffle_min': 62.75,
    'whiffle_outer_to_vertical': 75.89,
    'whiffle_triangle_closeness': 65.65,
    'blank_backface_angle': 4.43
}

sampler = CmaEsSampler(
    x0=x0,           # Center of initial distribution
    sigma0=0.3,      # Initial step size (30% of range)
    seed=42,         # Reproducibility
    restart_strategy='ipop'  # Increase population on restart
)

study = optuna.create_study(sampler=sampler, direction="minimize")

# CRITICAL: Enqueue baseline as trial 0!
# x0 only sets the CENTER, it doesn't evaluate the baseline
study.enqueue_trial(x0)

study.optimize(objective, n_trials=200)
```

---

## Common Pitfalls

### 1. Not Evaluating the Baseline

**Problem**: CMA-ES samples AROUND x0, but doesn't evaluate x0 itself.

**Solution**: Always enqueue the baseline:
```python
if len(study.trials) == 0:
    study.enqueue_trial(x0)
```

### 2. sigma0 Too Large or Too Small

| sigma0 | Effect |
|--------|--------|
| **Too large (>0.5)** | Explores too far, misses local optimum |
| **Too small (<0.1)** | Gets stuck, slow convergence |
| **Recommended (0.2-0.3)** | Good balance for refinement |

### 3. Wrong Problem Type

CMA-ES struggles with:
- Discrete/categorical variables
- Very high dimensions (>20)
- Multi-modal landscapes (use TPE first)
- Noisy objectives (add regularization)

---

## When to Use CMA-ES in Atomizer

| Scenario | Use CMA-ES? |
|----------|-------------|
| First exploration of design space | No, use TPE |
| Refining around known good design | **Yes** |
| 4-10 continuous variables | **Yes** |
| >15 variables | No, use TPE or NSGA-II |
| Need to learn variable correlations | **Yes** |
| Multi-objective optimization | No, use NSGA-II |

---

## References

- Hansen, N. (2016). The CMA Evolution Strategy: A Tutorial
- Optuna CmaEsSampler: https://optuna.readthedocs.io/en/stable/reference/samplers/generated/optuna.samplers.CmaEsSampler.html
- cmaes Python package: https://github.com/CyberAgentAILab/cmaes

---

*Created: 2025-12-19*
*Atomizer Framework*