# Optimization Strategy — Hydrotech Beam DoE & Landscape Mapping

**Study:** `01_doe_landscape`  
**Project:** Hydrotech Beam Structural Optimization  
**Author:** ⚡ Optimizer Agent  
**Date:** 2025-02-09  
**Status:** DRAFT — Awaiting review  
**References:** [BREAKDOWN.md](../../BREAKDOWN.md), [DECISIONS.md](../../DECISIONS.md), [CONTEXT.md](../../CONTEXT.md)

---

## 1. Problem Formulation

### 1.1 Objective

$$\min_{x} \quad f(x) = \text{mass}(x) \quad [\text{kg}]$$

Single-objective minimization of total beam mass. This aligns with DEC-HB-001 (approved by Tech Lead, pending CEO confirmation).

### 1.2 Constraints

| ID | Constraint | Operator | Limit | Units | Source |
|----|-----------|----------|-------|-------|--------|
| g₁ | Tip displacement | ≤ | 10.0 | mm | NX Nastran SOL 101 — displacement sensor at beam tip |
| g₂ | Max von Mises stress | ≤ | 130.0 | MPa | NX Nastran SOL 101 — max elemental nodal VM stress |

Both are **hard constraints** — no trade-off or relaxation without CEO approval.

### 1.3 Design Variables

| ID | NX Expression | Type | Lower | Upper | Baseline | Units | Notes |
|----|---------------|------|-------|-------|----------|-------|-------|
| DV1 | `beam_half_core_thickness` | Continuous | 10 | 40 | 20 | mm | Core half-thickness; stiffness scales ~quadratically via sandwich effect |
| DV2 | `beam_face_thickness` | Continuous | 10 | 40 | 20 | mm | Face sheet thickness; primary bending stiffness contributor |
| DV3 | `holes_diameter` | Continuous | 150 | 450 | 300 | mm | Lightening hole diameter; mass ∝ d² reduction |
| DV4 | `hole_count` | **Integer** | 5 | 15 | 10 | — | Number of lightening holes; 11 discrete levels |

**Total design space:** 3 continuous × 1 integer (11 levels) = effectively 3D continuous × 11 slices.

### 1.4 Integer Handling

Per DEC-HB-003, `hole_count` is treated as a **true integer** throughout:

- **Phase 1 (LHS):** Generate continuous LHS, round DV4 to nearest integer. Use stratified integer sampling to ensure coverage across all 11 levels.
- **Phase 2 (TPE):** Optuna `IntDistribution(5, 15)` — native integer support, no rounding hacks.
- **NX rebuild:** The model requires integer hole count. Non-integer values will cause geometry failures.

### 1.5 Baseline Assessment

| Metric | Baseline Value | Constraint | Status |
|--------|---------------|------------|--------|
| Mass | ~974 kg | (minimize) | Overbuilt — room to reduce |
| Tip displacement | ~22 mm | ≤ 10 mm | ❌ **FAILS** by 120% |
| VM stress | (unknown) | ≤ 130 MPa | ⚠️ Assumed OK but unconfirmed |

> ⚠️ **Critical:** The baseline design **violates** the displacement constraint (22 mm vs 10 mm limit). The optimizer must first find the feasible region before it can meaningfully minimize mass. This shapes the entire strategy.

---

## 2. Algorithm Selection

### 2.1 Tech Lead's Recommendation

DEC-HB-002 proposes a two-phase strategy:
- **Phase 1:** Latin Hypercube Sampling (LHS) — 40–50 trials
- **Phase 2:** TPE via Optuna — 60–100 trials

### 2.2 My Assessment: **CONFIRMED with refinements**

The two-phase approach is the right call. Here's why, and what I'd adjust:

#### Why LHS → TPE is correct for this problem

| Factor | Implication | Algorithm Fit |
|--------|------------|---------------|
| 4 design variables (low-dim) | All methods work; sample efficiency less critical | Any |
| 1 integer variable | Need native mixed-type support | TPE ✓, CMA-ES ≈ (rounding) |
| Infeasible baseline | Must map feasibility BEFORE optimizing | LHS first ✓ |
| Expected significant interactions (DV1×DV2, DV3×DV4) | Need space-filling to detect interactions | LHS ✓ |
| Potentially narrow feasible region | Risk of missing it with random search | LHS gives systematic coverage ✓ |
| NX-in-the-loop (medium cost) | ~100-200 trials is budget-appropriate | TPE efficient enough ✓ |

#### What I'd modify

1. **Phase 1 budget: 50 trials (not 40).** With 4 variables, we want at least 10× the dimensionality for a reliable DoE. 50 trials also divides cleanly for stratified integer sampling (≈4-5 trials per hole_count level).

2. **Enqueue baseline as Trial 0.** LAC critical lesson: CMA-ES doesn't evaluate x0 first. While we're using LHS (not CMA-ES), the same principle applies — **always evaluate the baseline explicitly** so we have a verified anchor point. This also validates the extractor pipeline before burning 50 trials.

3. **Phase 2 budget: 80 trials (flexible 60-100).** Start with 60, apply convergence criteria (Section 6), extend to 100 if still improving.

4. **Seed Phase 2 from Phase 1 data.** Use Optuna's `enqueue_trial()` to warm-start TPE with the best feasible point(s) from the DoE. This avoids the TPE startup penalty (first `n_startup_trials` are random).

#### Algorithms NOT selected (and why)

| Algorithm | Why Not |
|-----------|---------|
| **CMA-ES** | Good option, but integer rounding is a hack; doesn't evaluate x0 first (LAC lesson); TPE is equally good at 4D |
| **NSGA-II** | Overkill for single-objective; population size wastes budget |
| **Surrogate + L-BFGS** | **LAC CRITICAL: Gradient descent on surrogates finds fake optima.** V5 mirror study: L-BFGS was 22% WORSE than pure TPE (WS=325 vs WS=290). V6 confirmed simple TPE beats complex surrogate methods. Do not use. |
| **SOL 200 (Nastran native)** | No integer support for hole_count; gradient-based so may miss global optimum; more NX setup effort. Keep as backup (Tech Lead's suggestion). |
| **Nelder-Mead** | No integer support; poor exploration; would miss the feasible region |

### 2.3 Final Algorithm Configuration

```
Phase 1: LHS DoE
  - Trials: 50 (+ 1 baseline = 51 total)
  - Sampling: Maximin LHS, DV4 rounded to nearest integer
  - Purpose: Landscape mapping, feasibility identification, sensitivity analysis

Phase 2: TPE Optimization
  - Trials: 60-100 (adaptive, see convergence criteria)
  - Sampler: Optuna TPEsampler
  - n_startup_trials: 0 (warm-started from Phase 1 best)
  - Constraint handling: Optuna constraint interface with Deb's rules
  - Purpose: Converge to minimum-mass feasible design

Total budget: 111-151 evaluations
```

---

## 3. Constraint Handling

### 3.1 The Challenge

The baseline FAILS the displacement constraint by 120% (22 mm vs 10 mm). This means:
- A significant portion of the design space may be infeasible
- Random sampling may return few or zero feasible points
- The optimizer must navigate toward feasibility AND optimality simultaneously

### 3.2 Approach: Deb's Feasibility Rules (Constraint Domination)

For ranking solutions during optimization, use **Deb's feasibility rules** (Deb 2000):

1. **Feasible vs feasible** → compare by objective (lower mass wins)
2. **Feasible vs infeasible** → feasible always wins
3. **Infeasible vs infeasible** → lower total constraint violation wins

This is implemented via Optuna's constraint interface:

```python
def constraints(trial):
    """Return constraint violations (negative = feasible, positive = infeasible)."""
    disp = trial.user_attrs["tip_displacement"]
    stress = trial.user_attrs["max_von_mises"]
    return [
        disp - 10.0,      # ≤ 0 means displacement ≤ 10 mm
        stress - 130.0,    # ≤ 0 means stress ≤ 130 MPa
    ]
```

### 3.3 Why NOT Penalty Functions

| Method | Pros | Cons | Verdict |
|--------|------|------|---------|
| **Deb's rules** (selected) | No tuning params; feasible always beats infeasible; explores infeasible region for learning | Requires custom Optuna integration | ✅ Best for this case |
| **Quadratic penalty** | Simple to implement | Penalty weight requires tuning; wrong weight → optimizer ignores constraint OR over-penalizes | ❌ Fragile |
| **Adaptive penalty** | Self-tuning | Complex implementation; may oscillate | ❌ Over-engineered for 4 DVs |
| **Death penalty** (reject infeasible) | Simplest | With infeasible baseline, may reject 80%+ of trials → wasted budget | ❌ Dangerous |

### 3.4 Phase 1 (DoE) Constraint Handling

During the DoE phase, **record all results without filtering.** Every trial runs, every result is stored. Infeasible points are valuable for:
- Mapping the feasibility boundary
- Training the TPE model in Phase 2
- Understanding which variables drive constraint violation

### 3.5 Constraint Margin Buffer

Consider a 5% inner margin during optimization to account for numerical noise:
- Displacement target for optimizer: ≤ 9.5 mm (vs hard limit 10.0 mm)
- Stress target for optimizer: ≤ 123.5 MPa (vs hard limit 130.0 MPa)

The hard limits remain 10 mm / 130 MPa for final validation. The buffer prevents the optimizer from converging to designs that are right on the boundary and may flip infeasible under mesh variation.

---

## 4. Search Space Analysis

### 4.1 Bound Reasonableness

| Variable | Range | Span | Concern |
|----------|-------|------|---------|
| DV1: half_core_thickness | 10–40 mm | 4× range | Reasonable. Lower bound = thin core, upper = thick. Stiffness-mass trade-off |
| DV2: face_thickness | 10–40 mm | 4× range | Reasonable. 10 mm face is already substantial for steel |
| DV3: holes_diameter | 150–450 mm | 3× range | ⚠️ **Needs geometric check** — see §4.2 |
| DV4: hole_count | 5–15 | 3× range | ⚠️ **Needs geometric check** — see §4.2 |

### 4.2 Geometric Feasibility: Hole Overlap Analysis

**Critical concern:** At extreme DV3 × DV4 combinations, holes may overlap or leave insufficient ligament (material between holes).

#### Overlap condition

If the beam web has usable length `L_web` and `n` holes of diameter `d` are equally spaced:

```
Spacing between hole centers = L_web / (n + 1)
Ligament between holes = spacing - d = L_web/(n+1) - d
```

For **no overlap**, we need: `L_web/(n+1) - d > 0`, i.e., `d < L_web/(n+1)`

#### Worst case: n=15 holes, d=450 mm

```
Required: L_web > (n+1) × d = 16 × 450 = 7200 mm = 7.2 m
```

If the beam is shorter than ~7.2 m, this combination is **geometrically infeasible**.

#### Minimum ligament width

For structural integrity and mesh quality, a minimum ligament of ~20-30 mm is advisable:

```
Minimum ligament constraint: L_web/(n+1) - d ≥ 30 mm
```

> ⚠️ **ACTION REQUIRED:** We need to know the beam web length to validate bounds. If beam length < 7.2 m, either reduce max hole_count, reduce max hole_diameter, or add a geometric feasibility pre-check that skips NX evaluation for impossible geometries.

### 4.3 Hole-to-Web-Height Ratio

The hole diameter must also fit within the web height. If web height = 2 × half_core_thickness + 2 × face_thickness (approximate):

```
At minimum DV1=10, DV2=10: web_height ≈ 2×10 + 2×10 = 40 mm → max hole = 40 mm
```

But holes_diameter goes up to 450 mm — this suggests the web height is substantially larger than what the parametric cross-section variables alone define, OR the holes are in a different part of the geometry (e.g., a wider flange region or a tall web independent of core/face dimensions).

> ⚠️ **ACTION REQUIRED:** Clarify the geometric relationship between DV1/DV2 and the web where holes are placed. The holes may be in a different structural member than the sandwich faces.

### 4.4 Expected Feasible Region

Based on the physics (Tech Lead's analysis §1.2 and §1.3):

| To reduce displacement (currently 22→10 mm) | Effect on mass |
|----------------------------------------------|---------------|
| ↑ DV1 (thicker core) | ↑ mass (but stiffness scales ~d², mass scales ~d) → **efficient** |
| ↑ DV2 (thicker face) | ↑ mass (direct) |
| ↓ DV3 (smaller holes) | ↑ mass (more web material) |
| ↓ DV4 (fewer holes) | ↑ mass (more web material) |

**Prediction:** The feasible region (displacement ≤ 10 mm) likely requires:
- DV1 in upper range (25-40 mm) — the sandwich effect is the most mass-efficient stiffness lever
- DV2 moderate (15-30 mm) — thicker faces help stiffness but cost mass directly
- DV3 and DV4 constrained by stress — large/many holes save mass but increase stress

The optimizer should find a "sweet spot" where core thickness provides stiffness, and holes are sized to save mass without violating stress limits.

### 4.5 Estimated Design Space Volume

- DV1: 30 mm span (continuous)
- DV2: 30 mm span (continuous)
- DV3: 300 mm span (continuous)
- DV4: 11 integer levels

Total configurations: effectively infinite (3 continuous), but the integer dimension creates 11 "slices" of the space. With 50 DoE trials, we get ~4-5 trials per slice — sufficient for trend identification.

---

## 5. Trial Budget & Compute Estimate

### 5.1 Budget Breakdown

| Phase | Trials | Purpose |
|-------|--------|---------|
| **Trial 0** | 1 | Baseline validation (enqueued) |
| **Phase 1: LHS DoE** | 50 | Landscape mapping, feasibility, sensitivity |
| **Phase 2: TPE** | 60–100 | Directed optimization |
| **Validation** | 3–5 | Confirm optimum, check mesh sensitivity |
| **Total** | **114–156** | |

### 5.2 Compute Time Estimate

| Parameter | Estimate | Notes |
|-----------|----------|-------|
| DOF count | 10K–100K | Steel beam, SOL 101 |
| Single solve time | 30s–3min | Depends on mesh density |
| Model rebuild time | 10–30s | NX parametric update + remesh |
| Total per trial | 1–4 min | Rebuild + solve + extraction |
| Phase 1 (51 trials) | 1–3.5 hrs | |
| Phase 2 (60–100 trials) | 1–7 hrs | |
| **Total compute** | **2–10 hrs** | Likely ~4–5 hrs |

### 5.3 Budget Justification

For 4 design variables, rule-of-thumb budgets:
- **Minimum viable:** 10 × n_vars = 40 trials (DoE only)
- **Standard:** 25 × n_vars = 100 trials (DoE + optimization)
- **Thorough:** 50 × n_vars = 200 trials (with validation)

Our budget of 114–156 falls in the **standard-to-thorough** range. Appropriate for a first study where we're mapping an unknown landscape with an infeasible baseline.

---

## 6. Convergence Criteria

### 6.1 Phase 1 (DoE) — No Convergence Criteria

The DoE runs all 50 planned trials. It's not iterative — it's a one-shot space-filling design. Stop conditions:
- All 50 trials complete (or fail with documented errors)
- **Early abort:** If >80% of trials fail to solve (NX crashes), stop and investigate

### 6.2 Phase 2 (TPE) — Convergence Criteria

| Criterion | Threshold | Action |
|-----------|-----------|--------|
| **Improvement stall** | Best feasible objective unchanged for 20 consecutive trials | Consider stopping |
| **Relative improvement** | < 1% improvement over last 20 trials | Consider stopping |
| **Budget exhausted** | 100 trials completed in Phase 2 | Hard stop |
| **Perfect convergence** | Multiple trials within 0.5% of each other from different regions | Confident optimum found |
| **Minimum budget** | Always run at least 60 trials in Phase 2 | Ensures adequate exploration |

### 6.3 Decision Logic

```
After 60 Phase 2 trials:
  IF best_feasible improved by >2% in last 20 trials → continue to 80
  IF no feasible solution found → STOP, escalate (see §7.1)
  ELSE → assess convergence, decide 80 or 100

After 80 Phase 2 trials:
  IF still improving >1% per 20 trials → continue to 100
  ELSE → STOP, declare converged

After 100 Phase 2 trials:
  HARD STOP regardless
```

### 6.4 Phase 1 → Phase 2 Gate

Before starting Phase 2, review DoE results:

| Check | Action if FAIL |
|-------|---------------|
| At least 5 feasible points found | If 0 feasible: expand bounds or relax constraints (escalate to CEO) |
| NX solve success rate > 80% | If <80%: investigate failures, fix model, re-run failed trials |
| No systematic NX crashes at bounds | If crashes: tighten bounds away from failure region |
| Sensitivity trends visible | If flat: check extractors, may be reading wrong output |

---

## 7. Risk Mitigation

### 7.1 Risk: Feasible Region is Empty

**Likelihood: Medium** (baseline fails displacement by 120%)

**Detection:** After Phase 1, zero feasible points found.

**Mitigation ladder:**
1. **Check the data** — Are extractors reading correctly? Validate against manual NX check.
2. **Examine near-feasible** — Find the trial closest to feasibility. How far off? If displacement = 10.5 mm, we're close. If displacement = 18 mm, we have a problem.
3. **Targeted exploration** — Run additional trials at extreme stiffness (max DV1, max DV2, min DV3, min DV4). If even the stiffest/heaviest design fails, the constraint is physically impossible with this geometry.
4. **Constraint relaxation** — Propose to CEO: relax displacement to 12 or 15 mm. Document the mass-displacement Pareto front from DoE data to support the discussion.
5. **Geometric redesign** — If the problem is fundamentally infeasible, the beam geometry needs redesign (out of optimization scope).

### 7.2 Risk: NX Crashes at Parameter Extremes

**Likelihood: Medium** (LAC: rib_thickness had undocumented CAD constraint at 9mm, causing 34% failure rate in V13)

**Detection:** Solver returns no results for certain parameter combinations.

**Mitigation:**
1. **Pre-flight corner tests** — Before Phase 1, manually test the 16 corners of the design space (2⁴ combinations of min/max for each variable). This catches geometric rebuild failures early.
2. **Error-handling in run script** — Every trial must catch exceptions and log:
   - NX rebuild failure (geometry Boolean crash)
   - Meshing failure (degenerate elements)
   - Solver failure (singularity, divergence)
   - Extraction failure (missing result)
3. **Infeasible-by-default** — If a trial fails for any reason, record it as infeasible with maximum constraint violation (displacement=9999, stress=9999). This lets Deb's rules naturally steer away from crashing regions.
4. **NEVER kill NX processes directly** — LAC CRITICAL RULE. Use NXSessionManager.close_nx_if_allowed() only. If NX hangs, implement a timeout (e.g., 10 min per trial) and let NX time out gracefully.

### 7.3 Risk: Mesh-Dependent Stress Results

**Likelihood: Medium** (stress at hole edges is mesh-sensitive)

**Mitigation:**
1. **Mesh convergence pre-study** — Run baseline at 3 mesh densities. If stress varies >10%, refine mesh or use stress averaging region.
2. **Consistent mesh controls** — Ensure NX applies the same mesh size/refinement strategy regardless of parameter values. The parametric model should have mesh controls tied to hole geometry.
3. **Stress extraction method** — Use elemental nodal stress (conservative) per LAC success pattern. Note: pyNastran returns stress in kPa for NX kg-mm-s unit system — **divide by 1000 for MPa**.

### 7.4 Risk: Surrogate Temptation

**Mitigation: DON'T DO IT (yet).**

LAC lessons from the M1 Mirror project are unequivocal:
- V5 surrogate + L-BFGS was 22% **worse** than V6 pure TPE
- MLP surrogates have smooth gradients everywhere → L-BFGS descends to fake optima outside training distribution
- No uncertainty quantification = no way to detect out-of-distribution predictions

With only 4 variables and affordable FEA (~2 min/trial), direct FEA evaluation via TPE is both simpler and more reliable. Surrogate methods should only be considered if:
- FEA solve time exceeds 30 minutes per trial, AND
- We have 100+ validated training points, AND
- We use ensemble surrogates with uncertainty quantification (SYS_16 protocol)

### 7.5 Risk: Study Corruption

**Mitigation:** LAC CRITICAL — **Always copy working studies, never rewrite from scratch.**

- Phase 2 study will be created by **copying** the Phase 1 study directory and adding optimization logic
- Never modify `run_optimization.py` in-place for a new phase — copy to a new version
- Git-commit the study directory after each phase completion

---

## 8. AtomizerSpec Draft

See [`atomizer_spec_draft.json`](./atomizer_spec_draft.json) for the full JSON config.

### 8.1 Key Configuration Decisions

| Setting | Value | Rationale |
|---------|-------|-----------|
| `algorithm.phase1.type` | `LHS` | Space-filling DoE for landscape mapping |
| `algorithm.phase2.type` | `TPE` | Native mixed-integer, sample-efficient, LAC-proven |
| `hole_count.type` | `integer` | DEC-HB-003: true integer, no rounding |
| `constraint_handling` | `deb_feasibility_rules` | Best for infeasible baseline |
| `baseline_trial` | `enqueued` | LAC lesson: always validate baseline first |
| `penalty_config.method` | `deb_rules` | No penalty weight tuning needed |

### 8.2 Extractor Requirements

| ID | Type | Output | Source | Notes |
|----|------|--------|--------|-------|
| `ext_001` | `expression` | `mass` | NX expression `p173` | Direct read from NX |
| `ext_002` | `displacement` | `tip_displacement` | SOL 101 result sensor or .op2 parse | ⚠️ Need sensor setup or node ID |
| `ext_003` | `stress` | `max_von_mises` | SOL 101 elemental nodal | kPa → MPa conversion needed |

### 8.3 Open Items for Spec Finalization

Before this spec can be promoted from `_draft` to production:

1. **Beam web length** — Required to validate DV3 × DV4 geometric feasibility
2. **Displacement extraction method** — Sensor in .sim, or node ID for .op2 parsing?
3. **Stress extraction scope** — Whole model max, or specific element group?
4. **NX expression names confirmed** — Verify `p173` is mass, confirm displacement/stress expression names
5. **Solver runtime benchmark** — Time one SOL 101 run to refine compute estimates
6. **Corner test results** — Validate model rebuilds at all 16 bound corners

---

## 9. Execution Plan Summary

```
┌─────────────────────────────────────────────────────────────────┐
│                    HYDROTECH BEAM OPTIMIZATION                  │
│                    Study: 01_doe_landscape                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  PRE-FLIGHT (before any trials)                                 │
│  ├── Validate baseline: run Trial 0, verify mass/disp/stress   │
│  ├── Corner tests: 16 extreme combinations, check NX rebuilds  │
│  ├── Mesh convergence: 3 density levels at baseline             │
│  └── Confirm extractors: mass, displacement, stress pipelines   │
│                                                                 │
│  PHASE 1: DoE LANDSCAPE (51 trials)                             │
│  ├── Trial 0: Baseline (enqueued)                               │
│  ├── Trials 1-50: LHS with integer rounding for hole_count     │
│  ├── Analysis: sensitivity, interaction, feasibility mapping    │
│  └── GATE: ≥5 feasible? NX success >80%? Proceed/escalate      │
│                                                                 │
│  PHASE 2: TPE OPTIMIZATION (60-100 trials)                      │
│  ├── Warm-start from best Phase 1 feasible point(s)            │
│  ├── Deb's feasibility rules for constraint handling            │
│  ├── Convergence check every 20 trials                          │
│  └── Hard stop at 100 trials                                    │
│                                                                 │
│  VALIDATION (3-5 trials)                                        │
│  ├── Re-run best design to confirm repeatability                │
│  ├── Perturb ±5% on each variable to check sensitivity          │
│  └── Document final design with full NX results                 │
│                                                                 │
│  TOTAL: 114-156 NX evaluations | ~4-5 hours compute            │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
```

---

## 10. LAC Lessons Incorporated

| LAC Lesson | Source | How Applied |
|------------|--------|-------------|
| CMA-ES doesn't evaluate x0 first | Mirror V7 failure | Baseline enqueued as Trial 0 for both phases |
| Surrogate + L-BFGS = fake optima | Mirror V5 failure | No surrogates in this study; direct FEA only |
| Never kill NX processes directly | Dec 2025 incident | Timeout-based error handling; NXSessionManager only |
| Copy working studies, never rewrite | Mirror V5 failure | Phase 2 created by copying Phase 1 |
| pyNastran stress in kPa | Support arm success | Extractor divides by 1000 for MPa |
| CAD constraints can limit bounds | Mirror V13 (rib_thickness) | Pre-flight corner tests before DoE |
| Always include README.md | Repeated failures (Dec 2025, Jan 2026) | README.md created with study |
| Simple beats complex (TPE > surrogate) | Mirror V6 vs V5 | TPE selected over surrogate-based methods |

---

*⚡ Optimizer — The algorithm is the strategy.*