# Optimization Strategy — Hydrotech Beam DoE & Landscape Mapping **Study:** `01_doe_landscape` **Project:** Hydrotech Beam Structural Optimization **Author:** ⚡ Optimizer Agent **Date:** 2025-02-09 **Status:** DRAFT — Awaiting review **References:** [BREAKDOWN.md](../../BREAKDOWN.md), [DECISIONS.md](../../DECISIONS.md), [CONTEXT.md](../../CONTEXT.md) --- ## 1. Problem Formulation ### 1.1 Objective $$\min_{x} \quad f(x) = \text{mass}(x) \quad [\text{kg}]$$ Single-objective minimization of total beam mass. This aligns with DEC-HB-001 (approved by Tech Lead, pending CEO confirmation). ### 1.2 Constraints | ID | Constraint | Operator | Limit | Units | Source | |----|-----------|----------|-------|-------|--------| | g₁ | Tip displacement | ≤ | 10.0 | mm | NX Nastran SOL 101 — displacement sensor at beam tip | | g₂ | Max von Mises stress | ≤ | 130.0 | MPa | NX Nastran SOL 101 — max elemental nodal VM stress | Both are **hard constraints** — no trade-off or relaxation without CEO approval. ### 1.3 Design Variables | ID | NX Expression | Type | Lower | Upper | Baseline | Units | Notes | |----|---------------|------|-------|-------|----------|-------|-------| | DV1 | `beam_half_core_thickness` | Continuous | 10 | 40 | 20 | mm | Core half-thickness; stiffness scales ~quadratically via sandwich effect | | DV2 | `beam_face_thickness` | Continuous | 10 | 40 | 20 | mm | Face sheet thickness; primary bending stiffness contributor | | DV3 | `holes_diameter` | Continuous | 150 | 450 | 300 | mm | Lightening hole diameter; mass ∝ d² reduction | | DV4 | `hole_count` | **Integer** | 5 | 15 | 10 | — | Number of lightening holes; 11 discrete levels | **Total design space:** 3 continuous × 1 integer (11 levels) = effectively 3D continuous × 11 slices. ### 1.4 Integer Handling Per DEC-HB-003, `hole_count` is treated as a **true integer** throughout: - **Phase 1 (LHS):** Generate continuous LHS, round DV4 to nearest integer. Use stratified integer sampling to ensure coverage across all 11 levels. - **Phase 2 (TPE):** Optuna `IntDistribution(5, 15)` — native integer support, no rounding hacks. - **NX rebuild:** The model requires integer hole count. Non-integer values will cause geometry failures. ### 1.5 Baseline Assessment | Metric | Baseline Value | Constraint | Status | |--------|---------------|------------|--------| | Mass | ~974 kg | (minimize) | Overbuilt — room to reduce | | Tip displacement | ~22 mm | ≤ 10 mm | ❌ **FAILS** by 120% | | VM stress | (unknown) | ≤ 130 MPa | ⚠️ Assumed OK but unconfirmed | > ⚠️ **Critical:** The baseline design **violates** the displacement constraint (22 mm vs 10 mm limit). The optimizer must first find the feasible region before it can meaningfully minimize mass. This shapes the entire strategy. --- ## 2. Algorithm Selection ### 2.1 Tech Lead's Recommendation DEC-HB-002 proposes a two-phase strategy: - **Phase 1:** Latin Hypercube Sampling (LHS) — 40–50 trials - **Phase 2:** TPE via Optuna — 60–100 trials ### 2.2 My Assessment: **CONFIRMED with refinements** The two-phase approach is the right call. Here's why, and what I'd adjust: #### Why LHS → TPE is correct for this problem | Factor | Implication | Algorithm Fit | |--------|------------|---------------| | 4 design variables (low-dim) | All methods work; sample efficiency less critical | Any | | 1 integer variable | Need native mixed-type support | TPE ✓, CMA-ES ≈ (rounding) | | Infeasible baseline | Must map feasibility BEFORE optimizing | LHS first ✓ | | Expected significant interactions (DV1×DV2, DV3×DV4) | Need space-filling to detect interactions | LHS ✓ | | Potentially narrow feasible region | Risk of missing it with random search | LHS gives systematic coverage ✓ | | NX-in-the-loop (medium cost) | ~100-200 trials is budget-appropriate | TPE efficient enough ✓ | #### What I'd modify 1. **Phase 1 budget: 50 trials (not 40).** With 4 variables, we want at least 10× the dimensionality for a reliable DoE. 50 trials also divides cleanly for stratified integer sampling (≈4-5 trials per hole_count level). 2. **Enqueue baseline as Trial 0.** LAC critical lesson: CMA-ES doesn't evaluate x0 first. While we're using LHS (not CMA-ES), the same principle applies — **always evaluate the baseline explicitly** so we have a verified anchor point. This also validates the extractor pipeline before burning 50 trials. 3. **Phase 2 budget: 80 trials (flexible 60-100).** Start with 60, apply convergence criteria (Section 6), extend to 100 if still improving. 4. **Seed Phase 2 from Phase 1 data.** Use Optuna's `enqueue_trial()` to warm-start TPE with the best feasible point(s) from the DoE. This avoids the TPE startup penalty (first `n_startup_trials` are random). #### Algorithms NOT selected (and why) | Algorithm | Why Not | |-----------|---------| | **CMA-ES** | Good option, but integer rounding is a hack; doesn't evaluate x0 first (LAC lesson); TPE is equally good at 4D | | **NSGA-II** | Overkill for single-objective; population size wastes budget | | **Surrogate + L-BFGS** | **LAC CRITICAL: Gradient descent on surrogates finds fake optima.** V5 mirror study: L-BFGS was 22% WORSE than pure TPE (WS=325 vs WS=290). V6 confirmed simple TPE beats complex surrogate methods. Do not use. | | **SOL 200 (Nastran native)** | No integer support for hole_count; gradient-based so may miss global optimum; more NX setup effort. Keep as backup (Tech Lead's suggestion). | | **Nelder-Mead** | No integer support; poor exploration; would miss the feasible region | ### 2.3 Final Algorithm Configuration ``` Phase 1: LHS DoE - Trials: 50 (+ 1 baseline = 51 total) - Sampling: Maximin LHS, DV4 rounded to nearest integer - Purpose: Landscape mapping, feasibility identification, sensitivity analysis Phase 2: TPE Optimization - Trials: 60-100 (adaptive, see convergence criteria) - Sampler: Optuna TPEsampler - n_startup_trials: 0 (warm-started from Phase 1 best) - Constraint handling: Optuna constraint interface with Deb's rules - Purpose: Converge to minimum-mass feasible design Total budget: 111-151 evaluations ``` --- ## 3. Constraint Handling ### 3.1 The Challenge The baseline FAILS the displacement constraint by 120% (22 mm vs 10 mm). This means: - A significant portion of the design space may be infeasible - Random sampling may return few or zero feasible points - The optimizer must navigate toward feasibility AND optimality simultaneously ### 3.2 Approach: Deb's Feasibility Rules (Constraint Domination) For ranking solutions during optimization, use **Deb's feasibility rules** (Deb 2000): 1. **Feasible vs feasible** → compare by objective (lower mass wins) 2. **Feasible vs infeasible** → feasible always wins 3. **Infeasible vs infeasible** → lower total constraint violation wins This is implemented via Optuna's constraint interface: ```python def constraints(trial): """Return constraint violations (negative = feasible, positive = infeasible).""" disp = trial.user_attrs["tip_displacement"] stress = trial.user_attrs["max_von_mises"] return [ disp - 10.0, # ≤ 0 means displacement ≤ 10 mm stress - 130.0, # ≤ 0 means stress ≤ 130 MPa ] ``` ### 3.3 Why NOT Penalty Functions | Method | Pros | Cons | Verdict | |--------|------|------|---------| | **Deb's rules** (selected) | No tuning params; feasible always beats infeasible; explores infeasible region for learning | Requires custom Optuna integration | ✅ Best for this case | | **Quadratic penalty** | Simple to implement | Penalty weight requires tuning; wrong weight → optimizer ignores constraint OR over-penalizes | ❌ Fragile | | **Adaptive penalty** | Self-tuning | Complex implementation; may oscillate | ❌ Over-engineered for 4 DVs | | **Death penalty** (reject infeasible) | Simplest | With infeasible baseline, may reject 80%+ of trials → wasted budget | ❌ Dangerous | ### 3.4 Phase 1 (DoE) Constraint Handling During the DoE phase, **record all results without filtering.** Every trial runs, every result is stored. Infeasible points are valuable for: - Mapping the feasibility boundary - Training the TPE model in Phase 2 - Understanding which variables drive constraint violation ### 3.5 Constraint Margin Buffer Consider a 5% inner margin during optimization to account for numerical noise: - Displacement target for optimizer: ≤ 9.5 mm (vs hard limit 10.0 mm) - Stress target for optimizer: ≤ 123.5 MPa (vs hard limit 130.0 MPa) The hard limits remain 10 mm / 130 MPa for final validation. The buffer prevents the optimizer from converging to designs that are right on the boundary and may flip infeasible under mesh variation. --- ## 4. Search Space Analysis ### 4.1 Bound Reasonableness | Variable | Range | Span | Concern | |----------|-------|------|---------| | DV1: half_core_thickness | 10–40 mm | 4× range | Reasonable. Lower bound = thin core, upper = thick. Stiffness-mass trade-off | | DV2: face_thickness | 10–40 mm | 4× range | Reasonable. 10 mm face is already substantial for steel | | DV3: holes_diameter | 150–450 mm | 3× range | ⚠️ **Needs geometric check** — see §4.2 | | DV4: hole_count | 5–15 | 3× range | ⚠️ **Needs geometric check** — see §4.2 | ### 4.2 Geometric Feasibility: Hole Overlap Analysis **Critical concern:** At extreme DV3 × DV4 combinations, holes may overlap or leave insufficient ligament (material between holes). #### Overlap condition If the beam web has usable length `L_web` and `n` holes of diameter `d` are equally spaced: ``` Spacing between hole centers = L_web / (n + 1) Ligament between holes = spacing - d = L_web/(n+1) - d ``` For **no overlap**, we need: `L_web/(n+1) - d > 0`, i.e., `d < L_web/(n+1)` #### Worst case: n=15 holes, d=450 mm ``` Required: L_web > (n+1) × d = 16 × 450 = 7200 mm = 7.2 m ``` If the beam is shorter than ~7.2 m, this combination is **geometrically infeasible**. #### Minimum ligament width For structural integrity and mesh quality, a minimum ligament of ~20-30 mm is advisable: ``` Minimum ligament constraint: L_web/(n+1) - d ≥ 30 mm ``` > ⚠️ **ACTION REQUIRED:** We need to know the beam web length to validate bounds. If beam length < 7.2 m, either reduce max hole_count, reduce max hole_diameter, or add a geometric feasibility pre-check that skips NX evaluation for impossible geometries. ### 4.3 Hole-to-Web-Height Ratio The hole diameter must also fit within the web height. If web height = 2 × half_core_thickness + 2 × face_thickness (approximate): ``` At minimum DV1=10, DV2=10: web_height ≈ 2×10 + 2×10 = 40 mm → max hole = 40 mm ``` But holes_diameter goes up to 450 mm — this suggests the web height is substantially larger than what the parametric cross-section variables alone define, OR the holes are in a different part of the geometry (e.g., a wider flange region or a tall web independent of core/face dimensions). > ⚠️ **ACTION REQUIRED:** Clarify the geometric relationship between DV1/DV2 and the web where holes are placed. The holes may be in a different structural member than the sandwich faces. ### 4.4 Expected Feasible Region Based on the physics (Tech Lead's analysis §1.2 and §1.3): | To reduce displacement (currently 22→10 mm) | Effect on mass | |----------------------------------------------|---------------| | ↑ DV1 (thicker core) | ↑ mass (but stiffness scales ~d², mass scales ~d) → **efficient** | | ↑ DV2 (thicker face) | ↑ mass (direct) | | ↓ DV3 (smaller holes) | ↑ mass (more web material) | | ↓ DV4 (fewer holes) | ↑ mass (more web material) | **Prediction:** The feasible region (displacement ≤ 10 mm) likely requires: - DV1 in upper range (25-40 mm) — the sandwich effect is the most mass-efficient stiffness lever - DV2 moderate (15-30 mm) — thicker faces help stiffness but cost mass directly - DV3 and DV4 constrained by stress — large/many holes save mass but increase stress The optimizer should find a "sweet spot" where core thickness provides stiffness, and holes are sized to save mass without violating stress limits. ### 4.5 Estimated Design Space Volume - DV1: 30 mm span (continuous) - DV2: 30 mm span (continuous) - DV3: 300 mm span (continuous) - DV4: 11 integer levels Total configurations: effectively infinite (3 continuous), but the integer dimension creates 11 "slices" of the space. With 50 DoE trials, we get ~4-5 trials per slice — sufficient for trend identification. --- ## 5. Trial Budget & Compute Estimate ### 5.1 Budget Breakdown | Phase | Trials | Purpose | |-------|--------|---------| | **Trial 0** | 1 | Baseline validation (enqueued) | | **Phase 1: LHS DoE** | 50 | Landscape mapping, feasibility, sensitivity | | **Phase 2: TPE** | 60–100 | Directed optimization | | **Validation** | 3–5 | Confirm optimum, check mesh sensitivity | | **Total** | **114–156** | | ### 5.2 Compute Time Estimate | Parameter | Estimate | Notes | |-----------|----------|-------| | DOF count | 10K–100K | Steel beam, SOL 101 | | Single solve time | 30s–3min | Depends on mesh density | | Model rebuild time | 10–30s | NX parametric update + remesh | | Total per trial | 1–4 min | Rebuild + solve + extraction | | Phase 1 (51 trials) | 1–3.5 hrs | | | Phase 2 (60–100 trials) | 1–7 hrs | | | **Total compute** | **2–10 hrs** | Likely ~4–5 hrs | ### 5.3 Budget Justification For 4 design variables, rule-of-thumb budgets: - **Minimum viable:** 10 × n_vars = 40 trials (DoE only) - **Standard:** 25 × n_vars = 100 trials (DoE + optimization) - **Thorough:** 50 × n_vars = 200 trials (with validation) Our budget of 114–156 falls in the **standard-to-thorough** range. Appropriate for a first study where we're mapping an unknown landscape with an infeasible baseline. --- ## 6. Convergence Criteria ### 6.1 Phase 1 (DoE) — No Convergence Criteria The DoE runs all 50 planned trials. It's not iterative — it's a one-shot space-filling design. Stop conditions: - All 50 trials complete (or fail with documented errors) - **Early abort:** If >80% of trials fail to solve (NX crashes), stop and investigate ### 6.2 Phase 2 (TPE) — Convergence Criteria | Criterion | Threshold | Action | |-----------|-----------|--------| | **Improvement stall** | Best feasible objective unchanged for 20 consecutive trials | Consider stopping | | **Relative improvement** | < 1% improvement over last 20 trials | Consider stopping | | **Budget exhausted** | 100 trials completed in Phase 2 | Hard stop | | **Perfect convergence** | Multiple trials within 0.5% of each other from different regions | Confident optimum found | | **Minimum budget** | Always run at least 60 trials in Phase 2 | Ensures adequate exploration | ### 6.3 Decision Logic ``` After 60 Phase 2 trials: IF best_feasible improved by >2% in last 20 trials → continue to 80 IF no feasible solution found → STOP, escalate (see §7.1) ELSE → assess convergence, decide 80 or 100 After 80 Phase 2 trials: IF still improving >1% per 20 trials → continue to 100 ELSE → STOP, declare converged After 100 Phase 2 trials: HARD STOP regardless ``` ### 6.4 Phase 1 → Phase 2 Gate Before starting Phase 2, review DoE results: | Check | Action if FAIL | |-------|---------------| | At least 5 feasible points found | If 0 feasible: expand bounds or relax constraints (escalate to CEO) | | NX solve success rate > 80% | If <80%: investigate failures, fix model, re-run failed trials | | No systematic NX crashes at bounds | If crashes: tighten bounds away from failure region | | Sensitivity trends visible | If flat: check extractors, may be reading wrong output | --- ## 7. Risk Mitigation ### 7.1 Risk: Feasible Region is Empty **Likelihood: Medium** (baseline fails displacement by 120%) **Detection:** After Phase 1, zero feasible points found. **Mitigation ladder:** 1. **Check the data** — Are extractors reading correctly? Validate against manual NX check. 2. **Examine near-feasible** — Find the trial closest to feasibility. How far off? If displacement = 10.5 mm, we're close. If displacement = 18 mm, we have a problem. 3. **Targeted exploration** — Run additional trials at extreme stiffness (max DV1, max DV2, min DV3, min DV4). If even the stiffest/heaviest design fails, the constraint is physically impossible with this geometry. 4. **Constraint relaxation** — Propose to CEO: relax displacement to 12 or 15 mm. Document the mass-displacement Pareto front from DoE data to support the discussion. 5. **Geometric redesign** — If the problem is fundamentally infeasible, the beam geometry needs redesign (out of optimization scope). ### 7.2 Risk: NX Crashes at Parameter Extremes **Likelihood: Medium** (LAC: rib_thickness had undocumented CAD constraint at 9mm, causing 34% failure rate in V13) **Detection:** Solver returns no results for certain parameter combinations. **Mitigation:** 1. **Pre-flight corner tests** — Before Phase 1, manually test the 16 corners of the design space (2⁴ combinations of min/max for each variable). This catches geometric rebuild failures early. 2. **Error-handling in run script** — Every trial must catch exceptions and log: - NX rebuild failure (geometry Boolean crash) - Meshing failure (degenerate elements) - Solver failure (singularity, divergence) - Extraction failure (missing result) 3. **Infeasible-by-default** — If a trial fails for any reason, record it as infeasible with maximum constraint violation (displacement=9999, stress=9999). This lets Deb's rules naturally steer away from crashing regions. 4. **NEVER kill NX processes directly** — LAC CRITICAL RULE. Use NXSessionManager.close_nx_if_allowed() only. If NX hangs, implement a timeout (e.g., 10 min per trial) and let NX time out gracefully. ### 7.3 Risk: Mesh-Dependent Stress Results **Likelihood: Medium** (stress at hole edges is mesh-sensitive) **Mitigation:** 1. **Mesh convergence pre-study** — Run baseline at 3 mesh densities. If stress varies >10%, refine mesh or use stress averaging region. 2. **Consistent mesh controls** — Ensure NX applies the same mesh size/refinement strategy regardless of parameter values. The parametric model should have mesh controls tied to hole geometry. 3. **Stress extraction method** — Use elemental nodal stress (conservative) per LAC success pattern. Note: pyNastran returns stress in kPa for NX kg-mm-s unit system — **divide by 1000 for MPa**. ### 7.4 Risk: Surrogate Temptation **Mitigation: DON'T DO IT (yet).** LAC lessons from the M1 Mirror project are unequivocal: - V5 surrogate + L-BFGS was 22% **worse** than V6 pure TPE - MLP surrogates have smooth gradients everywhere → L-BFGS descends to fake optima outside training distribution - No uncertainty quantification = no way to detect out-of-distribution predictions With only 4 variables and affordable FEA (~2 min/trial), direct FEA evaluation via TPE is both simpler and more reliable. Surrogate methods should only be considered if: - FEA solve time exceeds 30 minutes per trial, AND - We have 100+ validated training points, AND - We use ensemble surrogates with uncertainty quantification (SYS_16 protocol) ### 7.5 Risk: Study Corruption **Mitigation:** LAC CRITICAL — **Always copy working studies, never rewrite from scratch.** - Phase 2 study will be created by **copying** the Phase 1 study directory and adding optimization logic - Never modify `run_optimization.py` in-place for a new phase — copy to a new version - Git-commit the study directory after each phase completion --- ## 8. AtomizerSpec Draft See [`atomizer_spec_draft.json`](./atomizer_spec_draft.json) for the full JSON config. ### 8.1 Key Configuration Decisions | Setting | Value | Rationale | |---------|-------|-----------| | `algorithm.phase1.type` | `LHS` | Space-filling DoE for landscape mapping | | `algorithm.phase2.type` | `TPE` | Native mixed-integer, sample-efficient, LAC-proven | | `hole_count.type` | `integer` | DEC-HB-003: true integer, no rounding | | `constraint_handling` | `deb_feasibility_rules` | Best for infeasible baseline | | `baseline_trial` | `enqueued` | LAC lesson: always validate baseline first | | `penalty_config.method` | `deb_rules` | No penalty weight tuning needed | ### 8.2 Extractor Requirements | ID | Type | Output | Source | Notes | |----|------|--------|--------|-------| | `ext_001` | `expression` | `mass` | NX expression `p173` | Direct read from NX | | `ext_002` | `displacement` | `tip_displacement` | SOL 101 result sensor or .op2 parse | ⚠️ Need sensor setup or node ID | | `ext_003` | `stress` | `max_von_mises` | SOL 101 elemental nodal | kPa → MPa conversion needed | ### 8.3 Open Items for Spec Finalization Before this spec can be promoted from `_draft` to production: 1. **Beam web length** — Required to validate DV3 × DV4 geometric feasibility 2. **Displacement extraction method** — Sensor in .sim, or node ID for .op2 parsing? 3. **Stress extraction scope** — Whole model max, or specific element group? 4. **NX expression names confirmed** — Verify `p173` is mass, confirm displacement/stress expression names 5. **Solver runtime benchmark** — Time one SOL 101 run to refine compute estimates 6. **Corner test results** — Validate model rebuilds at all 16 bound corners --- ## 9. Execution Plan Summary ``` ┌─────────────────────────────────────────────────────────────────┐ │ HYDROTECH BEAM OPTIMIZATION │ │ Study: 01_doe_landscape │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ PRE-FLIGHT (before any trials) │ │ ├── Validate baseline: run Trial 0, verify mass/disp/stress │ │ ├── Corner tests: 16 extreme combinations, check NX rebuilds │ │ ├── Mesh convergence: 3 density levels at baseline │ │ └── Confirm extractors: mass, displacement, stress pipelines │ │ │ │ PHASE 1: DoE LANDSCAPE (51 trials) │ │ ├── Trial 0: Baseline (enqueued) │ │ ├── Trials 1-50: LHS with integer rounding for hole_count │ │ ├── Analysis: sensitivity, interaction, feasibility mapping │ │ └── GATE: ≥5 feasible? NX success >80%? Proceed/escalate │ │ │ │ PHASE 2: TPE OPTIMIZATION (60-100 trials) │ │ ├── Warm-start from best Phase 1 feasible point(s) │ │ ├── Deb's feasibility rules for constraint handling │ │ ├── Convergence check every 20 trials │ │ └── Hard stop at 100 trials │ │ │ │ VALIDATION (3-5 trials) │ │ ├── Re-run best design to confirm repeatability │ │ ├── Perturb ±5% on each variable to check sensitivity │ │ └── Document final design with full NX results │ │ │ │ TOTAL: 114-156 NX evaluations | ~4-5 hours compute │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## 10. LAC Lessons Incorporated | LAC Lesson | Source | How Applied | |------------|--------|-------------| | CMA-ES doesn't evaluate x0 first | Mirror V7 failure | Baseline enqueued as Trial 0 for both phases | | Surrogate + L-BFGS = fake optima | Mirror V5 failure | No surrogates in this study; direct FEA only | | Never kill NX processes directly | Dec 2025 incident | Timeout-based error handling; NXSessionManager only | | Copy working studies, never rewrite | Mirror V5 failure | Phase 2 created by copying Phase 1 | | pyNastran stress in kPa | Support arm success | Extractor divides by 1000 for MPa | | CAD constraints can limit bounds | Mirror V13 (rib_thickness) | Pre-flight corner tests before DoE | | Always include README.md | Repeated failures (Dec 2025, Jan 2026) | README.md created with study | | Simple beats complex (TPE > surrogate) | Mirror V6 vs V5 | TPE selected over surrogate-based methods | --- *⚡ Optimizer — The algorithm is the strategy.*