26 KiB
Optimization Strategy — Hydrotech Beam DoE & Landscape Mapping
Study: 01_doe_landscape
Project: Hydrotech Beam Structural Optimization
Author: ⚡ Optimizer Agent
Date: 2025-02-09 (updated 2026-02-10 — introspection corrections)
Status: APPROVED WITH CONDITIONS — Auditor review 2026-02-10, blockers resolved inline
References: BREAKDOWN.md, DECISIONS.md, CONTEXT.md
1. Problem Formulation
1.1 Objective
\min_{x} \quad f(x) = \text{mass}(x) \quad [\text{kg}]
Single-objective minimization of total beam mass. This aligns with DEC-HB-001 (approved by Tech Lead, pending CEO confirmation).
1.2 Constraints
| ID | Constraint | Operator | Limit | Units | Source |
|---|---|---|---|---|---|
| g₁ | Tip displacement | ≤ | 10.0 | mm | NX Nastran SOL 101 — displacement sensor at beam tip |
| g₂ | Max von Mises stress | ≤ | 130.0 | MPa | NX Nastran SOL 101 — max elemental nodal VM stress |
Both are hard constraints — no trade-off or relaxation without CEO approval.
1.3 Design Variables
| ID | NX Expression | Type | Lower | Upper | Baseline | Units | Notes |
|---|---|---|---|---|---|---|---|
| DV1 | beam_half_core_thickness |
Continuous | 10 | 40 | 25.162 | mm | Core half-thickness; stiffness scales ~quadratically via sandwich effect |
| DV2 | beam_face_thickness |
Continuous | 10 | 40 | 21.504 | mm | Face sheet thickness; primary bending stiffness contributor |
| DV3 | holes_diameter |
Continuous | 150 | 450 | 300 | mm | Lightening hole diameter; mass ∝ d² reduction |
| DV4 | hole_count (→ Pattern_p7) |
Integer | 5 | 15 | 10 | — | Number of lightening holes; 11 discrete levels |
Total design space: 3 continuous × 1 integer (11 levels) = effectively 3D continuous × 11 slices.
1.4 Integer Handling
Per DEC-HB-003, hole_count is treated as a true integer throughout:
- Phase 1 (LHS): Generate continuous LHS, round DV4 to nearest integer. Use stratified integer sampling to ensure coverage across all 11 levels.
- Phase 2 (TPE): Optuna
IntDistribution(5, 15)— native integer support, no rounding hacks. - NX rebuild: The model requires integer hole count. Non-integer values will cause geometry failures.
1.5 Baseline Assessment
| Metric | Baseline Value | Constraint | Status |
|---|---|---|---|
| Mass | 1,133.01 kg | (minimize) | Overbuilt — room to reduce |
| Tip displacement | ~22 mm (unverified — awaiting baseline re-run) | ≤ 10 mm | ❌ Likely FAILS |
| VM stress | (unknown — awaiting baseline re-run) | ≤ 130 MPa | ⚠️ Unconfirmed |
⚠️ Critical: The baseline design likely violates the displacement constraint (~22 mm vs 10 mm limit). Baseline re-run pending — CEO running SOL 101 in parallel. The optimizer must first find the feasible region before it can meaningfully minimize mass. This shapes the entire strategy.
Introspection note (2026-02-10): Mass expression is
p173(body_property147.mass, kg). DV baselines are NOT round numbers (face=21.504mm, core=25.162mm). NX expressionbeam_lenghthas a typo (no 'h').hole_countlinks toPattern_p7in the NX pattern feature.
2. Algorithm Selection
2.1 Tech Lead's Recommendation
DEC-HB-002 proposes a two-phase strategy:
- Phase 1: Latin Hypercube Sampling (LHS) — 40–50 trials
- Phase 2: TPE via Optuna — 60–100 trials
2.2 My Assessment: CONFIRMED with refinements
The two-phase approach is the right call. Here's why, and what I'd adjust:
Why LHS → TPE is correct for this problem
| Factor | Implication | Algorithm Fit |
|---|---|---|
| 4 design variables (low-dim) | All methods work; sample efficiency less critical | Any |
| 1 integer variable | Need native mixed-type support | TPE ✓, CMA-ES ≈ (rounding) |
| Infeasible baseline | Must map feasibility BEFORE optimizing | LHS first ✓ |
| Expected significant interactions (DV1×DV2, DV3×DV4) | Need space-filling to detect interactions | LHS ✓ |
| Potentially narrow feasible region | Risk of missing it with random search | LHS gives systematic coverage ✓ |
| NX-in-the-loop (medium cost) | ~100-200 trials is budget-appropriate | TPE efficient enough ✓ |
What I'd modify
-
Phase 1 budget: 50 trials (not 40). With 4 variables, we want at least 10× the dimensionality for a reliable DoE. 50 trials also divides cleanly for stratified integer sampling (≈4-5 trials per hole_count level).
-
Enqueue baseline as Trial 0. LAC critical lesson: CMA-ES doesn't evaluate x0 first. While we're using LHS (not CMA-ES), the same principle applies — always evaluate the baseline explicitly so we have a verified anchor point. This also validates the extractor pipeline before burning 50 trials.
-
Phase 2 budget: 80 trials (flexible 60-100). Start with 60, apply convergence criteria (Section 6), extend to 100 if still improving.
-
Seed Phase 2 from Phase 1 data. Use Optuna's
enqueue_trial()to warm-start TPE with the best feasible point(s) from the DoE. This avoids the TPE startup penalty (firstn_startup_trialsare random).
Algorithms NOT selected (and why)
| Algorithm | Why Not |
|---|---|
| CMA-ES | Good option, but integer rounding is a hack; doesn't evaluate x0 first (LAC lesson); TPE is equally good at 4D |
| NSGA-II | Overkill for single-objective; population size wastes budget |
| Surrogate + L-BFGS | LAC CRITICAL: Gradient descent on surrogates finds fake optima. V5 mirror study: L-BFGS was 22% WORSE than pure TPE (WS=325 vs WS=290). V6 confirmed simple TPE beats complex surrogate methods. Do not use. |
| SOL 200 (Nastran native) | No integer support for hole_count; gradient-based so may miss global optimum; more NX setup effort. Keep as backup (Tech Lead's suggestion). |
| Nelder-Mead | No integer support; poor exploration; would miss the feasible region |
2.3 Final Algorithm Configuration
Phase 1: LHS DoE
- Trials: 50 (+ 1 baseline = 51 total)
- Sampling: Maximin LHS, DV4 rounded to nearest integer
- Purpose: Landscape mapping, feasibility identification, sensitivity analysis
Phase 2: TPE Optimization
- Trials: 60-100 (adaptive, see convergence criteria)
- Sampler: Optuna TPEsampler
- n_startup_trials: 0 (warm-started from Phase 1 best)
- Constraint handling: Optuna constraint interface with Deb's rules
- Purpose: Converge to minimum-mass feasible design
Total budget: 111-151 evaluations
3. Constraint Handling
3.1 The Challenge
The baseline FAILS the displacement constraint by 120% (22 mm vs 10 mm). This means:
- A significant portion of the design space may be infeasible
- Random sampling may return few or zero feasible points
- The optimizer must navigate toward feasibility AND optimality simultaneously
3.2 Approach: Deb's Feasibility Rules (Constraint Domination)
For ranking solutions during optimization, use Deb's feasibility rules (Deb 2000):
- Feasible vs feasible → compare by objective (lower mass wins)
- Feasible vs infeasible → feasible always wins
- Infeasible vs infeasible → lower total constraint violation wins
This is implemented via Optuna's constraint interface:
def constraints(trial):
"""Return constraint violations (negative = feasible, positive = infeasible)."""
disp = trial.user_attrs["tip_displacement"]
stress = trial.user_attrs["max_von_mises"]
return [
disp - 10.0, # ≤ 0 means displacement ≤ 10 mm
stress - 130.0, # ≤ 0 means stress ≤ 130 MPa
]
3.3 Why NOT Penalty Functions
| Method | Pros | Cons | Verdict |
|---|---|---|---|
| Deb's rules (selected) | No tuning params; feasible always beats infeasible; explores infeasible region for learning | Requires custom Optuna integration | ✅ Best for this case |
| Quadratic penalty | Simple to implement | Penalty weight requires tuning; wrong weight → optimizer ignores constraint OR over-penalizes | ❌ Fragile |
| Adaptive penalty | Self-tuning | Complex implementation; may oscillate | ❌ Over-engineered for 4 DVs |
| Death penalty (reject infeasible) | Simplest | With infeasible baseline, may reject 80%+ of trials → wasted budget | ❌ Dangerous |
3.4 Phase 1 (DoE) Constraint Handling
During the DoE phase, record all results without filtering. Every trial runs, every result is stored. Infeasible points are valuable for:
- Mapping the feasibility boundary
- Training the TPE model in Phase 2
- Understanding which variables drive constraint violation
3.5 Constraint Margin Buffer
Consider a 5% inner margin during optimization to account for numerical noise:
- Displacement target for optimizer: ≤ 9.5 mm (vs hard limit 10.0 mm)
- Stress target for optimizer: ≤ 123.5 MPa (vs hard limit 130.0 MPa)
The hard limits remain 10 mm / 130 MPa for final validation. The buffer prevents the optimizer from converging to designs that are right on the boundary and may flip infeasible under mesh variation.
4. Search Space Analysis
4.1 Bound Reasonableness
| Variable | Range | Span | Concern |
|---|---|---|---|
| DV1: half_core_thickness | 10–40 mm | 4× range | Reasonable. Lower bound = thin core, upper = thick. Stiffness-mass trade-off |
| DV2: face_thickness | 10–40 mm | 4× range | Reasonable. 10 mm face is already substantial for steel |
| DV3: holes_diameter | 150–450 mm | 3× range | ⚠️ Needs geometric check — see §4.2 |
| DV4: hole_count | 5–15 | 3× range | ⚠️ Needs geometric check — see §4.2 |
4.2 Geometric Feasibility: Hole Overlap Analysis
Critical concern: At extreme DV3 × DV4 combinations, holes may overlap or leave insufficient ligament (material between holes).
Overlap condition (CORRECTED — Auditor review 2026-02-10)
The NX pattern places n holes across a span of p6 mm using n-1 intervals (holes at both endpoints of the span). Confirmed by introspection: Pattern_p8 = 4000/9 = 444.44 mm.
Spacing between hole centers = hole_span / (hole_count - 1)
Ligament between holes = spacing - d = hole_span/(hole_count - 1) - d
For no overlap, we need: hole_span/(n-1) - d > 0, i.e., d < hole_span/(n-1)
With hole_span = 4,000 mm (fixed, p6):
Worst case: n=15 holes, d=450 mm
Spacing = 4000 / (15-1) = 285.7 mm
Ligament = 285.7 - 450 = -164.3 mm → INFEASIBLE (overlap)
Minimum ligament width
For structural integrity and mesh quality, a minimum ligament of ~30 mm is advisable:
Minimum ligament constraint: hole_span / (hole_count - 1) - holes_diameter ≥ 30 mm
Pre-flight geometric filter
Before sending any trial to NX, compute:
ligament = 4000 / (hole_count - 1) - holes_diameter→ must be ≥ 30 mmweb_clear = 2 × beam_half_height - 2 × beam_face_thickness - holes_diameter→ must be > 0
If either fails, skip NX evaluation and record as infeasible with max constraint violation. This saves compute and avoids NX geometry crashes.
4.3 Hole-to-Web-Height Ratio (CORRECTED — Auditor review 2026-02-10)
The hole diameter must fit within the web clear height. From introspection:
- Total beam height =
2 × beam_half_height = 2 × 250 = 500 mm(fixed) - Web clear height =
total_height - 2 × face_thickness = 500 - 2 × beam_face_thickness
At baseline (face=21.504mm): web_clear = 500 - 2×21.504 = 456.99 mm → holes of 450mm barely fit (7mm clearance)
At face=40mm: web_clear = 500 - 2×40 = 420 mm → holes of 450mm DO NOT FIT
At face=10mm: web_clear = 500 - 2×10 = 480 mm → holes of 450mm fit (30mm clearance)
This means beam_face_thickness and holes_diameter interact geometrically — thicker faces reduce the web clear height available for holes. This constraint is captured in the pre-flight filter (§4.2):
web_clear = 500 - 2 × beam_face_thickness - holes_diameter > 0
4.4 Expected Feasible Region
Based on the physics (Tech Lead's analysis §1.2 and §1.3):
| To reduce displacement (currently 22→10 mm) | Effect on mass |
|---|---|
| ↑ DV1 (thicker core) | ↑ mass (but stiffness scales ~d², mass scales ~d) → efficient |
| ↑ DV2 (thicker face) | ↑ mass (direct) |
| ↓ DV3 (smaller holes) | ↑ mass (more web material) |
| ↓ DV4 (fewer holes) | ↑ mass (more web material) |
Prediction: The feasible region (displacement ≤ 10 mm) likely requires:
- DV1 in upper range (25-40 mm) — the sandwich effect is the most mass-efficient stiffness lever
- DV2 moderate (15-30 mm) — thicker faces help stiffness but cost mass directly
- DV3 and DV4 constrained by stress — large/many holes save mass but increase stress
The optimizer should find a "sweet spot" where core thickness provides stiffness, and holes are sized to save mass without violating stress limits.
4.5 Estimated Design Space Volume
- DV1: 30 mm span (continuous)
- DV2: 30 mm span (continuous)
- DV3: 300 mm span (continuous)
- DV4: 11 integer levels
Total configurations: effectively infinite (3 continuous), but the integer dimension creates 11 "slices" of the space. With 50 DoE trials, we get ~4-5 trials per slice — sufficient for trend identification.
5. Trial Budget & Compute Estimate
5.1 Budget Breakdown
| Phase | Trials | Purpose |
|---|---|---|
| Trial 0 | 1 | Baseline validation (enqueued) |
| Phase 1: LHS DoE | 50 | Landscape mapping, feasibility, sensitivity |
| Phase 2: TPE | 60–100 | Directed optimization |
| Validation | 3–5 | Confirm optimum, check mesh sensitivity |
| Total | 114–156 |
5.2 Compute Time Estimate
| Parameter | Estimate | Notes |
|---|---|---|
| DOF count | 10K–100K | Steel beam, SOL 101 |
| Single solve time | 30s–3min | Depends on mesh density |
| Model rebuild time | 10–30s | NX parametric update + remesh |
| Total per trial | 1–4 min | Rebuild + solve + extraction |
| Phase 1 (51 trials) | 1–3.5 hrs | |
| Phase 2 (60–100 trials) | 1–7 hrs | |
| Total compute | 2–10 hrs | Likely ~4–5 hrs |
5.3 Budget Justification
For 4 design variables, rule-of-thumb budgets:
- Minimum viable: 10 × n_vars = 40 trials (DoE only)
- Standard: 25 × n_vars = 100 trials (DoE + optimization)
- Thorough: 50 × n_vars = 200 trials (with validation)
Our budget of 114–156 falls in the standard-to-thorough range. Appropriate for a first study where we're mapping an unknown landscape with an infeasible baseline.
6. Convergence Criteria
6.1 Phase 1 (DoE) — No Convergence Criteria
The DoE runs all 50 planned trials. It's not iterative — it's a one-shot space-filling design. Stop conditions:
- All 50 trials complete (or fail with documented errors)
- Early abort: If >80% of trials fail to solve (NX crashes), stop and investigate
6.2 Phase 2 (TPE) — Convergence Criteria
| Criterion | Threshold | Action |
|---|---|---|
| Improvement stall | Best feasible objective unchanged for 20 consecutive trials | Consider stopping |
| Relative improvement | < 1% improvement over last 20 trials | Consider stopping |
| Budget exhausted | 100 trials completed in Phase 2 | Hard stop |
| Perfect convergence | Multiple trials within 0.5% of each other from different regions | Confident optimum found |
| Minimum budget | Always run at least 60 trials in Phase 2 | Ensures adequate exploration |
6.3 Decision Logic
After 60 Phase 2 trials:
IF best_feasible improved by >2% in last 20 trials → continue to 80
IF no feasible solution found → STOP, escalate (see §7.1)
ELSE → assess convergence, decide 80 or 100
After 80 Phase 2 trials:
IF still improving >1% per 20 trials → continue to 100
ELSE → STOP, declare converged
After 100 Phase 2 trials:
HARD STOP regardless
6.4 Phase 1 → Phase 2 Gate
Before starting Phase 2, review DoE results:
| Check | Action if FAIL |
|---|---|
| At least 5 feasible points found | If 0 feasible: expand bounds or relax constraints (escalate to CEO) |
| NX solve success rate > 80% | If <80%: investigate failures, fix model, re-run failed trials |
| No systematic NX crashes at bounds | If crashes: tighten bounds away from failure region |
| Sensitivity trends visible | If flat: check extractors, may be reading wrong output |
7. Risk Mitigation
7.1 Risk: Feasible Region is Empty
Likelihood: Medium (baseline fails displacement by 120%)
Detection: After Phase 1, zero feasible points found.
Mitigation ladder:
- Check the data — Are extractors reading correctly? Validate against manual NX check.
- Examine near-feasible — Find the trial closest to feasibility. How far off? If displacement = 10.5 mm, we're close. If displacement = 18 mm, we have a problem.
- Targeted exploration — Run additional trials at extreme stiffness (max DV1, max DV2, min DV3, min DV4). If even the stiffest/heaviest design fails, the constraint is physically impossible with this geometry.
- Constraint relaxation — Propose to CEO: relax displacement to 12 or 15 mm. Document the mass-displacement Pareto front from DoE data to support the discussion.
- Geometric redesign — If the problem is fundamentally infeasible, the beam geometry needs redesign (out of optimization scope).
7.2 Risk: NX Crashes at Parameter Extremes
Likelihood: Medium (LAC: rib_thickness had undocumented CAD constraint at 9mm, causing 34% failure rate in V13)
Detection: Solver returns no results for certain parameter combinations.
Mitigation:
- Pre-flight corner tests — Before Phase 1, manually test the 16 corners of the design space (2⁴ combinations of min/max for each variable). This catches geometric rebuild failures early.
- Error-handling in run script — Every trial must catch exceptions and log:
- NX rebuild failure (geometry Boolean crash)
- Meshing failure (degenerate elements)
- Solver failure (singularity, divergence)
- Extraction failure (missing result)
- Infeasible-by-default — If a trial fails for any reason, record it as infeasible with maximum constraint violation (displacement=9999, stress=9999). This lets Deb's rules naturally steer away from crashing regions.
- NEVER kill NX processes directly — LAC CRITICAL RULE. Use NXSessionManager.close_nx_if_allowed() only. If NX hangs, implement a timeout (e.g., 10 min per trial) and let NX time out gracefully.
7.3 Risk: Mesh-Dependent Stress Results
Likelihood: Medium (stress at hole edges is mesh-sensitive)
Mitigation:
- Mesh convergence pre-study — Run baseline at 3 mesh densities. If stress varies >10%, refine mesh or use stress averaging region.
- Consistent mesh controls — Ensure NX applies the same mesh size/refinement strategy regardless of parameter values. The parametric model should have mesh controls tied to hole geometry.
- Stress extraction method — Use elemental nodal stress (conservative) per LAC success pattern. Note: pyNastran returns stress in kPa for NX kg-mm-s unit system — divide by 1000 for MPa.
7.4 Risk: Surrogate Temptation
Mitigation: DON'T DO IT (yet).
LAC lessons from the M1 Mirror project are unequivocal:
- V5 surrogate + L-BFGS was 22% worse than V6 pure TPE
- MLP surrogates have smooth gradients everywhere → L-BFGS descends to fake optima outside training distribution
- No uncertainty quantification = no way to detect out-of-distribution predictions
With only 4 variables and affordable FEA (~2 min/trial), direct FEA evaluation via TPE is both simpler and more reliable. Surrogate methods should only be considered if:
- FEA solve time exceeds 30 minutes per trial, AND
- We have 100+ validated training points, AND
- We use ensemble surrogates with uncertainty quantification (SYS_16 protocol)
7.5 Risk: Study Corruption
Mitigation: LAC CRITICAL — Always copy working studies, never rewrite from scratch.
- Phase 2 study will be created by copying the Phase 1 study directory and adding optimization logic
- Never modify
run_optimization.pyin-place for a new phase — copy to a new version - Git-commit the study directory after each phase completion
8. AtomizerSpec Draft
See atomizer_spec_draft.json for the full JSON config.
8.1 Key Configuration Decisions
| Setting | Value | Rationale |
|---|---|---|
algorithm.phase1.type |
LHS |
Space-filling DoE for landscape mapping |
algorithm.phase2.type |
TPE |
Native mixed-integer, sample-efficient, LAC-proven |
hole_count.type |
integer |
DEC-HB-003: true integer, no rounding |
constraint_handling |
deb_feasibility_rules |
Best for infeasible baseline |
baseline_trial |
enqueued |
LAC lesson: always validate baseline first |
penalty_config.method |
deb_rules |
No penalty weight tuning needed |
8.2 Extractor Requirements
| ID | Type | Output | Source | Notes |
|---|---|---|---|---|
ext_001 |
expression |
mass |
NX expression p173 |
Direct read from NX |
ext_002 |
displacement |
tip_displacement |
SOL 101 result sensor or .op2 parse | ⚠️ Need sensor setup or node ID |
ext_003 |
stress |
max_von_mises |
SOL 101 elemental nodal | kPa → MPa conversion needed |
8.3 Open Items for Spec Finalization
Before this spec can be promoted from _draft to production:
- Beam web length — Required to validate DV3 × DV4 geometric feasibility
- Displacement extraction method — Sensor in .sim, or node ID for .op2 parsing?
- Stress extraction scope — Whole model max, or specific element group?
- NX expression names confirmed — Verify
p173is mass, confirm displacement/stress expression names - Solver runtime benchmark — Time one SOL 101 run to refine compute estimates
- Corner test results — Validate model rebuilds at all 16 bound corners
9. Execution Plan Summary
┌─────────────────────────────────────────────────────────────────┐
│ HYDROTECH BEAM OPTIMIZATION │
│ Study: 01_doe_landscape │
├─────────────────────────────────────────────────────────────────┤
│ │
│ PRE-FLIGHT (before any trials) │
│ ├── Validate baseline: run Trial 0, verify mass/disp/stress │
│ ├── Corner tests: 16 extreme combinations, check NX rebuilds │
│ ├── Mesh convergence: 3 density levels at baseline │
│ └── Confirm extractors: mass, displacement, stress pipelines │
│ │
│ PHASE 1: DoE LANDSCAPE (51 trials) │
│ ├── Trial 0: Baseline (enqueued) │
│ ├── Trials 1-50: LHS with integer rounding for hole_count │
│ ├── Analysis: sensitivity, interaction, feasibility mapping │
│ └── GATE: ≥5 feasible? NX success >80%? Proceed/escalate │
│ │
│ PHASE 2: TPE OPTIMIZATION (60-100 trials) │
│ ├── Warm-start from best Phase 1 feasible point(s) │
│ ├── Deb's feasibility rules for constraint handling │
│ ├── Convergence check every 20 trials │
│ └── Hard stop at 100 trials │
│ │
│ VALIDATION (3-5 trials) │
│ ├── Re-run best design to confirm repeatability │
│ ├── Perturb ±5% on each variable to check sensitivity │
│ └── Document final design with full NX results │
│ │
│ TOTAL: 114-156 NX evaluations | ~4-5 hours compute │
│ │
└─────────────────────────────────────────────────────────────────┘
10. LAC Lessons Incorporated
| LAC Lesson | Source | How Applied |
|---|---|---|
| CMA-ES doesn't evaluate x0 first | Mirror V7 failure | Baseline enqueued as Trial 0 for both phases |
| Surrogate + L-BFGS = fake optima | Mirror V5 failure | No surrogates in this study; direct FEA only |
| Never kill NX processes directly | Dec 2025 incident | Timeout-based error handling; NXSessionManager only |
| Copy working studies, never rewrite | Mirror V5 failure | Phase 2 created by copying Phase 1 |
| pyNastran stress in kPa | Support arm success | Extractor divides by 1000 for MPa |
| CAD constraints can limit bounds | Mirror V13 (rib_thickness) | Pre-flight corner tests before DoE |
| Always include README.md | Repeated failures (Dec 2025, Jan 2026) | README.md created with study |
| Simple beats complex (TPE > surrogate) | Mirror V6 vs V5 | TPE selected over surrogate-based methods |
⚡ Optimizer — The algorithm is the strategy.