Files
Atomizer/hq/workspaces/auditor/Arsenal-Risk-Analysis-Quality-Gates.md

19 KiB
Raw Blame History

Arsenal Development Plan — Risk Analysis + Quality Gates

TASK: Arsenal Development Plan — Risk Analysis + Quality Gates
STATUS: complete
RESULT: Comprehensive risk analysis with quality gates for each planned sprint
CONFIDENCE: HIGH
NOTES: This document provides the reality check framework to prevent scope creep and ensure quality delivery


1. Executive Summary — THE REALITY CHECK

The Arsenal plan is ambitious and valuable, but contains significant execution risks that could derail the project. The core concept is sound: expand from NX/Nastran-only to a multi-solver platform. However, the 35+ tool integration plan needs aggressive de-risking and phased validation.

CRITICAL FINDING: The plan attempts to solve 5 different problems simultaneously:

  1. Format conversion (meshio, pyNastran)
  2. Open-source FEA (CalculiX, OpenFOAM)
  3. Multi-objective optimization (pymoo)
  4. LLM-driven CAD generation (Build123d, MCP servers)
  5. Advanced topology optimization (FEniCS)

RECOMMENDATION: Execute in strict sequence with hard quality gates. Each phase must FULLY work before advancing.


2. Risk Registry — By Sprint

Phase 1: Universal Glue Layer (Week 1-2)

Technical Risks

Risk Probability Impact Mitigation
Format conversion accuracy loss HIGH CRITICAL Round-trip validation on 10 reference models
meshio coordinate system errors MEDIUM MAJOR Validate stress tensor rotation on cantilever beam
pyNastran OP2 parsing failures MEDIUM MAJOR Test on Antoine's actual client models, not just tutorials
Mesh topology corruption LOW CRITICAL Automated mesh quality checks (aspect ratio, Jacobian)

Integration Risks

Risk Probability Impact Mitigation
Existing Optuna code breaks MEDIUM MAJOR Branch protection + parallel development
AtomizerSpec compatibility HIGH MAJOR Maintain backward compatibility, versioned specs
Python dependency hell HIGH MINOR Docker containerization from day 1

Validation Risks

Risk Probability Impact Mitigation
Silent accuracy degradation HIGH CRITICAL Automated benchmark regression suite
Units confusion (N vs lbf, mm vs in) MEDIUM CRITICAL Explicit unit validation in every converter

Phase 2: CalculiX Integration (Week 2-4)

Technical Risks

Risk Probability Impact Mitigation
CalculiX solver divergence HIGH MAJOR Start with linear static only, incremental complexity
Element type compatibility MEDIUM MAJOR Limit to C3D10 (tet10) initially
Contact analysis failures HIGH CRITICAL Phase 8+ only, not core requirement
Material model differences MEDIUM MAJOR Side-by-side validation vs Nastran on same mesh

Validation Risks

Risk Probability Impact Mitigation
Accuracy drift from Nastran HIGH CRITICAL <2% error on 5 benchmark problems
Mesh sensitivity differences MEDIUM MAJOR Convergence studies required

Phase 3: Multi-Objective Optimization (Week 3-5)

Technical Risks

Risk Probability Impact Mitigation
pymoo algorithm selection confusion MEDIUM MAJOR Start with NSGA-II only, expand later
Pareto front interpretation errors HIGH MAJOR Client education + decision support tools
Constraint handling differences MEDIUM MAJOR Validate constraint satisfaction on known problems

Over-Engineering Risks

Risk Probability Impact Mitigation
Analysis paralysis from too many options HIGH MAJOR Limit to 2-objective problems initially
Perfect being enemy of good HIGH MINOR Time-box Pareto visualization to 1 week

Phase 4: LLM-Driven CAD (Week 4-8)

Technical Risks — HIGHEST RISK PHASE

Risk Probability Impact Mitigation
Build123d geometry generation hallucinations HIGH CRITICAL Human validation + geometric sanity checks
MCP server reliability HIGH MAJOR Fallback to direct API calls
CAD code generation produces invalid geometry HIGH CRITICAL Automated STEP validation pipeline
Complex assembly constraints impossible VERY HIGH MAJOR Limit to single parts initially

Scope Creep Risks

Risk Probability Impact Mitigation
Trying to replace NX completely HIGH CRITICAL Keep NX for production work, Build123d for optimization only
AI-generated geometry perfectionism HIGH MAJOR Accept "good enough" for optimization, refine in NX

Phase 5: CFD + Thermal (Month 2-3)

Technical Risks — COMPLEXITY EXPLOSION

Risk Probability Impact Mitigation
OpenFOAM case setup expertise gap VERY HIGH CRITICAL Hire CFD consultant or defer to Phase 8+
Mesh quality for CFD vs FEA conflicts HIGH MAJOR Separate mesh generation pipelines
Thermal coupling convergence issues HIGH MAJOR Start with decoupled analysis
CFD solution validation difficulty HIGH CRITICAL Need experimental data or commercial CFD comparison

Dependencies Risks

Risk Probability Impact Mitigation
Docker/container complexity MEDIUM MAJOR Cloud deployment or dedicated CFD workstation

Phase 6: Multi-Physics Coupling (Month 3-4)

Technical Risks — RESEARCH-LEVEL DIFFICULTY

Risk Probability Impact Mitigation
preCICE configuration expertise VERY HIGH CRITICAL This is PhD-level work, need expert help
Coupling stability/convergence HIGH CRITICAL Extensive parameter studies required
Debug complexity VERY HIGH MAJOR Each physics must work perfectly before coupling

Phase 7: System-Level MDO (Month 4-6)

Technical Risks — ACADEMIC RESEARCH TERRITORY

Risk Probability Impact Mitigation
OpenMDAO complexity overwhelming VERY HIGH CRITICAL Consider this Phase 9+, not Phase 7
Gradient computation reliability HIGH CRITICAL Validate gradients against finite differences
System convergence failures HIGH CRITICAL Need MDO expert consultant

3. Quality Gates Per Sprint

Phase 1 Quality Gates — MANDATORY PASS/FAIL

  • Round-trip accuracy test: NX BDF → meshio → CalculiX INP → meshio → NX BDF, <0.1% geometry change
  • Stress tensor validation: Same mesh, same loads in Nastran vs CalculiX via conversion, <2% stress difference
  • Mass properties preservation: Convert 5 test parts, mass/CG/MOI within 0.1%
  • Unit consistency check: All conversions maintain proper N/mm/MPa units
  • Automation test: Full conversion pipeline runs without human intervention

FAILURE CRITERIA: Any >5% error in stress or >1% error in mass properties = STOP, fix before Phase 2

Phase 2 Quality Gates — CalculiX Validation

  • Cantilever beam: CalculiX vs analytical solution <1% error in tip deflection
  • Plate with hole: CalculiX vs Nastran stress concentration factor within 2%
  • Modal analysis: First 5 natural frequencies within 1% of Nastran
  • Thermal analysis: Steady-state temperature distribution within 2% of analytical
  • Performance benchmark: CalculiX solve time <2x Nastran for same model

BENCHMARK PROBLEMS (Mandatory):

  1. Cantilever beam (analytical comparison)
  2. Plate with circular hole (Peterson stress concentration)
  3. Simply supported beam modal (analytical frequencies)
  4. 1D heat conduction (analytical temperature distribution)
  5. Contact patch (Hertz contact pressure)

FAILURE CRITERIA: >5% error on any benchmark = STOP, investigate solver setup

Phase 3 Quality Gates — Multi-Objective Optimization

  • Pareto front validation: Known bi-objective problem produces expected trade-off curve
  • Constraint satisfaction: All solutions on Pareto front satisfy constraints within tolerance
  • Repeatability: Same problem run 3 times produces consistent results
  • Decision support: TOPSIS ranking produces sensible design recommendations
  • Performance: Multi-objective optimization completes in reasonable time (<2x single-objective)

TEST PROBLEM: Cantilever beam optimization (minimize weight vs minimize tip deflection, stress constraint)

Phase 4 Quality Gates — LLM CAD Generation

  • Geometric validity: All generated STEP files pass STEP checker
  • Parametric control: Generated geometry responds correctly to dimension changes
  • Manufacturing feasibility: No features <2mm thickness, no impossible geometries
  • Human review: 3 independent engineers can understand and approve generated CAD intent
  • FEA compatibility: Generated geometry meshes successfully in Gmsh

GEOMETRIC SANITY CHECKS:

  • Watertight solid (no gaps, overlaps, or open surfaces)
  • Positive volume
  • Reasonable aspect ratios
  • Manufacturable features only

Phase 5+ Quality Gates — CFD/Advanced Features

⚠️ RECOMMENDATION: DEFER TO PHASE 8+

These phases have research-level complexity. Focus on perfecting Phases 1-4 first.


4. Validation Strategy — Three-Tier Framework

Tier 1: Analytical Comparison (Required for ALL new solvers)

Problems with closed-form solutions for ABSOLUTE validation:

  • Cantilever beam deflection: δ = PL³/(3EI)
  • Plate with hole stress concentration: Kt = 3.0 for infinite plate
  • Simply supported beam modal: fn = (nπ/2L)²√(EI/ρA)/(2π)
  • 1D heat conduction: T(x) = T₀ + (Q·x)/(k·A)
  • Pressurized cylinder: σ_hoop = pr/t, σ_axial = pr/(2t)

PASS CRITERIA: <2% error vs analytical solution for ALL solvers

Tier 2: Cross-Solver Comparison (CalculiX vs Nastran validation)

Same mesh, same loads, same materials, compare results:

  • Linear static: Stress and displacement fields
  • Modal analysis: Natural frequencies and mode shapes
  • Thermal: Temperature distributions
  • Nonlinear (Phase 6+): Load-displacement curves

PASS CRITERIA: <5% difference between CalculiX and Nastran on representative models

Tier 3: Real-World Validation (Antoine's actual client models)

Run optimization studies on actual client geometries, compare:

  • Optimized design performance vs original
  • Manufacturing feasibility of optimized result
  • Client acceptance of design changes

PASS CRITERIA: Client signs off on optimized design for manufacturing


5. What Can Go Wrong — Top 5 Project Derailers

1. 🔴 CRITICAL: Accuracy Drift / Silent Failures

Risk: Format conversions introduce small errors that compound over optimization iterations Impact: Wrong results delivered to clients → liability issues Mitigation: Automated regression testing, benchmark validation on every build Early Warning Signs:

  • Stress results "close but not exact" vs Nastran
  • Optimization converges to different answers between runs
  • Mass properties drift during geometry updates

2. 🔴 CRITICAL: Solver Expertise Gap

Risk: Team lacks deep CFD/FEA knowledge to debug when solvers fail Impact: Months lost debugging OpenFOAM convergence issues Mitigation:

  • Start with CalculiX only (simpler, better docs)
  • Hire CFD consultant for OpenFOAM phase
  • Build internal expertise gradually Early Warning Signs:
  • Solver failures blamed on "bad mesh" without investigation
  • Parameter tuning by trial-and-error
  • No understanding of physics behind solver options

3. 🟡 MAJOR: Scope Creep / Perfect Being Enemy of Good

Risk: Trying to implement every tool instead of delivering value incrementally Impact: 18-month project with no delivered value Mitigation: Strict phase gates, client delivery after each phase Early Warning Signs:

  • Adding new tools before current ones are validated
  • "Just one more feature" before client delivery
  • No working optimization studies after 6 months

4. 🟡 MAJOR: MCP Server Reliability

Risk: Custom MCP servers are buggy, break with tool updates Impact: Automation fails, manual intervention required Mitigation: Fallback to direct Python APIs, modular architecture Early Warning Signs:

  • MCP servers crash frequently
  • Time spent debugging servers > time spent on optimization
  • Abandoning MCP for manual scripts

5. 🟡 MAJOR: Client Expectation Mismatch

Risk: Clients expect NX-level polish from open-source tools Impact: Client rejection of deliverables Mitigation: Clear communication about tool capabilities, hybrid approach (open-source analysis + NX deliverables) Early Warning Signs:

  • Clients asking for features only NX provides
  • Complaints about geometry quality
  • Requests for "professional" visualization

6. Antoine's Minimal Validation Path

What Antoine MUST personally validate:

  1. Final stress results accuracy — CalculiX vs Nastran comparison on client-type models
  2. Optimized geometry manufacturability — Can the result actually be made?
  3. Client presentation quality — Are the deliverables professional enough?
  4. Business case validation — Does this save time vs current NX workflow?

What HQ can self-validate autonomously:

  • Format conversion accuracy (automated tests)
  • Benchmark problem solutions (known analytical answers)
  • Code quality and testing (unit tests, integration tests)
  • Performance benchmarks (solve times, memory usage)

The 80/20 Rule for Antoine's Time:

  • 80% confidence: HQ automated validation catches errors
  • 20% verification: Antoine spot-checks on real models before client delivery
  • Week 2: Validate meshio conversion on 3 client models
  • Week 4: Run CalculiX vs Nastran comparison on representative bracket
  • Week 8: Review first LLM-generated CAD for sanity
  • Month 3: Final sign-off before first client delivery

7. Over-Engineering Warning — MVS (Minimum Viable Sprint)

Phase 1 MVS: Format Conversion ONLY

  • DO: meshio + pyNastran for BDF ↔ INP conversion
  • DON'T: Support 30 file formats, just focus on Nastran ↔ CalculiX

Phase 2 MVS: CalculiX Linear Static ONLY

  • DO: Basic linear static analysis matching Nastran
  • DON'T: Nonlinear, contact, dynamics, thermal all at once

Phase 3 MVS: 2-Objective NSGA-II ONLY

  • DO: Weight vs compliance trade-offs
  • DON'T: Many-objective optimization, exotic algorithms

Phase 4 MVS: Simple Parametric Geometry ONLY

  • DO: Boxes, cylinders, simple extrusions with Build123d
  • DON'T: Complex assemblies, surface modeling, AI-generated everything

SCOPE CREEP WARNING FLAGS:

  • "While we're at it, let's also add..."
  • "The client mentioned they might want..."
  • "This would be really cool if..."
  • "I saw this paper about..."

The Discipline Required:

Each MVS must be FULLY working and client-deliverable before adding complexity. A working 2-objective CalculiX optimization is worth more than a half-working 10-objective multi-physics system.


8. Risk Mitigation Strategy

Development Principles:

  1. Build horizontal before vertical — Get basic optimization working with ALL tools before adding advanced features to ANY tool
  2. Validate early and often — Never go >1 week without comparing to known results
  3. Client delivery drives priority — Features that directly improve client deliverables first
  4. Open-source complements NX, doesn't replace — Hybrid approach reduces risk

Quality Assurance Framework:

  1. Automated regression testing — Benchmark suite runs on every code change
  2. Staged deployment — Internal validation → Antoine review → client pilot → general release
  3. Error budgets — 2% error tolerance for solver comparisons, 1% for mass properties
  4. Documentation discipline — Every decision documented, every failure analyzed

Technical Risk Controls:

  1. Docker containerization — Eliminates "it works on my machine"
  2. Version pinning — Lock solver versions to prevent compatibility drift
  3. Fallback strategies — If Build123d fails, fallback to NX; if CalculiX fails, fallback to Nastran
  4. Modular architecture — Each tool can be swapped without rewriting everything

9. Success Metrics & Exit Criteria

Phase 1 Success Metrics:

  • 100% of test models convert without manual intervention
  • <2% accuracy loss in stress calculations
  • Conversion pipeline completes in <5 minutes per model

Phase 2 Success Metrics:

  • CalculiX matches Nastran within 2% on 5 benchmark problems
  • First optimization study completed end-to-end with CalculiX
  • Client accepts CalculiX results for non-critical analysis

Overall Project Success Metrics:

  • Technical: 3 client projects completed using open-source solver pipeline
  • Business: 50% reduction in software licensing costs
  • Capability: Multi-objective optimization standard offering
  • Quality: Zero client rejections due to solver accuracy issues

Exit Criteria (Stop Development):

  • Technical: Cannot achieve <5% accuracy vs Nastran after 3 months effort
  • Business: Open-source pipeline takes >2x longer than NX workflow
  • Resource: Antoine spending >50% time debugging vs delivering client value
  • Market: Clients consistently reject open-source analysis results

10. Final Recommendations

DO IMMEDIATELY (This Week):

  1. Set up automated benchmark testing — 5 problems, run daily
  2. Create Docker development environment — Reproducible builds
  3. Establish error tolerance budgets — 2% stress, 1% mass properties
  4. Document rollback strategy — How to revert if Phase N fails

DO IN PHASE 1 ONLY:

  • meshio + pyNastran integration
  • CalculiX basic linear static
  • Round-trip validation on client models
  • STOP when this works perfectly

DEFER TO PHASE 8+:

  • CFD/thermal analysis
  • Multi-physics coupling
  • Advanced topology optimization
  • System-level MDO
  • Any tool requiring research-level expertise

THE GOLDEN RULE:

Every phase must deliver working client value before advancing. A simple, reliable CalculiX integration that Antoine trusts is worth infinitely more than an ambitious multi-physics system that sometimes works.

This is the reality check. Build incrementally, validate obsessively, deliver constantly.


Prepared by Auditor 🔍
Confidence: HIGH
Recommendation: Proceed with phased approach and mandatory quality gates