19 KiB
Arsenal Development Plan — Risk Analysis + Quality Gates
TASK: Arsenal Development Plan — Risk Analysis + Quality Gates
STATUS: complete
RESULT: Comprehensive risk analysis with quality gates for each planned sprint
CONFIDENCE: HIGH
NOTES: This document provides the reality check framework to prevent scope creep and ensure quality delivery
1. Executive Summary — THE REALITY CHECK
The Arsenal plan is ambitious and valuable, but contains significant execution risks that could derail the project. The core concept is sound: expand from NX/Nastran-only to a multi-solver platform. However, the 35+ tool integration plan needs aggressive de-risking and phased validation.
CRITICAL FINDING: The plan attempts to solve 5 different problems simultaneously:
- Format conversion (meshio, pyNastran)
- Open-source FEA (CalculiX, OpenFOAM)
- Multi-objective optimization (pymoo)
- LLM-driven CAD generation (Build123d, MCP servers)
- Advanced topology optimization (FEniCS)
RECOMMENDATION: Execute in strict sequence with hard quality gates. Each phase must FULLY work before advancing.
2. Risk Registry — By Sprint
Phase 1: Universal Glue Layer (Week 1-2)
Technical Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Format conversion accuracy loss | HIGH | CRITICAL | Round-trip validation on 10 reference models |
| meshio coordinate system errors | MEDIUM | MAJOR | Validate stress tensor rotation on cantilever beam |
| pyNastran OP2 parsing failures | MEDIUM | MAJOR | Test on Antoine's actual client models, not just tutorials |
| Mesh topology corruption | LOW | CRITICAL | Automated mesh quality checks (aspect ratio, Jacobian) |
Integration Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Existing Optuna code breaks | MEDIUM | MAJOR | Branch protection + parallel development |
| AtomizerSpec compatibility | HIGH | MAJOR | Maintain backward compatibility, versioned specs |
| Python dependency hell | HIGH | MINOR | Docker containerization from day 1 |
Validation Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Silent accuracy degradation | HIGH | CRITICAL | Automated benchmark regression suite |
| Units confusion (N vs lbf, mm vs in) | MEDIUM | CRITICAL | Explicit unit validation in every converter |
Phase 2: CalculiX Integration (Week 2-4)
Technical Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| CalculiX solver divergence | HIGH | MAJOR | Start with linear static only, incremental complexity |
| Element type compatibility | MEDIUM | MAJOR | Limit to C3D10 (tet10) initially |
| Contact analysis failures | HIGH | CRITICAL | Phase 8+ only, not core requirement |
| Material model differences | MEDIUM | MAJOR | Side-by-side validation vs Nastran on same mesh |
Validation Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Accuracy drift from Nastran | HIGH | CRITICAL | <2% error on 5 benchmark problems |
| Mesh sensitivity differences | MEDIUM | MAJOR | Convergence studies required |
Phase 3: Multi-Objective Optimization (Week 3-5)
Technical Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| pymoo algorithm selection confusion | MEDIUM | MAJOR | Start with NSGA-II only, expand later |
| Pareto front interpretation errors | HIGH | MAJOR | Client education + decision support tools |
| Constraint handling differences | MEDIUM | MAJOR | Validate constraint satisfaction on known problems |
Over-Engineering Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Analysis paralysis from too many options | HIGH | MAJOR | Limit to 2-objective problems initially |
| Perfect being enemy of good | HIGH | MINOR | Time-box Pareto visualization to 1 week |
Phase 4: LLM-Driven CAD (Week 4-8)
Technical Risks — HIGHEST RISK PHASE
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Build123d geometry generation hallucinations | HIGH | CRITICAL | Human validation + geometric sanity checks |
| MCP server reliability | HIGH | MAJOR | Fallback to direct API calls |
| CAD code generation produces invalid geometry | HIGH | CRITICAL | Automated STEP validation pipeline |
| Complex assembly constraints impossible | VERY HIGH | MAJOR | Limit to single parts initially |
Scope Creep Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Trying to replace NX completely | HIGH | CRITICAL | Keep NX for production work, Build123d for optimization only |
| AI-generated geometry perfectionism | HIGH | MAJOR | Accept "good enough" for optimization, refine in NX |
Phase 5: CFD + Thermal (Month 2-3)
Technical Risks — COMPLEXITY EXPLOSION
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| OpenFOAM case setup expertise gap | VERY HIGH | CRITICAL | Hire CFD consultant or defer to Phase 8+ |
| Mesh quality for CFD vs FEA conflicts | HIGH | MAJOR | Separate mesh generation pipelines |
| Thermal coupling convergence issues | HIGH | MAJOR | Start with decoupled analysis |
| CFD solution validation difficulty | HIGH | CRITICAL | Need experimental data or commercial CFD comparison |
Dependencies Risks
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Docker/container complexity | MEDIUM | MAJOR | Cloud deployment or dedicated CFD workstation |
Phase 6: Multi-Physics Coupling (Month 3-4)
Technical Risks — RESEARCH-LEVEL DIFFICULTY
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| preCICE configuration expertise | VERY HIGH | CRITICAL | This is PhD-level work, need expert help |
| Coupling stability/convergence | HIGH | CRITICAL | Extensive parameter studies required |
| Debug complexity | VERY HIGH | MAJOR | Each physics must work perfectly before coupling |
Phase 7: System-Level MDO (Month 4-6)
Technical Risks — ACADEMIC RESEARCH TERRITORY
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| OpenMDAO complexity overwhelming | VERY HIGH | CRITICAL | Consider this Phase 9+, not Phase 7 |
| Gradient computation reliability | HIGH | CRITICAL | Validate gradients against finite differences |
| System convergence failures | HIGH | CRITICAL | Need MDO expert consultant |
3. Quality Gates Per Sprint
Phase 1 Quality Gates — MANDATORY PASS/FAIL
- Round-trip accuracy test: NX BDF → meshio → CalculiX INP → meshio → NX BDF, <0.1% geometry change
- Stress tensor validation: Same mesh, same loads in Nastran vs CalculiX via conversion, <2% stress difference
- Mass properties preservation: Convert 5 test parts, mass/CG/MOI within 0.1%
- Unit consistency check: All conversions maintain proper N/mm/MPa units
- Automation test: Full conversion pipeline runs without human intervention
FAILURE CRITERIA: Any >5% error in stress or >1% error in mass properties = STOP, fix before Phase 2
Phase 2 Quality Gates — CalculiX Validation
- Cantilever beam: CalculiX vs analytical solution <1% error in tip deflection
- Plate with hole: CalculiX vs Nastran stress concentration factor within 2%
- Modal analysis: First 5 natural frequencies within 1% of Nastran
- Thermal analysis: Steady-state temperature distribution within 2% of analytical
- Performance benchmark: CalculiX solve time <2x Nastran for same model
BENCHMARK PROBLEMS (Mandatory):
- Cantilever beam (analytical comparison)
- Plate with circular hole (Peterson stress concentration)
- Simply supported beam modal (analytical frequencies)
- 1D heat conduction (analytical temperature distribution)
- Contact patch (Hertz contact pressure)
FAILURE CRITERIA: >5% error on any benchmark = STOP, investigate solver setup
Phase 3 Quality Gates — Multi-Objective Optimization
- Pareto front validation: Known bi-objective problem produces expected trade-off curve
- Constraint satisfaction: All solutions on Pareto front satisfy constraints within tolerance
- Repeatability: Same problem run 3 times produces consistent results
- Decision support: TOPSIS ranking produces sensible design recommendations
- Performance: Multi-objective optimization completes in reasonable time (<2x single-objective)
TEST PROBLEM: Cantilever beam optimization (minimize weight vs minimize tip deflection, stress constraint)
Phase 4 Quality Gates — LLM CAD Generation
- Geometric validity: All generated STEP files pass STEP checker
- Parametric control: Generated geometry responds correctly to dimension changes
- Manufacturing feasibility: No features <2mm thickness, no impossible geometries
- Human review: 3 independent engineers can understand and approve generated CAD intent
- FEA compatibility: Generated geometry meshes successfully in Gmsh
GEOMETRIC SANITY CHECKS:
- Watertight solid (no gaps, overlaps, or open surfaces)
- Positive volume
- Reasonable aspect ratios
- Manufacturable features only
Phase 5+ Quality Gates — CFD/Advanced Features
⚠️ RECOMMENDATION: DEFER TO PHASE 8+
These phases have research-level complexity. Focus on perfecting Phases 1-4 first.
4. Validation Strategy — Three-Tier Framework
Tier 1: Analytical Comparison (Required for ALL new solvers)
Problems with closed-form solutions for ABSOLUTE validation:
- Cantilever beam deflection: δ = PL³/(3EI)
- Plate with hole stress concentration: Kt = 3.0 for infinite plate
- Simply supported beam modal: fn = (nπ/2L)²√(EI/ρA)/(2π)
- 1D heat conduction: T(x) = T₀ + (Q·x)/(k·A)
- Pressurized cylinder: σ_hoop = pr/t, σ_axial = pr/(2t)
PASS CRITERIA: <2% error vs analytical solution for ALL solvers
Tier 2: Cross-Solver Comparison (CalculiX vs Nastran validation)
Same mesh, same loads, same materials, compare results:
- Linear static: Stress and displacement fields
- Modal analysis: Natural frequencies and mode shapes
- Thermal: Temperature distributions
- Nonlinear (Phase 6+): Load-displacement curves
PASS CRITERIA: <5% difference between CalculiX and Nastran on representative models
Tier 3: Real-World Validation (Antoine's actual client models)
Run optimization studies on actual client geometries, compare:
- Optimized design performance vs original
- Manufacturing feasibility of optimized result
- Client acceptance of design changes
PASS CRITERIA: Client signs off on optimized design for manufacturing
5. What Can Go Wrong — Top 5 Project Derailers
1. 🔴 CRITICAL: Accuracy Drift / Silent Failures
Risk: Format conversions introduce small errors that compound over optimization iterations Impact: Wrong results delivered to clients → liability issues Mitigation: Automated regression testing, benchmark validation on every build Early Warning Signs:
- Stress results "close but not exact" vs Nastran
- Optimization converges to different answers between runs
- Mass properties drift during geometry updates
2. 🔴 CRITICAL: Solver Expertise Gap
Risk: Team lacks deep CFD/FEA knowledge to debug when solvers fail Impact: Months lost debugging OpenFOAM convergence issues Mitigation:
- Start with CalculiX only (simpler, better docs)
- Hire CFD consultant for OpenFOAM phase
- Build internal expertise gradually Early Warning Signs:
- Solver failures blamed on "bad mesh" without investigation
- Parameter tuning by trial-and-error
- No understanding of physics behind solver options
3. 🟡 MAJOR: Scope Creep / Perfect Being Enemy of Good
Risk: Trying to implement every tool instead of delivering value incrementally Impact: 18-month project with no delivered value Mitigation: Strict phase gates, client delivery after each phase Early Warning Signs:
- Adding new tools before current ones are validated
- "Just one more feature" before client delivery
- No working optimization studies after 6 months
4. 🟡 MAJOR: MCP Server Reliability
Risk: Custom MCP servers are buggy, break with tool updates Impact: Automation fails, manual intervention required Mitigation: Fallback to direct Python APIs, modular architecture Early Warning Signs:
- MCP servers crash frequently
- Time spent debugging servers > time spent on optimization
- Abandoning MCP for manual scripts
5. 🟡 MAJOR: Client Expectation Mismatch
Risk: Clients expect NX-level polish from open-source tools Impact: Client rejection of deliverables Mitigation: Clear communication about tool capabilities, hybrid approach (open-source analysis + NX deliverables) Early Warning Signs:
- Clients asking for features only NX provides
- Complaints about geometry quality
- Requests for "professional" visualization
6. Antoine's Minimal Validation Path
What Antoine MUST personally validate:
- Final stress results accuracy — CalculiX vs Nastran comparison on client-type models
- Optimized geometry manufacturability — Can the result actually be made?
- Client presentation quality — Are the deliverables professional enough?
- Business case validation — Does this save time vs current NX workflow?
What HQ can self-validate autonomously:
- Format conversion accuracy (automated tests)
- Benchmark problem solutions (known analytical answers)
- Code quality and testing (unit tests, integration tests)
- Performance benchmarks (solve times, memory usage)
The 80/20 Rule for Antoine's Time:
- 80% confidence: HQ automated validation catches errors
- 20% verification: Antoine spot-checks on real models before client delivery
Recommended Antoine Validation Schedule:
- Week 2: Validate meshio conversion on 3 client models
- Week 4: Run CalculiX vs Nastran comparison on representative bracket
- Week 8: Review first LLM-generated CAD for sanity
- Month 3: Final sign-off before first client delivery
7. Over-Engineering Warning — MVS (Minimum Viable Sprint)
Phase 1 MVS: Format Conversion ONLY
- DO: meshio + pyNastran for BDF ↔ INP conversion
- DON'T: Support 30 file formats, just focus on Nastran ↔ CalculiX
Phase 2 MVS: CalculiX Linear Static ONLY
- DO: Basic linear static analysis matching Nastran
- DON'T: Nonlinear, contact, dynamics, thermal all at once
Phase 3 MVS: 2-Objective NSGA-II ONLY
- DO: Weight vs compliance trade-offs
- DON'T: Many-objective optimization, exotic algorithms
Phase 4 MVS: Simple Parametric Geometry ONLY
- DO: Boxes, cylinders, simple extrusions with Build123d
- DON'T: Complex assemblies, surface modeling, AI-generated everything
SCOPE CREEP WARNING FLAGS:
- "While we're at it, let's also add..."
- "The client mentioned they might want..."
- "This would be really cool if..."
- "I saw this paper about..."
The Discipline Required:
Each MVS must be FULLY working and client-deliverable before adding complexity. A working 2-objective CalculiX optimization is worth more than a half-working 10-objective multi-physics system.
8. Risk Mitigation Strategy
Development Principles:
- Build horizontal before vertical — Get basic optimization working with ALL tools before adding advanced features to ANY tool
- Validate early and often — Never go >1 week without comparing to known results
- Client delivery drives priority — Features that directly improve client deliverables first
- Open-source complements NX, doesn't replace — Hybrid approach reduces risk
Quality Assurance Framework:
- Automated regression testing — Benchmark suite runs on every code change
- Staged deployment — Internal validation → Antoine review → client pilot → general release
- Error budgets — 2% error tolerance for solver comparisons, 1% for mass properties
- Documentation discipline — Every decision documented, every failure analyzed
Technical Risk Controls:
- Docker containerization — Eliminates "it works on my machine"
- Version pinning — Lock solver versions to prevent compatibility drift
- Fallback strategies — If Build123d fails, fallback to NX; if CalculiX fails, fallback to Nastran
- Modular architecture — Each tool can be swapped without rewriting everything
9. Success Metrics & Exit Criteria
Phase 1 Success Metrics:
- 100% of test models convert without manual intervention
- <2% accuracy loss in stress calculations
- Conversion pipeline completes in <5 minutes per model
Phase 2 Success Metrics:
- CalculiX matches Nastran within 2% on 5 benchmark problems
- First optimization study completed end-to-end with CalculiX
- Client accepts CalculiX results for non-critical analysis
Overall Project Success Metrics:
- Technical: 3 client projects completed using open-source solver pipeline
- Business: 50% reduction in software licensing costs
- Capability: Multi-objective optimization standard offering
- Quality: Zero client rejections due to solver accuracy issues
Exit Criteria (Stop Development):
- Technical: Cannot achieve <5% accuracy vs Nastran after 3 months effort
- Business: Open-source pipeline takes >2x longer than NX workflow
- Resource: Antoine spending >50% time debugging vs delivering client value
- Market: Clients consistently reject open-source analysis results
10. Final Recommendations
DO IMMEDIATELY (This Week):
- Set up automated benchmark testing — 5 problems, run daily
- Create Docker development environment — Reproducible builds
- Establish error tolerance budgets — 2% stress, 1% mass properties
- Document rollback strategy — How to revert if Phase N fails
DO IN PHASE 1 ONLY:
- meshio + pyNastran integration
- CalculiX basic linear static
- Round-trip validation on client models
- STOP when this works perfectly
DEFER TO PHASE 8+:
- CFD/thermal analysis
- Multi-physics coupling
- Advanced topology optimization
- System-level MDO
- Any tool requiring research-level expertise
THE GOLDEN RULE:
Every phase must deliver working client value before advancing. A simple, reliable CalculiX integration that Antoine trusts is worth infinitely more than an ambitious multi-physics system that sometimes works.
This is the reality check. Build incrementally, validate obsessively, deliver constantly.
Prepared by Auditor 🔍
Confidence: HIGH
Recommendation: Proceed with phased approach and mandatory quality gates