feat: Implement Study Interview Mode as default study creation method

Study Interview Mode is now the DEFAULT for all study creation requests. This intelligent Q&A system guides users through optimization setup with: - 7-phase interview flow: introspection → objectives → constraints → design_variables → validation → review → complete - Material-aware validation with 12 materials and fuzzy name matching - Anti-pattern detection for 12 common mistakes (mass-no-constraint, stress-over-yield, etc.) - Auto extractor mapping E1-E24 based on goal keywords - State persistence with JSON serialization and backup rotation - StudyBlueprint generation with full validation Triggers: "create a study", "new study", "optimize this", any study creation intent Skip with: "skip interview", "quick setup", "manual config" Components: - StudyInterviewEngine: Main orchestrator - QuestionEngine: Conditional logic evaluation - EngineeringValidator: MaterialsDatabase + AntiPatternDetector - InterviewPresenter: Markdown formatting for Claude - StudyBlueprint: Validated configuration output - InterviewState: Persistent state management All 129 tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 11:06:07 -05:00
parent b1ffc64407
commit 32caa5d05c
27 changed files with 9737 additions and 11 deletions
--- a/knowledge_base/lac/session_insights/failure.jsonl
+++ b/knowledge_base/lac/session_insights/failure.jsonl
@@ -6,3 +6,4 @@
 {"timestamp":"2025-12-28T17:30:00","category":"failure","context":"V5 turbo optimization created from scratch instead of copying V4. Multiple critical components were missing or wrong: no license server, wrong extraction keys (filtered_rms_nm vs relative_filtered_rms_nm), wrong mfg_90 key, missing figure_path parameter, incomplete version regex.","insight":"STUDY DERIVATION FAILURE: When creating a new study version (V5 from V4), NEVER rewrite the run_optimization.py from scratch. ALWAYS copy the working version first, then add/modify only the new feature (e.g., L-BFGS polish). Rewriting caused 5 independent bugs: (1) missing LICENSE_SERVER setup, (2) wrong extraction key filtered_rms_nm instead of relative_filtered_rms_nm, (3) wrong mfg_90 key, (4) missing figure_path=None in extractor call, (5) incomplete version regex missing DesigncenterNX pattern. The FEA/extraction pipeline is PROVEN CODE - never rewrite it. Only add new optimization strategies as modules on top.","confidence":1.0,"tags":["study-creation","copy-dont-rewrite","extraction","license-server","v5","critical"],"severity":"critical","rule":"When deriving a new study version, COPY the entire working run_optimization.py first. Add new features as ADDITIONS, not rewrites. The FEA pipeline (license, NXSolver setup, extraction) is proven - never rewrite it."}
 {"timestamp":"2025-12-28T21:30:00","category":"failure","context":"V5 flat back turbo optimization with MLP surrogate + L-BFGS polish. Surrogate predicted WS~280 but actual FEA gave WS~365-377. Error of 85-96 (30%+ relative error). All L-BFGS solutions converged to same fake optimum that didn't exist in reality.","insight":"SURROGATE + L-BFGS FAILURE MODE: Gradient-based optimization on MLP surrogates finds 'fake optima' that don't exist in real FEA. The surrogate has smooth gradients everywhere, but L-BFGS descends to regions OUTSIDE the training distribution where predictions are wildly wrong. V5 results: (1) Best TPE trial: WS=290.18, (2) Best L-BFGS trial: WS=325.27, (3) Worst L-BFGS trials: WS=376.52. The fancy L-BFGS polish made results WORSE than random TPE. Key issues: (a) No uncertainty quantification - can't detect out-of-distribution, (b) No mass constraint in surrogate - L-BFGS finds infeasible designs (122-124kg vs 120kg limit), (c) L-BFGS converges to same bad point from multiple starting locations (trials 31-44 all gave WS=376.52).","confidence":1.0,"tags":["surrogate","mlp","lbfgs","gradient-descent","fake-optima","out-of-distribution","v5","turbo"],"severity":"critical","rule":"NEVER trust gradient descent on surrogates without: (1) Uncertainty quantification to reject OOD predictions, (2) Mass/constraint prediction to enforce feasibility, (3) Trust-region to stay within training distribution. Pure TPE with real FEA often beats surrogate+gradient methods."}
 {"timestamp": "2025-12-29T15:29:55.869508", "category": "failure", "context": "Trial 5 solver error", "insight": "convergence_failure: Convergence failure at iteration 100", "confidence": 0.7, "tags": ["solver", "convergence_failure", "automatic"]}
+{"timestamp": "2026-01-01T21:06:37.877252", "category": "failure", "context": "V13 optimization had 45 FEA failures (34% failure rate)", "insight": "rib_thickness parameter has CAD geometry constraint at ~9mm. All trials with rib_thickness > 9.0 failed. Set max to 9.0 (was 12.0). This is a critical CAD constraint not documented anywhere - the NX model geometry breaks with thicker radial ribs.", "confidence": 0.95, "tags": ["m1_mirror", "cad_constraint", "rib_thickness", "V13", "parameter_bounds"]}
--- a/knowledge_base/lac/session_insights/success_pattern.jsonl
+++ b/knowledge_base/lac/session_insights/success_pattern.jsonl
@@ -7,3 +7,5 @@
 {"timestamp": "2025-12-28T12:28:04.706624", "category": "success_pattern", "context": "Implemented L-BFGS gradient optimizer for surrogate polish phase", "insight": "L-BFGS on trained MLP surrogates provides 100-1000x faster convergence than derivative-free methods (TPE, CMA-ES) for local refinement. Key: use multi-start from top FEA candidates, not random initialization. Integration: GradientOptimizer class in optimization_engine/gradient_optimizer.py.", "confidence": 0.9, "tags": ["optimization", "lbfgs", "surrogate", "gradient", "polish"]}
 {"timestamp": "2025-12-29T09:30:00", "category": "success_pattern", "context": "V6 pure TPE outperformed V5 surrogate+L-BFGS by 22%", "insight": "SIMPLE BEATS COMPLEX: V6 Pure TPE achieved WS=225.41 vs V5's WS=290.18 (22.3% better). Key insight: surrogates fail when gradient methods descend to OOD regions. Fix: EnsembleSurrogate with (1) N=5 MLPs for disagreement-based uncertainty, (2) OODDetector with KNN+z-score, (3) acquisition_score balancing exploitation+exploration, (4) trust-region L-BFGS that stays in training distribution. Never trust point predictions - always require uncertainty bounds. Protocol: SYS_16_SELF_AWARE_TURBO.md. Code: optimization_engine/surrogates/ensemble_surrogate.py", "confidence": 1.0, "tags": ["ensemble", "uncertainty", "ood", "surrogate", "v6", "tpe", "self-aware"]}
 {"timestamp": "2025-12-29T09:47:47.612485", "category": "success_pattern", "context": "Disk space optimization for FEA studies", "insight": "Per-trial FEA files are ~150MB but only OP2+JSON (~70MB) are essential. PRT/FEM/SIM/DAT are copies of master files and can be deleted after study completion. Archive to dalidou server for long-term storage.", "confidence": 0.95, "tags": ["disk_optimization", "archival", "study_management", "dalidou"], "related_files": ["optimization_engine/utils/study_archiver.py", "docs/protocols/operations/OP_07_DISK_OPTIMIZATION.md"]}
+{"timestamp": "2026-01-02T14:30:00", "category": "success_pattern", "context": "Study Interview Mode implementation and routing update", "insight": "STUDY CREATION DEFAULT: Interview Mode is now the DEFAULT for all study creation requests. Triggers: create a study, new study, set up study, optimize this, minimize mass - any study creation intent. Benefits: (1) Material-aware validation checks stress vs yield, (2) Anti-pattern detection warns about mass-no-constraint, (3) Auto extractor mapping E1-E10, (4) State persistence for interrupted sessions, (5) Blueprint generation with full validation. Skip with: skip interview, quick setup, manual config. Implementation: optimization_engine/interview/ with StudyInterviewEngine, QuestionEngine, EngineeringValidator, StudyBlueprint. All 129 tests passing.", "confidence": 1.0, "tags": ["interview_mode", "study_creation", "default", "validation", "anti_pattern", "materials"], "related_files": [".claude/skills/modules/study-interview-mode.md", "docs/protocols/operations/OP_01_CREATE_STUDY.md", "optimization_engine/interview/study_interview.py"]}
+{"timestamp": "2026-01-02T14:45:00", "category": "success_pattern", "context": "Study Interview Mode implementation complete", "insight": "INTERVIEW MODE DEFAULT: Study creation now uses Interview Mode by default for all study creation requests. This is a major usability improvement. Triggers: create a study, new study, set up, optimize this - any study creation intent. Key features: (1) Material-aware validation with 12 materials and fuzzy name matching, (2) Anti-pattern detection for 12 common mistakes, (3) Auto extractor mapping E1-E24, (4) 7-phase interview flow, (5) State persistence for interrupted sessions, (6) Blueprint validation before generation. Skip with: skip interview, quick setup, manual. Implementation in optimization_engine/interview/ with 129 tests passing. Full documentation in: .claude/skills/modules/study-interview-mode.md, docs/protocols/operations/OP_01_CREATE_STUDY.md", "confidence": 1.0, "tags": ["interview_mode", "study_creation", "default", "usability", "materials", "anti_pattern", "validation"], "related_files": [".claude/skills/modules/study-interview-mode.md", "docs/protocols/operations/OP_01_CREATE_STUDY.md", "optimization_engine/interview/"]}