feat: Pre-migration checkpoint - updated docs and utilities

Updates before optimization_engine migration: - Updated migration plan to v2.1 with complete file inventory - Added OP_07 disk optimization protocol - Added SYS_16 self-aware turbo protocol - Added study archiver and cleanup utilities - Added ensemble surrogate module - Updated NX solver and session manager - Updated zernike HTML generator - Added context engineering plan - LAC session insights updates 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 10:22:45 -05:00
parent faa7779a43
commit 82f36689b7
21 changed files with 6304 additions and 890 deletions
--- a/knowledge_base/lac/session_insights/failure.jsonl
+++ b/knowledge_base/lac/session_insights/failure.jsonl
@@ -3,3 +3,5 @@
 {"timestamp":"2025-12-19T10:00:00","category":"workaround","context":"NX journal execution via cmd /c with environment variables fails silently or produces garbled output. Multiple attempts with cmd /c SET and && chaining failed to capture run_journal.exe output.","insight":"CRITICAL WORKAROUND: When executing NX journals from Claude Code on Windows, use PowerShell with [Environment]::SetEnvironmentVariable() method instead of cmd /c or $env: syntax. The correct pattern is: powershell -Command \"[Environment]::SetEnvironmentVariable('SPLM_LICENSE_SERVER', '28000@dalidou;28000@100.80.199.40', 'Process'); & 'C:\\Program Files\\Siemens\\DesigncenterNX2512\\NXBIN\\run_journal.exe' 'journal.py' -args 'arg1' 'arg2' 2>&1\". The $env: syntax gets corrupted when passed through bash (colon gets interpreted). The cmd /c SET syntax often fails to capture output. This PowerShell pattern reliably sets license server and captures all output.","confidence":1.0,"tags":["nx","powershell","run_journal","license-server","windows","cmd-workaround"],"severity":"high","rule":"ALWAYS use PowerShell with [Environment]::SetEnvironmentVariable() for NX journal execution. NEVER use cmd /c SET or $env: syntax for setting SPLM_LICENSE_SERVER."}
 {"timestamp":"2025-12-19T15:30:00","category":"failure","context":"CMA-ES optimization V7 started with random sample instead of baseline. First trial had whiffle_min=45.73 instead of baseline 62.75, resulting in WS=329 instead of expected ~281.","insight":"CMA-ES with Optuna CmaEsSampler does NOT evaluate x0 (baseline) first - it samples AROUND x0 with sigma0 step size. The x0 parameter only sets the CENTER of the initial sampling distribution, not the first trial. To ensure baseline is evaluated first, use study.enqueue_trial(x0) after creating the study. This is critical for refinement studies where you need to compare against a known-good baseline. Pattern: if len(study.trials) == 0: study.enqueue_trial(x0)","confidence":1.0,"tags":["cma-es","optuna","baseline","x0","enqueue","optimization"],"severity":"high","rule":"When using CmaEsSampler with a known baseline, ALWAYS enqueue the baseline as trial 0 using study.enqueue_trial(x0). The x0 parameter alone does NOT guarantee baseline evaluation."}
 {"timestamp":"2025-12-22T14:00:00","category":"failure","context":"V10 mirror optimization reported impossibly good relative WFE values (40-20=1.99nm instead of ~6nm, 60-20=6.82nm instead of ~13nm). User noticed results were 'too good to be true'.","insight":"CRITICAL BUG IN RELATIVE WFE CALCULATION: The V10 run_optimization.py computed relative WFE as abs(RMS_target - RMS_ref) instead of RMS(WFE_target - WFE_ref). This is mathematically WRONG because |RMS(A) - RMS(B)| ≠ RMS(A - B). The correct approach is to compute the node-by-node WFE difference FIRST, then fit Zernike to the difference field, then compute RMS. The bug gave values 3-4x lower than correct values because the 20° reference had HIGHER absolute WFE than 40°/60°, so the subtraction gave negative values, and abs() hid the problem. The fix is to use extractor.extract_relative() which correctly computes node-by-node differences. Both ZernikeExtractor and ZernikeOPDExtractor now have extract_relative() methods.","confidence":1.0,"tags":["zernike","wfe","relative-wfe","extract_relative","critical-bug","v10"],"severity":"critical","rule":"NEVER compute relative WFE as abs(RMS_target - RMS_ref). ALWAYS use extract_relative() which computes RMS(WFE_target - WFE_ref) by doing node-by-node subtraction first, then Zernike fitting, then RMS."}
+{"timestamp":"2025-12-28T17:30:00","category":"failure","context":"V5 turbo optimization created from scratch instead of copying V4. Multiple critical components were missing or wrong: no license server, wrong extraction keys (filtered_rms_nm vs relative_filtered_rms_nm), wrong mfg_90 key, missing figure_path parameter, incomplete version regex.","insight":"STUDY DERIVATION FAILURE: When creating a new study version (V5 from V4), NEVER rewrite the run_optimization.py from scratch. ALWAYS copy the working version first, then add/modify only the new feature (e.g., L-BFGS polish). Rewriting caused 5 independent bugs: (1) missing LICENSE_SERVER setup, (2) wrong extraction key filtered_rms_nm instead of relative_filtered_rms_nm, (3) wrong mfg_90 key, (4) missing figure_path=None in extractor call, (5) incomplete version regex missing DesigncenterNX pattern. The FEA/extraction pipeline is PROVEN CODE - never rewrite it. Only add new optimization strategies as modules on top.","confidence":1.0,"tags":["study-creation","copy-dont-rewrite","extraction","license-server","v5","critical"],"severity":"critical","rule":"When deriving a new study version, COPY the entire working run_optimization.py first. Add new features as ADDITIONS, not rewrites. The FEA pipeline (license, NXSolver setup, extraction) is proven - never rewrite it."}
+{"timestamp":"2025-12-28T21:30:00","category":"failure","context":"V5 flat back turbo optimization with MLP surrogate + L-BFGS polish. Surrogate predicted WS~280 but actual FEA gave WS~365-377. Error of 85-96 (30%+ relative error). All L-BFGS solutions converged to same fake optimum that didn't exist in reality.","insight":"SURROGATE + L-BFGS FAILURE MODE: Gradient-based optimization on MLP surrogates finds 'fake optima' that don't exist in real FEA. The surrogate has smooth gradients everywhere, but L-BFGS descends to regions OUTSIDE the training distribution where predictions are wildly wrong. V5 results: (1) Best TPE trial: WS=290.18, (2) Best L-BFGS trial: WS=325.27, (3) Worst L-BFGS trials: WS=376.52. The fancy L-BFGS polish made results WORSE than random TPE. Key issues: (a) No uncertainty quantification - can't detect out-of-distribution, (b) No mass constraint in surrogate - L-BFGS finds infeasible designs (122-124kg vs 120kg limit), (c) L-BFGS converges to same bad point from multiple starting locations (trials 31-44 all gave WS=376.52).","confidence":1.0,"tags":["surrogate","mlp","lbfgs","gradient-descent","fake-optima","out-of-distribution","v5","turbo"],"severity":"critical","rule":"NEVER trust gradient descent on surrogates without: (1) Uncertainty quantification to reject OOD predictions, (2) Mass/constraint prediction to enforce feasibility, (3) Trust-region to stay within training distribution. Pure TPE with real FEA often beats surrogate+gradient methods."}
--- a/knowledge_base/lac/session_insights/success_pattern.jsonl
+++ b/knowledge_base/lac/session_insights/success_pattern.jsonl
@@ -5,3 +5,5 @@
 {"timestamp": "2025-12-28T10:15:00", "category": "success_pattern", "context": "Unified trial management with TrialManager and DashboardDB", "insight": "TRIAL MANAGEMENT PATTERN: Use TrialManager for consistent trial_NNNN naming across all optimization methods (Optuna, Turbo, GNN, manual). Key principles: (1) Trial numbers NEVER reset (monotonic), (2) Folders NEVER get overwritten, (3) Database always synced with filesystem, (4) Surrogate predictions are NOT trials - only FEA results. DashboardDB provides Optuna-compatible schema for dashboard integration. Path: optimization_engine/utils/trial_manager.py", "confidence": 0.95, "tags": ["trial_manager", "dashboard_db", "optuna", "trial_naming", "turbo"]}
 {"timestamp": "2025-12-28T10:15:00", "category": "success_pattern", "context": "GNN Turbo training data loading from multiple studies", "insight": "MULTI-STUDY TRAINING: When loading training data from multiple prior studies for GNN surrogate training, param names may have unit prefixes like '[mm]rib_thickness' or '[Degrees]angle'. Strip prefixes: if ']' in name: name = name.split(']', 1)[1]. Also, objective attribute names vary between studies (rel_filtered_rms_40_vs_20 vs obj_rel_filtered_rms_40_vs_20) - use fallback chain with 'or'. V5 successfully trained on 316 samples (V3: 297, V4: 19) with R²=[0.94, 0.94, 0.89, 0.95].", "confidence": 0.9, "tags": ["gnn", "turbo", "training_data", "multi_study", "param_naming"]}
 {"timestamp": "2025-12-28T12:28:04.706624", "category": "success_pattern", "context": "Implemented L-BFGS gradient optimizer for surrogate polish phase", "insight": "L-BFGS on trained MLP surrogates provides 100-1000x faster convergence than derivative-free methods (TPE, CMA-ES) for local refinement. Key: use multi-start from top FEA candidates, not random initialization. Integration: GradientOptimizer class in optimization_engine/gradient_optimizer.py.", "confidence": 0.9, "tags": ["optimization", "lbfgs", "surrogate", "gradient", "polish"]}
+{"timestamp": "2025-12-29T09:30:00", "category": "success_pattern", "context": "V6 pure TPE outperformed V5 surrogate+L-BFGS by 22%", "insight": "SIMPLE BEATS COMPLEX: V6 Pure TPE achieved WS=225.41 vs V5's WS=290.18 (22.3% better). Key insight: surrogates fail when gradient methods descend to OOD regions. Fix: EnsembleSurrogate with (1) N=5 MLPs for disagreement-based uncertainty, (2) OODDetector with KNN+z-score, (3) acquisition_score balancing exploitation+exploration, (4) trust-region L-BFGS that stays in training distribution. Never trust point predictions - always require uncertainty bounds. Protocol: SYS_16_SELF_AWARE_TURBO.md. Code: optimization_engine/surrogates/ensemble_surrogate.py", "confidence": 1.0, "tags": ["ensemble", "uncertainty", "ood", "surrogate", "v6", "tpe", "self-aware"]}
+{"timestamp": "2025-12-29T09:47:47.612485", "category": "success_pattern", "context": "Disk space optimization for FEA studies", "insight": "Per-trial FEA files are ~150MB but only OP2+JSON (~70MB) are essential. PRT/FEM/SIM/DAT are copies of master files and can be deleted after study completion. Archive to dalidou server for long-term storage.", "confidence": 0.95, "tags": ["disk_optimization", "archival", "study_management", "dalidou"], "related_files": ["optimization_engine/utils/study_archiver.py", "docs/protocols/operations/OP_07_DISK_OPTIMIZATION.md"]}