feat: Add DevLoop automation and HTML Reports

## DevLoop - Closed-Loop Development System - Orchestrator for plan → build → test → analyze cycle - Gemini planning via OpenCode CLI - Claude implementation via CLI bridge - Playwright browser testing integration - Test runner with API, filesystem, and browser tests - Persistent state in .devloop/ directory - CLI tool: tools/devloop_cli.py Usage: python tools/devloop_cli.py start 'Create new feature' python tools/devloop_cli.py plan 'Fix bug in X' python tools/devloop_cli.py test --study support_arm python tools/devloop_cli.py browser --level full ## HTML Reports (optimization_engine/reporting/) - Interactive Plotly-based reports - Convergence plot, Pareto front, parallel coordinates - Parameter importance analysis - Self-contained HTML (offline-capable) - Tailwind CSS styling ## Playwright E2E Tests - Home page tests - Test results in test-results/ ## LAC Knowledge Base Updates - Session insights (failures, workarounds, patterns) - Optimization memory for arm support study
2026-01-24 21:18:18 -05:00
parent a3f18dc377
commit 3193831340
24 changed files with 6437 additions and 0 deletions
--- a/knowledge_base/lac/optimization_memory/arm_support.jsonl
+++ b/knowledge_base/lac/optimization_memory/arm_support.jsonl
@@ -0,0 +1 @@
+{"timestamp": "2026-01-22T21:10:37.955211", "study_name": "stage_3_arm", "geometry_type": "arm_support", "method": "TPE", "objectives": ["displacement", "mass"], "n_objectives": 2, "design_vars": 3, "trials": 21, "converged": false, "convergence_trial": null, "convergence_ratio": null, "best_value": null, "best_params": null, "notes": ""}
--- a/knowledge_base/lac/session_insights/failure.jsonl
+++ b/knowledge_base/lac/session_insights/failure.jsonl
@@ -9,3 +9,11 @@
 {"timestamp": "2026-01-01T21:06:37.877252", "category": "failure", "context": "V13 optimization had 45 FEA failures (34% failure rate)", "insight": "rib_thickness parameter has CAD geometry constraint at ~9mm. All trials with rib_thickness > 9.0 failed. Set max to 9.0 (was 12.0). This is a critical CAD constraint not documented anywhere - the NX model geometry breaks with thicker radial ribs.", "confidence": 0.95, "tags": ["m1_mirror", "cad_constraint", "rib_thickness", "V13", "parameter_bounds"]}
 {"timestamp": "2026-01-06T11:00:00.000000", "category": "failure", "context": "flat_back_final study failed at journal line 1042. params.exp contained '[mm]description=Best design from V10...' which is not a valid NX expression.", "insight": "CONFIG DATA LEAKAGE INTO EXPRESSIONS: When config contains a 'starting_design' section with documentation fields like 'description', these string values get passed to NX as expressions if not filtered. The fix is to check isinstance(value, (int, float)) before adding to expressions dict. NEVER blindly iterate config dictionaries and pass to NX - always filter by type. The journal failed because NX cannot create an expression named 'description' with a string value.", "confidence": 1.0, "tags": ["nx", "expressions", "config", "starting_design", "type-filtering", "journal-failure"]}
 {"timestamp": "2026-01-13T11:00:00.000000", "category": "failure", "context": "Created m1_mirror_flatback_lateral study without README.md despite: (1) OP_01 protocol requiring it, (2) PRIOR LAC FAILURE entry from 2025-12-17 documenting same mistake", "insight": "REPEATED FAILURE - DID NOT LEARN FROM LAC: This exact failure was documented on 2025-12-17 with clear remediation (use TodoWrite to track ALL required outputs). Yet I repeated the same mistake. ROOT CAUSE: Did not read failure.jsonl at session start as required by CLAUDE.md initialization steps. The CLAUDE.md explicitly says MANDATORY: Read knowledge_base/lac/session_insights/failure.jsonl. I skipped this step. FIX: Actually follow the initialization protocol. When creating studies, the checklist MUST include README.md and I must verify its creation before declaring the study complete.", "confidence": 1.0, "tags": ["study-creation", "readme", "repeated-failure", "lac-not-read", "session-initialization", "process-discipline"], "severity": "critical", "rule": "At session start, ACTUALLY READ failure.jsonl as mandated. When creating studies, use TodoWrite with explicit README.md item and verify completion."}
+{"timestamp": "2026-01-22T13:27:00", "category": "failure", "context": "DevLoop end-to-end test of support_arm study - NX solver failed to load geometry parts", "insight": "NX SOLVER PART LOADING: When running FEA on a new study, the NX journal may fail with NoneType error when trying to load geometry/idealized parts. The issue is that Parts.Open() returns a tuple (part, status) but the code expects just the part. Also need to ensure the part paths are absolute. Fix: Check return tuple and use absolute paths for part loading.", "confidence": 0.9, "tags": ["nx", "solver", "part-loading", "devloop", "support_arm"], "severity": "high"}
+{"timestamp": "2026-01-22T13:37:05.354753", "category": "failure", "context": "Importing extractors from optimization_engine.extractors", "insight": "extract_displacement and extract_mass_from_bdf were not exported in __init__.py __all__ list. Always verify new extractors are added to both imports AND __all__ exports.", "confidence": 0.95, "tags": ["extractors", "imports", "python"]}
+{"timestamp": "2026-01-22T13:37:05.357090", "category": "failure", "context": "NX solver failing to load geometry parts in solve_simulation.py", "insight": "Parts.Open() can return (None, status) instead of (part, status). Must check if loaded_part is not None before accessing .Name attribute. Fixed around line 852 in solve_simulation.py.", "confidence": 0.95, "tags": ["nx", "solver", "parts", "null-check"]}
+{"timestamp": "2026-01-22T13:37:05.357090", "category": "failure", "context": "Nastran solve failing with memory allocation error", "insight": "Nastran may request large memory (28GB+) and fail if not available. Check support_arm_sim1-solution_1.log for memory error code 12. May need to configure memory limits in Nastran or close other applications.", "confidence": 0.8, "tags": ["nastran", "memory", "solver", "error"]}
+{"timestamp": "2026-01-22T15:12:01.584128", "category": "failure", "context": "DevLoop closed-loop development system", "insight": "DevLoop was built but NOT used in this session. Claude defaulted to manual debugging instead of using devloop_cli.py. Need to make DevLoop the default workflow for any multi-step task. Add reminder in CLAUDE.md to use DevLoop for any task with 3+ steps.", "confidence": 0.95, "tags": ["devloop", "process", "automation", "workflow"]}
+{"timestamp": "2026-01-22T15:23:37.040324", "category": "failure", "context": "NXSolver initialization with license_server parameter", "insight": "NXSolver does NOT have license_server in __init__. It reads from SPLM_LICENSE_SERVER env var. Set os.environ before creating solver.", "confidence": 1.0, "tags": ["nxsolver", "license", "config", "gotcha"]}
+{"timestamp": "2026-01-22T21:00:03.480993", "category": "failure", "context": "Stage 3 arm baseline test: stress=641.8 MPa vs limit=82.5 MPa", "insight": "Stage 3 arm baseline design has stress 641.8 MPa, far exceeding 30%% Al yield (82.5 MPa). Either the constraint is too restrictive for this geometry, or design needs significant thickening. Consider relaxing constraint to 200 MPa (73%% yield) like support_arm study, or find stiff/light designs.", "confidence": 0.9, "tags": ["stage3_arm", "stress_constraint", "infeasible_baseline"]}
+{"timestamp": "2026-01-22T21:10:37.955211", "category": "failure", "context": "Stage 3 arm optimization: 21 trials, 0 feasible (stress 600-680 MPa vs 200 MPa limit)", "insight": "Stage 3 arm geometry has INHERENT HIGH STRESS CONCENTRATIONS. Even 200 MPa (73%% yield) constraint is impossible to satisfy with current design variables (arm_thk, center_space, end_thk). All 21 trials showed stress 600-680 MPa regardless of parameters. This geometry needs: (1) stress-reducing features (fillets), (2) higher yield material, or (3) redesigned load paths. DO NOT use stress constraint <600 MPa for this geometry without redesign.", "confidence": 1.0, "tags": ["stage3_arm", "stress_constraint", "geometry_limitation", "infeasible"]}
--- a/knowledge_base/lac/session_insights/protocol_clarification.jsonl
+++ b/knowledge_base/lac/session_insights/protocol_clarification.jsonl
@@ -1,2 +1,3 @@
 {"timestamp": "2025-12-24T08:13:38.642843", "category": "protocol_clarification", "context": "SYS_14 Neural Acceleration with dashboard integration", "insight": "When running neural surrogate turbo optimization, FEA validation trials MUST be logged to Optuna for dashboard visibility. Use optuna.create_study() with load_if_exists=True, then for each FEA result: trial=study.ask(), set params via suggest_float(), set objectives as user_attrs, then study.tell(trial, weighted_sum).", "confidence": 0.95, "tags": ["SYS_14", "neural", "optuna", "dashboard", "turbo"]}
 {"timestamp": "2025-12-28T10:15:00", "category": "protocol_clarification", "context": "SYS_14 v2.3 update with TrialManager integration", "insight": "SYS_14 Neural Acceleration protocol updated to v2.3. Now uses TrialManager for consistent trial_NNNN naming instead of iter{N}. Key components: (1) TrialManager for folder+DB management, (2) DashboardDB for Optuna-compatible schema, (3) Trial numbers are monotonically increasing and NEVER reset. Reference implementation: studies/M1_Mirror/m1_mirror_cost_reduction_flat_back_V5/run_turbo_optimization.py", "confidence": 0.95, "tags": ["SYS_14", "trial_manager", "dashboard_db", "v2.3"]}
+{"timestamp": "2026-01-22T21:10:37.956764", "category": "protocol_clarification", "context": "Stage 3 arm study uses 1_model instead of 1_setup/model", "insight": "Dashboard intake creates studies with 1_model/ folder for CAD files, not the standard 1_setup/model/ structure. The run_optimization.py template uses MODEL_DIR = STUDY_DIR / 1_model for these intake-created studies. When fixing/completing intake studies, do NOT move files to 1_setup/model - just use the existing 1_model path.", "confidence": 0.9, "tags": ["study_structure", "dashboard_intake", "1_model", "paths"]}
--- a/knowledge_base/lac/session_insights/success_pattern.jsonl
+++ b/knowledge_base/lac/session_insights/success_pattern.jsonl
@@ -9,3 +9,12 @@
 {"timestamp": "2025-12-29T09:47:47.612485", "category": "success_pattern", "context": "Disk space optimization for FEA studies", "insight": "Per-trial FEA files are ~150MB but only OP2+JSON (~70MB) are essential. PRT/FEM/SIM/DAT are copies of master files and can be deleted after study completion. Archive to dalidou server for long-term storage.", "confidence": 0.95, "tags": ["disk_optimization", "archival", "study_management", "dalidou"], "related_files": ["optimization_engine/utils/study_archiver.py", "docs/protocols/operations/OP_07_DISK_OPTIMIZATION.md"]}
 {"timestamp": "2026-01-02T14:30:00", "category": "success_pattern", "context": "Study Interview Mode implementation and routing update", "insight": "STUDY CREATION DEFAULT: Interview Mode is now the DEFAULT for all study creation requests. Triggers: create a study, new study, set up study, optimize this, minimize mass - any study creation intent. Benefits: (1) Material-aware validation checks stress vs yield, (2) Anti-pattern detection warns about mass-no-constraint, (3) Auto extractor mapping E1-E10, (4) State persistence for interrupted sessions, (5) Blueprint generation with full validation. Skip with: skip interview, quick setup, manual config. Implementation: optimization_engine/interview/ with StudyInterviewEngine, QuestionEngine, EngineeringValidator, StudyBlueprint. All 129 tests passing.", "confidence": 1.0, "tags": ["interview_mode", "study_creation", "default", "validation", "anti_pattern", "materials"], "related_files": [".claude/skills/modules/study-interview-mode.md", "docs/protocols/operations/OP_01_CREATE_STUDY.md", "optimization_engine/interview/study_interview.py"]}
 {"timestamp": "2026-01-02T14:45:00", "category": "success_pattern", "context": "Study Interview Mode implementation complete", "insight": "INTERVIEW MODE DEFAULT: Study creation now uses Interview Mode by default for all study creation requests. This is a major usability improvement. Triggers: create a study, new study, set up, optimize this - any study creation intent. Key features: (1) Material-aware validation with 12 materials and fuzzy name matching, (2) Anti-pattern detection for 12 common mistakes, (3) Auto extractor mapping E1-E24, (4) 7-phase interview flow, (5) State persistence for interrupted sessions, (6) Blueprint validation before generation. Skip with: skip interview, quick setup, manual. Implementation in optimization_engine/interview/ with 129 tests passing. Full documentation in: .claude/skills/modules/study-interview-mode.md, docs/protocols/operations/OP_01_CREATE_STUDY.md", "confidence": 1.0, "tags": ["interview_mode", "study_creation", "default", "usability", "materials", "anti_pattern", "validation"], "related_files": [".claude/skills/modules/study-interview-mode.md", "docs/protocols/operations/OP_01_CREATE_STUDY.md", "optimization_engine/interview/"]}
+{"timestamp": "2026-01-22T13:00:00", "category": "success_pattern", "context": "DevLoop closed-loop development system implementation", "insight": "DEVLOOP PATTERN: Implemented autonomous development cycle that coordinates Gemini (planning) + Claude Code (implementation) + Dashboard (testing) + LAC (learning). 7-stage loop: PLAN -> BUILD -> TEST -> ANALYZE -> FIX -> VERIFY -> LOOP. Key components: (1) DevLoopOrchestrator in optimization_engine/devloop/, (2) DashboardTestRunner for automated testing, (3) GeminiPlanner for strategic planning with mock fallback, (4) ClaudeCodeBridge for implementation, (5) ProblemAnalyzer for failure analysis. API at /api/devloop/* with WebSocket for real-time updates. CLI tool at tools/devloop_cli.py. Frontend panel DevLoopPanel.tsx. Test with: python tools/devloop_cli.py test --study support_arm", "confidence": 0.95, "tags": ["devloop", "automation", "testing", "gemini", "claude", "dashboard", "closed-loop"], "related_files": ["optimization_engine/devloop/orchestrator.py", "tools/devloop_cli.py", "docs/guides/DEVLOOP.md"]}
+{"timestamp": "2026-01-22T13:37:05.355957", "category": "success_pattern", "context": "Extracting mass from Nastran BDF files", "insight": "Use BDFMassExtractor from bdf_mass_extractor.py for reliable mass extraction. It uses elem.Mass() which handles unit conversions properly. The simpler extract_mass_from_bdf.py now wraps this.", "confidence": 0.9, "tags": ["mass", "bdf", "extraction", "pyNastran"]}
+{"timestamp": "2026-01-22T13:47:38.696196", "category": "success_pattern", "context": "Stress extraction from NX Nastran OP2 files", "insight": "pyNastran returns stress in kPa for NX kg-mm-s unit system. Divide by 1000 to get MPa. Must check ALL solid element types (CTETRA, CHEXA, CPENTA, CPYRAM) to find true max. Elemental Nodal gives peak stress (143.5 MPa), Elemental Centroid gives averaged (100.3 MPa).", "confidence": 0.95, "tags": ["stress", "extraction", "units", "pyNastran", "nastran"]}
+{"timestamp": "2026-01-22T15:12:01.584128", "category": "success_pattern", "context": "Dashboard study discovery", "insight": "Dashboard now supports atomizer_spec.json as primary config. Updated _load_study_info() in optimization.py to check atomizer_spec.json first, then fall back to optimization_config.json. Studies with atomizer_spec.json are now discoverable.", "confidence": 0.9, "tags": ["dashboard", "atomizer_spec", "config", "v2.0"]}
+{"timestamp": "2026-01-22T15:12:01.584128", "category": "success_pattern", "context": "Extracting stress from NX Nastran results", "insight": "CONFIRMED: pyNastran returns stress in kPa for NX kg-mm-s unit system. Divide by 1000 for MPa. Must check ALL solid types (CTETRA, CHEXA, CPENTA, CPYRAM) - CHEXA often has highest stress. Elemental Nodal (143.5 MPa) vs Elemental Centroid (100.3 MPa) - use Nodal for conservative peak stress.", "confidence": 1.0, "tags": ["stress", "extraction", "units", "nastran", "verified"]}
+{"timestamp": "2026-01-22T15:23:37.040324", "category": "success_pattern", "context": "Creating new study with DevLoop workflow", "insight": "DevLoop workflow: plan -> create dirs -> copy models -> atomizer_spec.json -> validate canvas -> run_optimization.py -> devloop test -> FEA validation. 8 steps completed for support_arm_lightweight.", "confidence": 0.95, "tags": ["devloop", "workflow", "study_creation", "success"]}
+{"timestamp": "2026-01-22T15:23:37.040324", "category": "success_pattern", "context": "Single-objective optimization with constraints", "insight": "Single-objective with constraints: one objective in array, constraints use threshold+operator, penalty in objective function, canvas edges ext->obj for objective, ext->con for constraints.", "confidence": 0.9, "tags": ["optimization", "single_objective", "constraints", "canvas"]}
+{"timestamp": "2026-01-22T16:15:11.449264", "category": "success_pattern", "context": "Atomizer UX System implementation - January 2026", "insight": "New study workflow: (1) Put files in studies/_inbox/project_name/models/, (2) Optionally add intake.yaml and context/goals.md, (3) Run atomizer intake project_name, (4) Run atomizer gate study_name to validate with test trials, (5) If passed, approve with --approve flag, (6) Run optimization, (7) Run atomizer finalize study_name to generate interactive HTML report. The CLI commands are: intake, gate, list, finalize.", "confidence": 1.0, "tags": ["workflow", "ux", "cli", "intake", "validation", "report"]}
+{"timestamp": "2026-01-22T21:10:37.956764", "category": "success_pattern", "context": "Stage 3 arm study setup and execution with DevLoop", "insight": "DevLoop test command (devloop_cli.py test --study) successfully validated study setup before optimization. The 5 standard tests (directory, spec JSON, README, run_optimization.py, model dir) caught structure issues early. Full workflow: (1) Copy model files, (2) Create atomizer_spec.json with extractors/objectives/constraints, (3) Create run_optimization.py from template, (4) Create README.md, (5) Run DevLoop tests, (6) Execute optimization.", "confidence": 0.95, "tags": ["devloop", "study_creation", "workflow", "testing"]}
--- a/knowledge_base/lac/session_insights/user_preference.jsonl
+++ b/knowledge_base/lac/session_insights/user_preference.jsonl
@@ -1 +1,2 @@
 {"timestamp": "2025-12-29T12:00:00", "category": "user_preference", "context": "Git remote configuration", "insight": "GitHub repository URL is https://github.com/Anto01/Atomizer.git (private repo). Always push to both origin (Gitea at 192.168.86.50:3000) and github remote.", "confidence": 1.0, "tags": ["git", "github", "remote", "configuration"]}
+{"timestamp": "2026-01-22T16:13:41.159557", "category": "user_preference", "context": "Atomizer UX architecture decision - January 2026", "insight": "NO DASHBOARD API - Use Claude Code CLI as the primary interface. The user (engineer) interacts with Atomizer through: (1) Claude Code chat in terminal - natural language, (2) CLI commands like atomizer intake/gate/finalize, (3) Dashboard is for VIEWING only (monitoring, reports), not for configuration. All study creation, validation, and management goes through Claude Code or CLI.", "confidence": 1.0, "tags": ["architecture", "ux", "cli", "dashboard", "claude-code"]}
--- a/knowledge_base/lac/session_insights/workaround.jsonl
+++ b/knowledge_base/lac/session_insights/workaround.jsonl
@@ -1,2 +1,6 @@
 {"timestamp": "2025-12-24T08:13:38.641823", "category": "workaround", "context": "Turbo optimization study structure", "insight": "Turbo studies use 3_results/ not 2_results/. Dashboard already supports both. Use study.db for Optuna-format (dashboard compatible), study_custom.db for internal custom tracking. Backfill script (scripts/backfill_optuna.py) can convert existing trials.", "confidence": 0.9, "tags": ["turbo", "study_structure", "optuna", "dashboard"]}
 {"timestamp": "2025-12-28T10:15:00", "category": "workaround", "context": "Custom database schema not showing in dashboard", "insight": "DASHBOARD COMPATIBILITY: If a study uses custom database schema instead of Optuna's (missing trial_values, trial_params, trial_user_attributes tables), the dashboard won't show trials. Use convert_custom_to_optuna() from dashboard_db.py to convert. This function drops all tables and recreates with Optuna-compatible schema, migrating all trial data.", "confidence": 0.95, "tags": ["dashboard", "optuna", "database", "schema", "migration"]}
+{"timestamp": "2026-01-22T13:37:05.353675", "category": "workaround", "context": "NX installation paths on this machine", "insight": "The working NX installation is DesigncenterNX2512, NOT NX2506 or NX2412. NX2506 only has ThermalFlow components. Always use C:\\Program Files\\Siemens\\DesigncenterNX2512 for NX_INSTALL_DIR.", "confidence": 1.0, "tags": ["nx", "installation", "path", "config"]}
+{"timestamp": "2026-01-22T15:12:01.584128", "category": "workaround", "context": "Nastran failing with 28GB memory allocation error", "insight": "Bun processes can consume 10-15GB of memory in background. When Nastran fails with memory allocation error, check Task Manager for Bun processes and kill them. Command: Get-Process -Name bun | Stop-Process -Force", "confidence": 1.0, "tags": ["nastran", "memory", "bun", "workaround"]}
+{"timestamp": "2026-01-22T15:12:01.584128", "category": "workaround", "context": "NX installation paths", "insight": "CONFIRMED: Working NX installation is DesigncenterNX2512 at C:\\Program Files\\Siemens\\DesigncenterNX2512. NX2506 only has ThermalFlow. NX2412 exists but DesigncenterNX2512 is the primary working install.", "confidence": 1.0, "tags": ["nx", "installation", "path", "verified"]}
+{"timestamp": "2026-01-22T15:23:37.040324", "category": "workaround", "context": "DevLoop test runner looking in wrong study path", "insight": "DevLoop test_runner.py was hardcoded to look in studies/_Other. Fixed devloop_cli.py to search flat structure first, then nested. Study path resolution now dynamic.", "confidence": 1.0, "tags": ["devloop", "bug", "fixed", "study_path"]}
				`@@ -0,0 +1 @@`
				`{"timestamp": "2026-01-22T21:10:37.955211", "study_name": "stage_3_arm", "geometry_type": "arm_support", "method": "TPE", "objectives": ["displacement", "mass"], "n_objectives": 2, "design_vars": 3, "trials": 21, "converged": false, "convergence_trial": null, "convergence_ratio": null, "best_value": null, "best_params": null, "notes": ""}`