# Testing Plan - November 18, 2025 **Goal**: Validate Hybrid Mode with real optimizations and verify centralized library system ## Overview Today we're testing the newly refactored architecture with real-world optimizations. Focus is on: 1. ✅ Hybrid Mode workflow (90% automation, no API key) 2. ✅ Centralized extractor library (deduplication) 3. ✅ Clean study folder structure 4. ✅ Production readiness **Estimated Time**: 2-3 hours total --- ## Test 1: Verify Beam Optimization (30 minutes) ### Goal Confirm existing beam optimization works with new architecture. ### What We're Testing - ✅ Parameter bounds parsing (20-30mm not 0.2-1.0mm!) - ✅ Workflow config auto-saved - ✅ Extractors added to core library - ✅ Study manifest created (not code pollution) - ✅ Clean study folder structure ### Steps #### 1. Review Existing Workflow JSON ```bash # Open in VSCode code studies/simple_beam_optimization/1_setup/workflow_config.json ``` **Check**: - Design variable bounds are `[20, 30]` format (not `min`/`max`) - Extraction actions are clear (extract_mass, extract_displacement) - Objectives and constraints specified #### 2. Run Short Optimization (5 trials) ```python # Create: studies/simple_beam_optimization/test_today.py from pathlib import Path from optimization_engine.llm_optimization_runner import LLMOptimizationRunner study_dir = Path("studies/simple_beam_optimization") workflow_json = study_dir / "1_setup/workflow_config.json" prt_file = study_dir / "1_setup/model/Beam.prt" sim_file = study_dir / "1_setup/model/Beam_sim1.sim" output_dir = study_dir / "2_substudies/test_nov18_verification" print("="*80) print("TEST 1: BEAM OPTIMIZATION VERIFICATION") print("="*80) print() print(f"Workflow: {workflow_json}") print(f"Model: {prt_file}") print(f"Output: {output_dir}") print() print("Running 5 trials to verify system...") print() runner = LLMOptimizationRunner( llm_workflow_file=workflow_json, prt_file=prt_file, sim_file=sim_file, output_dir=output_dir, n_trials=5 # Just 5 for verification ) study = runner.run() print() print("="*80) print("TEST 1 RESULTS") print("="*80) print() print(f"Best design found:") print(f" beam_half_core_thickness: {study.best_params['beam_half_core_thickness']:.2f} mm") print(f" beam_face_thickness: {study.best_params['beam_face_thickness']:.2f} mm") print(f" holes_diameter: {study.best_params['holes_diameter']:.2f} mm") print(f" hole_count: {study.best_params['hole_count']}") print(f" Objective value: {study.best_value:.6f}") print() print("[SUCCESS] Optimization completed!") ``` Run it: ```bash python studies/simple_beam_optimization/test_today.py ``` #### 3. Verify Results **Check output directory structure**: ```bash # Should contain ONLY these files (no generated_extractors/!) dir studies\simple_beam_optimization\2_substudies\test_nov18_verification ``` **Expected**: ``` test_nov18_verification/ ├── extractors_manifest.json ✓ References to core library ├── llm_workflow_config.json ✓ What LLM understood ├── optimization_results.json ✓ Best design ├── optimization_history.json ✓ All trials └── study.db ✓ Optuna database ``` **Check parameter values are realistic**: ```python # Create: verify_results.py import json from pathlib import Path results_file = Path("studies/simple_beam_optimization/2_substudies/test_nov18_verification/optimization_results.json") with open(results_file) as f: results = json.load(f) print("Parameter values:") for param, value in results['best_params'].items(): print(f" {param}: {value}") # VERIFY: thickness should be 20-30 range (not 0.2-1.0!) thickness = results['best_params']['beam_half_core_thickness'] assert 20 <= thickness <= 30, f"FAIL: thickness {thickness} not in 20-30 range!" print() print("[OK] Parameter ranges are correct!") ``` **Check core library**: ```python # Create: check_library.py from optimization_engine.extractor_library import ExtractorLibrary library = ExtractorLibrary() print(library.get_library_summary()) ``` Expected output: ``` ================================================================================ ATOMIZER EXTRACTOR LIBRARY ================================================================================ Location: optimization_engine/extractors/ Total extractors: 3 Available Extractors: -------------------------------------------------------------------------------- extract_mass Domain: result_extraction Description: Extract mass from FEA results File: extract_mass.py Signature: 2f58f241a96afb1f extract_displacement Domain: result_extraction Description: Extract displacement from FEA results File: extract_displacement.py Signature: 381739e9cada3a48 extract_von_mises_stress Domain: result_extraction Description: Extract von Mises stress from FEA results File: extract_von_mises_stress.py Signature: 63d54f297f2403e4 ``` ### Success Criteria - ✅ Optimization completes without errors - ✅ Parameter values in correct range (20-30mm not 0.2-1.0mm) - ✅ Study folder clean (only 5 files, no generated_extractors/) - ✅ extractors_manifest.json exists - ✅ Core library contains 3 extractors - ✅ llm_workflow_config.json saved automatically ### If It Fails - Check parameter bounds parsing in llm_optimization_runner.py:205-211 - Verify NX expression names match workflow JSON - Check OP2 file contains expected results --- ## Test 2: Create New Optimization with Claude (1 hour) ### Goal Use Claude Code to create a brand new optimization from scratch, demonstrating full Hybrid Mode workflow. ### Scenario You have a cantilever plate that needs optimization: - **Design variables**: plate_thickness (3-8mm), support_width (20-50mm) - **Objective**: Minimize mass - **Constraints**: max_displacement < 1.5mm, max_stress < 150 MPa ### Steps #### 1. Prepare Model (if you have one) ``` studies/ cantilever_plate_optimization/ 1_setup/ model/ Plate.prt # Your NX model Plate_sim1.sim # Your FEM setup ``` **If you don't have a real model**, we'll simulate the workflow and use beam model as placeholder. #### 2. Describe Optimization to Claude Start conversation with Claude Code (this tool!): ``` YOU: I want to optimize a cantilever plate design. Design variables: - plate_thickness: 3 to 8 mm - support_width: 20 to 50 mm Objective: - Minimize mass Constraints: - Maximum displacement < 1.5 mm - Maximum von Mises stress < 150 MPa Can you help me create the workflow JSON for Hybrid Mode? ``` #### 3. Claude Creates Workflow JSON Claude (me!) will generate something like: ```json { "study_name": "cantilever_plate_optimization", "optimization_request": "Minimize mass while keeping displacement < 1.5mm and stress < 150 MPa", "design_variables": [ { "parameter": "plate_thickness", "bounds": [3, 8], "description": "Plate thickness in mm" }, { "parameter": "support_width", "bounds": [20, 50], "description": "Support width in mm" } ], "objectives": [ { "name": "mass", "goal": "minimize", "weight": 1.0, "extraction": { "action": "extract_mass", "domain": "result_extraction", "params": { "result_type": "mass", "metric": "total" } } } ], "constraints": [ { "name": "max_displacement_limit", "type": "less_than", "threshold": 1.5, "extraction": { "action": "extract_displacement", "domain": "result_extraction", "params": { "result_type": "displacement", "metric": "max" } } }, { "name": "max_stress_limit", "type": "less_than", "threshold": 150, "extraction": { "action": "extract_von_mises_stress", "domain": "result_extraction", "params": { "result_type": "stress", "metric": "max" } } } ] } ``` #### 4. Save and Review ```bash # Save to: # studies/cantilever_plate_optimization/1_setup/workflow_config.json # Review in VSCode code studies/cantilever_plate_optimization/1_setup/workflow_config.json ``` **Check**: - Parameter names match your NX expressions EXACTLY - Bounds in correct units (mm) - Extraction actions make sense for your model #### 5. Run Optimization ```python # Create: studies/cantilever_plate_optimization/run_optimization.py from pathlib import Path from optimization_engine.llm_optimization_runner import LLMOptimizationRunner study_dir = Path("studies/cantilever_plate_optimization") workflow_json = study_dir / "1_setup/workflow_config.json" prt_file = study_dir / "1_setup/model/Plate.prt" sim_file = study_dir / "1_setup/model/Plate_sim1.sim" output_dir = study_dir / "2_substudies/optimization_run_001" print("="*80) print("TEST 2: NEW CANTILEVER PLATE OPTIMIZATION") print("="*80) print() print("This demonstrates Hybrid Mode workflow:") print(" 1. You described optimization in natural language") print(" 2. Claude created workflow JSON") print(" 3. LLMOptimizationRunner does 90% automation") print() print("Running 10 trials...") print() runner = LLMOptimizationRunner( llm_workflow_file=workflow_json, prt_file=prt_file, sim_file=sim_file, output_dir=output_dir, n_trials=10 ) study = runner.run() print() print("="*80) print("TEST 2 RESULTS") print("="*80) print() print(f"Best design found:") for param, value in study.best_params.items(): print(f" {param}: {value:.2f}") print(f" Objective: {study.best_value:.6f}") print() print("[SUCCESS] New optimization from scratch!") ``` Run it: ```bash python studies/cantilever_plate_optimization/run_optimization.py ``` #### 6. Verify Library Reuse **Key test**: Did it reuse extractors from Test 1? ```python # Create: check_reuse.py from optimization_engine.extractor_library import ExtractorLibrary from pathlib import Path import json library = ExtractorLibrary() # Check manifest from Test 2 manifest_file = Path("studies/cantilever_plate_optimization/2_substudies/optimization_run_001/extractors_manifest.json") with open(manifest_file) as f: manifest = json.load(f) print("Extractors used in Test 2:") for sig in manifest['extractors_used']: info = library.get_extractor_metadata(sig) print(f" {info['name']} (signature: {sig})") print() print("Core library status:") print(f" Total extractors: {len(library.catalog)}") print() # VERIFY: Should still be 3 extractors (reused from Test 1!) assert len(library.catalog) == 3, "FAIL: Should reuse extractors, not duplicate!" print("[OK] Extractors were reused from core library!") print("[OK] No duplicate code generated!") ``` ### Success Criteria - ✅ Claude successfully creates workflow JSON from natural language - ✅ Optimization runs without errors - ✅ Core library STILL only has 3 extractors (reused!) - ✅ Study folder clean (no generated_extractors/) - ✅ Results make engineering sense ### If It Fails - NX expression mismatch: Check Tools → Expression in NX - OP2 results missing: Verify FEM setup outputs required results - Library issues: Check `optimization_engine/extractors/catalog.json` --- ## Test 3: Validate Extractor Deduplication (15 minutes) ### Goal Explicitly test that signature-based deduplication works correctly. ### Steps #### 1. Run Same Workflow Twice ```python # Create: test_deduplication.py from pathlib import Path from optimization_engine.llm_optimization_runner import LLMOptimizationRunner from optimization_engine.extractor_library import ExtractorLibrary print("="*80) print("TEST 3: EXTRACTOR DEDUPLICATION") print("="*80) print() library = ExtractorLibrary() print(f"Core library before test: {len(library.catalog)} extractors") print() # Run 1: First optimization print("RUN 1: First optimization with displacement extractor...") study_dir = Path("studies/simple_beam_optimization") runner1 = LLMOptimizationRunner( llm_workflow_file=study_dir / "1_setup/workflow_config.json", prt_file=study_dir / "1_setup/model/Beam.prt", sim_file=study_dir / "1_setup/model/Beam_sim1.sim", output_dir=study_dir / "2_substudies/dedup_test_run1", n_trials=2 # Just 2 trials ) study1 = runner1.run() print("[OK] Run 1 complete") print() # Check library library = ExtractorLibrary() # Reload count_after_run1 = len(library.catalog) print(f"Core library after Run 1: {count_after_run1} extractors") print() # Run 2: Same workflow, different output directory print("RUN 2: Same optimization, different study...") runner2 = LLMOptimizationRunner( llm_workflow_file=study_dir / "1_setup/workflow_config.json", prt_file=study_dir / "1_setup/model/Beam.prt", sim_file=study_dir / "1_setup/model/Beam_sim1.sim", output_dir=study_dir / "2_substudies/dedup_test_run2", n_trials=2 # Just 2 trials ) study2 = runner2.run() print("[OK] Run 2 complete") print() # Check library again library = ExtractorLibrary() # Reload count_after_run2 = len(library.catalog) print(f"Core library after Run 2: {count_after_run2} extractors") print() # VERIFY: Should be same count (deduplication worked!) print("="*80) print("DEDUPLICATION TEST RESULTS") print("="*80) print() if count_after_run1 == count_after_run2: print(f"[SUCCESS] Extractor count unchanged ({count_after_run1} → {count_after_run2})") print("[SUCCESS] Deduplication working correctly!") print() print("This means:") print(" ✓ Run 2 reused extractors from Run 1") print(" ✓ No duplicate code generated") print(" ✓ Core library stays clean") else: print(f"[FAIL] Extractor count changed ({count_after_run1} → {count_after_run2})") print("[FAIL] Deduplication not working!") print() print("="*80) ``` Run it: ```bash python test_deduplication.py ``` #### 2. Inspect Manifests ```python # Create: compare_manifests.py import json from pathlib import Path manifest1 = Path("studies/simple_beam_optimization/2_substudies/dedup_test_run1/extractors_manifest.json") manifest2 = Path("studies/simple_beam_optimization/2_substudies/dedup_test_run2/extractors_manifest.json") with open(manifest1) as f: data1 = json.load(f) with open(manifest2) as f: data2 = json.load(f) print("Run 1 used extractors:") for sig in data1['extractors_used']: print(f" {sig}") print() print("Run 2 used extractors:") for sig in data2['extractors_used']: print(f" {sig}") print() if data1['extractors_used'] == data2['extractors_used']: print("[OK] Same extractors referenced") print("[OK] Signatures match correctly") else: print("[WARN] Different extractors used") ``` ### Success Criteria - ✅ Core library size unchanged after Run 2 - ✅ Both manifests reference same extractor signatures - ✅ No duplicate extractor files created - ✅ Study folders both clean (only manifests, no code) ### If It Fails - Check signature computation in `extractor_library.py:73-92` - Verify catalog.json persistence - Check `get_or_create()` logic in `extractor_library.py:93-137` --- ## Test 4: Dashboard Visualization (30 minutes) - OPTIONAL ### Goal Verify dashboard can visualize the optimization results. ### Steps #### 1. Start Dashboard ```bash cd dashboard/api python app.py ``` #### 2. Open Browser ``` http://localhost:5000 ``` #### 3. Load Study - Navigate to beam optimization study - View optimization history plot - Check Pareto front (if multi-objective) - Inspect trial details ### Success Criteria - ✅ Dashboard loads without errors - ✅ Can select study from dropdown - ✅ History plot shows all trials - ✅ Best design highlighted - ✅ Can inspect individual trials --- ## Summary Checklist At end of testing session, verify: ### Architecture - [ ] Core library system working (deduplication verified) - [ ] Study folders clean (only 5 files, no code pollution) - [ ] Extractors manifest created correctly - [ ] Workflow config auto-saved ### Functionality - [ ] Parameter bounds parsed correctly (actual mm values) - [ ] Extractors auto-generated successfully - [ ] Optimization completes without errors - [ ] Results make engineering sense ### Hybrid Mode Workflow - [ ] Claude successfully creates workflow JSON from natural language - [ ] LLMOptimizationRunner handles workflow correctly - [ ] 90% automation achieved (only JSON creation manual) - [ ] Full audit trail saved (workflow config + manifest) ### Production Readiness - [ ] No code duplication across studies - [ ] Clean folder structure maintained - [ ] Library grows intelligently (deduplication) - [ ] Reproducible (workflow config captures everything) --- ## If Everything Passes **Congratulations!** 🎉 You now have a production-ready optimization system with: - ✅ 90% automation (Hybrid Mode) - ✅ Clean architecture (centralized library) - ✅ Full transparency (audit trails) - ✅ Code reuse (deduplication) - ✅ Professional structure (studies = data, core = code) ### Next Steps 1. Run longer optimizations (50-100 trials) 2. Try real engineering problems 3. Build up core library with domain-specific extractors 4. Consider upgrading to Full LLM Mode (API) when ready ### Share Your Success - Update DEVELOPMENT.md with test results - Document any issues encountered - Add your own optimization examples to `studies/` --- ## If Something Fails ### Debugging Strategy 1. **Check logs**: Look for error messages in terminal output 2. **Verify files**: Ensure NX model and sim files exist and are valid 3. **Inspect manifests**: Check `extractors_manifest.json` is created 4. **Review library**: Run `python -m optimization_engine.extractor_library` to see library status 5. **Test components**: Run E2E test: `python tests/test_phase_3_2_e2e.py` ### Common Issues **"Expression not found"**: - Open NX model - Tools → Expression - Verify exact parameter names - Update workflow JSON **"No mass results"**: - Check OP2 file contains mass data - Try different result type (displacement, stress) - Verify FEM setup outputs required results **"Extractor generation failed"**: - Check pyNastran can read OP2: `python -c "from pyNastran.op2.op2 import OP2; OP2().read_op2('path')"` - Review knowledge base patterns - Manually create extractor if needed **"Deduplication not working"**: - Check `optimization_engine/extractors/catalog.json` - Verify signature computation - Review `get_or_create()` logic ### Get Help - Review `docs/HYBRID_MODE_GUIDE.md` - Check `docs/ARCHITECTURE_REFACTOR_NOV17.md` - Inspect code in `optimization_engine/llm_optimization_runner.py` --- **Ready to revolutionize your optimization workflow!** 🚀 **Start Time**: ___________ **End Time**: ___________ **Tests Passed**: ___ / 4 **Issues Found**: ___________ **Notes**: ___________