MAJOR ARCHITECTURE REFACTOR - Clean Study Folders
Problem Identified by User:
"My study folder is a mess, why? I want some order and real structure to develop
an insanely good engineering software that evolve with time."
- Every substudy was generating duplicate extractor code
- Study folders polluted with reusable library code (generated_extractors/, generated_hooks/)
- No code reuse across studies
- Not production-grade architecture
Solution - Centralized Library System:
Implemented smart library with signature-based deduplication:
- Core extractors in optimization_engine/extractors/
- Studies only store metadata (extractors_manifest.json)
- Clean separation: studies = data, core = code
Architecture:
BEFORE (BAD):
studies/my_study/
generated_extractors/ ❌ Code pollution!
extract_displacement.py
extract_von_mises_stress.py
generated_hooks/ ❌ Code pollution!
llm_workflow_config.json
results.json
AFTER (GOOD):
optimization_engine/extractors/ ✓ Core library
extract_displacement.py
extract_stress.py
catalog.json
studies/my_study/
extractors_manifest.json ✓ Just references!
llm_workflow_config.json ✓ Config
optimization_results.json ✓ Results
New Components:
1. ExtractorLibrary (extractor_library.py)
- Signature-based deduplication
- Centralized catalog (catalog.json)
- Study manifest generation
- Reusability across all studies
2. Updated ExtractorOrchestrator
- Uses core library instead of per-study generation
- Creates manifest instead of copying code
- Backward compatible (legacy mode available)
3. Updated LLMOptimizationRunner
- Removed generated_extractors/ directory creation
- Removed generated_hooks/ directory creation
- Uses core library exclusively
4. Updated Tests
- Verifies extractors_manifest.json exists
- Checks for clean study folder structure
- All 18/18 checks pass
Results:
Study folders NOW ONLY contain:
✓ extractors_manifest.json - references to core library
✓ llm_workflow_config.json - study configuration
✓ optimization_results.json - optimization results
✓ optimization_history.json - trial history
✓ .db file - Optuna database
Core library contains:
✓ extract_displacement.py - reusable across ALL studies
✓ extract_von_mises_stress.py - reusable across ALL studies
✓ extract_mass.py - reusable across ALL studies
✓ catalog.json - tracks all extractors with signatures
Benefits:
- Clean, professional study folder structure
- Code reuse eliminates duplication
- Library grows over time, studies stay clean
- Production-grade architecture
- "Insanely good engineering software that evolves with time"
Testing:
E2E test passes with clean folder structure
- No generated_extractors/ pollution
- Manifest correctly references library
- Core library populated with reusable extractors
- Study folder professional and minimal
Documentation:
- Added comprehensive architecture doc (docs/ARCHITECTURE_REFACTOR_NOV17.md)
- Includes migration guide
- Documents future work (hooks library, versioning, CLI tools)
Next Steps:
- Apply same architecture to hooks library
- Add auto-generated documentation for library
- Implement versioning for reproducibility
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
**CRITICAL FIX**: FEM results were identical across trials
**Root Cause**:
The LLM runner was passing design_vars to simulation_runner(), which then passed
them to NX Solver's expression_updates parameter. The solve journal tried to
update hardcoded expression names (tip_thickness, support_angle) that don't exist
in the beam model, causing the solver to ignore updates and use cached geometry.
**Solution**:
Match the working 50-trial optimization workflow:
1. model_updater() updates PRT file via NX import journal
2. Part file is closed/flushed to disk
3. simulation_runner() runs WITHOUT passing design_vars
4. NX solver loads SIM file, which references the updated PRT from disk
5. FEM regenerates with updated geometry automatically
**Changes**:
- llm_optimization_runner.py: Call simulation_runner() without arguments
- run_optimization.py: Remove design_vars parameter from simulation_runner closure
- import_expressions.py: Added theSession.Parts.CloseAll() to flush changes
- test_phase_3_2_e2e.py: Fixed remaining variable name bugs
**Test Results**:
✅ Trial 0: objective 7,315,679
✅ Trial 1: objective 9,158.67
✅ Trial 2: objective 7,655.28
FEM results are now DIFFERENT for each trial - optimization working correctly!
**Remaining Issue**: LLM parsing "20 to 30 mm" as 0-1 range (separate fix needed)
Critical bug fix for LLM mode optimization:
**Problem**:
- NXParameterUpdater.update_expressions() uses NX journal to import expressions (default use_nx_import=True)
- The NX journal directly updates the PRT file on disk and saves it
- But then run_optimization.py was calling updater.save() afterwards
- save() writes self.content (loaded at initialization) back to file
- This overwrote the NX journal changes with stale binary content!
**Result**: All optimization trials produced identical FEM results because the model was never actually updated.
**Fixes**:
1. Removed updater.save() call from model_updater closure in run_optimization.py
2. Added theSession.Parts.CloseAll() in import_expressions.py to ensure changes are flushed and file is released
3. Fixed test_phase_3_2_e2e.py variable name (best_trial_file → results_file)
**Testing**: Verified expressions persist to disk correctly with standalone test.
Next step: Address remaining issue where FEM results are still identical (likely solve journal not reloading updated PRT).