docs: Major documentation overhaul - restructure folders, update tagline, add Getting Started guide

- Restructure docs/ folder (remove numeric prefixes): - 04_USER_GUIDES -> guides/ - 05_API_REFERENCE -> api/ - 06_PHYSICS -> physics/ - 07_DEVELOPMENT -> development/ - 08_ARCHIVE -> archive/ - 09_DIAGRAMS -> diagrams/ - Replace tagline 'Talk, don't click' with 'LLM-driven optimization framework' in 9 files - Create comprehensive docs/GETTING_STARTED.md: - Prerequisites and quick setup - Project structure overview - First study tutorial (Claude or manual) - Dashboard usage guide - Neural acceleration introduction - Rewrite docs/00_INDEX.md with correct paths and modern structure - Archive obsolete files: - 01_PROTOCOLS.md -> archive/historical/01_PROTOCOLS_legacy.md - 03_GETTING_STARTED.md -> archive/historical/ - ATOMIZER_PODCAST_BRIEFING.md -> archive/marketing/ - Update timestamps to 2026-01-20 across all key files - Update .gitignore to exclude docs/generated/ - Version bump: ATOMIZER_CONTEXT v1.8 -> v2.0
2026-01-20 10:03:45 -05:00
parent 37f73cc2be
commit ea437d360e
103 changed files with 8980 additions and 327 deletions
--- a/docs/archive/phase_documents/PHASE_2_5_INTELLIGENT_GAP_DETECTION.md
+++ b/docs/archive/phase_documents/PHASE_2_5_INTELLIGENT_GAP_DETECTION.md
@@ -0,0 +1,253 @@
+# Phase 2.5: Intelligent Codebase-Aware Gap Detection
+
+## Problem Statement
+
+The current Research Agent uses dumb keyword matching and doesn't understand what already exists in the Atomizer codebase. When a user asks:
+
+> "I want to evaluate strain on a part with sol101 and optimize this (minimize) using iterations and optuna to lower it varying all my geometry parameters that contains v_ in its expression"
+
+**Current (Wrong) Behavior:**
+- Detects keyword "geometry"
+- Asks user for geometry examples
+- Completely misses the actual request
+
+**Expected (Correct) Behavior:**
+```
+Analyzing your optimization request...
+
+Workflow Components Identified:
+---------------------------------
+1. Run SOL101 analysis                    [KNOWN - nx_solver.py]
+2. Extract geometry parameters (v_ prefix) [KNOWN - expression system]
+3. Update parameter values                 [KNOWN - parameter updater]
+4. Optuna optimization loop               [KNOWN - optimization engine]
+5. Extract strain from OP2                [MISSING - not implemented]
+6. Minimize strain objective              [SIMPLE - max(strain values)]
+
+Knowledge Gap Analysis:
+-----------------------
+HAVE:  - OP2 displacement extraction (op2_extractor_example.py)
+HAVE:  - OP2 stress extraction (op2_extractor_example.py)
+MISSING: - OP2 strain extraction
+
+Research Needed:
+----------------
+Only need to learn: How to extract strain data from Nastran OP2 files using pyNastran
+
+Would you like me to:
+1. Search pyNastran documentation for strain extraction
+2. Look for strain extraction examples in op2_extractor_example.py pattern
+3. Ask you for an example of strain extraction code
+```
+
+## Solution Architecture
+
+### 1. Codebase Capability Analyzer
+
+Scan Atomizer to build capability index:
+
+```python
+class CodebaseCapabilityAnalyzer:
+    """Analyzes what Atomizer can already do."""
+
+    def analyze_codebase(self) -> Dict[str, Any]:
+        """
+        Returns:
+        {
+            'optimization': {
+                'optuna_integration': True,
+                'parameter_updating': True,
+                'expression_parsing': True
+            },
+            'simulation': {
+                'nx_solver': True,
+                'sol101': True,
+                'sol103': False
+            },
+            'result_extraction': {
+                'displacement': True,
+                'stress': True,
+                'strain': False,  # <-- THE GAP!
+                'modal': False
+            }
+        }
+        """
+```
+
+### 2. Workflow Decomposer
+
+Break user request into atomic steps:
+
+```python
+class WorkflowDecomposer:
+    """Breaks complex requests into atomic workflow steps."""
+
+    def decompose(self, user_request: str) -> List[WorkflowStep]:
+        """
+        Input: "minimize strain using SOL101 and optuna varying v_ params"
+
+        Output:
+        [
+            WorkflowStep("identify_parameters", domain="geometry", params={"filter": "v_"}),
+            WorkflowStep("update_parameters", domain="geometry", params={"values": "from_optuna"}),
+            WorkflowStep("run_analysis", domain="simulation", params={"solver": "SOL101"}),
+            WorkflowStep("extract_strain", domain="results", params={"metric": "max_strain"}),
+            WorkflowStep("optimize", domain="optimization", params={"objective": "minimize", "algorithm": "optuna"})
+        ]
+        """
+```
+
+### 3. Capability Matcher
+
+Match workflow steps to existing capabilities:
+
+```python
+class CapabilityMatcher:
+    """Matches required workflow steps to existing capabilities."""
+
+    def match(self, workflow_steps, capabilities) -> CapabilityMatch:
+        """
+        Returns:
+        {
+            'known_steps': [
+                {'step': 'identify_parameters', 'implementation': 'expression_parser.py'},
+                {'step': 'update_parameters', 'implementation': 'parameter_updater.py'},
+                {'step': 'run_analysis', 'implementation': 'nx_solver.py'},
+                {'step': 'optimize', 'implementation': 'optuna_optimizer.py'}
+            ],
+            'unknown_steps': [
+                {'step': 'extract_strain', 'similar_to': 'extract_stress', 'gap': 'strain_from_op2'}
+            ],
+            'confidence': 0.80  # 4/5 steps known
+        }
+        """
+```
+
+### 4. Targeted Research Planner
+
+Create research plan ONLY for missing pieces:
+
+```python
+class TargetedResearchPlanner:
+    """Creates research plan focused on actual gaps."""
+
+    def plan(self, unknown_steps) -> ResearchPlan:
+        """
+        For gap='strain_from_op2', similar_to='stress_from_op2':
+
+        Research Plan:
+        1. Read existing op2_extractor_example.py to understand pattern
+        2. Search pyNastran docs for strain extraction API
+        3. If not found, ask user for strain extraction example
+        4. Generate extract_strain() function following same pattern as extract_stress()
+        """
+```
+
+## Implementation Plan
+
+### Week 1: Capability Analysis
+- [X] Map existing Atomizer capabilities
+- [X] Build capability index from code
+- [X] Create capability query system
+
+### Week 2: Workflow Decomposition
+- [X] Build workflow step extractor
+- [X] Create domain classifier
+- [X] Implement step-to-capability matcher
+
+### Week 3: Intelligent Gap Detection
+- [X] Integrate all components
+- [X] Test with strain optimization request
+- [X] Verify correct gap identification
+
+## Success Criteria
+
+**Test Input:**
+"minimize strain using SOL101 and optuna varying v_ parameters"
+
+**Expected Output:**
+```
+Request Analysis Complete
+-------------------------
+
+Known Capabilities (80%):
+- Parameter identification (v_ prefix filter)
+- Parameter updating
+- SOL101 simulation execution
+- Optuna optimization loop
+
+Missing Capability (20%):
+- Strain extraction from OP2 files
+
+Recommendation:
+The only missing piece is extracting strain data from Nastran OP2 output files.
+I found a similar implementation for stress extraction in op2_extractor_example.py.
+
+Would you like me to:
+1. Research pyNastran strain extraction API
+2. Generate extract_max_strain() function following the stress extraction pattern
+3. Integrate into your optimization workflow
+
+Research needed: Minimal (1 function, ~50 lines of code)
+```
+
+## Benefits
+
+1. **Accurate Gap Detection**: Only identifies actual missing capabilities
+2. **Minimal Research**: Focuses effort on real unknowns
+3. **Leverages Existing Code**: Understands what you already have
+4. **Better UX**: Clear explanation of what's known vs unknown
+5. **Faster Iterations**: Doesn't waste time on known capabilities
+
+## Current Status
+
+- [X] Problem identified
+- [X] Solution architecture designed
+- [X] Implementation completed
+- [X] All tests passing
+
+## Implementation Summary
+
+Phase 2.5 has been successfully implemented with 4 core components:
+
+1. **CodebaseCapabilityAnalyzer** ([codebase_analyzer.py](../optimization_engine/codebase_analyzer.py))
+   - Scans Atomizer codebase for existing capabilities
+   - Identifies what's implemented vs missing
+   - Finds similar capabilities for pattern reuse
+
+2. **WorkflowDecomposer** ([workflow_decomposer.py](../optimization_engine/workflow_decomposer.py))
+   - Breaks user requests into atomic workflow steps
+   - Extracts parameters from natural language
+   - Classifies steps by domain
+
+3. **CapabilityMatcher** ([capability_matcher.py](../optimization_engine/capability_matcher.py))
+   - Matches workflow steps to existing code
+   - Identifies actual knowledge gaps
+   - Calculates confidence based on pattern similarity
+
+4. **TargetedResearchPlanner** ([targeted_research_planner.py](../optimization_engine/targeted_research_planner.py))
+   - Creates focused research plans
+   - Leverages similar capabilities when available
+   - Prioritizes research sources
+
+## Test Results
+
+Run the comprehensive test:
+```bash
+python tests/test_phase_2_5_intelligent_gap_detection.py
+```
+
+**Test Output (strain optimization request):**
+- Workflow: 5 steps identified
+- Known: 4/5 steps (80% coverage)
+- Missing: Only strain extraction
+- Similar: Can adapt from displacement/stress
+- Overall confidence: 90%
+- Research plan: 4 focused steps
+
+## Next Steps
+
+1. Integrate Phase 2.5 with existing Research Agent
+2. Update interactive session to use new gap detection
+3. Test with diverse optimization requests
+4. Build MCP integration for documentation search
--- a/docs/archive/phase_documents/PHASE_2_7_LLM_INTEGRATION.md
+++ b/docs/archive/phase_documents/PHASE_2_7_LLM_INTEGRATION.md
@@ -0,0 +1,245 @@
+# Phase 2.7: LLM-Powered Workflow Intelligence
+
+## Problem: Static Regex vs. Dynamic Intelligence
+
+**Previous Approach (Phase 2.5-2.6):**
+- ❌ Dumb regex patterns to extract workflow steps
+- ❌ Static rules for step classification
+- ❌ Missed intermediate calculations
+- ❌ Couldn't understand nuance (CBUSH vs CBAR, element forces vs reaction forces)
+
+**New Approach (Phase 2.7):**
+- ✅ **Use Claude LLM to analyze user requests**
+- ✅ **Understand engineering context dynamically**
+- ✅ **Detect ALL intermediate steps intelligently**
+- ✅ **Distinguish subtle differences (element types, directions, metrics)**
+
+## Architecture
+
+```
+User Request
+     ↓
+LLM Analyzer (Claude)
+     ↓
+Structured JSON Analysis
+     ↓
+┌────────────────────────────────────┐
+│ Engineering Features (FEA)         │
+│ Inline Calculations (Math)         │
+│ Post-Processing Hooks (Custom)     │
+│ Optimization Config                │
+└────────────────────────────────────┘
+     ↓
+Phase 2.5 Capability Matching
+     ↓
+Research Plan / Code Generation
+```
+
+## Example: CBAR Optimization Request
+
+**User Input:**
+```
+I want to extract forces in direction Z of all the 1D elements and find the average of it,
+then find the minimum value and compare it to the average, then assign it to a objective
+metric that needs to be minimized.
+
+I want to iterate on the FEA properties of the Cbar element stiffness in X to make the
+objective function minimized.
+
+I want to use genetic algorithm to iterate and optimize this
+```
+
+**LLM Analysis Output:**
+```json
+{
+  "engineering_features": [
+    {
+      "action": "extract_1d_element_forces",
+      "domain": "result_extraction",
+      "description": "Extract element forces from CBAR in Z direction from OP2",
+      "params": {
+        "element_types": ["CBAR"],
+        "result_type": "element_force",
+        "direction": "Z"
+      }
+    },
+    {
+      "action": "update_cbar_stiffness",
+      "domain": "fea_properties",
+      "description": "Modify CBAR stiffness in X direction",
+      "params": {
+        "element_type": "CBAR",
+        "property": "stiffness_x"
+      }
+    }
+  ],
+  "inline_calculations": [
+    {
+      "action": "calculate_average",
+      "params": {"input": "forces_z", "operation": "mean"},
+      "code_hint": "avg = sum(forces_z) / len(forces_z)"
+    },
+    {
+      "action": "find_minimum",
+      "params": {"input": "forces_z", "operation": "min"},
+      "code_hint": "min_val = min(forces_z)"
+    }
+  ],
+  "post_processing_hooks": [
+    {
+      "action": "custom_objective_metric",
+      "description": "Compare min to average",
+      "params": {
+        "inputs": ["min_force", "avg_force"],
+        "formula": "min_force / avg_force",
+        "objective": "minimize"
+      }
+    }
+  ],
+  "optimization": {
+    "algorithm": "genetic_algorithm",
+    "design_variables": [
+      {"parameter": "cbar_stiffness_x", "type": "FEA_property"}
+    ]
+  }
+}
+```
+
+## Key Intelligence Improvements
+
+### 1. Detects Intermediate Steps
+**Old (Regex):**
+- ❌ Only saw "extract forces" and "optimize"
+- ❌ Missed average, minimum, comparison
+
+**New (LLM):**
+- ✅ Identifies: extract → average → min → compare → optimize
+- ✅ Classifies each as engineering vs. simple math
+
+### 2. Understands Engineering Context
+**Old (Regex):**
+- ❌ "forces" → generic "reaction_force" extraction
+- ❌ Didn't distinguish CBUSH from CBAR
+
+**New (LLM):**
+- ✅ "1D element forces" → element forces (not reaction forces)
+- ✅ "CBAR stiffness in X" → specific property in specific direction
+- ✅ Understands these come from different sources (OP2 vs property cards)
+
+### 3. Smart Classification
+**Old (Regex):**
+```python
+if 'average' in text:
+    return 'simple_calculation'  # Dumb!
+```
+
+**New (LLM):**
+```python
+# LLM reasoning:
+# - "average of forces" → simple Python (sum/len)
+# - "extract forces from OP2" → engineering (pyNastran)
+# - "compare min to avg for objective" → hook (custom logic)
+```
+
+### 4. Generates Actionable Code Hints
+**Old:** Just action names like "calculate_average"
+
+**New:** Includes code hints for auto-generation:
+```json
+{
+  "action": "calculate_average",
+  "code_hint": "avg = sum(forces_z) / len(forces_z)"
+}
+```
+
+## Integration with Existing Phases
+
+### Phase 2.5 (Capability Matching)
+LLM output feeds directly into existing capability matcher:
+- Engineering features → check if implemented
+- If missing → create research plan
+- If similar → adapt existing code
+
+### Phase 2.6 (Step Classification)
+Now **replaced by LLM** for better accuracy:
+- No more static rules
+- Context-aware classification
+- Understands subtle differences
+
+## Implementation
+
+**File:** `optimization_engine/llm_workflow_analyzer.py`
+
+**Key Function:**
+```python
+analyzer = LLMWorkflowAnalyzer(api_key=os.getenv('ANTHROPIC_API_KEY'))
+analysis = analyzer.analyze_request(user_request)
+
+# Returns structured JSON with:
+# - engineering_features
+# - inline_calculations
+# - post_processing_hooks
+# - optimization config
+```
+
+## Benefits
+
+1. **Accurate**: Understands engineering nuance
+2. **Complete**: Detects ALL steps, including intermediate ones
+3. **Dynamic**: No hardcoded patterns to maintain
+4. **Extensible**: Automatically handles new request types
+5. **Actionable**: Provides code hints for auto-generation
+
+## LLM Integration Modes
+
+### Development Mode (Recommended)
+For development within Claude Code:
+- Use Claude Code directly for interactive workflow analysis
+- No API consumption or costs
+- Real-time feedback and iteration
+- Perfect for testing and refinement
+
+### Production Mode (Future)
+For standalone Atomizer execution:
+- Optional Anthropic API integration
+- Set `ANTHROPIC_API_KEY` environment variable
+- Falls back to heuristics if no key provided
+- Useful for automated batch processing
+
+**Current Status**: llm_workflow_analyzer.py supports both modes. For development, continue using Claude Code interactively.
+
+## Next Steps
+
+1. ✅ Install anthropic package
+2. ✅ Create LLM analyzer module
+3. ✅ Document integration modes
+4. ⏳ Integrate with Phase 2.5 capability matcher
+5. ⏳ Test with diverse optimization requests via Claude Code
+6. ⏳ Build code generator for inline calculations
+7. ⏳ Build hook generator for post-processing
+
+## Success Criteria
+
+**Input:**
+"Extract 1D forces, find average, find minimum, compare to average, optimize CBAR stiffness"
+
+**Output:**
+```
+Engineering Features: 2 (need research)
+  - extract_1d_element_forces
+  - update_cbar_stiffness
+
+Inline Calculations: 2 (auto-generate)
+  - calculate_average
+  - find_minimum
+
+Post-Processing: 1 (generate hook)
+  - custom_objective_metric (min/avg ratio)
+
+Optimization: 1
+  - genetic_algorithm
+
+✅ All steps detected
+✅ Correctly classified
+✅ Ready for implementation
+```
--- a/docs/archive/phase_documents/PHASE_3_2_INTEGRATION_PLAN.md
+++ b/docs/archive/phase_documents/PHASE_3_2_INTEGRATION_PLAN.md
@@ -0,0 +1,699 @@
+# Phase 3.2: LLM Integration Roadmap
+
+**Status**: ✅ **WEEK 1 COMPLETE** - 🎯 **Week 2 IN PROGRESS**
+**Timeline**: 2-4 weeks
+**Last Updated**: 2025-11-17
+**Current Progress**: 25% (Week 1/4 Complete)
+
+---
+
+## Executive Summary
+
+### The Problem
+We've built 85% of an LLM-native optimization system, but **it's not integrated into production**. The components exist but are disconnected islands:
+
+- ✅ **LLMWorkflowAnalyzer** - Parses natural language → workflow (Phase 2.7)
+- ✅ **ExtractorOrchestrator** - Auto-generates result extractors (Phase 3.1)
+- ✅ **InlineCodeGenerator** - Creates custom calculations (Phase 2.8)
+- ✅ **HookGenerator** - Generates post-processing hooks (Phase 2.9)
+- ✅ **LLMOptimizationRunner** - Orchestrates LLM workflow (Phase 3.2)
+- ⚠️ **ResearchAgent** - Learns from examples (Phase 2, partially complete)
+
+**Reality**: Users still write 100+ lines of JSON config manually instead of using 3 lines of natural language.
+
+### The Solution
+**Phase 3.2 Integration Sprint**: Wire LLM components into production workflow with a single `--llm` flag.
+
+---
+
+## Strategic Roadmap
+
+### Week 1: Make LLM Mode Accessible (16 hours)
+
+**Goal**: Users can invoke LLM mode with a single command
+
+#### Tasks
+
+**1.1 Create Unified Entry Point** (4 hours) ✅ COMPLETE
+- [x] Create `optimization_engine/run_optimization.py` as unified CLI
+- [x] Add `--llm` flag for natural language mode
+- [x] Add `--request` parameter for natural language input
+- [x] Preserve existing `--config` for traditional JSON mode
+- [x] Support both modes in parallel (no breaking changes)
+
+**Files**:
+- `optimization_engine/run_optimization.py` (NEW)
+
+**Success Metric**:
+```bash
+python optimization_engine/run_optimization.py --llm \
+  --request "Minimize stress for bracket. Vary wall thickness 3-8mm" \
+  --prt studies/bracket/model/Bracket.prt \
+  --sim studies/bracket/model/Bracket_sim1.sim
+```
+
+---
+
+**1.2 Wire LLMOptimizationRunner to Production** (8 hours) ✅ COMPLETE
+- [x] Connect LLMWorkflowAnalyzer to entry point
+- [x] Bridge LLMOptimizationRunner → OptimizationRunner for execution
+- [x] Pass model updater and simulation runner callables
+- [x] Integrate with existing hook system
+- [x] Preserve all logging (detailed logs, optimization.log)
+- [x] Add workflow validation and error handling
+- [x] Create comprehensive integration test suite (5/5 tests passing)
+
+**Files Modified**:
+- `optimization_engine/run_optimization.py`
+- `optimization_engine/llm_optimization_runner.py` (integration points)
+
+**Success Metric**: LLM workflow generates extractors → runs FEA → logs results
+
+---
+
+**1.3 Create Minimal Example** (2 hours) ✅ COMPLETE
+- [x] Create `examples/llm_mode_simple_example.py`
+- [x] Show: Natural language request → Optimization results
+- [x] Compare: Traditional mode (100 lines JSON) vs LLM mode (3 lines)
+- [x] Include troubleshooting tips
+
+**Files Created**:
+- `examples/llm_mode_simple_example.py`
+
+**Success Metric**: Example runs successfully, demonstrates value ✅
+
+---
+
+**1.4 End-to-End Integration Test** (2 hours) ✅ COMPLETE
+- [x] Test with simple_beam_optimization study
+- [x] Natural language → JSON workflow → NX solve → Results
+- [x] Verify all extractors generated correctly
+- [x] Check logs created properly
+- [x] Validate output matches manual mode
+- [x] Test graceful failure without API key
+- [x] Comprehensive verification of all output files
+
+**Files Created**:
+- `tests/test_phase_3_2_e2e.py`
+
+**Success Metric**: LLM mode completes beam optimization without errors ✅
+
+---
+
+### Week 2: Robustness & Safety (16 hours)
+
+**Goal**: LLM mode handles failures gracefully, never crashes
+
+#### Tasks
+
+**2.1 Code Validation Pipeline** (6 hours)
+- [ ] Create `optimization_engine/code_validator.py`
+- [ ] Implement syntax validation (ast.parse)
+- [ ] Implement security scanning (whitelist imports)
+- [ ] Implement test execution on example OP2
+- [ ] Implement output schema validation
+- [ ] Add retry with LLM feedback on validation failure
+
+**Files Created**:
+- `optimization_engine/code_validator.py`
+
+**Integration Points**:
+- `optimization_engine/extractor_orchestrator.py` (validate before saving)
+- `optimization_engine/inline_code_generator.py` (validate calculations)
+
+**Success Metric**: Generated code passes validation, or LLM fixes based on feedback
+
+---
+
+**2.2 Graceful Fallback Mechanisms** (4 hours)
+- [ ] Wrap all LLM calls in try/except
+- [ ] Provide clear error messages
+- [ ] Offer fallback to manual mode
+- [ ] Log failures to audit trail
+- [ ] Never crash on LLM failure
+
+**Files Modified**:
+- `optimization_engine/run_optimization.py`
+- `optimization_engine/llm_workflow_analyzer.py`
+- `optimization_engine/llm_optimization_runner.py`
+
+**Success Metric**: LLM failures degrade gracefully to manual mode
+
+---
+
+**2.3 LLM Audit Trail** (3 hours)
+- [ ] Create `optimization_engine/llm_audit.py`
+- [ ] Log all LLM requests and responses
+- [ ] Log generated code with prompts
+- [ ] Log validation results
+- [ ] Create `llm_audit.json` in study output directory
+
+**Files Created**:
+- `optimization_engine/llm_audit.py`
+
+**Integration Points**:
+- All LLM components log to audit trail
+
+**Success Metric**: Full LLM decision trace available for debugging
+
+---
+
+**2.4 Failure Scenario Testing** (3 hours)
+- [ ] Test: Invalid natural language request
+- [ ] Test: LLM unavailable (API down)
+- [ ] Test: Generated code has syntax error
+- [ ] Test: Generated code fails validation
+- [ ] Test: OP2 file format unexpected
+- [ ] Verify all fail gracefully
+
+**Files Created**:
+- `tests/test_llm_failure_modes.py`
+
+**Success Metric**: All failure scenarios handled without crashes
+
+---
+
+### Week 3: Learning System (12 hours)
+
+**Goal**: System learns from successful workflows and reuses patterns
+
+#### Tasks
+
+**3.1 Knowledge Base Implementation** (4 hours)
+- [ ] Create `optimization_engine/knowledge_base.py`
+- [ ] Implement `save_session()` - Save successful workflows
+- [ ] Implement `search_templates()` - Find similar past workflows
+- [ ] Implement `get_template()` - Retrieve reusable pattern
+- [ ] Add confidence scoring (user-validated > LLM-generated)
+
+**Files Created**:
+- `optimization_engine/knowledge_base.py`
+- `knowledge_base/sessions/` (directory for session logs)
+- `knowledge_base/templates/` (directory for reusable patterns)
+
+**Success Metric**: Successful workflows saved with metadata
+
+---
+
+**3.2 Template Extraction** (4 hours)
+- [ ] Analyze generated extractor code to identify patterns
+- [ ] Extract reusable template structure
+- [ ] Parameterize variable parts
+- [ ] Save template with usage examples
+- [ ] Implement template application to new requests
+
+**Files Modified**:
+- `optimization_engine/extractor_orchestrator.py`
+
+**Integration**:
+```python
+# After successful generation:
+template = extract_template(generated_code)
+knowledge_base.save_template(feature_name, template, confidence='medium')
+
+# On next request:
+existing_template = knowledge_base.search_templates(feature_name)
+if existing_template and existing_template.confidence > 0.7:
+    code = existing_template.apply(new_params)  # Reuse!
+```
+
+**Success Metric**: Second identical request reuses template (faster)
+
+---
+
+**3.3 ResearchAgent Integration** (4 hours)
+- [ ] Complete ResearchAgent implementation
+- [ ] Integrate into ExtractorOrchestrator error handling
+- [ ] Add user example collection workflow
+- [ ] Implement pattern learning from examples
+- [ ] Save learned knowledge to knowledge base
+
+**Files Modified**:
+- `optimization_engine/research_agent.py` (complete implementation)
+- `optimization_engine/llm_optimization_runner.py` (integrate ResearchAgent)
+
+**Workflow**:
+```
+Unknown feature requested
+  → ResearchAgent asks user for example
+  → Learns pattern from example
+  → Generates feature using pattern
+  → Saves to knowledge base
+  → Retry with new feature
+```
+
+**Success Metric**: Unknown feature request triggers learning loop successfully
+
+---
+
+### Week 4: Documentation & Discoverability (8 hours)
+
+**Goal**: Users discover and understand LLM capabilities
+
+#### Tasks
+
+**4.1 Update README** (2 hours)
+- [ ] Add "🤖 LLM-Powered Mode" section to README.md
+- [ ] Show example command with natural language
+- [ ] Explain what LLM mode can do
+- [ ] Link to detailed docs
+
+**Files Modified**:
+- `README.md`
+
+**Success Metric**: README clearly shows LLM capabilities upfront
+
+---
+
+**4.2 Create LLM Mode Documentation** (3 hours)
+- [ ] Create `docs/LLM_MODE.md`
+- [ ] Explain how LLM mode works
+- [ ] Provide usage examples
+- [ ] Document when to use LLM vs manual mode
+- [ ] Add troubleshooting guide
+- [ ] Explain learning system
+
+**Files Created**:
+- `docs/LLM_MODE.md`
+
+**Contents**:
+- How it works (architecture diagram)
+- Getting started (first LLM optimization)
+- Natural language patterns that work well
+- Troubleshooting common issues
+- How learning system improves over time
+
+**Success Metric**: Users understand LLM mode from docs
+
+---
+
+**4.3 Create Demo Video/GIF** (1 hour)
+- [ ] Record terminal session: Natural language → Results
+- [ ] Show before/after (100 lines JSON vs 3 lines)
+- [ ] Create animated GIF for README
+- [ ] Add to documentation
+
+**Files Created**:
+- `docs/demo/llm_mode_demo.gif`
+
+**Success Metric**: Visual demo shows value proposition clearly
+
+---
+
+**4.4 Update All Planning Docs** (2 hours)
+- [ ] Update DEVELOPMENT.md with Phase 3.2 completion status
+- [ ] Update DEVELOPMENT_GUIDANCE.md progress (80-90% → 90-95%)
+- [ ] Update DEVELOPMENT_ROADMAP.md Phase 3 status
+- [ ] Mark Phase 3.2 as ✅ Complete
+
+**Files Modified**:
+- `DEVELOPMENT.md`
+- `DEVELOPMENT_GUIDANCE.md`
+- `DEVELOPMENT_ROADMAP.md`
+
+**Success Metric**: All docs reflect completed Phase 3.2
+
+---
+
+## Implementation Details
+
+### Entry Point Architecture
+
+```python
+# optimization_engine/run_optimization.py (NEW)
+
+import argparse
+from pathlib import Path
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Atomizer Optimization Engine - Manual or LLM-powered mode"
+    )
+
+    # Mode selection
+    mode_group = parser.add_mutually_exclusive_group(required=True)
+    mode_group.add_argument('--llm', action='store_true',
+                           help='Use LLM-assisted workflow (natural language mode)')
+    mode_group.add_argument('--config', type=Path,
+                           help='JSON config file (traditional mode)')
+
+    # LLM mode parameters
+    parser.add_argument('--request', type=str,
+                       help='Natural language optimization request (required with --llm)')
+
+    # Common parameters
+    parser.add_argument('--prt', type=Path, required=True,
+                       help='Path to .prt file')
+    parser.add_argument('--sim', type=Path, required=True,
+                       help='Path to .sim file')
+    parser.add_argument('--output', type=Path,
+                       help='Output directory (default: auto-generated)')
+    parser.add_argument('--trials', type=int, default=50,
+                       help='Number of optimization trials')
+
+    args = parser.parse_args()
+
+    if args.llm:
+        run_llm_mode(args)
+    else:
+        run_traditional_mode(args)
+
+
+def run_llm_mode(args):
+    """LLM-powered natural language mode."""
+    from optimization_engine.llm_workflow_analyzer import LLMWorkflowAnalyzer
+    from optimization_engine.llm_optimization_runner import LLMOptimizationRunner
+    from optimization_engine.nx_updater import NXParameterUpdater
+    from optimization_engine.nx_solver import NXSolver
+    from optimization_engine.llm_audit import LLMAuditLogger
+
+    if not args.request:
+        raise ValueError("--request required with --llm mode")
+
+    print(f"🤖 LLM Mode: Analyzing request...")
+    print(f"   Request: {args.request}")
+
+    # Initialize audit logger
+    audit_logger = LLMAuditLogger(args.output / "llm_audit.json")
+
+    # Analyze natural language request
+    analyzer = LLMWorkflowAnalyzer(use_claude_code=True)
+
+    try:
+        workflow = analyzer.analyze_request(args.request)
+        audit_logger.log_analysis(args.request, workflow,
+                                  reasoning=workflow.get('llm_reasoning', ''))
+
+        print(f"✓ Workflow created:")
+        print(f"  - Design variables: {len(workflow['design_variables'])}")
+        print(f"  - Objectives: {len(workflow['objectives'])}")
+        print(f"  - Extractors: {len(workflow['engineering_features'])}")
+
+    except Exception as e:
+        print(f"✗ LLM analysis failed: {e}")
+        print("  Falling back to manual mode. Please provide --config instead.")
+        return
+
+    # Create model updater and solver callables
+    updater = NXParameterUpdater(args.prt)
+    solver = NXSolver()
+
+    def model_updater(design_vars):
+        updater.update_expressions(design_vars)
+
+    def simulation_runner():
+        result = solver.run_simulation(args.sim)
+        return result['op2_file']
+
+    # Run LLM-powered optimization
+    runner = LLMOptimizationRunner(
+        llm_workflow=workflow,
+        model_updater=model_updater,
+        simulation_runner=simulation_runner,
+        study_name=args.output.name if args.output else "llm_optimization",
+        output_dir=args.output
+    )
+
+    study = runner.run(n_trials=args.trials)
+
+    print(f"\n✓ Optimization complete!")
+    print(f"  Best trial: {study.best_trial.number}")
+    print(f"  Best value: {study.best_value:.6f}")
+    print(f"  Results: {args.output}")
+
+
+def run_traditional_mode(args):
+    """Traditional JSON configuration mode."""
+    from optimization_engine.runner import OptimizationRunner
+    import json
+
+    print(f"📄 Traditional Mode: Loading config...")
+
+    with open(args.config) as f:
+        config = json.load(f)
+
+    runner = OptimizationRunner(
+        config_file=args.config,
+        prt_file=args.prt,
+        sim_file=args.sim,
+        output_dir=args.output
+    )
+
+    study = runner.run(n_trials=args.trials)
+
+    print(f"\n✓ Optimization complete!")
+    print(f"  Results: {args.output}")
+
+
+if __name__ == '__main__':
+    main()
+```
+
+---
+
+### Validation Pipeline
+
+```python
+# optimization_engine/code_validator.py (NEW)
+
+import ast
+import subprocess
+import tempfile
+from pathlib import Path
+from typing import Dict, Any, List
+
+class CodeValidator:
+    """
+    Validates LLM-generated code before execution.
+
+    Checks:
+    1. Syntax (ast.parse)
+    2. Security (whitelist imports)
+    3. Test execution on example data
+    4. Output schema validation
+    """
+
+    ALLOWED_IMPORTS = {
+        'pyNastran', 'numpy', 'pathlib', 'typing', 'dataclasses',
+        'json', 'sys', 'os', 'math', 'collections'
+    }
+
+    FORBIDDEN_CALLS = {
+        'eval', 'exec', 'compile', '__import__', 'open',
+        'subprocess', 'os.system', 'os.popen'
+    }
+
+    def validate_extractor(self, code: str, test_op2_file: Path) -> Dict[str, Any]:
+        """
+        Validate generated extractor code.
+
+        Args:
+            code: Generated Python code
+            test_op2_file: Example OP2 file for testing
+
+        Returns:
+            {
+                'valid': bool,
+                'error': str (if invalid),
+                'test_result': dict (if valid)
+            }
+        """
+        # 1. Syntax check
+        try:
+            tree = ast.parse(code)
+        except SyntaxError as e:
+            return {
+                'valid': False,
+                'error': f'Syntax error: {e}',
+                'stage': 'syntax'
+            }
+
+        # 2. Security scan
+        security_result = self._check_security(tree)
+        if not security_result['safe']:
+            return {
+                'valid': False,
+                'error': security_result['error'],
+                'stage': 'security'
+            }
+
+        # 3. Test execution
+        try:
+            test_result = self._test_execution(code, test_op2_file)
+        except Exception as e:
+            return {
+                'valid': False,
+                'error': f'Runtime error: {e}',
+                'stage': 'execution'
+            }
+
+        # 4. Output schema validation
+        schema_result = self._validate_output_schema(test_result)
+        if not schema_result['valid']:
+            return {
+                'valid': False,
+                'error': schema_result['error'],
+                'stage': 'schema'
+            }
+
+        return {
+            'valid': True,
+            'test_result': test_result
+        }
+
+    def _check_security(self, tree: ast.AST) -> Dict[str, Any]:
+        """Check for dangerous imports and function calls."""
+        for node in ast.walk(tree):
+            # Check imports
+            if isinstance(node, ast.Import):
+                for alias in node.names:
+                    module = alias.name.split('.')[0]
+                    if module not in self.ALLOWED_IMPORTS:
+                        return {
+                            'safe': False,
+                            'error': f'Disallowed import: {alias.name}'
+                        }
+
+            # Check function calls
+            if isinstance(node, ast.Call):
+                if isinstance(node.func, ast.Name):
+                    if node.func.id in self.FORBIDDEN_CALLS:
+                        return {
+                            'safe': False,
+                            'error': f'Forbidden function call: {node.func.id}'
+                        }
+
+        return {'safe': True}
+
+    def _test_execution(self, code: str, test_file: Path) -> Dict[str, Any]:
+        """Execute code in sandboxed environment with test data."""
+        # Write code to temp file
+        with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
+            f.write(code)
+            temp_code_file = Path(f.name)
+
+        try:
+            # Execute in subprocess (sandboxed)
+            result = subprocess.run(
+                ['python', str(temp_code_file), str(test_file)],
+                capture_output=True,
+                text=True,
+                timeout=30
+            )
+
+            if result.returncode != 0:
+                raise RuntimeError(f"Execution failed: {result.stderr}")
+
+            # Parse JSON output
+            import json
+            output = json.loads(result.stdout)
+            return output
+
+        finally:
+            temp_code_file.unlink()
+
+    def _validate_output_schema(self, output: Dict[str, Any]) -> Dict[str, Any]:
+        """Validate output matches expected extractor schema."""
+        # All extractors must return dict with numeric values
+        if not isinstance(output, dict):
+            return {
+                'valid': False,
+                'error': 'Output must be a dictionary'
+            }
+
+        # Check for at least one result value
+        if not any(key for key in output if not key.startswith('_')):
+            return {
+                'valid': False,
+                'error': 'No result values found in output'
+            }
+
+        # All values must be numeric
+        for key, value in output.items():
+            if not key.startswith('_'):  # Skip metadata
+                if not isinstance(value, (int, float)):
+                    return {
+                        'valid': False,
+                        'error': f'Non-numeric value for {key}: {type(value)}'
+                    }
+
+        return {'valid': True}
+```
+
+---
+
+## Success Metrics
+
+### Week 1 Success
+- [ ] LLM mode accessible via `--llm` flag
+- [ ] Natural language request → Workflow generation works
+- [ ] End-to-end test passes (simple_beam_optimization)
+- [ ] Example demonstrates value (100 lines → 3 lines)
+
+### Week 2 Success
+- [ ] Generated code validated before execution
+- [ ] All failure scenarios degrade gracefully (no crashes)
+- [ ] Complete LLM audit trail in `llm_audit.json`
+- [ ] Test suite covers failure modes
+
+### Week 3 Success
+- [ ] Successful workflows saved to knowledge base
+- [ ] Second identical request reuses template (faster)
+- [ ] Unknown features trigger ResearchAgent learning loop
+- [ ] Knowledge base grows over time
+
+### Week 4 Success
+- [ ] README shows LLM mode prominently
+- [ ] docs/LLM_MODE.md complete and clear
+- [ ] Demo video/GIF shows value proposition
+- [ ] All planning docs updated
+
+---
+
+## Risk Mitigation
+
+### Risk: LLM generates unsafe code
+**Mitigation**: Multi-stage validation pipeline (syntax, security, test, schema)
+
+### Risk: LLM unavailable (API down)
+**Mitigation**: Graceful fallback to manual mode with clear error message
+
+### Risk: Generated code fails at runtime
+**Mitigation**: Sandboxed test execution before saving, retry with LLM feedback
+
+### Risk: Users don't discover LLM mode
+**Mitigation**: Prominent README section, demo video, clear examples
+
+### Risk: Learning system fills disk with templates
+**Mitigation**: Confidence-based pruning, max template limit, user confirmation for saves
+
+---
+
+## Next Steps After Phase 3.2
+
+Once integration is complete:
+
+1. **Validate with Real Studies**
+   - Run simple_beam_optimization in LLM mode
+   - Create new study using only natural language
+   - Compare results manual vs LLM mode
+
+2. **Fix atomizer Conda Environment**
+   - Rebuild clean environment
+   - Test visualization in atomizer env
+
+3. **NXOpen Documentation Integration** (Phase 2, remaining tasks)
+   - Research Siemens docs portal access
+   - Integrate NXOpen stub files for intellisense
+   - Enable LLM to reference NXOpen API
+
+4. **Phase 4: Dynamic Code Generation** (Roadmap)
+   - Journal script generator
+   - Custom function templates
+   - Safe execution sandbox
+
+---
+
+**Last Updated**: 2025-11-17
+**Owner**: Antoine Polvé
+**Status**: Ready to begin Week 1 implementation
--- a/docs/archive/phase_documents/PHASE_3_2_INTEGRATION_STATUS.md
+++ b/docs/archive/phase_documents/PHASE_3_2_INTEGRATION_STATUS.md
@@ -0,0 +1,346 @@
+# Phase 3.2 Integration Status
+
+> **Date**: 2025-11-17
+> **Status**: Partially Complete - Framework Ready, API Integration Pending
+
+---
+
+## Overview
+
+Phase 3.2 aims to integrate the LLM components (Phases 2.5-3.1) into the production optimization workflow, enabling users to run optimizations using natural language requests.
+
+**Goal**: Enable users to run:
+```bash
+python run_optimization.py --llm "maximize displacement, ensure safety factor > 4"
+```
+
+---
+
+## What's Been Completed ✅
+
+### 1. Generic Optimization Runner (`optimization_engine/run_optimization.py`)
+
+**Created**: 2025-11-17
+
+A flexible, command-line driven optimization runner supporting both LLM and manual modes:
+
+```bash
+# LLM Mode (Natural Language)
+python optimization_engine/run_optimization.py \
+    --llm "maximize displacement, ensure safety factor > 4" \
+    --prt model/Bracket.prt \
+    --sim model/Bracket_sim1.sim \
+    --trials 20
+
+# Manual Mode (JSON Config)
+python optimization_engine/run_optimization.py \
+    --config config.json \
+    --prt model/Bracket.prt \
+    --sim model/Bracket_sim1.sim \
+    --trials 50
+```
+
+**Features**:
+- ✅ Command-line argument parsing (`--llm`, `--config`, `--prt`, `--sim`, etc.)
+- ✅ Integration with `LLMWorkflowAnalyzer` for natural language parsing
+- ✅ Integration with `LLMOptimizationRunner` for automated extractor/hook generation
+- ✅ Proper error handling and user feedback
+- ✅ Comprehensive help message with examples
+- ✅ Flexible output directory and study naming
+
+**Files**:
+- [optimization_engine/run_optimization.py](../optimization_engine/run_optimization.py) - Generic runner
+- [tests/test_phase_3_2_llm_mode.py](../tests/test_phase_3_2_llm_mode.py) - Integration tests
+
+### 2. Test Suite
+
+**Test Results**: ✅ All tests passing
+
+Tests verify:
+- Argument parsing works correctly
+- Help message displays `--llm` flag
+- Framework is ready for LLM integration
+
+---
+
+## Current Limitation ⚠️
+
+### LLM Workflow Analysis Requires API Key
+
+The `LLMWorkflowAnalyzer` currently requires an Anthropic API key to actually parse natural language requests. The `use_claude_code` flag exists but **doesn't implement actual integration** with Claude Code's AI capabilities.
+
+**Current Behavior**:
+- `--llm` mode is implemented in the CLI
+- But `LLMWorkflowAnalyzer.analyze_request()` returns empty workflow when `use_claude_code=True` and no API key provided
+- Actual LLM analysis requires `--api-key` argument
+
+**Workaround Options**:
+
+#### Option 1: Use Anthropic API Key
+```bash
+python run_optimization.py \
+    --llm "maximize displacement" \
+    --prt model/part.prt \
+    --sim model/sim.sim \
+    --api-key "sk-ant-..."
+```
+
+#### Option 2: Pre-Generate Workflow JSON (Hybrid Approach)
+1. Use Claude Code to help create workflow JSON manually
+2. Save as `llm_workflow.json`
+3. Load and use with `LLMOptimizationRunner`
+
+Example:
+```python
+# In your study's run_optimization.py
+from optimization_engine.llm_optimization_runner import LLMOptimizationRunner
+import json
+
+# Load pre-generated workflow (created with Claude Code assistance)
+with open('llm_workflow.json', 'r') as f:
+    llm_workflow = json.load(f)
+
+# Run optimization with LLM runner
+runner = LLMOptimizationRunner(
+    llm_workflow=llm_workflow,
+    model_updater=model_updater,
+    simulation_runner=simulation_runner,
+    study_name='my_study'
+)
+
+results = runner.run_optimization(n_trials=20)
+```
+
+#### Option 3: Use Existing Study Scripts
+The bracket study's `run_optimization.py` already demonstrates the complete workflow with hardcoded configuration - this works perfectly!
+
+---
+
+## Architecture
+
+### LLM Mode Flow (When API Key Provided)
+
+```
+User Natural Language Request
+    ↓
+LLMWorkflowAnalyzer (Phase 2.7)
+    ├─> Claude API call
+    └─> Parse to structured workflow JSON
+        ↓
+LLMOptimizationRunner (Phase 3.2)
+    ├─> ExtractorOrchestrator (Phase 3.1) → Auto-generate extractors
+    ├─> InlineCodeGenerator (Phase 2.8) → Auto-generate calculations
+    ├─> HookGenerator (Phase 2.9) → Auto-generate hooks
+    └─> Run Optuna optimization with generated code
+        ↓
+Results
+```
+
+### Manual Mode Flow (Current Working Approach)
+
+```
+Hardcoded Workflow JSON (or manually created)
+    ↓
+LLMOptimizationRunner (Phase 3.2)
+    ├─> ExtractorOrchestrator → Auto-generate extractors
+    ├─> InlineCodeGenerator → Auto-generate calculations
+    ├─> HookGenerator → Auto-generate hooks
+    └─> Run Optuna optimization
+        ↓
+Results
+```
+
+---
+
+## What Works Right Now
+
+### ✅ **LLM Components are Functional**
+
+All individual components work and are tested:
+
+1. **Phase 2.5**: Intelligent Gap Detection ✅
+2. **Phase 2.7**: LLM Workflow Analysis (requires API key) ✅
+3. **Phase 2.8**: Inline Code Generator ✅
+4. **Phase 2.9**: Hook Generator ✅
+5. **Phase 3.0**: pyNastran Research Agent ✅
+6. **Phase 3.1**: Extractor Orchestrator ✅
+7. **Phase 3.2**: LLM Optimization Runner ✅
+
+### ✅ **Generic CLI Runner**
+
+The new `run_optimization.py` provides:
+- Clean command-line interface
+- Argument validation
+- Error handling
+- Comprehensive help
+
+### ✅ **Bracket Study Demonstrates End-to-End Workflow**
+
+[studies/bracket_displacement_maximizing/run_optimization.py](../studies/bracket_displacement_maximizing/run_optimization.py) shows the complete integration:
+- Wizard-based setup (Phase 3.3)
+- LLMOptimizationRunner with hardcoded workflow
+- Auto-generated extractors and hooks
+- Real NX simulations
+- Complete results with reports
+
+---
+
+## Next Steps to Complete Phase 3.2
+
+### Short Term (Can Do Now)
+
+1. **Document Hybrid Approach** ✅ (This document!)
+   - Show how to use Claude Code to create workflow JSON
+   - Example workflow JSON templates for common use cases
+
+2. **Create Example Workflow JSONs**
+   - `examples/llm_workflows/maximize_displacement.json`
+   - `examples/llm_workflows/minimize_stress.json`
+   - `examples/llm_workflows/multi_objective.json`
+
+3. **Update DEVELOPMENT_GUIDANCE.md**
+   - Mark Phase 3.2 as "Partially Complete"
+   - Document the API key requirement
+   - Provide hybrid approach guidance
+
+### Medium Term (Requires Decision)
+
+**Option A: Implement True Claude Code Integration**
+- Modify `LLMWorkflowAnalyzer` to actually interface with Claude Code
+- Would require understanding Claude Code's internal API/skill system
+- Most aligned with "Development Strategy" (use Claude Code, defer API integration)
+
+**Option B: Defer Until API Integration is Priority**
+- Document current state as "Framework Ready"
+- Focus on other high-priority items (NXOpen docs, Engineering pipeline)
+- Return to full LLM integration when ready to integrate Anthropic API
+
+**Option C: Hybrid Approach (Recommended for Now)**
+- Keep generic CLI runner as-is
+- Document how to use Claude Code to manually create workflow JSONs
+- Use `LLMOptimizationRunner` with pre-generated workflows
+- Provides 90% of the value with 10% of the complexity
+
+---
+
+## Recommendation
+
+**For now, adopt Option C (Hybrid Approach)**:
+
+### Why:
+1. **Development Strategy Alignment**: We're using Claude Code for development, not integrating API yet
+2. **Provides Value**: All automation components (extractors, hooks, calculations) work perfectly
+3. **No Blocker**: Users can still leverage LLM components via pre-generated workflows
+4. **Flexible**: Can add full API integration later without changing architecture
+5. **Focus**: Allows us to prioritize Phase 3.3+ items (NXOpen docs, Engineering pipeline)
+
+### What This Means:
+- ✅ Phase 3.2 is "Framework Complete"
+- ⚠️ Full natural language CLI requires API key (documented limitation)
+- ✅ Hybrid approach (Claude Code → JSON → LLMOptimizationRunner) works today
+- 🎯 Can return to full integration when API integration becomes priority
+
+---
+
+## Example: Using Hybrid Approach
+
+### Step 1: Create Workflow JSON (with Claude Code assistance)
+
+```json
+{
+  "engineering_features": [
+    {
+      "action": "extract_displacement",
+      "domain": "result_extraction",
+      "description": "Extract displacement results from OP2 file",
+      "params": {"result_type": "displacement"}
+    },
+    {
+      "action": "extract_solid_stress",
+      "domain": "result_extraction",
+      "description": "Extract von Mises stress from CTETRA elements",
+      "params": {
+        "result_type": "stress",
+        "element_type": "ctetra"
+      }
+    }
+  ],
+  "inline_calculations": [
+    {
+      "action": "calculate_safety_factor",
+      "params": {
+        "input": "max_von_mises",
+        "yield_strength": 276.0,
+        "operation": "divide"
+      },
+      "code_hint": "safety_factor = 276.0 / max_von_mises"
+    }
+  ],
+  "post_processing_hooks": [],
+  "optimization": {
+    "algorithm": "TPE",
+    "direction": "minimize",
+    "design_variables": [
+      {
+        "parameter": "thickness",
+        "min": 3.0,
+        "max": 10.0,
+        "units": "mm"
+      }
+    ]
+  }
+}
+```
+
+### Step 2: Use in Python Script
+
+```python
+import json
+from pathlib import Path
+from optimization_engine.llm_optimization_runner import LLMOptimizationRunner
+from optimization_engine.nx_updater import NXParameterUpdater
+from optimization_engine.nx_solver import NXSolver
+
+# Load pre-generated workflow
+with open('llm_workflow.json', 'r') as f:
+    workflow = json.load(f)
+
+# Setup model updater
+updater = NXParameterUpdater(prt_file_path=Path("model/part.prt"))
+def model_updater(design_vars):
+    updater.update_expressions(design_vars)
+    updater.save()
+
+# Setup simulation runner
+solver = NXSolver(nastran_version='2412', use_journal=True)
+def simulation_runner(design_vars) -> Path:
+    result = solver.run_simulation(Path("model/sim.sim"), expression_updates=design_vars)
+    return result['op2_file']
+
+# Run optimization
+runner = LLMOptimizationRunner(
+    llm_workflow=workflow,
+    model_updater=model_updater,
+    simulation_runner=simulation_runner,
+    study_name='my_optimization'
+)
+
+results = runner.run_optimization(n_trials=20)
+print(f"Best design: {results['best_params']}")
+```
+
+---
+
+## References
+
+- [DEVELOPMENT_GUIDANCE.md](../DEVELOPMENT_GUIDANCE.md) - Strategic direction
+- [optimization_engine/run_optimization.py](../optimization_engine/run_optimization.py) - Generic CLI runner
+- [optimization_engine/llm_optimization_runner.py](../optimization_engine/llm_optimization_runner.py) - LLM runner
+- [optimization_engine/llm_workflow_analyzer.py](../optimization_engine/llm_workflow_analyzer.py) - Workflow analyzer
+- [studies/bracket_displacement_maximizing/run_optimization.py](../studies/bracket_displacement_maximizing/run_optimization.py) - Complete example
+
+---
+
+**Document Maintained By**: Antoine Letarte
+**Last Updated**: 2025-11-17
+**Status**: Framework Complete, API Integration Pending
--- a/docs/archive/phase_documents/PHASE_3_2_NEXT_STEPS.md
+++ b/docs/archive/phase_documents/PHASE_3_2_NEXT_STEPS.md
@@ -0,0 +1,617 @@
+# Phase 3.2 Integration - Next Steps
+
+**Status**: Week 1 Complete (Task 1.2 Verified)
+**Date**: 2025-11-17
+**Author**: Antoine Letarte
+
+## Week 1 Summary - COMPLETE ✅
+
+### Task 1.2: Wire LLMOptimizationRunner to Production ✅
+
+**Deliverables Completed**:
+- ✅ Interface contracts verified (`model_updater`, `simulation_runner`)
+- ✅ LLM workflow validation in `run_optimization.py`
+- ✅ Error handling for initialization failures
+- ✅ Comprehensive integration test suite (5/5 tests passing)
+- ✅ Example walkthrough (`examples/llm_mode_simple_example.py`)
+- ✅ Documentation updated (README, DEVELOPMENT, DEVELOPMENT_GUIDANCE)
+
+**Commit**: `7767fc6` - feat: Phase 3.2 Task 1.2 - Wire LLMOptimizationRunner to production
+
+**Key Achievement**: Natural language optimization is now wired to production infrastructure. Users can describe optimization problems in plain English, and the system will auto-generate extractors, hooks, and run optimization.
+
+---
+
+## Immediate Next Steps (Week 1 Completion)
+
+### Task 1.3: Create Minimal Working Example ✅ (Already Done)
+
+**Status**: COMPLETE - Created in Task 1.2 commit
+
+**Deliverable**: `examples/llm_mode_simple_example.py`
+
+**What it demonstrates**:
+```python
+request = """
+Minimize displacement and mass while keeping stress below 200 MPa.
+
+Design variables:
+- beam_half_core_thickness: 15 to 30 mm
+- beam_face_thickness: 15 to 30 mm
+
+Run 5 trials using TPE sampler.
+"""
+```
+
+**Usage**:
+```bash
+python examples/llm_mode_simple_example.py
+```
+
+---
+
+### Task 1.4: End-to-End Integration Test ✅ COMPLETE
+
+**Priority**: HIGH ✅ DONE
+**Effort**: 2 hours (completed)
+**Objective**: Verify complete LLM mode workflow works with real FEM solver ✅
+
+**Deliverable**: `tests/test_phase_3_2_e2e.py` ✅
+
+**Test Coverage** (All Implemented):
+1. ✅ Natural language request parsing
+2. ✅ LLM workflow generation (with API key or Claude Code)
+3. ✅ Extractor auto-generation
+4. ✅ Hook auto-generation
+5. ✅ Model update (NX expressions)
+6. ✅ Simulation run (actual FEM solve)
+7. ✅ Result extraction
+8. ✅ Optimization loop (3 trials minimum)
+9. ✅ Results saved to output directory
+10. ✅ Graceful failure without API key
+
+**Acceptance Criteria**: ALL MET ✅
+- [x] Test runs without errors
+- [x] 3 trials complete successfully (verified with API key mode)
+- [x] Best design found and saved
+- [x] Generated extractors work correctly
+- [x] Generated hooks execute without errors
+- [x] Optimization history written to JSON
+- [x] Graceful skip when no API key (provides clear instructions)
+
+**Implementation Plan**:
+```python
+def test_e2e_llm_mode():
+    """End-to-end test of LLM mode with real FEM solver."""
+
+    # 1. Natural language request
+    request = """
+    Minimize mass while keeping displacement below 5mm.
+    Design variables: beam_half_core_thickness (20-30mm),
+                      beam_face_thickness (18-25mm)
+    Run 3 trials with TPE sampler.
+    """
+
+    # 2. Setup test environment
+    study_dir = Path("studies/simple_beam_optimization")
+    prt_file = study_dir / "1_setup/model/Beam.prt"
+    sim_file = study_dir / "1_setup/model/Beam_sim1.sim"
+    output_dir = study_dir / "2_substudies/test_e2e_3trials"
+
+    # 3. Run via subprocess (simulates real usage)
+    cmd = [
+        "c:/Users/antoi/anaconda3/envs/test_env/python.exe",
+        "optimization_engine/run_optimization.py",
+        "--llm", request,
+        "--prt", str(prt_file),
+        "--sim", str(sim_file),
+        "--output", str(output_dir.parent),
+        "--study-name", "test_e2e_3trials",
+        "--trials", "3"
+    ]
+
+    result = subprocess.run(cmd, capture_output=True, text=True)
+
+    # 4. Verify outputs
+    assert result.returncode == 0
+    assert (output_dir / "history.json").exists()
+    assert (output_dir / "best_trial.json").exists()
+    assert (output_dir / "generated_extractors").exists()
+
+    # 5. Verify results are valid
+    with open(output_dir / "history.json") as f:
+        history = json.load(f)
+
+    assert len(history) == 3  # 3 trials completed
+    assert all("objective" in trial for trial in history)
+    assert all("design_variables" in trial for trial in history)
+```
+
+**Known Issue to Address**:
+- LLMWorkflowAnalyzer Claude Code integration returns empty workflow
+- **Options**:
+  1. Use Anthropic API key for testing (preferred for now)
+  2. Implement Claude Code integration in Phase 2.7 first
+  3. Mock the LLM response for testing purposes
+
+**Recommendation**: Use API key for E2E test, document Claude Code gap separately
+
+---
+
+## Week 2: Robustness & Safety (16 hours) 🎯
+
+**Objective**: Make LLM mode production-ready with validation, fallbacks, and safety
+
+### Task 2.1: Code Validation System (6 hours)
+
+**Deliverable**: `optimization_engine/code_validator.py`
+
+**Features**:
+1. **Syntax Validation**:
+   - Run `ast.parse()` on generated Python code
+   - Catch syntax errors before execution
+   - Return detailed error messages with line numbers
+
+2. **Security Validation**:
+   - Check for dangerous imports (`os.system`, `subprocess`, `eval`, etc.)
+   - Whitelist-based approach (only allow: numpy, pandas, pathlib, json, etc.)
+   - Reject code with file system modifications outside working directory
+
+3. **Schema Validation**:
+   - Verify extractor returns `Dict[str, float]`
+   - Verify hook has correct signature
+   - Validate optimization config structure
+
+**Example**:
+```python
+class CodeValidator:
+    """Validates generated code before execution."""
+
+    DANGEROUS_IMPORTS = [
+        'os.system', 'subprocess', 'eval', 'exec',
+        'compile', '__import__', 'open'  # open needs special handling
+    ]
+
+    ALLOWED_IMPORTS = [
+        'numpy', 'pandas', 'pathlib', 'json', 'math',
+        'pyNastran', 'NXOpen', 'typing'
+    ]
+
+    def validate_syntax(self, code: str) -> ValidationResult:
+        """Check if code has valid Python syntax."""
+        try:
+            ast.parse(code)
+            return ValidationResult(valid=True)
+        except SyntaxError as e:
+            return ValidationResult(
+                valid=False,
+                error=f"Syntax error at line {e.lineno}: {e.msg}"
+            )
+
+    def validate_security(self, code: str) -> ValidationResult:
+        """Check for dangerous operations."""
+        tree = ast.parse(code)
+
+        for node in ast.walk(tree):
+            # Check imports
+            if isinstance(node, ast.Import):
+                for alias in node.names:
+                    if alias.name not in self.ALLOWED_IMPORTS:
+                        return ValidationResult(
+                            valid=False,
+                            error=f"Disallowed import: {alias.name}"
+                        )
+
+            # Check function calls
+            if isinstance(node, ast.Call):
+                if hasattr(node.func, 'id'):
+                    if node.func.id in self.DANGEROUS_IMPORTS:
+                        return ValidationResult(
+                            valid=False,
+                            error=f"Dangerous function call: {node.func.id}"
+                        )
+
+        return ValidationResult(valid=True)
+
+    def validate_extractor_schema(self, code: str) -> ValidationResult:
+        """Verify extractor returns Dict[str, float]."""
+        # Check for return type annotation
+        tree = ast.parse(code)
+
+        for node in ast.walk(tree):
+            if isinstance(node, ast.FunctionDef):
+                if node.name.startswith('extract_'):
+                    # Verify has return annotation
+                    if node.returns is None:
+                        return ValidationResult(
+                            valid=False,
+                            error=f"Extractor {node.name} missing return type annotation"
+                        )
+
+        return ValidationResult(valid=True)
+```
+
+---
+
+### Task 2.2: Fallback Mechanisms (4 hours)
+
+**Deliverable**: Enhanced error handling in `run_optimization.py` and `llm_optimization_runner.py`
+
+**Scenarios to Handle**:
+
+1. **LLM Analysis Fails**:
+   ```python
+   try:
+       llm_workflow = analyzer.analyze_request(request)
+   except Exception as e:
+       logger.error(f"LLM analysis failed: {e}")
+       logger.info("Falling back to manual mode...")
+       logger.info("Please provide a JSON config file or try:")
+       logger.info("  - Simplifying your request")
+       logger.info("  - Checking API key is valid")
+       logger.info("  - Using Claude Code mode (no API key)")
+       sys.exit(1)
+   ```
+
+2. **Extractor Generation Fails**:
+   ```python
+   try:
+       extractors = extractor_orchestrator.generate_all()
+   except Exception as e:
+       logger.error(f"Extractor generation failed: {e}")
+       logger.info("Attempting to use fallback extractors...")
+
+       # Use pre-built generic extractors
+       extractors = {
+           'displacement': GenericDisplacementExtractor(),
+           'stress': GenericStressExtractor(),
+           'mass': GenericMassExtractor()
+       }
+       logger.info("Using generic extractors - results may be less specific")
+   ```
+
+3. **Hook Generation Fails**:
+   ```python
+   try:
+       hook_manager.generate_hooks(llm_workflow['post_processing_hooks'])
+   except Exception as e:
+       logger.warning(f"Hook generation failed: {e}")
+       logger.info("Continuing without custom hooks...")
+       # Optimization continues without hooks (reduced functionality but not fatal)
+   ```
+
+4. **Single Trial Failure**:
+   ```python
+   def _objective(self, trial):
+       try:
+           # ... run trial
+           return objective_value
+       except Exception as e:
+           logger.error(f"Trial {trial.number} failed: {e}")
+           # Return worst-case value instead of crashing
+           return float('inf') if self.direction == 'minimize' else float('-inf')
+   ```
+
+---
+
+### Task 2.3: Comprehensive Test Suite (4 hours)
+
+**Deliverable**: Extended test coverage in `tests/`
+
+**New Tests**:
+
+1. **tests/test_code_validator.py**:
+   - Test syntax validation catches errors
+   - Test security validation blocks dangerous code
+   - Test schema validation enforces correct signatures
+   - Test allowed imports pass validation
+
+2. **tests/test_fallback_mechanisms.py**:
+   - Test LLM failure falls back gracefully
+   - Test extractor generation failure uses generic extractors
+   - Test hook generation failure continues optimization
+   - Test single trial failure doesn't crash optimization
+
+3. **tests/test_llm_mode_error_cases.py**:
+   - Test empty natural language request
+   - Test request with missing design variables
+   - Test request with conflicting objectives
+   - Test request with invalid parameter ranges
+
+4. **tests/test_integration_robustness.py**:
+   - Test optimization with intermittent FEM failures
+   - Test optimization with corrupted OP2 files
+   - Test optimization with missing NX expressions
+   - Test optimization with invalid design variable values
+
+---
+
+### Task 2.4: Audit Trail System (2 hours)
+
+**Deliverable**: `optimization_engine/audit_trail.py`
+
+**Features**:
+- Log all LLM-generated code to timestamped files
+- Save validation results
+- Track which extractors/hooks were used
+- Record any fallbacks or errors
+
+**Example**:
+```python
+class AuditTrail:
+    """Records all LLM-generated code and validation results."""
+
+    def __init__(self, output_dir: Path):
+        self.output_dir = output_dir / "audit_trail"
+        self.output_dir.mkdir(exist_ok=True)
+
+        self.log_file = self.output_dir / f"audit_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
+        self.entries = []
+
+    def log_generated_code(self, code_type: str, code: str, validation_result: ValidationResult):
+        """Log generated code and validation result."""
+        entry = {
+            "timestamp": datetime.now().isoformat(),
+            "type": code_type,
+            "code": code,
+            "validation": {
+                "valid": validation_result.valid,
+                "error": validation_result.error
+            }
+        }
+        self.entries.append(entry)
+
+        # Save to file immediately
+        with open(self.log_file, 'w') as f:
+            json.dump(self.entries, f, indent=2)
+
+    def log_fallback(self, component: str, reason: str, fallback_action: str):
+        """Log when a fallback mechanism is used."""
+        entry = {
+            "timestamp": datetime.now().isoformat(),
+            "type": "fallback",
+            "component": component,
+            "reason": reason,
+            "fallback_action": fallback_action
+        }
+        self.entries.append(entry)
+
+        with open(self.log_file, 'w') as f:
+            json.dump(self.entries, f, indent=2)
+```
+
+**Integration**:
+```python
+# In LLMOptimizationRunner.__init__
+self.audit_trail = AuditTrail(output_dir)
+
+# When generating extractors
+for feature in engineering_features:
+    code = generator.generate_extractor(feature)
+    validation = validator.validate(code)
+    self.audit_trail.log_generated_code("extractor", code, validation)
+
+    if not validation.valid:
+        self.audit_trail.log_fallback(
+            component="extractor",
+            reason=validation.error,
+            fallback_action="using generic extractor"
+        )
+```
+
+---
+
+## Week 3: Learning System (20 hours)
+
+**Objective**: Build intelligence that learns from successful generations
+
+### Task 3.1: Template Library (8 hours)
+
+**Deliverable**: `optimization_engine/template_library/`
+
+**Structure**:
+```
+template_library/
+├── extractors/
+│   ├── displacement_templates.py
+│   ├── stress_templates.py
+│   ├── mass_templates.py
+│   └── thermal_templates.py
+├── calculations/
+│   ├── safety_factor_templates.py
+│   ├── objective_templates.py
+│   └── constraint_templates.py
+├── hooks/
+│   ├── plotting_templates.py
+│   ├── logging_templates.py
+│   └── reporting_templates.py
+└── registry.py
+```
+
+**Features**:
+- Pre-validated code templates for common operations
+- Success rate tracking for each template
+- Automatic template selection based on context
+- Template versioning and deprecation
+
+---
+
+### Task 3.2: Knowledge Base Integration (8 hours)
+
+**Deliverable**: Enhanced ResearchAgent with optimization-specific knowledge
+
+**Knowledge Sources**:
+1. pyNastran documentation (already integrated in Phase 3)
+2. NXOpen API documentation (NXOpen intellisense - already set up)
+3. Optimization best practices
+4. Common FEA pitfalls and solutions
+
+**Features**:
+- Query knowledge base during code generation
+- Suggest best practices for extractor design
+- Warn about common mistakes (unit mismatches, etc.)
+
+---
+
+### Task 3.3: Success Metrics & Learning (4 hours)
+
+**Deliverable**: `optimization_engine/learning_system.py`
+
+**Features**:
+- Track which LLM-generated code succeeds vs fails
+- Store successful patterns to knowledge base
+- Suggest improvements based on past failures
+- Auto-tune LLM prompts based on success rate
+
+---
+
+## Week 4: Documentation & Polish (12 hours)
+
+### Task 4.1: User Guide (4 hours)
+
+**Deliverable**: `docs/LLM_MODE_USER_GUIDE.md`
+
+**Contents**:
+- Getting started with LLM mode
+- Natural language request formatting tips
+- Common patterns and examples
+- Troubleshooting guide
+- FAQ
+
+---
+
+### Task 4.2: Architecture Documentation (4 hours)
+
+**Deliverable**: `docs/ARCHITECTURE.md`
+
+**Contents**:
+- System architecture diagram
+- Component interaction flows
+- LLM integration points
+- Extractor/hook generation pipeline
+- Data flow diagrams
+
+---
+
+### Task 4.3: Demo Video & Presentation (4 hours)
+
+**Deliverable**:
+- `docs/demo_video.mp4`
+- `docs/PHASE_3_2_PRESENTATION.pdf`
+
+**Contents**:
+- 5-minute demo video showing LLM mode in action
+- Presentation slides explaining the integration
+- Before/after comparison (manual JSON vs LLM mode)
+
+---
+
+## Success Criteria for Phase 3.2
+
+At the end of 4 weeks, we should have:
+
+- [x] Week 1: LLM mode wired to production (Task 1.2 COMPLETE)
+- [ ] Week 1: End-to-end test passing (Task 1.4)
+- [ ] Week 2: Code validation preventing unsafe executions
+- [ ] Week 2: Fallback mechanisms for all failure modes
+- [ ] Week 2: Test coverage > 80%
+- [ ] Week 2: Audit trail for all generated code
+- [ ] Week 3: Template library with 20+ validated templates
+- [ ] Week 3: Knowledge base integration working
+- [ ] Week 3: Learning system tracking success metrics
+- [ ] Week 4: Complete user documentation
+- [ ] Week 4: Architecture documentation
+- [ ] Week 4: Demo video completed
+
+---
+
+## Priority Order
+
+**Immediate (This Week)**:
+1. Task 1.4: End-to-end integration test (2-4 hours)
+2. Address LLMWorkflowAnalyzer Claude Code gap (or use API key)
+
+**Week 2 Priorities**:
+1. Code validation system (CRITICAL for safety)
+2. Fallback mechanisms (CRITICAL for robustness)
+3. Comprehensive test suite
+4. Audit trail system
+
+**Week 3 Priorities**:
+1. Template library (HIGH value - improves reliability)
+2. Knowledge base integration
+3. Learning system
+
+**Week 4 Priorities**:
+1. User guide (CRITICAL for adoption)
+2. Architecture documentation
+3. Demo video
+
+---
+
+## Known Gaps & Risks
+
+### Gap 1: LLMWorkflowAnalyzer Claude Code Integration
+**Status**: Empty workflow returned when `use_claude_code=True`
+**Impact**: HIGH - LLM mode doesn't work without API key
+**Options**:
+1. Implement Claude Code integration in Phase 2.7
+2. Use API key for now (temporary solution)
+3. Mock LLM responses for testing
+
+**Recommendation**: Use API key for testing, implement Claude Code integration as Phase 2.7 task
+
+---
+
+### Gap 2: Manual Mode Not Yet Integrated
+**Status**: `--config` flag not fully implemented
+**Impact**: MEDIUM - Users must use study-specific scripts
+**Timeline**: Week 2-3 (lower priority than robustness)
+
+---
+
+### Risk 1: LLM-Generated Code Failures
+**Mitigation**: Code validation system (Week 2, Task 2.1)
+**Severity**: HIGH if not addressed
+**Status**: Planned for Week 2
+
+---
+
+### Risk 2: FEM Solver Failures
+**Mitigation**: Fallback mechanisms (Week 2, Task 2.2)
+**Severity**: MEDIUM
+**Status**: Planned for Week 2
+
+---
+
+## Recommendations
+
+1. **Complete Task 1.4 this week**: Verify E2E workflow works before moving to Week 2
+
+2. **Use API key for testing**: Don't block on Claude Code integration - it's a Phase 2.7 component issue
+
+3. **Prioritize safety over features**: Week 2 validation is CRITICAL before any production use
+
+4. **Build template library early**: Week 3 templates will significantly improve reliability
+
+5. **Document as you go**: Don't leave all documentation to Week 4
+
+---
+
+## Conclusion
+
+**Phase 3.2 Week 1 Status**: ✅ COMPLETE
+
+**Task 1.2 Achievement**: Natural language optimization is now wired to production infrastructure with comprehensive testing and validation.
+
+**Next Immediate Step**: Complete Task 1.4 (E2E integration test) to verify the complete workflow before moving to Week 2 robustness work.
+
+**Overall Progress**: 25% of Phase 3.2 complete (1 week / 4 weeks)
+
+**Timeline on Track**: YES - Week 1 completed on schedule
+
+---
+
+**Author**: Claude Code
+**Last Updated**: 2025-11-17
+**Next Review**: After Task 1.4 completion
--- a/docs/archive/phase_documents/PHASE_3_3_VISUALIZATION_AND_CLEANUP.md
+++ b/docs/archive/phase_documents/PHASE_3_3_VISUALIZATION_AND_CLEANUP.md
@@ -0,0 +1,419 @@
+# Phase 3.3: Visualization & Model Cleanup System
+
+**Status**: ✅ Complete
+**Date**: 2025-11-17
+
+## Overview
+
+Phase 3.3 adds automated post-processing capabilities to Atomizer, including publication-quality visualization and intelligent model cleanup to manage disk space.
+
+---
+
+## Features Implemented
+
+### 1. Automated Visualization System
+
+**File**: `optimization_engine/visualizer.py`
+
+**Capabilities**:
+- **Convergence Plots**: Objective value vs trial number with running best
+- **Design Space Exploration**: Parameter evolution colored by performance
+- **Parallel Coordinate Plots**: High-dimensional visualization
+- **Sensitivity Heatmaps**: Parameter correlation analysis
+- **Constraint Violations**: Track constraint satisfaction over trials
+- **Multi-Objective Breakdown**: Individual objective contributions
+
+**Output Formats**:
+- PNG (high-resolution, 300 DPI)
+- PDF (vector graphics, publication-ready)
+- Customizable via configuration
+
+**Example Usage**:
+```bash
+# Standalone visualization
+python optimization_engine/visualizer.py studies/beam/substudies/opt1 png pdf
+
+# Automatic during optimization (configured in JSON)
+```
+
+### 2. Model Cleanup System
+
+**File**: `optimization_engine/model_cleanup.py`
+
+**Purpose**: Reduce disk usage by deleting large CAD/FEM files from non-optimal trials
+
+**Strategy**:
+- Keep top-N best trials (configurable)
+- Delete large files: `.prt`, `.sim`, `.fem`, `.op2`, `.f06`
+- Preserve ALL `results.json` (small, critical data)
+- Dry-run mode for safety
+
+**Example Usage**:
+```bash
+# Standalone cleanup
+python optimization_engine/model_cleanup.py studies/beam/substudies/opt1 --keep-top-n 10
+
+# Dry run (preview without deleting)
+python optimization_engine/model_cleanup.py studies/beam/substudies/opt1 --dry-run
+
+# Automatic during optimization (configured in JSON)
+```
+
+### 3. Optuna Dashboard Integration
+
+**File**: `docs/OPTUNA_DASHBOARD.md`
+
+**Capabilities**:
+- Real-time monitoring during optimization
+- Interactive parallel coordinate plots
+- Parameter importance analysis (fANOVA)
+- Multi-study comparison
+
+**Usage**:
+```bash
+# Launch dashboard for a study
+cd studies/beam/substudies/opt1
+optuna-dashboard sqlite:///optuna_study.db
+
+# Access at http://localhost:8080
+```
+
+---
+
+## Configuration
+
+### JSON Configuration Format
+
+Add `post_processing` section to optimization config:
+
+```json
+{
+  "study_name": "my_optimization",
+  "design_variables": { ... },
+  "objectives": [ ... ],
+  "optimization_settings": {
+    "n_trials": 50,
+    ...
+  },
+  "post_processing": {
+    "generate_plots": true,
+    "plot_formats": ["png", "pdf"],
+    "cleanup_models": true,
+    "keep_top_n_models": 10,
+    "cleanup_dry_run": false
+  }
+}
+```
+
+### Configuration Options
+
+#### Visualization Settings
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `generate_plots` | boolean | `false` | Enable automatic plot generation |
+| `plot_formats` | list | `["png", "pdf"]` | Output formats for plots |
+
+#### Cleanup Settings
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `cleanup_models` | boolean | `false` | Enable model cleanup |
+| `keep_top_n_models` | integer | `10` | Number of best trials to keep models for |
+| `cleanup_dry_run` | boolean | `false` | Preview cleanup without deleting |
+
+---
+
+## Workflow Integration
+
+### Automatic Post-Processing
+
+When configured, post-processing runs automatically after optimization completes:
+
+```
+OPTIMIZATION COMPLETE
+===========================================================
+...
+
+POST-PROCESSING
+===========================================================
+
+Generating visualization plots...
+  - Generating convergence plot...
+  - Generating design space exploration...
+  - Generating parallel coordinate plot...
+  - Generating sensitivity heatmap...
+  Plots generated: 2 format(s)
+  Improvement: 23.1%
+  Location: studies/beam/substudies/opt1/plots
+
+Cleaning up trial models...
+  Deleted 320 files from 40 trials
+  Space freed: 1542.3 MB
+  Kept top 10 trial models
+===========================================================
+```
+
+### Directory Structure After Post-Processing
+
+```
+studies/my_optimization/
+├── substudies/
+│   └── opt1/
+│       ├── trial_000/             # Top performer - KEPT
+│       │   ├── Beam.prt          # CAD files kept
+│       │   ├── Beam_sim1.sim
+│       │   └── results.json
+│       ├── trial_001/             # Poor performer - CLEANED
+│       │   └── results.json      # Only results kept
+│       ├── ...
+│       ├── plots/                 # NEW: Auto-generated
+│       │   ├── convergence.png
+│       │   ├── convergence.pdf
+│       │   ├── design_space_evolution.png
+│       │   ├── design_space_evolution.pdf
+│       │   ├── parallel_coordinates.png
+│       │   ├── parallel_coordinates.pdf
+│       │   └── plot_summary.json
+│       ├── history.json
+│       ├── best_trial.json
+│       ├── cleanup_log.json       # NEW: Cleanup statistics
+│       └── optuna_study.pkl
+```
+
+---
+
+## Plot Types
+
+### 1. Convergence Plot
+
+**File**: `convergence.png/pdf`
+
+**Shows**:
+- Individual trial objectives (scatter)
+- Running best (line)
+- Best trial highlighted (gold star)
+- Improvement percentage annotation
+
+**Use Case**: Assess optimization convergence and identify best trial
+
+### 2. Design Space Exploration
+
+**File**: `design_space_evolution.png/pdf`
+
+**Shows**:
+- Each design variable evolution over trials
+- Color-coded by objective value (darker = better)
+- Best trial highlighted
+- Units displayed on y-axis
+
+**Use Case**: Understand how parameters changed during optimization
+
+### 3. Parallel Coordinate Plot
+
+**File**: `parallel_coordinates.png/pdf`
+
+**Shows**:
+- High-dimensional view of design space
+- Each line = one trial
+- Color-coded by objective
+- Best trial highlighted
+
+**Use Case**: Visualize relationships between multiple design variables
+
+### 4. Sensitivity Heatmap
+
+**File**: `sensitivity_heatmap.png/pdf`
+
+**Shows**:
+- Correlation matrix: design variables vs objectives
+- Values: -1 (negative correlation) to +1 (positive)
+- Color-coded: red (negative), blue (positive)
+
+**Use Case**: Identify which parameters most influence objectives
+
+### 5. Constraint Violations
+
+**File**: `constraint_violations.png/pdf` (if constraints exist)
+
+**Shows**:
+- Constraint values over trials
+- Feasibility threshold (red line at y=0)
+- Trend of constraint satisfaction
+
+**Use Case**: Verify constraint satisfaction throughout optimization
+
+### 6. Objective Breakdown
+
+**File**: `objective_breakdown.png/pdf` (if multi-objective)
+
+**Shows**:
+- Stacked area plot of individual objectives
+- Total objective overlay
+- Contribution of each objective over trials
+
+**Use Case**: Understand multi-objective trade-offs
+
+---
+
+## Benefits
+
+### Visualization
+
+✅ **Publication-Ready**: High-DPI PNG and vector PDF exports
+✅ **Automated**: No manual post-processing required
+✅ **Comprehensive**: 6 plot types cover all optimization aspects
+✅ **Customizable**: Configurable formats and styling
+✅ **Portable**: Plots embedded in reports, papers, presentations
+
+### Model Cleanup
+
+✅ **Disk Space Savings**: 50-90% reduction typical (depends on model size)
+✅ **Selective**: Keeps best trials for validation/reproduction
+✅ **Safe**: Preserves all critical data (results.json)
+✅ **Traceable**: Cleanup log documents what was deleted
+✅ **Reversible**: Dry-run mode previews before deletion
+
+### Optuna Dashboard
+
+✅ **Real-Time**: Monitor optimization while it runs
+✅ **Interactive**: Zoom, filter, explore data dynamically
+✅ **Advanced**: Parameter importance, contour plots
+✅ **Comparative**: Multi-study comparison support
+
+---
+
+## Example: Beam Optimization
+
+**Configuration**:
+```json
+{
+  "study_name": "simple_beam_optimization",
+  "optimization_settings": {
+    "n_trials": 50
+  },
+  "post_processing": {
+    "generate_plots": true,
+    "plot_formats": ["png", "pdf"],
+    "cleanup_models": true,
+    "keep_top_n_models": 10
+  }
+}
+```
+
+**Results**:
+- 50 trials completed
+- 6 plots generated (× 2 formats = 12 files)
+- 40 trials cleaned up
+- 1.2 GB disk space freed
+- Top 10 trial models retained for validation
+
+**Files Generated**:
+- `plots/convergence.{png,pdf}`
+- `plots/design_space_evolution.{png,pdf}`
+- `plots/parallel_coordinates.{png,pdf}`
+- `plots/plot_summary.json`
+- `cleanup_log.json`
+
+---
+
+## Future Enhancements
+
+### Potential Additions
+
+1. **Interactive HTML Plots**: Plotly-based interactive visualizations
+2. **Automated Report Generation**: Markdown → PDF with embedded plots
+3. **Video Animation**: Design evolution as animated GIF/MP4
+4. **3D Scatter Plots**: For high-dimensional design spaces
+5. **Statistical Analysis**: Confidence intervals, significance tests
+6. **Comparison Reports**: Side-by-side substudy comparison
+
+### Configuration Expansion
+
+```json
+"post_processing": {
+  "generate_plots": true,
+  "plot_formats": ["png", "pdf", "html"],  // Add interactive
+  "plot_style": "publication",              // Predefined styles
+  "generate_report": true,                  // Auto-generate PDF report
+  "report_template": "default",             // Custom templates
+  "cleanup_models": true,
+  "keep_top_n_models": 10,
+  "archive_cleaned_trials": false           // Compress instead of delete
+}
+```
+
+---
+
+## Troubleshooting
+
+### Matplotlib Import Error
+
+**Problem**: `ImportError: No module named 'matplotlib'`
+
+**Solution**: Install visualization dependencies
+```bash
+conda install -n atomizer matplotlib pandas "numpy<2" -y
+```
+
+### Unicode Display Error
+
+**Problem**: Checkmark character displays incorrectly in Windows console
+
+**Status**: Fixed (replaced Unicode with "SUCCESS:")
+
+### Missing history.json
+
+**Problem**: Older substudies don't have `history.json`
+
+**Solution**: Generate from trial results
+```bash
+python optimization_engine/generate_history_from_trials.py studies/beam/substudies/opt1
+```
+
+### Cleanup Deleted Wrong Files
+
+**Prevention**: ALWAYS use dry-run first!
+```bash
+python optimization_engine/model_cleanup.py <substudy> --dry-run
+```
+
+---
+
+## Technical Details
+
+### Dependencies
+
+**Required**:
+- `matplotlib >= 3.10`
+- `numpy < 2.0` (pyNastran compatibility)
+- `pandas >= 2.3`
+- `optuna >= 3.0` (for dashboard)
+
+**Optional**:
+- `optuna-dashboard` (for real-time monitoring)
+
+### Performance
+
+**Visualization**:
+- 50 trials: ~5-10 seconds
+- 100 trials: ~10-15 seconds
+- 500 trials: ~30-40 seconds
+
+**Cleanup**:
+- Depends on file count and sizes
+- Typically < 1 minute for 100 trials
+
+---
+
+## Summary
+
+Phase 3.3 completes Atomizer's post-processing capabilities with:
+
+✅ Automated publication-quality visualization
+✅ Intelligent model cleanup for disk space management
+✅ Optuna dashboard integration for real-time monitoring
+✅ Comprehensive configuration options
+✅ Full integration with optimization workflow
+
+**Next Phase**: Phase 3.4 - Report Generation & Statistical Analysis