# Atomizer Development Guidance

> **Living Document**: Strategic direction, current status, and development priorities for Atomizer
>
> **Last Updated**: 2025-11-17 (Evening - Phase 3.2 Integration Planning Complete)
>
> **Status**: Alpha Development - 80-90% Complete, Integration Phase
>
> 🎯 **NOW IN PROGRESS**: Phase 3.2 Integration Sprint - [Integration Plan](docs/PHASE_3_2_INTEGRATION_PLAN.md)

---

## Table of Contents

1. [Executive Summary](#executive-summary)
2. [Comprehensive Status Report](#comprehensive-status-report)
3. [Development Strategy](#development-strategy)
4. [Priority Initiatives](#priority-initiatives)
5. [Foundation for Future](#foundation-for-future)
6. [Technical Roadmap](#technical-roadmap)
7. [Development Standards](#development-standards)
8. [Key Principles](#key-principles)

---

## Executive Summary

### Current State

**Status**: Alpha Development - Significant Progress Made ✅
**Readiness**: Foundation solid, LLM features partially implemented, ready for integration phase
**Direction**: ✅ Aligned with roadmap vision - moving toward LLM-native optimization platform

### Quick Stats

- **110+ Python files** (~10,000+ lines in core engine)
- **23 test files** covering major components
- **Phase 1 (Plugin System)**: ✅ 100% Complete & Production Ready
- **Phases 2.5-3.1 (LLM Intelligence)**: ✅ 85% Complete - Components Built, Integration Needed
- **Phase 3.3 (Visualization & Cleanup)**: ✅ 100% Complete & Production Ready
- **Study Organization v2.0**: ✅ 100% Complete with Templates
- **Working Example Study**: simple_beam_optimization (4 substudies, 56 trials, full documentation)

### Key Insight

**You've built more than the documentation suggests!** The roadmap says "Phase 2: 0% Complete" but you've actually built sophisticated LLM components through Phase 3.1 (85% complete). The challenge now is **integration**, not development.

---

## Comprehensive Status Report

### 🎯 What's Actually Working (Production Ready)

#### ✅ Core Optimization Engine
**Status**: FULLY FUNCTIONAL

The foundation is rock solid:

- **Optuna Integration**: TPE, CMA-ES, GP samplers operational
- **NX Solver Integration**: Journal-based parameter updates and simulation execution
- **OP2 Result Extraction**: Stress and displacement extractors tested on real files
- **Study Management**: Complete folder structure with resume capability
- **Precision Control**: 4-decimal rounding for engineering units

**Evidence**:
- `studies/simple_beam_optimization/` - Complete 4D optimization study
  - 4 substudies (01-04) with numbered organization
  - 56 total trials across all substudies
  - 4 design variables (beam thickness, face thickness, hole diameter, hole count)
  - 3 objectives (displacement, stress, mass) + 1 constraint
  - Full documentation with substudy READMEs
- `studies/bracket_displacement_maximizing/` - Earlier study (20 trials)

#### ✅ Plugin System (Phase 1)
**Status**: PRODUCTION READY

This is exemplary architecture:

- **Hook Manager**: Priority-based execution at 7 lifecycle points
  - `pre_solve`, `post_solve`, `post_extraction`, `post_calculation`, etc.
- **Auto-discovery**: Plugins load automatically from directories
- **Context Passing**: Full trial data available to hooks
- **Logging Infrastructure**:
  - Per-trial detailed logs (`trial_logs/`)
  - High-level optimization log (`optimization.log`)
  - Clean, parseable format

**Evidence**: Hook system tested in `test_hooks_with_bracket.py` - all passing ✅

#### ✅ Substudy System
**Status**: WORKING & ELEGANT

NX-like hierarchical studies:

- **Shared models**, independent configurations
- **Continuation support** (fine-tuning builds on coarse exploration)
- **Live incremental history** tracking
- **Clean separation** of concerns

**File**: `studies/simple_beam_optimization/run_optimization.py`

#### ✅ Phase 3.3: Visualization & Model Cleanup
**Status**: PRODUCTION READY

Automated post-processing system for optimization results:

- **6 Plot Types**:
  - Convergence (objective vs trial with running best)
  - Design space evolution (parameter changes over time)
  - Parallel coordinates (high-dimensional visualization)
  - Sensitivity heatmap (parameter correlation analysis)
  - Constraint violations tracking
  - Multi-objective breakdown
- **Output Formats**: PNG (300 DPI) + PDF (vector graphics)
- **Model Cleanup**: Selective deletion of large CAD/FEM files
  - Keeps top-N best trials (default: 10)
  - Preserves all results.json files
  - 50-90% disk space savings typical
- **Configuration**: JSON-based `post_processing` section

**Evidence**:
- Tested on 50-trial beam optimization
- Generated 12 plot files (6 types × 2 formats)
- Plots saved to `studies/simple_beam_optimization/2_substudies/04_full_optimization_50trials/plots/`
- Documentation: `docs/PHASE_3_3_VISUALIZATION_AND_CLEANUP.md`

**Integration**: Runs automatically after optimization completes (if enabled in config)

#### ✅ Study Organization System v2.0
**Status**: PRODUCTION READY

Standardized directory structure for all optimization studies:

**Structure**:
```
studies/[study_name]/
├── 1_setup/              # Pre-optimization (model, benchmarking)
├── 2_substudies/         # Numbered runs (01_, 02_, 03_...)
└── 3_reports/            # Study-level analysis
```

**Features**:
- **Numbered Substudies**: Chronological ordering (01, 02, 03...)
- **Self-Documenting**: Each substudy has README.md with purpose/results
- **Metadata Tracking**: study_metadata.json with complete substudy registry
- **Templates**: Complete templates for new studies and substudies
- **Migration Tool**: reorganize_study.py for existing studies

**Evidence**:
- Applied to simple_beam_optimization study
- 4 substudy READMEs documenting progression
- Complete template system in `templates/`
- How-to guide: `templates/HOW_TO_CREATE_A_STUDY.md`
- Documentation: `docs/STUDY_ORGANIZATION.md`

**File**: `studies/simple_beam_optimization/study_metadata.json`

### 🚧 What's Built But Not Yet Integrated

#### 🟡 Phase 2.5-3.1: LLM Intelligence Components
**Status**: 85% Complete - Individual Modules Working, Integration Pending

These are sophisticated, well-designed modules that are 90% ready but not yet connected to the main optimization loop:

##### ✅ Built & Tested:

1. **LLM Workflow Analyzer** (`llm_workflow_analyzer.py` - 14.5KB)
   - Uses Claude API to analyze natural language optimization requests
   - Outputs structured JSON with engineering_features, inline_calculations, post_processing_hooks
   - Status: Fully functional standalone

2. **Extractor Orchestrator** (`extractor_orchestrator.py` - 12.7KB)
   - Processes LLM output and generates OP2 extractors
   - Dynamic loading and execution
   - Test: `test_phase_3_1_integration.py` - PASSING ✅
   - Evidence: Generated 3 working extractors in `result_extractors/generated/`

3. **pyNastran Research Agent** (`pynastran_research_agent.py` - 13.3KB)
   - Uses WebFetch to learn pyNastran API patterns
   - Knowledge base system stores learned patterns
   - 3 core extraction patterns: displacement, stress, force
   - Test: `test_complete_research_workflow.py` - PASSING ✅

4. **Hook Generator** (`hook_generator.py` - 27.8KB)
   - Auto-generates post-processing hook scripts
   - Weighted objectives, custom formulas, constraints, comparisons
   - Complete JSON I/O handling
   - Evidence: 4 working hooks in `plugins/post_calculation/`

5. **Inline Code Generator** (`inline_code_generator.py` - 17KB)
   - Generates Python code for simple math operations
   - Normalization, averaging, min/max calculations

6. **Codebase Analyzer & Capability Matcher** (Phase 2.5)
   - Scans existing code to detect gaps before requesting examples
   - 80-90% accuracy on complex optimization requests
   - Test: `test_phase_2_5_intelligent_gap_detection.py` - PASSING ✅

##### 🟡 What's Missing:

**Integration into main runner!** The components exist but aren't connected to `runner.py`:

```python
# Current runner.py (Line 29-76):
class OptimizationRunner:
    def __init__(self, config_path, model_updater, simulation_runner, result_extractors):
        # Uses MANUAL config.json
        # Uses MANUAL result_extractors dict
        # No LLM workflow integration ❌
```

New `LLMOptimizationRunner` exists (`llm_optimization_runner.py`) but:
- Not used in any production study
- Not tested end-to-end with real NX solves
- Missing integration with `run_optimization.py` scripts

### 📊 Architecture Assessment

#### 🟢 Strengths

1. **Clean Separation of Concerns**
   - Each phase is a self-contained module
   - Dependencies flow in one direction (no circular imports)
   - Easy to test components independently

2. **Excellent Documentation**
   - Session summaries for each phase (`docs/SESSION_SUMMARY_PHASE_*.md`)
   - Comprehensive roadmap (`DEVELOPMENT_ROADMAP.md`)
   - Inline docstrings with examples

3. **Feature Registry** (`feature_registry.json` - 35KB)
   - Well-structured capability catalog
   - Each feature has: implementation, interface, usage examples, metadata
   - Perfect foundation for LLM navigation

4. **Knowledge Base System**
   - Research sessions stored with rationale
   - 9 markdown files documenting learned patterns
   - Enables "learn once, use forever" approach

5. **Test Coverage**
   - 23 test files covering major components
   - Tests for individual phases (2.5, 2.9, 3.1)
   - Integration tests passing

#### 🟡 Areas for Improvement

1. **Integration Gap**
   - **Critical**: LLM components not connected to main runner
   - Two parallel runners exist (`runner.py` vs `llm_optimization_runner.py`)
   - Production studies still use manual JSON config

2. **Documentation Drift**
   - `README.md` says "Phase 2" is next priority
   - But Phases 2.5-3.1 are actually 85% complete
   - `DEVELOPMENT.md` shows "Phase 2: 0% Complete" - **INCORRECT**

3. **Test vs Production Gap**
   - LLM features tested in isolation
   - No end-to-end test: Natural language → LLM → Generated code → Real NX solve → Results
   - `test_bracket_llm_runner.py` exists but may not cover full pipeline

4. **User Experience**
   - No simple way to run LLM-enhanced optimization yet
   - User must manually edit JSON configs (old workflow)
   - Natural language interface exists but not exposed

5. **Code Duplication Risk**
   - `runner.py` and `llm_optimization_runner.py` share similar structure
   - Could consolidate into single runner with "LLM mode" flag

### 🎯 Phase 3.2 Integration Sprint - ACTIVE NOW

**Status**: 🟢 **IN PROGRESS** (2025-11-17)

**Goal**: Connect LLM components to production workflow - make LLM mode accessible

**Detailed Plan**: See [docs/PHASE_3_2_INTEGRATION_PLAN.md](docs/PHASE_3_2_INTEGRATION_PLAN.md)

#### What's Being Built (4-Week Sprint)

**Week 1: Make LLM Mode Accessible** (16 hours)
- Create unified entry point with `--llm` flag
- Wire LLMOptimizationRunner to production
- Create minimal working example
- End-to-end integration test

**Week 2: Robustness & Safety** (16 hours)
- Code validation pipeline (syntax, security, test execution)
- Graceful fallback mechanisms
- LLM audit trail for transparency
- Failure scenario testing

**Week 3: Learning System** (12 hours)
- Knowledge base implementation
- Template extraction and reuse
- ResearchAgent integration

**Week 4: Documentation & Discoverability** (8 hours)
- Update README with LLM capabilities
- Create docs/LLM_MODE.md
- Demo video/GIF
- Update all planning docs

#### Success Metrics

- [ ] Natural language request → Optimization results (single command)
- [ ] Generated code validated before execution (no crashes)
- [ ] Successful workflows saved and reused (learning system operational)
- [ ] Documentation shows LLM mode prominently (users discover it)

#### Impact

Once complete:
- **100 lines of JSON config** → **3 lines of natural language**
- Users describe goals → LLM generates code automatically
- System learns from successful workflows → gets faster over time
- Complete audit trail for all LLM decisions

---

### 🎯 Gap Analysis: What's Missing for Complete Vision

#### Critical Gaps (Being Addressed in Phase 3.2)

1. **Phase 3.2: Runner Integration** ✅ **IN PROGRESS**
   - Connect `LLMOptimizationRunner` to production workflows
   - Update `run_optimization.py` to support both manual and LLM modes
   - End-to-end test: Natural language → Actual NX solve → Results
   - **Timeline**: Week 1 of Phase 3.2 (2025-11-17 onwards)

2. **User-Facing Interface** ✅ **IN PROGRESS**
   - CLI command: `python run_optimization.py --llm --request "minimize stress"`
   - Dual-mode: LLM or traditional JSON config
   - **Timeline**: Week 1 of Phase 3.2

3. **Error Handling & Recovery** ✅ **IN PROGRESS**
   - Code validation before execution
   - Graceful fallback to manual mode
   - Complete audit trail
   - **Timeline**: Week 2 of Phase 3.2

#### Important Gaps (Should-Have)

1. **Dashboard Integration**
   - Dashboard exists (`dashboard/`) but may not show LLM-generated components
   - No visualization of generated code
   - No "LLM mode" toggle in UI

2. **Performance Optimization**
   - LLM calls in optimization loop could be slow
   - Caching for repeated patterns?
   - Batch code generation before optimization starts?

3. **Validation & Safety**
   - Generated code execution sandboxing?
   - Code review before running?
   - Unit tests for generated extractors?

#### Nice-to-Have Gaps

1. **Phase 4: Advanced Code Generation**
   - Complex FEA features (topology optimization, multi-physics)
   - NXOpen journal script generation

2. **Phase 5: Analysis & Decision Support**
   - Surrogate quality assessment (R², CV scores)
   - Sensitivity analysis
   - Engineering recommendations

3. **Phase 6: Automated Reporting**
   - HTML/PDF report generation
   - LLM-written narrative insights

### 🔍 Code Quality Assessment

**Excellent**:
- Modularity: Each component is self-contained (can be imported independently)
- Type Hints: Extensive use of `Dict[str, Any]`, `Path`, `Optional[...]`
- Error Messages: Clear, actionable error messages
- Logging: Comprehensive logging at appropriate levels

**Good**:
- Naming: Clear, descriptive function/variable names
- Documentation: Most functions have docstrings with examples
- Testing: Core components have tests

**Could Improve**:
- Consolidation: Some code duplication between runners
- Configuration Validation: Some JSON configs lack schema validation
- Async Operations: No async/await for potential concurrency
- Type Checking: Not using mypy or similar (no `mypy.ini` found)

---

## Development Strategy

### Current Approach: Claude Code + Manual Development

**Strategic Decision**: We are NOT integrating LLM API calls into Atomizer right now for development purposes.

#### Why This Makes Sense:

1. **Use What Works**: Claude Code (your subscription) is already providing LLM assistance for development
2. **Avoid Premature Optimization**: Don't block on LLM API integration when you can develop without it
3. **Focus on Foundation**: Build the architecture first, add LLM API later
4. **Keep Options Open**: Architecture supports LLM API, but doesn't require it for development

#### Future LLM Integration Strategy:

- **Near-term**: Maybe test simple use cases to validate API integration works
- **Medium-term**: Integrate LLM API for production user features (not dev workflow)
- **Long-term**: Fully LLM-native optimization workflow for end users

**Bottom Line**: Continue using Claude Code for Atomizer development. LLM API integration is a "later" feature, not a blocker.

---

## Priority Initiatives

### ✅ Phase 3.2 Integration - Framework Complete (2025-11-17)

**Status**: ✅ 75% Complete - Framework implemented, API integration pending

**What's Done**:
- ✅ Generic `run_optimization.py` CLI with `--llm` flag support
- ✅ Integration with `LLMOptimizationRunner` for automated extractor/hook generation
- ✅ Argument parsing and validation
- ✅ Comprehensive help message and examples
- ✅ Test suite verifying framework functionality
- ✅ Documentation of hybrid approach (Claude Code → JSON → LLMOptimizationRunner)

**Current Limitation**:
- ⚠️ `LLMWorkflowAnalyzer` requires Anthropic API key for natural language parsing
- `--llm` mode works but needs `--api-key` argument
- Without API key, use hybrid approach (pre-generated workflow JSON)

**Working Approaches**:
1. **With API Key**: `--llm "request" --api-key "sk-ant-..."`
2. **Hybrid (Recommended)**: Claude Code → workflow JSON → `LLMOptimizationRunner`
3. **Study-Specific**: Hardcoded workflow (see bracket study example)

**Files**:
- [optimization_engine/run_optimization.py](../optimization_engine/run_optimization.py) - Generic CLI runner
- [docs/PHASE_3_2_INTEGRATION_STATUS.md](../docs/PHASE_3_2_INTEGRATION_STATUS.md) - Complete status report
- [tests/test_phase_3_2_llm_mode.py](../tests/test_phase_3_2_llm_mode.py) - Integration tests

**Next Steps** (When API integration becomes priority):
- Implement true Claude Code integration in `LLMWorkflowAnalyzer`
- OR defer until Anthropic API integration is prioritized
- OR continue with hybrid approach (90% of value, 10% of complexity)

**Recommendation**: ✅ Framework Complete - Proceed to other priorities (NXOpen docs, Engineering pipeline)

### 🔬 HIGH PRIORITY: NXOpen Documentation Access

**Goal**: Enable LLM to reference NXOpen documentation when developing Atomizer features and generating NXOpen code

#### Options to Investigate:

1. **Authenticated Web Fetching**
   - Can we login to Siemens documentation portal?
   - Can WebFetch tool use authenticated sessions?
   - Explore Siemens PLM API access

2. **Documentation Scraping**
   - Ethical/legal considerations
   - Caching locally for offline use
   - Structured extraction of API signatures

3. **Official API Access**
   - Does Siemens provide API documentation in structured format?
   - JSON/XML schema files?
   - OpenAPI/Swagger specs?

4. **Community Resources**
   - TheScriptingEngineer blog content
   - NXOpen examples repository
   - Community-contributed documentation

#### Research Tasks:

- [ ] Investigate Siemens documentation portal login mechanism
- [ ] Test WebFetch with authentication headers
- [ ] Explore Siemens PLM API documentation access
- [ ] Review legal/ethical considerations for documentation access
- [ ] Create proof-of-concept: LLM + NXOpen docs → Generated code

**Success Criteria**: LLM can fetch NXOpen documentation on-demand when writing code

### 🔧 MEDIUM PRIORITY: NXOpen Intellisense Integration

**Goal**: Investigate if NXOpen Python stub files can improve Atomizer development workflow

#### Background:

From NX2406 onwards, Siemens provides stub files for Python intellisense:
- **Location**: `UGII_BASE_DIR\ugopen\pythonStubs`
- **Purpose**: Enable code completion, parameter info, member lists for NXOpen objects
- **Integration**: Works with VSCode Pylance extension

#### TheScriptingEngineer's Configuration:

```json
// settings.json
"python.analysis.typeCheckingMode": "basic",
"python.analysis.stubPath": "path_to_NX/ugopen/pythonStubs/Release2023/"
```

#### Questions to Answer:

1. **Development Workflow**:
   - Does this improve Atomizer development speed?
   - Can Claude Code leverage intellisense information?
   - Does it reduce NXOpen API lookup time?

2. **Code Generation**:
   - Can generated code use these stubs for validation?
   - Can we type-check generated NXOpen scripts before execution?
   - Does it catch errors earlier?

3. **Integration Points**:
   - Should this be part of Atomizer setup process?
   - Can we distribute stubs with Atomizer?
   - Legal considerations for redistribution?

#### Implementation Plan:

- [ ] Locate stub files in NX2412 installation
- [ ] Configure VSCode with stub path
- [ ] Test intellisense with sample NXOpen code
- [ ] Evaluate impact on development workflow
- [ ] Document setup process for contributors
- [ ] Decide: Include in Atomizer or document as optional enhancement?

**Success Criteria**: Developers have working intellisense for NXOpen APIs

---

## Foundation for Future

### 🏗️ Engineering Feature Documentation Pipeline

**Purpose**: Establish rigorous validation process for LLM-generated engineering features

**Important**: This is NOT for current software development. This is the foundation for future user-generated features.

#### Vision:

When a user asks Atomizer to create a new FEA feature (e.g., "calculate buckling safety factor"), the system should:

1. **Generate Code**: LLM creates the implementation
2. **Generate Documentation**: Auto-create comprehensive markdown explaining the feature
3. **Human Review**: Engineer reviews and approves before integration
4. **Version Control**: Documentation and code committed together

This ensures **scientific rigor** and **traceability** for production use.

#### Auto-Generated Documentation Format:

Each engineering feature should produce a markdown file with these sections:

```markdown
# Feature Name: [e.g., Buckling Safety Factor Calculator]

## Goal
What problem does this feature solve?
- Engineering context
- Use cases
- Expected outcomes

## Engineering Rationale
Why this approach?
- Design decisions
- Alternative approaches considered
- Why this method was chosen

## Mathematical Foundation

### Equations
\```
σ_buckling = (π² × E × I) / (K × L)²
Safety Factor = σ_buckling / σ_applied
\```

### Sources
- Euler Buckling Theory (1744)
- AISC Steel Construction Manual, 15th Edition, Chapter E
- Timoshenko & Gere, "Theory of Elastic Stability" (1961)

### Assumptions & Limitations
- Elastic buckling only
- Slender columns (L/r > 100)
- Perfect geometry assumed
- Material isotropy

## Implementation

### Code Structure
\```python
def calculate_buckling_safety_factor(
    youngs_modulus: float,
    moment_of_inertia: float,
    effective_length: float,
    applied_stress: float,
    k_factor: float = 1.0
) -> float:
    """
    Calculate buckling safety factor using Euler formula.

    Parameters:
    ...
    """
\```

### Input Validation
- Positive values required
- Units: Pa, m⁴, m, Pa
- K-factor range: 0.5 to 2.0

### Error Handling
- Division by zero checks
- Physical validity checks
- Numerical stability considerations

## Testing & Validation

### Unit Tests
\```python
def test_euler_buckling_simple_case():
    # Steel column: E=200GPa, I=1e-6m⁴, L=3m, σ=100MPa
    sf = calculate_buckling_safety_factor(200e9, 1e-6, 3.0, 100e6)
    assert 2.0 < sf < 2.5  # Expected range
\```

### Validation Cases
1. **Benchmark Case 1**: AISC Manual Example 3.1 (page 45)
   - Input: [values]
   - Expected: [result]
   - Actual: [result]
   - Error: [%]

2. **Benchmark Case 2**: Timoshenko Example 2.3
   - ...

### Edge Cases Tested
- Very short columns (L/r < 50) - should warn/fail
- Very long columns - numerical stability
- Zero/negative inputs - should error gracefully

## Approval

- **Author**: [LLM Generated | Engineer Name]
- **Reviewer**: [Engineer Name]
- **Date Reviewed**: [YYYY-MM-DD]
- **Status**: [Pending | Approved | Rejected]
- **Notes**: [Reviewer comments]

## References

1. Euler, L. (1744). "Methodus inveniendi lineas curvas maximi minimive proprietate gaudentes"
2. American Institute of Steel Construction (2016). *Steel Construction Manual*, 15th Edition
3. Timoshenko, S.P. & Gere, J.M. (1961). *Theory of Elastic Stability*, 2nd Edition, McGraw-Hill

## Change Log

- **v1.0** (2025-11-17): Initial implementation
- **v1.1** (2025-11-20): Added K-factor validation per reviewer feedback
```

#### Implementation Requirements:

1. **Template System**:
   - Markdown template for each feature type
   - Auto-fill sections where possible
   - Highlight sections requiring human input

2. **Generation Pipeline**:
   ```
   User Request → LLM Analysis → Code Generation → Documentation Generation → Human Review → Approval → Integration
   ```

3. **Storage Structure**:
   ```
   atomizer/
   ├── engineering_features/
   │   ├── approved/
   │   │   ├── buckling_safety_factor/
   │   │   │   ├── implementation.py
   │   │   │   ├── tests.py
   │   │   │   └── FEATURE_DOCS.md
   │   │   └── ...
   │   └── pending_review/
   │       └── ...
   ```

4. **Validation Checklist**:
   - [ ] Equations match cited sources
   - [ ] Units are documented and validated
   - [ ] Edge cases are tested
   - [ ] Physical validity checks exist
   - [ ] Benchmarks pass within tolerance
   - [ ] Code matches documentation
   - [ ] References are credible and accessible

#### Who Uses This:

- **NOT YOU (current development)**: You're building Atomizer's software foundation - different process
- **FUTURE USERS**: When users ask Atomizer to create custom FEA features
- **PRODUCTION DEPLOYMENTS**: Where engineering rigor and traceability matter

#### Development Now vs Foundation for Future:

| Aspect | Development Now | Foundation for Future |
|--------|----------------|----------------------|
| **Scope** | Building Atomizer software | User-generated FEA features |
| **Process** | Agile, iterate fast | Rigorous validation pipeline |
| **Documentation** | Code comments, dev docs | Full engineering documentation |
| **Review** | You approve | Human engineer approves |
| **Testing** | Unit tests, integration tests | Benchmark validation required |
| **Speed** | Move fast | Move carefully |

**Bottom Line**: Build the framework now, but don't use it yourself yet. It's for future credibility and production use.

### 🔐 Validation Pipeline Framework

**Goal**: Define the structure for rigorous validation of LLM-generated scientific tools

#### Pipeline Stages:

```mermaid
graph LR
    A[User Request] --> B[LLM Analysis]
    B --> C[Code Generation]
    C --> D[Documentation Generation]
    D --> E[Automated Tests]
    E --> F{Tests Pass?}
    F -->|No| G[Feedback Loop]
    G --> C
    F -->|Yes| H[Human Review Queue]
    H --> I{Approved?}
    I -->|No| J[Reject with Feedback]
    J --> G
    I -->|Yes| K[Integration]
    K --> L[Production Ready]
```

#### Components to Build:

1. **Request Parser**:
   - Natural language → Structured requirements
   - Identify required equations/standards
   - Classify feature type (stress, displacement, buckling, etc.)

2. **Code Generator with Documentation**:
   - Generate implementation code
   - Generate test cases
   - Generate markdown documentation
   - Link code ↔ docs bidirectionally

3. **Automated Validation**:
   - Run unit tests
   - Check benchmark cases
   - Validate equation implementations
   - Verify units consistency

4. **Review Queue System**:
   - Pending features awaiting approval
   - Review interface (CLI or web)
   - Approval/rejection workflow
   - Feedback mechanism to LLM

5. **Integration Manager**:
   - Move approved features to production
   - Update feature registry
   - Generate release notes
   - Version control integration

#### Current Status:

- [ ] Request parser - Not started
- [ ] Code generator with docs - Partially exists (hook_generator, extractor_orchestrator)
- [ ] Automated validation - Basic tests exist, need benchmark framework
- [ ] Review queue - Not started
- [ ] Integration manager - Not started

**Priority**: Build the structure and interfaces now, implement validation logic later.

#### Example Workflow (Future):

```bash
# User creates custom feature
$ atomizer create-feature --request "Calculate von Mises stress safety factor using Tresca criterion"

[LLM Analysis]
✓ Identified: Stress-based safety factor
✓ Standards: Tresca yield criterion
✓ Required inputs: stress_tensor, yield_strength
✓ Generating code...

[Code Generation]
✓ Created: engineering_features/pending_review/tresca_safety_factor/
  - implementation.py
  - tests.py
  - FEATURE_DOCS.md

[Automated Tests]
✓ Unit tests: 5/5 passed
✓ Benchmark cases: 3/3 passed
✓ Edge cases: 4/4 passed

[Status]
🟡 Pending human review
📋 Review with: atomizer review tresca_safety_factor

# Engineer reviews
$ atomizer review tresca_safety_factor

[Review Interface]
Feature: Tresca Safety Factor Calculator
Status: Automated tests PASSED

Documentation Preview:
[shows FEATURE_DOCS.md]

Code Preview:
[shows implementation.py]

Test Results:
[shows test output]

Approve? [y/N]: y
Review Notes: Looks good, equations match standard

[Approval]
✓ Feature approved
✓ Integrated into feature registry
✓ Available for use

# Now users can use it
$ atomizer optimize --objective "maximize displacement" --constraint "tresca_sf > 2.0"
```

**This is the vision**. Build the foundation now for future implementation.

---

## Technical Roadmap

### Revised Phase Timeline

| Phase | Status | Description | Priority |
|-------|--------|-------------|----------|
| **Phase 1** | ✅ 100% | Plugin System | Complete |
| **Phase 2.5** | ✅ 85% | Intelligent Gap Detection | Built, needs integration |
| **Phase 2.6** | ✅ 85% | Workflow Decomposition | Built, needs integration |
| **Phase 2.7** | ✅ 85% | Step Classification | Built, needs integration |
| **Phase 2.9** | ✅ 85% | Hook Generation | Built, tested |
| **Phase 3.0** | ✅ 85% | Research Agent | Built, tested |
| **Phase 3.1** | ✅ 85% | Extractor Orchestration | Built, tested |
| **Phase 3.2** | ✅ 75% | **Runner Integration** | Framework complete, API integration pending |
| **Phase 3.3** | 🟡 50% | Optimization Setup Wizard | Partially built |
| **Phase 3.4** | 🔵 0% | NXOpen Documentation Integration | Research phase |
| **Phase 3.5** | 🔵 0% | Engineering Feature Pipeline | Foundation design |
| **Phase 4+** | 🔵 0% | Advanced Features | Paused until 3.2 complete |

### Immediate Next Steps (Next 2 Weeks)

#### Week 1: Integration & Testing

**Monday-Tuesday**: Runner Integration
- [ ] Add `--llm` flag to `run_optimization.py`
- [ ] Connect `LLMOptimizationRunner` to production workflow
- [ ] Implement fallback to manual mode
- [ ] Test with bracket study

**Wednesday-Thursday**: End-to-End Testing
- [ ] Run complete LLM workflow: Request → Code → Solve → Results
- [ ] Compare LLM-generated vs manual extractors
- [ ] Performance profiling
- [ ] Fix any integration bugs

**Friday**: Polish & Documentation
- [ ] Improve error messages
- [ ] Add progress indicators
- [ ] Create example script
- [ ] Update inline documentation

#### Week 2: NXOpen Documentation Research

**Monday-Tuesday**: Investigation
- [ ] Research Siemens documentation portal
- [ ] Test authenticated WebFetch
- [ ] Explore PLM API access
- [ ] Review legal considerations

**Wednesday**: Intellisense Setup
- [ ] Locate NX2412 stub files
- [ ] Configure VSCode with Pylance
- [ ] Test intellisense with NXOpen code
- [ ] Document setup process

**Thursday-Friday**: Documentation Updates
- [ ] Update `README.md` with LLM capabilities
- [ ] Update `DEVELOPMENT.md` with accurate status
- [ ] Create `NXOPEN_INTEGRATION.md` guide
- [ ] Update this guidance document

### Medium-Term Goals (1-3 Months)

1. **Phase 3.4: NXOpen Documentation Integration**
   - Implement authenticated documentation access
   - Create NXOpen knowledge base
   - Test LLM code generation with docs

2. **Phase 3.5: Engineering Feature Pipeline**
   - Build documentation template system
   - Create review queue interface
   - Implement validation framework

3. **Dashboard Enhancement**
   - Add LLM mode toggle
   - Visualize generated code
   - Show approval workflow

4. **Performance Optimization**
   - LLM response caching
   - Batch code generation
   - Async operations

### Long-Term Vision (3-12 Months)

1. **Phase 4: Advanced Code Generation**
   - Complex FEA feature generation
   - Multi-physics setup automation
   - Topology optimization support

2. **Phase 5: Intelligent Analysis**
   - Surrogate quality assessment
   - Sensitivity analysis
   - Pareto front optimization

3. **Phase 6: Automated Reporting**
   - HTML/PDF generation
   - LLM-written insights
   - Executive summaries

4. **Production Hardening**
   - Security audits
   - Performance optimization
   - Enterprise features

---

## Development Standards

### Reference Hierarchy for Feature Implementation

When implementing new features or capabilities in Atomizer, follow this **prioritized order** for consulting documentation and APIs:

#### Tier 1: Primary References (ALWAYS CHECK FIRST)

These are the authoritative sources that define the actual APIs and behaviors we work with:

1. **NXOpen Python Stub Files** (`C:\Program Files\Siemens\NX2412\UGOPEN\pythonStubs`)
   - **Why**: Exact method signatures, parameter types, return values for all NXOpen APIs
   - **When**: Writing NX journal scripts, updating part parameters, CAE operations
   - **Access**: VSCode Pylance intellisense (configured in `.vscode/settings.json`)
   - **Accuracy**: ~95% - this is the actual API definition
   - **Example**: For updating expressions, check `NXOpen/Part.pyi` → `ExpressionCollection` class → see `FindObject()` and `EditExpressionWithUnits()` methods

2. **Existing Atomizer Journals** (`optimization_engine/*.py`, `studies/*/`)
   - **Why**: Working, tested code that already solves similar problems
   - **When**: Before writing new NX integration code
   - **Files to Check**:
     - `optimization_engine/solve_simulation.py` - NX journal for running simulations
     - `optimization_engine/nx_updater.py` - Parameter update patterns
     - Any study-specific journals in `studies/*/`
   - **Pattern**: Search for similar functionality first, adapt existing code

3. **NXOpen API Patterns in Codebase** (`optimization_engine/`, `result_extractors/`)
   - **Why**: Established patterns for NX API usage in Atomizer
   - **When**: Implementing new NX operations
   - **What to Look For**:
     - Session management patterns
     - Part update workflows
     - Expression handling
     - Save/load patterns

#### Tier 2: Specialized References (USE FOR SPECIFIC TASKS)

These are secondary sources for specialized tasks - use **ONLY** for their specific domains:

1. **pyNastran** (`knowledge_base/`, online docs)
   - **ONLY FOR**: OP2/F06 file post-processing (reading Nastran output files)
   - **NOT FOR**: NXOpen guidance, simulation setup, parameter updates
   - **Why Limited**: pyNastran is for reading results, not for NX API integration
   - **When to Use**: Creating result extractors, reading stress/displacement from OP2 files
   - **Example Valid Use**: `result_extractors/stress_extractor.py` - reads OP2 stress data
   - **Example INVALID Use**: ❌ Don't use pyNastran docs to learn how to update NX part expressions

2. **TheScriptingEngineer Blog** (https://thescriptingengineer.com)
   - **When**: Need working examples of NXOpen usage patterns
   - **Why**: High-quality, practical examples with explanations
   - **Best For**: Learning NXOpen workflow patterns, discovering API usage
   - **Limitation**: Blog may use different NX versions, verify against stub files

#### Tier 3: Last Resort References (USE SPARINGLY)

Use these only when Tier 1 and Tier 2 don't provide answers:

1. **Web Search / External Documentation**
   - **When**: Researching new concepts not covered by existing code
   - **Caution**: Verify information against stub files and existing code
   - **Best For**: Conceptual understanding, theory, background research

2. **Siemens Official Documentation Portal** (https://plm.sw.siemens.com)
   - **When**: Need detailed API documentation beyond stub files
   - **Status**: Authenticated access under investigation (see NXOpen Integration initiative)
   - **Future**: May become Tier 1 once integration is complete

### Reference Hierarchy Decision Tree

```
Need to implement NXOpen functionality?
│
├─> Check NXOpen stub files (.pyi) - Do exact methods exist?
│   ├─> YES: Use those method signatures ✅
│   └─> NO: Continue ↓
│
├─> Search existing Atomizer journals - Has this been done before?
│   ├─> YES: Adapt existing code ✅
│   └─> NO: Continue ↓
│
├─> Check TheScriptingEngineer - Are there examples?
│   ├─> YES: Adapt pattern, verify against stub files ✅
│   └─> NO: Continue ↓
│
└─> Web search for concept - Understand theory, then implement using stub files
    └─> ALWAYS verify final code against stub files before using ✅

Need to extract results from OP2/F06?
│
└─> Use pyNastran ✅
    └─> Check knowledge_base/ for existing patterns first

Need to understand FEA theory/equations?
│
└─> Web search / textbooks ✅
    └─> Document sources in feature documentation
```

### Why This Hierarchy Matters

**Before** (guessing/hallucinating):
```python
# ❌ Guessed API - might not exist or have wrong signature
work_part.Expressions.Edit("tip_thickness", "5.0")  # Wrong method name!
```

**After** (checking stub files):
```python
# ✅ Verified against NXOpen/Part.pyi stub file
expr = work_part.Expressions.FindObject("tip_thickness")  # Correct!
work_part.Expressions.EditExpressionWithUnits(expr, unit, "5.0")  # Correct!
```

**Improvement**: ~60% accuracy (guessing) → ~95% accuracy (stub files)

### NXOpen Integration Status

✅ **Completed** (2025-11-17):
- NXOpen stub files located and configured in VSCode
- Python 3.11 environment setup for NXOpen compatibility
- NXOpen module import enabled via `.pth` file
- Intellisense working for all NXOpen APIs
- Documentation: [NXOPEN_INTELLISENSE_SETUP.md](docs/NXOPEN_INTELLISENSE_SETUP.md)

🔜 **Future Work**:
- Authenticated Siemens documentation access (research phase)
- Documentation scraping for LLM knowledge base
- LLM-generated journal scripts with validation

---

## Key Principles

### Development Philosophy

1. **Ship Before Perfecting**: Integration is more valuable than new features
2. **User Value First**: Every feature must solve a real user problem
3. **Scientific Rigor**: Engineering features require validation and documentation
4. **Progressive Enhancement**: System works without LLM, better with LLM
5. **Learn and Improve**: Knowledge base grows with every use

### Decision Framework

When prioritizing work, ask:

1. **Does this unlock user value?** If yes, prioritize
2. **Does this require other work first?** If yes, do dependencies first
3. **Can we test this independently?** If no, split into testable pieces
4. **Will this create technical debt?** If yes, document and plan to address
5. **Does this align with long-term vision?** If no, reconsider

### Quality Standards

**For Software Development (Atomizer itself)**:
- Unit tests for core components
- Integration tests for workflows
- Code review by you (main developer)
- Documentation for contributors
- Move fast, iterate

**For Engineering Features (User-generated FEA)**:
- Comprehensive mathematical documentation
- Benchmark validation required
- Human engineer approval mandatory
- Traceability to standards/papers
- Move carefully, validate thoroughly

---

## Success Metrics

### Phase 3.2 Success Criteria

- [ ] Users can run: `python run_optimization.py --llm "maximize displacement"`
- [ ] End-to-end test passes: Natural language → NX solve → Results
- [ ] LLM-generated extractors produce same results as manual extractors
- [ ] Error handling works gracefully (fallback to manual mode)
- [ ] Documentation updated to reflect LLM capabilities
- [ ] Example workflow created and tested

### NXOpen Integration Success Criteria

- [ ] LLM can fetch NXOpen documentation on-demand
- [ ] Generated code references correct NXOpen API methods
- [ ] Intellisense working in VSCode for NXOpen development
- [ ] Setup documented for contributors
- [ ] Legal/ethical review completed

### Engineering Feature Pipeline Success Criteria

- [ ] Documentation template system implemented
- [ ] Example feature with full documentation created
- [ ] Review workflow interface built (CLI or web)
- [ ] Validation framework structure defined
- [ ] At least one feature goes through full pipeline (demo)

---

## Communication & Collaboration

### Stakeholders

- **Antoine Letarte**: Main developer, architect, decision maker
- **Claude Code**: Development assistant for Atomizer software
- **Future Contributors**: Will follow established patterns and documentation
- **Future Users**: Will use LLM features for optimization workflows

### Documentation Strategy

1. **DEVELOPMENT_GUIDANCE.md** (this doc): Strategic direction, priorities, status
2. **README.md**: User-facing introduction, quick start, features
3. **DEVELOPMENT.md**: Detailed development status, todos, completed work
4. **DEVELOPMENT_ROADMAP.md**: Long-term vision, phases, future work
5. **Session summaries**: Detailed records of development sessions

Keep all documents synchronized and consistent.

### Review Cadence

- **Weekly**: Review progress against priorities
- **Monthly**: Update roadmap and adjust course if needed
- **Quarterly**: Major strategic reviews and planning

---

## Appendix: Quick Reference

### File Locations

**Core Engine**:
- `optimization_engine/runner.py` - Current production runner
- `optimization_engine/llm_optimization_runner.py` - LLM-enhanced runner (needs integration)
- `optimization_engine/nx_solver.py` - NX Simcenter integration
- `optimization_engine/nx_updater.py` - Parameter update system

**LLM Components**:
- `optimization_engine/llm_workflow_analyzer.py` - Natural language parser
- `optimization_engine/extractor_orchestrator.py` - Extractor generation
- `optimization_engine/pynastran_research_agent.py` - Documentation learning
- `optimization_engine/hook_generator.py` - Hook code generation

**Studies**:
- `studies/bracket_displacement_maximizing/` - Working example with substudies
- `studies/bracket_displacement_maximizing/run_substudy.py` - Substudy runner
- `studies/bracket_displacement_maximizing/SUBSTUDIES_README.md` - Substudy guide

**Tests**:
- `tests/test_phase_2_5_intelligent_gap_detection.py` - Gap detection tests
- `tests/test_phase_3_1_integration.py` - Extractor orchestration tests
- `tests/test_complete_research_workflow.py` - Research agent tests

**Documentation**:
- `docs/SESSION_SUMMARY_PHASE_*.md` - Development session records
- `knowledge_base/` - Learned patterns and research sessions
- `feature_registry.json` - Complete capability catalog

### Common Commands

```bash
# Run optimization (current manual mode)
cd studies/bracket_displacement_maximizing
python run_optimization.py

# Run substudy
python run_substudy.py coarse_exploration

# Run tests
python -m pytest tests/test_phase_3_1_integration.py -v

# Start dashboard
python dashboard/start_dashboard.py
```

### Key Contacts & Resources

- **Siemens NX Documentation**: [PLM Portal](https://plm.sw.siemens.com)
- **TheScriptingEngineer**: [Blog](https://thescriptingengineer.com)
- **pyNastran Docs**: [GitHub](https://github.com/SteveDoyle2/pyNastran)
- **Optuna Docs**: [optuna.org](https://optuna.org)

---

**Document Maintained By**: Antoine Letarte (Main Developer)
**Last Review**: 2025-11-17
**Next Review**: 2025-11-24