- Restructure docs/ folder (remove numeric prefixes): - 04_USER_GUIDES -> guides/ - 05_API_REFERENCE -> api/ - 06_PHYSICS -> physics/ - 07_DEVELOPMENT -> development/ - 08_ARCHIVE -> archive/ - 09_DIAGRAMS -> diagrams/ - Replace tagline 'Talk, don't click' with 'LLM-driven optimization framework' in 9 files - Create comprehensive docs/GETTING_STARTED.md: - Prerequisites and quick setup - Project structure overview - First study tutorial (Claude or manual) - Dashboard usage guide - Neural acceleration introduction - Rewrite docs/00_INDEX.md with correct paths and modern structure - Archive obsolete files: - 01_PROTOCOLS.md -> archive/historical/01_PROTOCOLS_legacy.md - 03_GETTING_STARTED.md -> archive/historical/ - ATOMIZER_PODCAST_BRIEFING.md -> archive/marketing/ - Update timestamps to 2026-01-20 across all key files - Update .gitignore to exclude docs/generated/ - Version bump: ATOMIZER_CONTEXT v1.8 -> v2.0
21 KiB
Phase 3.2: LLM Integration Roadmap
Status: ✅ WEEK 1 COMPLETE - 🎯 Week 2 IN PROGRESS Timeline: 2-4 weeks Last Updated: 2025-11-17 Current Progress: 25% (Week 1/4 Complete)
Executive Summary
The Problem
We've built 85% of an LLM-native optimization system, but it's not integrated into production. The components exist but are disconnected islands:
- ✅ LLMWorkflowAnalyzer - Parses natural language → workflow (Phase 2.7)
- ✅ ExtractorOrchestrator - Auto-generates result extractors (Phase 3.1)
- ✅ InlineCodeGenerator - Creates custom calculations (Phase 2.8)
- ✅ HookGenerator - Generates post-processing hooks (Phase 2.9)
- ✅ LLMOptimizationRunner - Orchestrates LLM workflow (Phase 3.2)
- ⚠️ ResearchAgent - Learns from examples (Phase 2, partially complete)
Reality: Users still write 100+ lines of JSON config manually instead of using 3 lines of natural language.
The Solution
Phase 3.2 Integration Sprint: Wire LLM components into production workflow with a single --llm flag.
Strategic Roadmap
Week 1: Make LLM Mode Accessible (16 hours)
Goal: Users can invoke LLM mode with a single command
Tasks
1.1 Create Unified Entry Point (4 hours) ✅ COMPLETE
- Create
optimization_engine/run_optimization.pyas unified CLI - Add
--llmflag for natural language mode - Add
--requestparameter for natural language input - Preserve existing
--configfor traditional JSON mode - Support both modes in parallel (no breaking changes)
Files:
optimization_engine/run_optimization.py(NEW)
Success Metric:
python optimization_engine/run_optimization.py --llm \
--request "Minimize stress for bracket. Vary wall thickness 3-8mm" \
--prt studies/bracket/model/Bracket.prt \
--sim studies/bracket/model/Bracket_sim1.sim
1.2 Wire LLMOptimizationRunner to Production (8 hours) ✅ COMPLETE
- Connect LLMWorkflowAnalyzer to entry point
- Bridge LLMOptimizationRunner → OptimizationRunner for execution
- Pass model updater and simulation runner callables
- Integrate with existing hook system
- Preserve all logging (detailed logs, optimization.log)
- Add workflow validation and error handling
- Create comprehensive integration test suite (5/5 tests passing)
Files Modified:
optimization_engine/run_optimization.pyoptimization_engine/llm_optimization_runner.py(integration points)
Success Metric: LLM workflow generates extractors → runs FEA → logs results
1.3 Create Minimal Example (2 hours) ✅ COMPLETE
- Create
examples/llm_mode_simple_example.py - Show: Natural language request → Optimization results
- Compare: Traditional mode (100 lines JSON) vs LLM mode (3 lines)
- Include troubleshooting tips
Files Created:
examples/llm_mode_simple_example.py
Success Metric: Example runs successfully, demonstrates value ✅
1.4 End-to-End Integration Test (2 hours) ✅ COMPLETE
- Test with simple_beam_optimization study
- Natural language → JSON workflow → NX solve → Results
- Verify all extractors generated correctly
- Check logs created properly
- Validate output matches manual mode
- Test graceful failure without API key
- Comprehensive verification of all output files
Files Created:
tests/test_phase_3_2_e2e.py
Success Metric: LLM mode completes beam optimization without errors ✅
Week 2: Robustness & Safety (16 hours)
Goal: LLM mode handles failures gracefully, never crashes
Tasks
2.1 Code Validation Pipeline (6 hours)
- Create
optimization_engine/code_validator.py - Implement syntax validation (ast.parse)
- Implement security scanning (whitelist imports)
- Implement test execution on example OP2
- Implement output schema validation
- Add retry with LLM feedback on validation failure
Files Created:
optimization_engine/code_validator.py
Integration Points:
optimization_engine/extractor_orchestrator.py(validate before saving)optimization_engine/inline_code_generator.py(validate calculations)
Success Metric: Generated code passes validation, or LLM fixes based on feedback
2.2 Graceful Fallback Mechanisms (4 hours)
- Wrap all LLM calls in try/except
- Provide clear error messages
- Offer fallback to manual mode
- Log failures to audit trail
- Never crash on LLM failure
Files Modified:
optimization_engine/run_optimization.pyoptimization_engine/llm_workflow_analyzer.pyoptimization_engine/llm_optimization_runner.py
Success Metric: LLM failures degrade gracefully to manual mode
2.3 LLM Audit Trail (3 hours)
- Create
optimization_engine/llm_audit.py - Log all LLM requests and responses
- Log generated code with prompts
- Log validation results
- Create
llm_audit.jsonin study output directory
Files Created:
optimization_engine/llm_audit.py
Integration Points:
- All LLM components log to audit trail
Success Metric: Full LLM decision trace available for debugging
2.4 Failure Scenario Testing (3 hours)
- Test: Invalid natural language request
- Test: LLM unavailable (API down)
- Test: Generated code has syntax error
- Test: Generated code fails validation
- Test: OP2 file format unexpected
- Verify all fail gracefully
Files Created:
tests/test_llm_failure_modes.py
Success Metric: All failure scenarios handled without crashes
Week 3: Learning System (12 hours)
Goal: System learns from successful workflows and reuses patterns
Tasks
3.1 Knowledge Base Implementation (4 hours)
- Create
optimization_engine/knowledge_base.py - Implement
save_session()- Save successful workflows - Implement
search_templates()- Find similar past workflows - Implement
get_template()- Retrieve reusable pattern - Add confidence scoring (user-validated > LLM-generated)
Files Created:
optimization_engine/knowledge_base.pyknowledge_base/sessions/(directory for session logs)knowledge_base/templates/(directory for reusable patterns)
Success Metric: Successful workflows saved with metadata
3.2 Template Extraction (4 hours)
- Analyze generated extractor code to identify patterns
- Extract reusable template structure
- Parameterize variable parts
- Save template with usage examples
- Implement template application to new requests
Files Modified:
optimization_engine/extractor_orchestrator.py
Integration:
# After successful generation:
template = extract_template(generated_code)
knowledge_base.save_template(feature_name, template, confidence='medium')
# On next request:
existing_template = knowledge_base.search_templates(feature_name)
if existing_template and existing_template.confidence > 0.7:
code = existing_template.apply(new_params) # Reuse!
Success Metric: Second identical request reuses template (faster)
3.3 ResearchAgent Integration (4 hours)
- Complete ResearchAgent implementation
- Integrate into ExtractorOrchestrator error handling
- Add user example collection workflow
- Implement pattern learning from examples
- Save learned knowledge to knowledge base
Files Modified:
optimization_engine/research_agent.py(complete implementation)optimization_engine/llm_optimization_runner.py(integrate ResearchAgent)
Workflow:
Unknown feature requested
→ ResearchAgent asks user for example
→ Learns pattern from example
→ Generates feature using pattern
→ Saves to knowledge base
→ Retry with new feature
Success Metric: Unknown feature request triggers learning loop successfully
Week 4: Documentation & Discoverability (8 hours)
Goal: Users discover and understand LLM capabilities
Tasks
4.1 Update README (2 hours)
- Add "🤖 LLM-Powered Mode" section to README.md
- Show example command with natural language
- Explain what LLM mode can do
- Link to detailed docs
Files Modified:
README.md
Success Metric: README clearly shows LLM capabilities upfront
4.2 Create LLM Mode Documentation (3 hours)
- Create
docs/LLM_MODE.md - Explain how LLM mode works
- Provide usage examples
- Document when to use LLM vs manual mode
- Add troubleshooting guide
- Explain learning system
Files Created:
docs/LLM_MODE.md
Contents:
- How it works (architecture diagram)
- Getting started (first LLM optimization)
- Natural language patterns that work well
- Troubleshooting common issues
- How learning system improves over time
Success Metric: Users understand LLM mode from docs
4.3 Create Demo Video/GIF (1 hour)
- Record terminal session: Natural language → Results
- Show before/after (100 lines JSON vs 3 lines)
- Create animated GIF for README
- Add to documentation
Files Created:
docs/demo/llm_mode_demo.gif
Success Metric: Visual demo shows value proposition clearly
4.4 Update All Planning Docs (2 hours)
- Update DEVELOPMENT.md with Phase 3.2 completion status
- Update DEVELOPMENT_GUIDANCE.md progress (80-90% → 90-95%)
- Update DEVELOPMENT_ROADMAP.md Phase 3 status
- Mark Phase 3.2 as ✅ Complete
Files Modified:
DEVELOPMENT.mdDEVELOPMENT_GUIDANCE.mdDEVELOPMENT_ROADMAP.md
Success Metric: All docs reflect completed Phase 3.2
Implementation Details
Entry Point Architecture
# optimization_engine/run_optimization.py (NEW)
import argparse
from pathlib import Path
def main():
parser = argparse.ArgumentParser(
description="Atomizer Optimization Engine - Manual or LLM-powered mode"
)
# Mode selection
mode_group = parser.add_mutually_exclusive_group(required=True)
mode_group.add_argument('--llm', action='store_true',
help='Use LLM-assisted workflow (natural language mode)')
mode_group.add_argument('--config', type=Path,
help='JSON config file (traditional mode)')
# LLM mode parameters
parser.add_argument('--request', type=str,
help='Natural language optimization request (required with --llm)')
# Common parameters
parser.add_argument('--prt', type=Path, required=True,
help='Path to .prt file')
parser.add_argument('--sim', type=Path, required=True,
help='Path to .sim file')
parser.add_argument('--output', type=Path,
help='Output directory (default: auto-generated)')
parser.add_argument('--trials', type=int, default=50,
help='Number of optimization trials')
args = parser.parse_args()
if args.llm:
run_llm_mode(args)
else:
run_traditional_mode(args)
def run_llm_mode(args):
"""LLM-powered natural language mode."""
from optimization_engine.llm_workflow_analyzer import LLMWorkflowAnalyzer
from optimization_engine.llm_optimization_runner import LLMOptimizationRunner
from optimization_engine.nx_updater import NXParameterUpdater
from optimization_engine.nx_solver import NXSolver
from optimization_engine.llm_audit import LLMAuditLogger
if not args.request:
raise ValueError("--request required with --llm mode")
print(f"🤖 LLM Mode: Analyzing request...")
print(f" Request: {args.request}")
# Initialize audit logger
audit_logger = LLMAuditLogger(args.output / "llm_audit.json")
# Analyze natural language request
analyzer = LLMWorkflowAnalyzer(use_claude_code=True)
try:
workflow = analyzer.analyze_request(args.request)
audit_logger.log_analysis(args.request, workflow,
reasoning=workflow.get('llm_reasoning', ''))
print(f"✓ Workflow created:")
print(f" - Design variables: {len(workflow['design_variables'])}")
print(f" - Objectives: {len(workflow['objectives'])}")
print(f" - Extractors: {len(workflow['engineering_features'])}")
except Exception as e:
print(f"✗ LLM analysis failed: {e}")
print(" Falling back to manual mode. Please provide --config instead.")
return
# Create model updater and solver callables
updater = NXParameterUpdater(args.prt)
solver = NXSolver()
def model_updater(design_vars):
updater.update_expressions(design_vars)
def simulation_runner():
result = solver.run_simulation(args.sim)
return result['op2_file']
# Run LLM-powered optimization
runner = LLMOptimizationRunner(
llm_workflow=workflow,
model_updater=model_updater,
simulation_runner=simulation_runner,
study_name=args.output.name if args.output else "llm_optimization",
output_dir=args.output
)
study = runner.run(n_trials=args.trials)
print(f"\n✓ Optimization complete!")
print(f" Best trial: {study.best_trial.number}")
print(f" Best value: {study.best_value:.6f}")
print(f" Results: {args.output}")
def run_traditional_mode(args):
"""Traditional JSON configuration mode."""
from optimization_engine.runner import OptimizationRunner
import json
print(f"📄 Traditional Mode: Loading config...")
with open(args.config) as f:
config = json.load(f)
runner = OptimizationRunner(
config_file=args.config,
prt_file=args.prt,
sim_file=args.sim,
output_dir=args.output
)
study = runner.run(n_trials=args.trials)
print(f"\n✓ Optimization complete!")
print(f" Results: {args.output}")
if __name__ == '__main__':
main()
Validation Pipeline
# optimization_engine/code_validator.py (NEW)
import ast
import subprocess
import tempfile
from pathlib import Path
from typing import Dict, Any, List
class CodeValidator:
"""
Validates LLM-generated code before execution.
Checks:
1. Syntax (ast.parse)
2. Security (whitelist imports)
3. Test execution on example data
4. Output schema validation
"""
ALLOWED_IMPORTS = {
'pyNastran', 'numpy', 'pathlib', 'typing', 'dataclasses',
'json', 'sys', 'os', 'math', 'collections'
}
FORBIDDEN_CALLS = {
'eval', 'exec', 'compile', '__import__', 'open',
'subprocess', 'os.system', 'os.popen'
}
def validate_extractor(self, code: str, test_op2_file: Path) -> Dict[str, Any]:
"""
Validate generated extractor code.
Args:
code: Generated Python code
test_op2_file: Example OP2 file for testing
Returns:
{
'valid': bool,
'error': str (if invalid),
'test_result': dict (if valid)
}
"""
# 1. Syntax check
try:
tree = ast.parse(code)
except SyntaxError as e:
return {
'valid': False,
'error': f'Syntax error: {e}',
'stage': 'syntax'
}
# 2. Security scan
security_result = self._check_security(tree)
if not security_result['safe']:
return {
'valid': False,
'error': security_result['error'],
'stage': 'security'
}
# 3. Test execution
try:
test_result = self._test_execution(code, test_op2_file)
except Exception as e:
return {
'valid': False,
'error': f'Runtime error: {e}',
'stage': 'execution'
}
# 4. Output schema validation
schema_result = self._validate_output_schema(test_result)
if not schema_result['valid']:
return {
'valid': False,
'error': schema_result['error'],
'stage': 'schema'
}
return {
'valid': True,
'test_result': test_result
}
def _check_security(self, tree: ast.AST) -> Dict[str, Any]:
"""Check for dangerous imports and function calls."""
for node in ast.walk(tree):
# Check imports
if isinstance(node, ast.Import):
for alias in node.names:
module = alias.name.split('.')[0]
if module not in self.ALLOWED_IMPORTS:
return {
'safe': False,
'error': f'Disallowed import: {alias.name}'
}
# Check function calls
if isinstance(node, ast.Call):
if isinstance(node.func, ast.Name):
if node.func.id in self.FORBIDDEN_CALLS:
return {
'safe': False,
'error': f'Forbidden function call: {node.func.id}'
}
return {'safe': True}
def _test_execution(self, code: str, test_file: Path) -> Dict[str, Any]:
"""Execute code in sandboxed environment with test data."""
# Write code to temp file
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write(code)
temp_code_file = Path(f.name)
try:
# Execute in subprocess (sandboxed)
result = subprocess.run(
['python', str(temp_code_file), str(test_file)],
capture_output=True,
text=True,
timeout=30
)
if result.returncode != 0:
raise RuntimeError(f"Execution failed: {result.stderr}")
# Parse JSON output
import json
output = json.loads(result.stdout)
return output
finally:
temp_code_file.unlink()
def _validate_output_schema(self, output: Dict[str, Any]) -> Dict[str, Any]:
"""Validate output matches expected extractor schema."""
# All extractors must return dict with numeric values
if not isinstance(output, dict):
return {
'valid': False,
'error': 'Output must be a dictionary'
}
# Check for at least one result value
if not any(key for key in output if not key.startswith('_')):
return {
'valid': False,
'error': 'No result values found in output'
}
# All values must be numeric
for key, value in output.items():
if not key.startswith('_'): # Skip metadata
if not isinstance(value, (int, float)):
return {
'valid': False,
'error': f'Non-numeric value for {key}: {type(value)}'
}
return {'valid': True}
Success Metrics
Week 1 Success
- LLM mode accessible via
--llmflag - Natural language request → Workflow generation works
- End-to-end test passes (simple_beam_optimization)
- Example demonstrates value (100 lines → 3 lines)
Week 2 Success
- Generated code validated before execution
- All failure scenarios degrade gracefully (no crashes)
- Complete LLM audit trail in
llm_audit.json - Test suite covers failure modes
Week 3 Success
- Successful workflows saved to knowledge base
- Second identical request reuses template (faster)
- Unknown features trigger ResearchAgent learning loop
- Knowledge base grows over time
Week 4 Success
- README shows LLM mode prominently
- docs/LLM_MODE.md complete and clear
- Demo video/GIF shows value proposition
- All planning docs updated
Risk Mitigation
Risk: LLM generates unsafe code
Mitigation: Multi-stage validation pipeline (syntax, security, test, schema)
Risk: LLM unavailable (API down)
Mitigation: Graceful fallback to manual mode with clear error message
Risk: Generated code fails at runtime
Mitigation: Sandboxed test execution before saving, retry with LLM feedback
Risk: Users don't discover LLM mode
Mitigation: Prominent README section, demo video, clear examples
Risk: Learning system fills disk with templates
Mitigation: Confidence-based pruning, max template limit, user confirmation for saves
Next Steps After Phase 3.2
Once integration is complete:
-
Validate with Real Studies
- Run simple_beam_optimization in LLM mode
- Create new study using only natural language
- Compare results manual vs LLM mode
-
Fix atomizer Conda Environment
- Rebuild clean environment
- Test visualization in atomizer env
-
NXOpen Documentation Integration (Phase 2, remaining tasks)
- Research Siemens docs portal access
- Integrate NXOpen stub files for intellisense
- Enable LLM to reference NXOpen API
-
Phase 4: Dynamic Code Generation (Roadmap)
- Journal script generator
- Custom function templates
- Safe execution sandbox
Last Updated: 2025-11-17 Owner: Antoine Polvé Status: Ready to begin Week 1 implementation