Files

Anto01 e3bdb08a22 feat: Major update with validators, skills, dashboard, and docs reorganization

- Add validation framework (config, model, results, study validators)
- Add Claude Code skills (create-study, run-optimization, generate-report,
  troubleshoot, analyze-model)
- Add Atomizer Dashboard (React frontend + FastAPI backend)
- Reorganize docs into structured directories (00-09)
- Add neural surrogate modules and training infrastructure
- Add multi-objective optimization support

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-25 19:23:58 -05:00

24 KiB

Raw Blame History

Feature Registry Architecture

Comprehensive guide to Atomizer's LLM-instructed feature database system

Last Updated: 2025-01-16 Status: Phase 2 - Design Document

Vision and Goals
Feature Categorization System
Feature Registry Structure
LLM Instruction Format
Feature Documentation Strategy
Dynamic Tool Building
Examples
Implementation Plan

Vision and Goals

Core Philosophy

Atomizer's feature registry is not just a catalog - it's an LLM instruction system that enables:

Self-Documentation: Features describe themselves to the LLM
Intelligent Composition: LLM can combine features into workflows
Autonomous Proposals: LLM suggests new features based on user needs
Structured Customization: Users customize the tool through natural language
Continuous Evolution: Feature database grows as users add capabilities

Key Principles

Feature Types Are First-Class: Engineering, software, UI, and analysis features are equally important
Location-Aware: Features know where their code lives and how to use it
Metadata-Rich: Each feature has enough context for LLM to understand and use it
Composable: Features can be combined into higher-level workflows
Extensible: New feature types can be added without breaking the system

Feature Categorization System

Primary Feature Dimensions

Features are organized along three dimensions:

Dimension 1: Domain (WHAT it does)

Engineering: Physics-based operations (stress, thermal, modal, etc.)
Software: Core algorithms and infrastructure (optimization, hooks, path resolution)
UI: User-facing components (dashboard, reports, visualization)
Analysis: Post-processing and decision support (sensitivity, Pareto, surrogate quality)

Dimension 2: Lifecycle Stage (WHEN it runs)

Pre-Mesh: Before meshing (geometry operations)
Pre-Solve: Before FEA solve (parameter updates, logging)
Solve: During FEA execution (solver control)
Post-Solve: After solve, before extraction (file validation)
Post-Extraction: After result extraction (logging, analysis)
Post-Optimization: After optimization completes (reporting, visualization)

Dimension 3: Abstraction Level (HOW it's used)

Primitive: Low-level functions (extract_stress, update_expression)
Composite: Mid-level workflows (RSS_metric, weighted_objective)
Workflow: High-level operations (run_optimization, generate_report)

Feature Type Classification

┌─────────────────────────────────────────────────────────────┐
│                     FEATURE UNIVERSE                        │
└─────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
   ENGINEERING            SOFTWARE                UI
        │                     │                     │
    ┌───┴───┐           ┌────┴────┐          ┌─────┴─────┐
    │       │           │         │          │           │
Extractors  Metrics  Optimization Hooks  Dashboard  Reports
    │       │           │         │          │           │
  Stress   RSS        Optuna   Pre-Solve  Widgets    HTML
  Thermal  SCF         TPE     Post-Solve Controls   PDF
  Modal    FOS       Sampler  Post-Extract Charts   Markdown

Feature Registry Structure

JSON Schema

{
  "feature_registry": {
    "version": "0.2.0",
    "last_updated": "2025-01-16",
    "categories": {
      "engineering": { ... },
      "software": { ... },
      "ui": { ... },
      "analysis": { ... }
    }
  }
}

Feature Entry Schema

Each feature has:

{
  "feature_id": "unique_identifier",
  "name": "Human-Readable Name",
  "description": "What this feature does (for LLM understanding)",
  "category": "engineering|software|ui|analysis",
  "subcategory": "extractors|metrics|optimization|hooks|...",
  "lifecycle_stage": "pre_solve|post_solve|post_extraction|...",
  "abstraction_level": "primitive|composite|workflow",
  "implementation": {
    "file_path": "relative/path/to/implementation.py",
    "function_name": "function_or_class_name",
    "entry_point": "how to invoke this feature"
  },
  "interface": {
    "inputs": [
      {
        "name": "parameter_name",
        "type": "str|int|float|dict|list",
        "required": true,
        "description": "What this parameter does",
        "units": "mm|MPa|Hz|none",
        "example": "example_value"
      }
    ],
    "outputs": [
      {
        "name": "output_name",
        "type": "float|dict|list",
        "description": "What this output represents",
        "units": "mm|MPa|Hz|none"
      }
    ]
  },
  "dependencies": {
    "features": ["feature_id_1", "feature_id_2"],
    "libraries": ["optuna", "pyNastran"],
    "nx_version": "2412"
  },
  "usage_examples": [
    {
      "description": "Example scenario",
      "code": "example_code_snippet",
      "natural_language": "How user would request this"
    }
  ],
  "composition_hints": {
    "combines_with": ["feature_id_3", "feature_id_4"],
    "typical_workflows": ["workflow_name_1"],
    "prerequisites": ["feature that must run before this"]
  },
  "metadata": {
    "author": "Antoine Polvé",
    "created": "2025-01-16",
    "status": "stable|experimental|deprecated",
    "tested": true,
    "documentation_url": "docs/features/feature_name.md"
  }
}

LLM Instruction Format

How LLM Uses the Registry

The feature registry serves as a structured instruction manual for the LLM:

1. Discovery Phase

User: "I want to minimize stress on my bracket"

LLM reads registry:
  → Finds category="engineering", subcategory="extractors"
  → Discovers "stress_extractor" feature
  → Reads: "Extracts von Mises stress from OP2 files"
  → Checks composition_hints: combines_with=["optimization_runner"]

LLM response: "I'll use the stress_extractor feature to minimize stress.
               This requires an OP2 file from NX solve."

2. Composition Phase

User: "Add a custom RSS metric combining stress and displacement"

LLM reads registry:
  → Finds abstraction_level="composite" examples
  → Discovers "rss_metric" template feature
  → Reads interface: inputs=[stress_value, displacement_value]
  → Checks composition_hints: combines_with=["stress_extractor", "displacement_extractor"]

LLM generates new composite feature following the pattern

3. Proposal Phase

User: "What features could help me analyze fatigue life?"

LLM reads registry:
  → Searches category="engineering", subcategory="extractors"
  → Finds: stress_extractor, displacement_extractor (exist)
  → Doesn't find: fatigue_extractor (missing)
  → Reads composition_hints for similar features

LLM proposes: "I can create a fatigue_life_extractor that:
               1. Extracts stress history from OP2
               2. Applies rainflow counting algorithm
               3. Uses S-N curve to estimate fatigue life

               This would be similar to stress_extractor but with
               time-series analysis. Should I implement it?"

4. Execution Phase

User: "Run the optimization"

LLM reads registry:
  → Finds abstraction_level="workflow", feature_id="run_optimization"
  → Reads implementation.entry_point
  → Checks dependencies: ["optuna", "nx_solver", "stress_extractor"]
  → Reads lifecycle_stage to understand execution order

LLM executes: python optimization_engine/runner.py

Natural Language Mapping

Each feature includes natural_language examples showing how users might request it:

"usage_examples": [
  {
    "natural_language": [
      "minimize stress",
      "reduce von Mises stress",
      "find lowest stress configuration",
      "optimize for minimum stress"
    ],
    "maps_to": {
      "feature": "stress_extractor",
      "objective": "minimize",
      "metric": "max_von_mises"
    }
  }
]

This enables LLM to understand user intent and select correct features.

Feature Documentation Strategy

Multi-Location Documentation

Features are documented in three places, each serving different purposes:

1. Feature Registry (feature_registry.json)

Purpose: LLM instruction and discovery Location: optimization_engine/feature_registry.json Content:

Structured metadata
Interface definitions
Composition hints
Usage examples

Example:

{
  "feature_id": "stress_extractor",
  "name": "Stress Extractor",
  "description": "Extracts von Mises stress from OP2 files",
  "category": "engineering",
  "subcategory": "extractors"
}

2. Code Implementation (*.py files)

Purpose: Actual functionality Location: Codebase (e.g., optimization_engine/result_extractors/extractors.py) Content:

Python code with docstrings
Type hints
Implementation details

Example:

def extract_stress_from_op2(op2_file: Path) -> dict:
    """
    Extracts von Mises stress from OP2 file.

    Args:
        op2_file: Path to OP2 file

    Returns:
        dict with max_von_mises, min_von_mises, avg_von_mises
    """
    # Implementation...

3. Feature Documentation (docs/features/*.md)

Purpose: Human-readable guides and tutorials Location: docs/features/ Content:

Detailed explanations
Extended examples
Best practices
Troubleshooting

Example: docs/features/stress_extractor.md

# Stress Extractor

## Overview
Extracts von Mises stress from NX Nastran OP2 files.

## When to Use
- Structural optimization where stress is the objective
- Constraint checking (yield stress limits)
- Multi-objective with stress as one objective

## Example Workflows
[detailed examples...]

Documentation Flow

User Request
     ↓
LLM reads feature_registry.json (discovers feature)
     ↓
LLM reads code docstrings (understands interface)
     ↓
LLM reads docs/features/*.md (if complex usage needed)
     ↓
LLM composes workflow using features

Dynamic Tool Building

How LLM Builds New Features

The registry enables autonomous feature creation through templates and patterns:

Step 1: Pattern Recognition

User: "I need thermal stress extraction"

LLM:
1. Reads existing feature: stress_extractor
2. Identifies pattern: OP2 parsing → result extraction → return dict
3. Finds similar features: displacement_extractor
4. Recognizes template: engineering.extractors

Step 2: Feature Generation

LLM generates new feature following pattern:
{
  "feature_id": "thermal_stress_extractor",
  "name": "Thermal Stress Extractor",
  "description": "Extracts thermal stress from OP2 files (steady-state heat transfer analysis)",
  "category": "engineering",
  "subcategory": "extractors",
  "lifecycle_stage": "post_extraction",
  "abstraction_level": "primitive",
  "implementation": {
    "file_path": "optimization_engine/result_extractors/thermal_extractors.py",
    "function_name": "extract_thermal_stress_from_op2",
    "entry_point": "from optimization_engine.result_extractors.thermal_extractors import extract_thermal_stress_from_op2"
  },
  # ... rest of schema
}

Step 3: Code Generation

# LLM writes implementation following stress_extractor pattern
def extract_thermal_stress_from_op2(op2_file: Path) -> dict:
    """
    Extracts thermal stress from OP2 file.

    Args:
        op2_file: Path to OP2 file from thermal analysis

    Returns:
        dict with max_thermal_stress, temperature_at_max_stress
    """
    from pyNastran.op2.op2 import OP2

    op2 = OP2()
    op2.read_op2(op2_file)

    # Extract thermal stress (element type depends on analysis)
    thermal_stress = op2.thermal_stress_data

    return {
        'max_thermal_stress': thermal_stress.max(),
        'temperature_at_max_stress': # ...
    }

Step 4: Registration

LLM adds to feature_registry.json
LLM creates docs/features/thermal_stress_extractor.md
LLM updates CHANGELOG.md with new feature
LLM runs tests to validate implementation

Feature Composition Examples

Example 1: RSS Metric (Composite Feature)

User: "Create RSS metric combining stress and displacement"

LLM composes from primitives:
  stress_extractor + displacement_extractor → rss_metric

Generated feature:
{
  "feature_id": "rss_stress_displacement",
  "abstraction_level": "composite",
  "dependencies": {
    "features": ["stress_extractor", "displacement_extractor"]
  },
  "composition_hints": {
    "composed_from": ["stress_extractor", "displacement_extractor"],
    "composition_type": "root_sum_square"
  }
}

Example 2: Complete Workflow

User: "Run bracket optimization minimizing stress"

LLM composes workflow from features:
  1. study_manager (create study folder)
  2. nx_updater (update wall_thickness parameter)
  3. nx_solver (run FEA)
  4. stress_extractor (extract results)
  5. optimization_runner (Optuna TPE loop)
  6. report_generator (create HTML report)

Each step uses a feature from registry with proper sequencing
based on lifecycle_stage metadata.

Examples

Example 1: Engineering Feature (Stress Extractor)

{
  "feature_id": "stress_extractor",
  "name": "Stress Extractor",
  "description": "Extracts von Mises stress from NX Nastran OP2 files",
  "category": "engineering",
  "subcategory": "extractors",
  "lifecycle_stage": "post_extraction",
  "abstraction_level": "primitive",
  "implementation": {
    "file_path": "optimization_engine/result_extractors/extractors.py",
    "function_name": "extract_stress_from_op2",
    "entry_point": "from optimization_engine.result_extractors.extractors import extract_stress_from_op2"
  },
  "interface": {
    "inputs": [
      {
        "name": "op2_file",
        "type": "Path",
        "required": true,
        "description": "Path to OP2 file from NX solve",
        "example": "bracket_sim1-solution_1.op2"
      }
    ],
    "outputs": [
      {
        "name": "max_von_mises",
        "type": "float",
        "description": "Maximum von Mises stress across all elements",
        "units": "MPa"
      },
      {
        "name": "element_id_at_max",
        "type": "int",
        "description": "Element ID where max stress occurs"
      }
    ]
  },
  "dependencies": {
    "features": [],
    "libraries": ["pyNastran"],
    "nx_version": "2412"
  },
  "usage_examples": [
    {
      "description": "Minimize stress in bracket optimization",
      "code": "result = extract_stress_from_op2(Path('bracket.op2'))\nmax_stress = result['max_von_mises']",
      "natural_language": [
        "minimize stress",
        "reduce von Mises stress",
        "find lowest stress configuration"
      ]
    }
  ],
  "composition_hints": {
    "combines_with": ["displacement_extractor", "mass_extractor"],
    "typical_workflows": ["structural_optimization", "stress_minimization"],
    "prerequisites": ["nx_solver"]
  },
  "metadata": {
    "author": "Antoine Polvé",
    "created": "2025-01-10",
    "status": "stable",
    "tested": true,
    "documentation_url": "docs/features/stress_extractor.md"
  }
}

Example 2: Software Feature (Hook Manager)

{
  "feature_id": "hook_manager",
  "name": "Hook Manager",
  "description": "Manages plugin lifecycle hooks for optimization workflow",
  "category": "software",
  "subcategory": "infrastructure",
  "lifecycle_stage": "all",
  "abstraction_level": "composite",
  "implementation": {
    "file_path": "optimization_engine/plugins/hook_manager.py",
    "function_name": "HookManager",
    "entry_point": "from optimization_engine.plugins.hook_manager import HookManager"
  },
  "interface": {
    "inputs": [
      {
        "name": "hook_type",
        "type": "str",
        "required": true,
        "description": "Lifecycle point: pre_solve, post_solve, post_extraction",
        "example": "pre_solve"
      },
      {
        "name": "context",
        "type": "dict",
        "required": true,
        "description": "Context data passed to hooks (trial_number, design_variables, etc.)"
      }
    ],
    "outputs": [
      {
        "name": "execution_history",
        "type": "list",
        "description": "List of hooks executed with timestamps and success status"
      }
    ]
  },
  "dependencies": {
    "features": [],
    "libraries": [],
    "nx_version": null
  },
  "usage_examples": [
    {
      "description": "Execute pre-solve hooks before FEA",
      "code": "hook_manager.execute_hooks('pre_solve', context={'trial': 1})",
      "natural_language": [
        "run pre-solve plugins",
        "execute hooks before solving"
      ]
    }
  ],
  "composition_hints": {
    "combines_with": ["detailed_logger", "optimization_logger"],
    "typical_workflows": ["optimization_runner"],
    "prerequisites": []
  },
  "metadata": {
    "author": "Antoine Polvé",
    "created": "2025-01-16",
    "status": "stable",
    "tested": true,
    "documentation_url": "docs/features/hook_manager.md"
  }
}

{
  "feature_id": "optimization_progress_chart",
  "name": "Optimization Progress Chart",
  "description": "Real-time chart showing optimization convergence",
  "category": "ui",
  "subcategory": "dashboard_widgets",
  "lifecycle_stage": "post_optimization",
  "abstraction_level": "composite",
  "implementation": {
    "file_path": "dashboard/frontend/components/ProgressChart.js",
    "function_name": "OptimizationProgressChart",
    "entry_point": "new OptimizationProgressChart(containerId)"
  },
  "interface": {
    "inputs": [
      {
        "name": "trial_data",
        "type": "list[dict]",
        "required": true,
        "description": "List of trial results with objective values",
        "example": "[{trial: 1, value: 45.3}, {trial: 2, value: 42.1}]"
      }
    ],
    "outputs": [
      {
        "name": "chart_element",
        "type": "HTMLElement",
        "description": "Rendered chart DOM element"
      }
    ]
  },
  "dependencies": {
    "features": [],
    "libraries": ["Chart.js"],
    "nx_version": null
  },
  "usage_examples": [
    {
      "description": "Display optimization progress in dashboard",
      "code": "chart = new OptimizationProgressChart('chart-container')\nchart.update(trial_data)",
      "natural_language": [
        "show optimization progress",
        "display convergence chart",
        "visualize trial results"
      ]
    }
  ],
  "composition_hints": {
    "combines_with": ["trial_history_table", "best_parameters_display"],
    "typical_workflows": ["dashboard_view", "result_monitoring"],
    "prerequisites": ["optimization_runner"]
  },
  "metadata": {
    "author": "Antoine Polvé",
    "created": "2025-01-10",
    "status": "stable",
    "tested": true,
    "documentation_url": "docs/features/dashboard_widgets.md"
  }
}

Example 4: Analysis Feature (Surrogate Quality Checker)

{
  "feature_id": "surrogate_quality_checker",
  "name": "Surrogate Quality Checker",
  "description": "Evaluates surrogate model quality using R², CV score, and confidence intervals",
  "category": "analysis",
  "subcategory": "decision_support",
  "lifecycle_stage": "post_optimization",
  "abstraction_level": "composite",
  "implementation": {
    "file_path": "optimization_engine/analysis/surrogate_quality.py",
    "function_name": "check_surrogate_quality",
    "entry_point": "from optimization_engine.analysis.surrogate_quality import check_surrogate_quality"
  },
  "interface": {
    "inputs": [
      {
        "name": "trial_data",
        "type": "list[dict]",
        "required": true,
        "description": "Trial history with design variables and objectives"
      },
      {
        "name": "min_r_squared",
        "type": "float",
        "required": false,
        "description": "Minimum acceptable R² threshold",
        "example": "0.9"
      }
    ],
    "outputs": [
      {
        "name": "r_squared",
        "type": "float",
        "description": "Coefficient of determination",
        "units": "none"
      },
      {
        "name": "cv_score",
        "type": "float",
        "description": "Cross-validation score",
        "units": "none"
      },
      {
        "name": "quality_verdict",
        "type": "str",
        "description": "EXCELLENT|GOOD|POOR based on metrics"
      }
    ]
  },
  "dependencies": {
    "features": ["optimization_runner"],
    "libraries": ["sklearn", "numpy"],
    "nx_version": null
  },
  "usage_examples": [
    {
      "description": "Check if surrogate is reliable for predictions",
      "code": "quality = check_surrogate_quality(trial_data)\nif quality['r_squared'] > 0.9:\n    print('Surrogate is reliable')",
      "natural_language": [
        "check surrogate quality",
        "is surrogate reliable",
        "can I trust the surrogate model"
      ]
    }
  ],
  "composition_hints": {
    "combines_with": ["sensitivity_analysis", "pareto_front_analyzer"],
    "typical_workflows": ["post_optimization_analysis", "decision_support"],
    "prerequisites": ["optimization_runner"]
  },
  "metadata": {
    "author": "Antoine Polvé",
    "created": "2025-01-16",
    "status": "experimental",
    "tested": false,
    "documentation_url": "docs/features/surrogate_quality_checker.md"
  }
}

Implementation Plan

Phase 2 Week 1: Foundation

Day 1-2: Create Initial Registry

Create optimization_engine/feature_registry.json
Document 15-20 existing features across all categories
Add engineering features (stress_extractor, displacement_extractor)
Add software features (hook_manager, optimization_runner, nx_solver)
Add UI features (dashboard widgets)

Day 3-4: LLM Skill Setup

Create .claude/skills/atomizer.md
Define how LLM should read and use feature_registry.json
Add feature discovery examples
Add feature composition examples
Test LLM's ability to navigate registry

Day 5: Documentation

Create docs/features/ directory
Write feature guides for key features
Link registry entries to documentation
Update DEVELOPMENT.md with registry usage

Phase 2 Week 2: LLM Integration

Natural Language Parser

Intent classification using registry metadata
Entity extraction for design variables, objectives
Feature selection based on user request
Workflow composition from features

Future Phases: Feature Expansion

Phase 3: Code Generation

Template features for common patterns
Validation rules for generated code
Auto-registration of new features

Phase 4-7: Continuous Evolution

User-contributed features
Pattern learning from usage
Best practices extraction
Self-documentation updates

Benefits of This Architecture

For Users

Natural language control: "minimize stress" → LLM selects stress_extractor
Intelligent suggestions: LLM proposes features based on context
No configuration files: LLM generates config from conversation

For Developers

Clear structure: Features organized by domain, lifecycle, abstraction
Easy extension: Add new features following templates
Self-documenting: Registry serves as API documentation

For LLM

Comprehensive context: All capabilities in one place
Composition guidance: Knows how features combine
Natural language mapping: Understands user intent
Pattern recognition: Can generate new features from templates

Next Steps

Create initial feature_registry.json with 15-20 existing features
Test LLM navigation with Claude skill
Validate registry structure with real user requests
Iterate on metadata based on LLM's needs
Build out documentation in docs/features/

Maintained by: Antoine Polvé (antoine@atomaste.com) Repository: GitHub - Atomizer

24 KiB Raw Blame History