docs: Major documentation overhaul - restructure folders, update tagline, add Getting Started guide

- Restructure docs/ folder (remove numeric prefixes):
  - 04_USER_GUIDES -> guides/
  - 05_API_REFERENCE -> api/
  - 06_PHYSICS -> physics/
  - 07_DEVELOPMENT -> development/
  - 08_ARCHIVE -> archive/
  - 09_DIAGRAMS -> diagrams/

- Replace tagline 'Talk, don't click' with 'LLM-driven optimization framework' in 9 files

- Create comprehensive docs/GETTING_STARTED.md:
  - Prerequisites and quick setup
  - Project structure overview
  - First study tutorial (Claude or manual)
  - Dashboard usage guide
  - Neural acceleration introduction

- Rewrite docs/00_INDEX.md with correct paths and modern structure

- Archive obsolete files:
  - 01_PROTOCOLS.md -> archive/historical/01_PROTOCOLS_legacy.md
  - 03_GETTING_STARTED.md -> archive/historical/
  - ATOMIZER_PODCAST_BRIEFING.md -> archive/marketing/

- Update timestamps to 2026-01-20 across all key files

- Update .gitignore to exclude docs/generated/

- Version bump: ATOMIZER_CONTEXT v1.8 -> v2.0
This commit is contained in:
2026-01-20 10:03:45 -05:00
parent 37f73cc2be
commit ea437d360e
103 changed files with 8980 additions and 327 deletions

View File

@@ -0,0 +1,495 @@
# Neural Network Surrogate Automation Plan
## Vision: One-Click ML-Accelerated Optimization
Make neural network surrogates a **first-class citizen** in Atomizer, fully integrated into the optimization workflow so that:
1. Non-coders can enable/configure NN acceleration via JSON config
2. The system automatically builds, trains, and validates surrogates
3. Knowledge accumulates in a reusable "Physics Knowledge Base"
4. The dashboard provides full visibility and control
---
## Current State (What We Have)
```
Manual Steps Required Today:
1. Run optimization (30+ FEA trials)
2. Manually run: generate_training_data.py
3. Manually run: run_training_fea.py
4. Manually run: train_nn_surrogate.py
5. Manually run: generate_nn_report.py
6. Manually enable --enable-nn flag
7. No persistent knowledge storage
```
---
## Target State (What We Want)
```
Automated Flow:
1. User creates optimization_config.json with surrogate_settings
2. User runs: python run_optimization.py --trials 100
3. System automatically:
- Runs initial FEA exploration (20-30 trials)
- Generates space-filling training points
- Runs parallel FEA on training points
- Trains and validates surrogate
- Switches to NN-accelerated optimization
- Validates top candidates with real FEA
- Stores learned physics in Knowledge Base
```
---
## Phase 1: Extended Configuration Schema
### Current optimization_config.json
```json
{
"study_name": "uav_arm_optimization",
"optimization_settings": {
"protocol": "protocol_11_multi_objective",
"n_trials": 30
},
"design_variables": [...],
"objectives": [...],
"constraints": [...]
}
```
### Proposed Extended Schema
```json
{
"study_name": "uav_arm_optimization",
"description": "UAV Camera Support Arm",
"engineering_context": "Drone gimbal arm for 850g camera payload",
"optimization_settings": {
"protocol": "protocol_12_hybrid_surrogate",
"n_trials": 200,
"sampler": "NSGAIISampler"
},
"design_variables": [...],
"objectives": [...],
"constraints": [...],
"surrogate_settings": {
"enabled": true,
"mode": "auto",
"training": {
"initial_fea_trials": 30,
"space_filling_samples": 100,
"sampling_method": "lhs_with_corners",
"parallel_workers": 2
},
"model": {
"architecture": "mlp",
"hidden_layers": [64, 128, 64],
"validation_method": "5_fold_cv",
"min_accuracy_mape": 10.0,
"retrain_threshold": 15.0
},
"optimization": {
"nn_trials_per_fea": 50,
"validate_top_n": 5,
"adaptive_sampling": true
},
"knowledge_base": {
"save_to_master": true,
"master_db_path": "knowledge_base/physics_surrogates.db",
"tags": ["cantilever", "aluminum", "modal", "static"],
"reuse_similar": true
}
},
"simulation": {...},
"reporting": {...}
}
```
---
## Phase 2: Protocol 12 - Hybrid Surrogate Optimization
### Workflow Stages
```
┌─────────────────────────────────────────────────────────────────────┐
│ PROTOCOL 12: HYBRID SURROGATE │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ STAGE 1: EXPLORATION (FEA Only) │
│ ├─ Run initial_fea_trials with real FEA │
│ ├─ Build baseline Pareto front │
│ └─ Assess design space complexity │
│ │
│ STAGE 2: TRAINING DATA GENERATION │
│ ├─ Generate space_filling_samples (LHS + corners) │
│ ├─ Run parallel FEA on training points │
│ ├─ Store all results in training_data.db │
│ └─ Monitor for failures, retry if needed │
│ │
│ STAGE 3: SURROGATE TRAINING │
│ ├─ Train NN on combined data (optimization + training) │
│ ├─ Validate with k-fold cross-validation │
│ ├─ Check accuracy >= min_accuracy_mape │
│ └─ Generate performance report │
│ │
│ STAGE 4: NN-ACCELERATED OPTIMIZATION │
│ ├─ Run nn_trials_per_fea NN evaluations per FEA validation │
│ ├─ Validate top_n candidates with real FEA │
│ ├─ Update surrogate with new data (adaptive) │
│ └─ Repeat until n_trials reached │
│ │
│ STAGE 5: FINAL VALIDATION & REPORTING │
│ ├─ Validate all Pareto-optimal designs with FEA │
│ ├─ Generate comprehensive report │
│ └─ Save learned physics to Knowledge Base │
│ │
└─────────────────────────────────────────────────────────────────────┘
```
### Implementation: runner_protocol_12.py
```python
class HybridSurrogateRunner:
"""Protocol 12: Automated hybrid FEA/NN optimization."""
def __init__(self, config: dict):
self.config = config
self.surrogate_config = config.get('surrogate_settings', {})
self.stage = "exploration"
def run(self):
# Stage 1: Exploration
self.run_exploration_stage()
# Stage 2: Training Data
if self.surrogate_config.get('enabled', False):
self.generate_training_data()
self.run_parallel_fea_training()
# Stage 3: Train Surrogate
self.train_and_validate_surrogate()
# Stage 4: NN-Accelerated
self.run_nn_accelerated_optimization()
# Stage 5: Final
self.validate_and_report()
self.save_to_knowledge_base()
```
---
## Phase 3: Physics Knowledge Base Architecture
### Purpose
Store learned physics relationships so future optimizations can:
1. **Warm-start** with pre-trained surrogates for similar problems
2. **Transfer learn** from related geometries/materials
3. **Build institutional knowledge** over time
### Database Schema: physics_surrogates.db
```sql
-- Master registry of all trained surrogates
CREATE TABLE surrogates (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
study_name TEXT,
-- Problem characterization
geometry_type TEXT, -- 'cantilever', 'plate', 'shell', 'solid'
material_family TEXT, -- 'aluminum', 'steel', 'composite'
analysis_types TEXT, -- JSON: ['static', 'modal', 'buckling']
-- Design space
n_parameters INTEGER,
parameter_names TEXT, -- JSON array
parameter_bounds TEXT, -- JSON: {name: [min, max]}
-- Objectives & Constraints
objectives TEXT, -- JSON: [{name, goal}]
constraints TEXT, -- JSON: [{name, type, threshold}]
-- Model info
model_path TEXT, -- Path to .pt file
architecture TEXT, -- JSON: model architecture
training_samples INTEGER,
-- Performance metrics
cv_mape_mass REAL,
cv_mape_frequency REAL,
cv_r2_mass REAL,
cv_r2_frequency REAL,
-- Metadata
tags TEXT, -- JSON array for search
description TEXT,
engineering_context TEXT
);
-- Training data for each surrogate
CREATE TABLE training_data (
id INTEGER PRIMARY KEY,
surrogate_id INTEGER REFERENCES surrogates(id),
-- Input parameters (normalized 0-1)
params_json TEXT,
params_normalized TEXT,
-- Output values
mass REAL,
frequency REAL,
max_displacement REAL,
max_stress REAL,
-- Source
source TEXT, -- 'optimization', 'lhs', 'corner', 'adaptive'
fea_timestamp TIMESTAMP
);
-- Similarity index for finding related problems
CREATE TABLE problem_similarity (
surrogate_id INTEGER REFERENCES surrogates(id),
-- Embedding for similarity search
geometry_embedding BLOB, -- Vector embedding of geometry type
physics_embedding BLOB, -- Vector embedding of physics signature
-- Precomputed similarity features
feature_vector TEXT -- JSON: normalized features for matching
);
```
### Knowledge Base API
```python
class PhysicsKnowledgeBase:
"""Central repository for learned physics surrogates."""
def __init__(self, db_path: str = "knowledge_base/physics_surrogates.db"):
self.db_path = db_path
def find_similar_surrogate(self, config: dict) -> Optional[SurrogateMatch]:
"""Find existing surrogate that could transfer to this problem."""
# Extract features from config
features = self._extract_problem_features(config)
# Query similar problems
matches = self._query_similar(features)
# Return best match if similarity > threshold
if matches and matches[0].similarity > 0.8:
return matches[0]
return None
def save_surrogate(self, study_name: str, model_path: str,
config: dict, metrics: dict):
"""Save trained surrogate to knowledge base."""
# Store model and metadata
# Index for future similarity search
pass
def transfer_learn(self, base_surrogate_id: int,
new_config: dict) -> nn.Module:
"""Create new surrogate by transfer learning from existing one."""
# Load base model
# Freeze early layers
# Fine-tune on new data
pass
```
---
## Phase 4: Dashboard Integration
### New Dashboard Pages
#### 1. Surrogate Status Panel (in existing Dashboard)
```
┌─────────────────────────────────────────────────────────┐
│ SURROGATE STATUS │
├─────────────────────────────────────────────────────────┤
│ Mode: Hybrid (NN + FEA Validation) │
│ Stage: NN-Accelerated Optimization │
│ │
│ Training Data: 150 samples (50 opt + 100 LHS) │
│ Model Accuracy: MAPE 1.8% mass, 1.1% freq │
│ Speedup: ~50x (10ms NN vs 500ms FEA) │
│ │
│ [View Report] [Retrain] [Disable NN] │
└─────────────────────────────────────────────────────────┘
```
#### 2. Knowledge Base Browser
```
┌─────────────────────────────────────────────────────────┐
│ PHYSICS KNOWLEDGE BASE │
├─────────────────────────────────────────────────────────┤
│ Stored Surrogates: 12 │
│ │
│ [Cantilever Beams] 5 models, avg MAPE 2.1% │
│ [Shell Structures] 3 models, avg MAPE 3.4% │
│ [Solid Parts] 4 models, avg MAPE 4.2% │
│ │
│ Search: [aluminum modal_______] [Find Similar] │
│ │
│ Matching Models: │
│ - uav_arm_v2 (92% match) - Transfer Learning Available │
│ - bracket_opt (78% match) │
└─────────────────────────────────────────────────────────┘
```
---
## Phase 5: User Workflow (Non-Coder Experience)
### Scenario: New Optimization with NN Acceleration
```
Step 1: Create Study via Dashboard
┌─────────────────────────────────────────────────────────┐
│ NEW OPTIMIZATION STUDY │
├─────────────────────────────────────────────────────────┤
│ Study Name: [drone_motor_mount___________] │
│ Description: [Motor mount bracket________] │
│ │
│ Model File: [Browse...] drone_mount.prt │
│ Sim File: [Browse...] drone_mount_sim.sim │
│ │
│ ☑ Enable Neural Network Acceleration │
│ ├─ Initial FEA Trials: [30____] │
│ ├─ Training Samples: [100___] │
│ ├─ Target Accuracy: [10% MAPE] │
│ └─ ☑ Save to Knowledge Base │
│ │
│ Similar existing model found: "uav_arm_optimization" │
│ ☑ Use as starting point (transfer learning) │
│ │
│ [Create Study] │
└─────────────────────────────────────────────────────────┘
Step 2: System Automatically Executes Protocol 12
- User sees progress in dashboard
- No command-line needed
- All stages automated
Step 3: Review Results
- Pareto front with FEA-validated designs
- NN performance report
- Knowledge saved for future use
```
---
## Implementation Roadmap
### Phase 1: Config Schema Extension (1-2 days)
- [ ] Define surrogate_settings schema
- [ ] Update config validator
- [ ] Create migration for existing configs
### Phase 2: Protocol 12 Runner (3-5 days)
- [ ] Create HybridSurrogateRunner class
- [ ] Implement stage transitions
- [ ] Add progress callbacks for dashboard
- [ ] Integrate existing scripts as modules
### Phase 3: Knowledge Base (2-3 days)
- [ ] Create SQLite schema
- [ ] Implement PhysicsKnowledgeBase API
- [ ] Add similarity search
- [ ] Basic transfer learning
### Phase 4: Dashboard Integration (2-3 days)
- [ ] Surrogate status panel
- [ ] Knowledge base browser
- [ ] Study creation wizard with NN options
### Phase 5: Documentation & Testing (1-2 days)
- [ ] User guide for non-coders
- [ ] Integration tests
- [ ] Example workflows
---
## Data Flow Architecture
```
┌──────────────────────────────────────┐
│ optimization_config.json │
│ (Single source of truth for study) │
└──────────────────┬───────────────────┘
┌──────────────────▼───────────────────┐
│ Protocol 12 Runner │
│ (Orchestrates entire workflow) │
└──────────────────┬───────────────────┘
┌─────────────────┬───────────┼───────────┬─────────────────┐
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ FEA │ │Training │ │Surrogate│ │ NN │ │Knowledge│
│ Solver │ │ Data │ │ Trainer │ │ Optim │ │ Base │
└────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ study.db │
│ (Optuna trials + training data + surrogate metadata) │
└─────────────────────────────────────────────────────────────────┘
┌──────────────────▼───────────────────┐
│ physics_surrogates.db │
│ (Master knowledge base - global) │
└──────────────────────────────────────┘
```
---
## Key Benefits
### For Non-Coders
1. **Single JSON config** - No Python scripts to run manually
2. **Dashboard control** - Start/stop/monitor from browser
3. **Automatic recommendations** - System suggests best settings
4. **Knowledge reuse** - Similar problems get free speedup
### For the Organization
1. **Institutional memory** - Physics knowledge persists
2. **Faster iterations** - Each new study benefits from past work
3. **Reproducibility** - Everything tracked in databases
4. **Scalability** - Add more workers, train better models
### For the Workflow
1. **End-to-end automation** - No manual steps between stages
2. **Adaptive optimization** - System learns during run
3. **Validated results** - Top candidates always FEA-verified
4. **Rich reporting** - Performance metrics, comparisons, recommendations
---
## Next Steps
1. **Review this plan** - Get feedback on priorities
2. **Start with config schema** - Extend optimization_config.json
3. **Build Protocol 12** - Core automation logic
4. **Knowledge Base MVP** - Basic save/load functionality
5. **Dashboard integration** - Visual control panel
---
*Document Version: 1.0*
*Created: 2025-11-25*
*Author: Claude Code + Antoine*