docs: Major documentation overhaul - restructure folders, update tagline, add Getting Started guide

- Restructure docs/ folder (remove numeric prefixes): - 04_USER_GUIDES -> guides/ - 05_API_REFERENCE -> api/ - 06_PHYSICS -> physics/ - 07_DEVELOPMENT -> development/ - 08_ARCHIVE -> archive/ - 09_DIAGRAMS -> diagrams/ - Replace tagline 'Talk, don't click' with 'LLM-driven optimization framework' in 9 files - Create comprehensive docs/GETTING_STARTED.md: - Prerequisites and quick setup - Project structure overview - First study tutorial (Claude or manual) - Dashboard usage guide - Neural acceleration introduction - Rewrite docs/00_INDEX.md with correct paths and modern structure - Archive obsolete files: - 01_PROTOCOLS.md -> archive/historical/01_PROTOCOLS_legacy.md - 03_GETTING_STARTED.md -> archive/historical/ - ATOMIZER_PODCAST_BRIEFING.md -> archive/marketing/ - Update timestamps to 2026-01-20 across all key files - Update .gitignore to exclude docs/generated/ - Version bump: ATOMIZER_CONTEXT v1.8 -> v2.0
2026-01-20 10:03:45 -05:00
parent 37f73cc2be
commit ea437d360e
103 changed files with 8980 additions and 327 deletions
--- a/docs/development/NN_SURROGATE_AUTOMATION_PLAN.md
+++ b/docs/development/NN_SURROGATE_AUTOMATION_PLAN.md
@@ -0,0 +1,495 @@
+# Neural Network Surrogate Automation Plan
+
+## Vision: One-Click ML-Accelerated Optimization
+
+Make neural network surrogates a **first-class citizen** in Atomizer, fully integrated into the optimization workflow so that:
+1. Non-coders can enable/configure NN acceleration via JSON config
+2. The system automatically builds, trains, and validates surrogates
+3. Knowledge accumulates in a reusable "Physics Knowledge Base"
+4. The dashboard provides full visibility and control
+
+---
+
+## Current State (What We Have)
+
+```
+Manual Steps Required Today:
+1. Run optimization (30+ FEA trials)
+2. Manually run: generate_training_data.py
+3. Manually run: run_training_fea.py
+4. Manually run: train_nn_surrogate.py
+5. Manually run: generate_nn_report.py
+6. Manually enable --enable-nn flag
+7. No persistent knowledge storage
+```
+
+---
+
+## Target State (What We Want)
+
+```
+Automated Flow:
+1. User creates optimization_config.json with surrogate_settings
+2. User runs: python run_optimization.py --trials 100
+3. System automatically:
+   - Runs initial FEA exploration (20-30 trials)
+   - Generates space-filling training points
+   - Runs parallel FEA on training points
+   - Trains and validates surrogate
+   - Switches to NN-accelerated optimization
+   - Validates top candidates with real FEA
+   - Stores learned physics in Knowledge Base
+```
+
+---
+
+## Phase 1: Extended Configuration Schema
+
+### Current optimization_config.json
+```json
+{
+  "study_name": "uav_arm_optimization",
+  "optimization_settings": {
+    "protocol": "protocol_11_multi_objective",
+    "n_trials": 30
+  },
+  "design_variables": [...],
+  "objectives": [...],
+  "constraints": [...]
+}
+```
+
+### Proposed Extended Schema
+```json
+{
+  "study_name": "uav_arm_optimization",
+  "description": "UAV Camera Support Arm",
+  "engineering_context": "Drone gimbal arm for 850g camera payload",
+
+  "optimization_settings": {
+    "protocol": "protocol_12_hybrid_surrogate",
+    "n_trials": 200,
+    "sampler": "NSGAIISampler"
+  },
+
+  "design_variables": [...],
+  "objectives": [...],
+  "constraints": [...],
+
+  "surrogate_settings": {
+    "enabled": true,
+    "mode": "auto",
+
+    "training": {
+      "initial_fea_trials": 30,
+      "space_filling_samples": 100,
+      "sampling_method": "lhs_with_corners",
+      "parallel_workers": 2
+    },
+
+    "model": {
+      "architecture": "mlp",
+      "hidden_layers": [64, 128, 64],
+      "validation_method": "5_fold_cv",
+      "min_accuracy_mape": 10.0,
+      "retrain_threshold": 15.0
+    },
+
+    "optimization": {
+      "nn_trials_per_fea": 50,
+      "validate_top_n": 5,
+      "adaptive_sampling": true
+    },
+
+    "knowledge_base": {
+      "save_to_master": true,
+      "master_db_path": "knowledge_base/physics_surrogates.db",
+      "tags": ["cantilever", "aluminum", "modal", "static"],
+      "reuse_similar": true
+    }
+  },
+
+  "simulation": {...},
+  "reporting": {...}
+}
+```
+
+---
+
+## Phase 2: Protocol 12 - Hybrid Surrogate Optimization
+
+### Workflow Stages
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                    PROTOCOL 12: HYBRID SURROGATE                     │
+├─────────────────────────────────────────────────────────────────────┤
+│                                                                      │
+│  STAGE 1: EXPLORATION (FEA Only)                                    │
+│  ├─ Run initial_fea_trials with real FEA                            │
+│  ├─ Build baseline Pareto front                                     │
+│  └─ Assess design space complexity                                  │
+│                                                                      │
+│  STAGE 2: TRAINING DATA GENERATION                                  │
+│  ├─ Generate space_filling_samples (LHS + corners)                  │
+│  ├─ Run parallel FEA on training points                             │
+│  ├─ Store all results in training_data.db                           │
+│  └─ Monitor for failures, retry if needed                           │
+│                                                                      │
+│  STAGE 3: SURROGATE TRAINING                                        │
+│  ├─ Train NN on combined data (optimization + training)             │
+│  ├─ Validate with k-fold cross-validation                           │
+│  ├─ Check accuracy >= min_accuracy_mape                             │
+│  └─ Generate performance report                                     │
+│                                                                      │
+│  STAGE 4: NN-ACCELERATED OPTIMIZATION                               │
+│  ├─ Run nn_trials_per_fea NN evaluations per FEA validation         │
+│  ├─ Validate top_n candidates with real FEA                         │
+│  ├─ Update surrogate with new data (adaptive)                       │
+│  └─ Repeat until n_trials reached                                   │
+│                                                                      │
+│  STAGE 5: FINAL VALIDATION & REPORTING                              │
+│  ├─ Validate all Pareto-optimal designs with FEA                    │
+│  ├─ Generate comprehensive report                                   │
+│  └─ Save learned physics to Knowledge Base                          │
+│                                                                      │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+### Implementation: runner_protocol_12.py
+
+```python
+class HybridSurrogateRunner:
+    """Protocol 12: Automated hybrid FEA/NN optimization."""
+
+    def __init__(self, config: dict):
+        self.config = config
+        self.surrogate_config = config.get('surrogate_settings', {})
+        self.stage = "exploration"
+
+    def run(self):
+        # Stage 1: Exploration
+        self.run_exploration_stage()
+
+        # Stage 2: Training Data
+        if self.surrogate_config.get('enabled', False):
+            self.generate_training_data()
+            self.run_parallel_fea_training()
+
+            # Stage 3: Train Surrogate
+            self.train_and_validate_surrogate()
+
+            # Stage 4: NN-Accelerated
+            self.run_nn_accelerated_optimization()
+
+        # Stage 5: Final
+        self.validate_and_report()
+        self.save_to_knowledge_base()
+```
+
+---
+
+## Phase 3: Physics Knowledge Base Architecture
+
+### Purpose
+Store learned physics relationships so future optimizations can:
+1. **Warm-start** with pre-trained surrogates for similar problems
+2. **Transfer learn** from related geometries/materials
+3. **Build institutional knowledge** over time
+
+### Database Schema: physics_surrogates.db
+
+```sql
+-- Master registry of all trained surrogates
+CREATE TABLE surrogates (
+    id INTEGER PRIMARY KEY,
+    name TEXT NOT NULL,
+    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+    study_name TEXT,
+
+    -- Problem characterization
+    geometry_type TEXT,        -- 'cantilever', 'plate', 'shell', 'solid'
+    material_family TEXT,      -- 'aluminum', 'steel', 'composite'
+    analysis_types TEXT,       -- JSON: ['static', 'modal', 'buckling']
+
+    -- Design space
+    n_parameters INTEGER,
+    parameter_names TEXT,      -- JSON array
+    parameter_bounds TEXT,     -- JSON: {name: [min, max]}
+
+    -- Objectives & Constraints
+    objectives TEXT,           -- JSON: [{name, goal}]
+    constraints TEXT,          -- JSON: [{name, type, threshold}]
+
+    -- Model info
+    model_path TEXT,           -- Path to .pt file
+    architecture TEXT,         -- JSON: model architecture
+    training_samples INTEGER,
+
+    -- Performance metrics
+    cv_mape_mass REAL,
+    cv_mape_frequency REAL,
+    cv_r2_mass REAL,
+    cv_r2_frequency REAL,
+
+    -- Metadata
+    tags TEXT,                 -- JSON array for search
+    description TEXT,
+    engineering_context TEXT
+);
+
+-- Training data for each surrogate
+CREATE TABLE training_data (
+    id INTEGER PRIMARY KEY,
+    surrogate_id INTEGER REFERENCES surrogates(id),
+
+    -- Input parameters (normalized 0-1)
+    params_json TEXT,
+    params_normalized TEXT,
+
+    -- Output values
+    mass REAL,
+    frequency REAL,
+    max_displacement REAL,
+    max_stress REAL,
+
+    -- Source
+    source TEXT,              -- 'optimization', 'lhs', 'corner', 'adaptive'
+    fea_timestamp TIMESTAMP
+);
+
+-- Similarity index for finding related problems
+CREATE TABLE problem_similarity (
+    surrogate_id INTEGER REFERENCES surrogates(id),
+
+    -- Embedding for similarity search
+    geometry_embedding BLOB,   -- Vector embedding of geometry type
+    physics_embedding BLOB,    -- Vector embedding of physics signature
+
+    -- Precomputed similarity features
+    feature_vector TEXT        -- JSON: normalized features for matching
+);
+```
+
+### Knowledge Base API
+
+```python
+class PhysicsKnowledgeBase:
+    """Central repository for learned physics surrogates."""
+
+    def __init__(self, db_path: str = "knowledge_base/physics_surrogates.db"):
+        self.db_path = db_path
+
+    def find_similar_surrogate(self, config: dict) -> Optional[SurrogateMatch]:
+        """Find existing surrogate that could transfer to this problem."""
+        # Extract features from config
+        features = self._extract_problem_features(config)
+
+        # Query similar problems
+        matches = self._query_similar(features)
+
+        # Return best match if similarity > threshold
+        if matches and matches[0].similarity > 0.8:
+            return matches[0]
+        return None
+
+    def save_surrogate(self, study_name: str, model_path: str,
+                       config: dict, metrics: dict):
+        """Save trained surrogate to knowledge base."""
+        # Store model and metadata
+        # Index for future similarity search
+        pass
+
+    def transfer_learn(self, base_surrogate_id: int,
+                       new_config: dict) -> nn.Module:
+        """Create new surrogate by transfer learning from existing one."""
+        # Load base model
+        # Freeze early layers
+        # Fine-tune on new data
+        pass
+```
+
+---
+
+## Phase 4: Dashboard Integration
+
+### New Dashboard Pages
+
+#### 1. Surrogate Status Panel (in existing Dashboard)
+```
+┌─────────────────────────────────────────────────────────┐
+│ SURROGATE STATUS                                        │
+├─────────────────────────────────────────────────────────┤
+│ Mode: Hybrid (NN + FEA Validation)                      │
+│ Stage: NN-Accelerated Optimization                      │
+│                                                         │
+│ Training Data: 150 samples (50 opt + 100 LHS)          │
+│ Model Accuracy: MAPE 1.8% mass, 1.1% freq              │
+│ Speedup: ~50x (10ms NN vs 500ms FEA)                   │
+│                                                         │
+│ [View Report] [Retrain] [Disable NN]                   │
+└─────────────────────────────────────────────────────────┘
+```
+
+#### 2. Knowledge Base Browser
+```
+┌─────────────────────────────────────────────────────────┐
+│ PHYSICS KNOWLEDGE BASE                                  │
+├─────────────────────────────────────────────────────────┤
+│ Stored Surrogates: 12                                   │
+│                                                         │
+│ [Cantilever Beams]  5 models, avg MAPE 2.1%            │
+│ [Shell Structures]  3 models, avg MAPE 3.4%            │
+│ [Solid Parts]       4 models, avg MAPE 4.2%            │
+│                                                         │
+│ Search: [aluminum modal_______] [Find Similar]          │
+│                                                         │
+│ Matching Models:                                        │
+│ - uav_arm_v2 (92% match) - Transfer Learning Available │
+│ - bracket_opt (78% match)                              │
+└─────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Phase 5: User Workflow (Non-Coder Experience)
+
+### Scenario: New Optimization with NN Acceleration
+
+```
+Step 1: Create Study via Dashboard
+┌─────────────────────────────────────────────────────────┐
+│ NEW OPTIMIZATION STUDY                                  │
+├─────────────────────────────────────────────────────────┤
+│ Study Name: [drone_motor_mount___________]              │
+│ Description: [Motor mount bracket________]              │
+│                                                         │
+│ Model File: [Browse...] drone_mount.prt                │
+│ Sim File:   [Browse...] drone_mount_sim.sim            │
+│                                                         │
+│ ☑ Enable Neural Network Acceleration                    │
+│   ├─ Initial FEA Trials: [30____]                      │
+│   ├─ Training Samples:   [100___]                      │
+│   ├─ Target Accuracy:    [10% MAPE]                    │
+│   └─ ☑ Save to Knowledge Base                          │
+│                                                         │
+│ Similar existing model found: "uav_arm_optimization"   │
+│ ☑ Use as starting point (transfer learning)            │
+│                                                         │
+│ [Create Study]                                          │
+└─────────────────────────────────────────────────────────┘
+
+Step 2: System Automatically Executes Protocol 12
+- User sees progress in dashboard
+- No command-line needed
+- All stages automated
+
+Step 3: Review Results
+- Pareto front with FEA-validated designs
+- NN performance report
+- Knowledge saved for future use
+```
+
+---
+
+## Implementation Roadmap
+
+### Phase 1: Config Schema Extension (1-2 days)
+- [ ] Define surrogate_settings schema
+- [ ] Update config validator
+- [ ] Create migration for existing configs
+
+### Phase 2: Protocol 12 Runner (3-5 days)
+- [ ] Create HybridSurrogateRunner class
+- [ ] Implement stage transitions
+- [ ] Add progress callbacks for dashboard
+- [ ] Integrate existing scripts as modules
+
+### Phase 3: Knowledge Base (2-3 days)
+- [ ] Create SQLite schema
+- [ ] Implement PhysicsKnowledgeBase API
+- [ ] Add similarity search
+- [ ] Basic transfer learning
+
+### Phase 4: Dashboard Integration (2-3 days)
+- [ ] Surrogate status panel
+- [ ] Knowledge base browser
+- [ ] Study creation wizard with NN options
+
+### Phase 5: Documentation & Testing (1-2 days)
+- [ ] User guide for non-coders
+- [ ] Integration tests
+- [ ] Example workflows
+
+---
+
+## Data Flow Architecture
+
+```
+                    ┌──────────────────────────────────────┐
+                    │      optimization_config.json        │
+                    │  (Single source of truth for study)  │
+                    └──────────────────┬───────────────────┘
+                                       │
+                    ┌──────────────────▼───────────────────┐
+                    │         Protocol 12 Runner           │
+                    │    (Orchestrates entire workflow)    │
+                    └──────────────────┬───────────────────┘
+                                       │
+         ┌─────────────────┬───────────┼───────────┬─────────────────┐
+         │                 │           │           │                 │
+         ▼                 ▼           ▼           ▼                 ▼
+    ┌─────────┐      ┌─────────┐ ┌─────────┐ ┌─────────┐      ┌─────────┐
+    │  FEA    │      │Training │ │Surrogate│ │   NN    │      │Knowledge│
+    │ Solver  │      │  Data   │ │ Trainer │ │  Optim  │      │  Base   │
+    └────┬────┘      └────┬────┘ └────┬────┘ └────┬────┘      └────┬────┘
+         │                │           │           │                 │
+         ▼                ▼           ▼           ▼                 ▼
+    ┌─────────────────────────────────────────────────────────────────┐
+    │                         study.db                                 │
+    │  (Optuna trials + training data + surrogate metadata)           │
+    └─────────────────────────────────────────────────────────────────┘
+                                       │
+                    ┌──────────────────▼───────────────────┐
+                    │        physics_surrogates.db         │
+                    │   (Master knowledge base - global)   │
+                    └──────────────────────────────────────┘
+```
+
+---
+
+## Key Benefits
+
+### For Non-Coders
+1. **Single JSON config** - No Python scripts to run manually
+2. **Dashboard control** - Start/stop/monitor from browser
+3. **Automatic recommendations** - System suggests best settings
+4. **Knowledge reuse** - Similar problems get free speedup
+
+### For the Organization
+1. **Institutional memory** - Physics knowledge persists
+2. **Faster iterations** - Each new study benefits from past work
+3. **Reproducibility** - Everything tracked in databases
+4. **Scalability** - Add more workers, train better models
+
+### For the Workflow
+1. **End-to-end automation** - No manual steps between stages
+2. **Adaptive optimization** - System learns during run
+3. **Validated results** - Top candidates always FEA-verified
+4. **Rich reporting** - Performance metrics, comparisons, recommendations
+
+---
+
+## Next Steps
+
+1. **Review this plan** - Get feedback on priorities
+2. **Start with config schema** - Extend optimization_config.json
+3. **Build Protocol 12** - Core automation logic
+4. **Knowledge Base MVP** - Basic save/load functionality
+5. **Dashboard integration** - Visual control panel
+
+---
+
+*Document Version: 1.0*
+*Created: 2025-11-25*
+*Author: Claude Code + Antoine*