# Neural Network Surrogate Automation Plan ## Vision: One-Click ML-Accelerated Optimization Make neural network surrogates a **first-class citizen** in Atomizer, fully integrated into the optimization workflow so that: 1. Non-coders can enable/configure NN acceleration via JSON config 2. The system automatically builds, trains, and validates surrogates 3. Knowledge accumulates in a reusable "Physics Knowledge Base" 4. The dashboard provides full visibility and control --- ## Current State (What We Have) ``` Manual Steps Required Today: 1. Run optimization (30+ FEA trials) 2. Manually run: generate_training_data.py 3. Manually run: run_training_fea.py 4. Manually run: train_nn_surrogate.py 5. Manually run: generate_nn_report.py 6. Manually enable --enable-nn flag 7. No persistent knowledge storage ``` --- ## Target State (What We Want) ``` Automated Flow: 1. User creates optimization_config.json with surrogate_settings 2. User runs: python run_optimization.py --trials 100 3. System automatically: - Runs initial FEA exploration (20-30 trials) - Generates space-filling training points - Runs parallel FEA on training points - Trains and validates surrogate - Switches to NN-accelerated optimization - Validates top candidates with real FEA - Stores learned physics in Knowledge Base ``` --- ## Phase 1: Extended Configuration Schema ### Current optimization_config.json ```json { "study_name": "uav_arm_optimization", "optimization_settings": { "protocol": "protocol_11_multi_objective", "n_trials": 30 }, "design_variables": [...], "objectives": [...], "constraints": [...] } ``` ### Proposed Extended Schema ```json { "study_name": "uav_arm_optimization", "description": "UAV Camera Support Arm", "engineering_context": "Drone gimbal arm for 850g camera payload", "optimization_settings": { "protocol": "protocol_12_hybrid_surrogate", "n_trials": 200, "sampler": "NSGAIISampler" }, "design_variables": [...], "objectives": [...], "constraints": [...], "surrogate_settings": { "enabled": true, "mode": "auto", "training": { "initial_fea_trials": 30, "space_filling_samples": 100, "sampling_method": "lhs_with_corners", "parallel_workers": 2 }, "model": { "architecture": "mlp", "hidden_layers": [64, 128, 64], "validation_method": "5_fold_cv", "min_accuracy_mape": 10.0, "retrain_threshold": 15.0 }, "optimization": { "nn_trials_per_fea": 50, "validate_top_n": 5, "adaptive_sampling": true }, "knowledge_base": { "save_to_master": true, "master_db_path": "knowledge_base/physics_surrogates.db", "tags": ["cantilever", "aluminum", "modal", "static"], "reuse_similar": true } }, "simulation": {...}, "reporting": {...} } ``` --- ## Phase 2: Protocol 12 - Hybrid Surrogate Optimization ### Workflow Stages ``` ┌─────────────────────────────────────────────────────────────────────┐ │ PROTOCOL 12: HYBRID SURROGATE │ ├─────────────────────────────────────────────────────────────────────┤ │ │ │ STAGE 1: EXPLORATION (FEA Only) │ │ ├─ Run initial_fea_trials with real FEA │ │ ├─ Build baseline Pareto front │ │ └─ Assess design space complexity │ │ │ │ STAGE 2: TRAINING DATA GENERATION │ │ ├─ Generate space_filling_samples (LHS + corners) │ │ ├─ Run parallel FEA on training points │ │ ├─ Store all results in training_data.db │ │ └─ Monitor for failures, retry if needed │ │ │ │ STAGE 3: SURROGATE TRAINING │ │ ├─ Train NN on combined data (optimization + training) │ │ ├─ Validate with k-fold cross-validation │ │ ├─ Check accuracy >= min_accuracy_mape │ │ └─ Generate performance report │ │ │ │ STAGE 4: NN-ACCELERATED OPTIMIZATION │ │ ├─ Run nn_trials_per_fea NN evaluations per FEA validation │ │ ├─ Validate top_n candidates with real FEA │ │ ├─ Update surrogate with new data (adaptive) │ │ └─ Repeat until n_trials reached │ │ │ │ STAGE 5: FINAL VALIDATION & REPORTING │ │ ├─ Validate all Pareto-optimal designs with FEA │ │ ├─ Generate comprehensive report │ │ └─ Save learned physics to Knowledge Base │ │ │ └─────────────────────────────────────────────────────────────────────┘ ``` ### Implementation: runner_protocol_12.py ```python class HybridSurrogateRunner: """Protocol 12: Automated hybrid FEA/NN optimization.""" def __init__(self, config: dict): self.config = config self.surrogate_config = config.get('surrogate_settings', {}) self.stage = "exploration" def run(self): # Stage 1: Exploration self.run_exploration_stage() # Stage 2: Training Data if self.surrogate_config.get('enabled', False): self.generate_training_data() self.run_parallel_fea_training() # Stage 3: Train Surrogate self.train_and_validate_surrogate() # Stage 4: NN-Accelerated self.run_nn_accelerated_optimization() # Stage 5: Final self.validate_and_report() self.save_to_knowledge_base() ``` --- ## Phase 3: Physics Knowledge Base Architecture ### Purpose Store learned physics relationships so future optimizations can: 1. **Warm-start** with pre-trained surrogates for similar problems 2. **Transfer learn** from related geometries/materials 3. **Build institutional knowledge** over time ### Database Schema: physics_surrogates.db ```sql -- Master registry of all trained surrogates CREATE TABLE surrogates ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, study_name TEXT, -- Problem characterization geometry_type TEXT, -- 'cantilever', 'plate', 'shell', 'solid' material_family TEXT, -- 'aluminum', 'steel', 'composite' analysis_types TEXT, -- JSON: ['static', 'modal', 'buckling'] -- Design space n_parameters INTEGER, parameter_names TEXT, -- JSON array parameter_bounds TEXT, -- JSON: {name: [min, max]} -- Objectives & Constraints objectives TEXT, -- JSON: [{name, goal}] constraints TEXT, -- JSON: [{name, type, threshold}] -- Model info model_path TEXT, -- Path to .pt file architecture TEXT, -- JSON: model architecture training_samples INTEGER, -- Performance metrics cv_mape_mass REAL, cv_mape_frequency REAL, cv_r2_mass REAL, cv_r2_frequency REAL, -- Metadata tags TEXT, -- JSON array for search description TEXT, engineering_context TEXT ); -- Training data for each surrogate CREATE TABLE training_data ( id INTEGER PRIMARY KEY, surrogate_id INTEGER REFERENCES surrogates(id), -- Input parameters (normalized 0-1) params_json TEXT, params_normalized TEXT, -- Output values mass REAL, frequency REAL, max_displacement REAL, max_stress REAL, -- Source source TEXT, -- 'optimization', 'lhs', 'corner', 'adaptive' fea_timestamp TIMESTAMP ); -- Similarity index for finding related problems CREATE TABLE problem_similarity ( surrogate_id INTEGER REFERENCES surrogates(id), -- Embedding for similarity search geometry_embedding BLOB, -- Vector embedding of geometry type physics_embedding BLOB, -- Vector embedding of physics signature -- Precomputed similarity features feature_vector TEXT -- JSON: normalized features for matching ); ``` ### Knowledge Base API ```python class PhysicsKnowledgeBase: """Central repository for learned physics surrogates.""" def __init__(self, db_path: str = "knowledge_base/physics_surrogates.db"): self.db_path = db_path def find_similar_surrogate(self, config: dict) -> Optional[SurrogateMatch]: """Find existing surrogate that could transfer to this problem.""" # Extract features from config features = self._extract_problem_features(config) # Query similar problems matches = self._query_similar(features) # Return best match if similarity > threshold if matches and matches[0].similarity > 0.8: return matches[0] return None def save_surrogate(self, study_name: str, model_path: str, config: dict, metrics: dict): """Save trained surrogate to knowledge base.""" # Store model and metadata # Index for future similarity search pass def transfer_learn(self, base_surrogate_id: int, new_config: dict) -> nn.Module: """Create new surrogate by transfer learning from existing one.""" # Load base model # Freeze early layers # Fine-tune on new data pass ``` --- ## Phase 4: Dashboard Integration ### New Dashboard Pages #### 1. Surrogate Status Panel (in existing Dashboard) ``` ┌─────────────────────────────────────────────────────────┐ │ SURROGATE STATUS │ ├─────────────────────────────────────────────────────────┤ │ Mode: Hybrid (NN + FEA Validation) │ │ Stage: NN-Accelerated Optimization │ │ │ │ Training Data: 150 samples (50 opt + 100 LHS) │ │ Model Accuracy: MAPE 1.8% mass, 1.1% freq │ │ Speedup: ~50x (10ms NN vs 500ms FEA) │ │ │ │ [View Report] [Retrain] [Disable NN] │ └─────────────────────────────────────────────────────────┘ ``` #### 2. Knowledge Base Browser ``` ┌─────────────────────────────────────────────────────────┐ │ PHYSICS KNOWLEDGE BASE │ ├─────────────────────────────────────────────────────────┤ │ Stored Surrogates: 12 │ │ │ │ [Cantilever Beams] 5 models, avg MAPE 2.1% │ │ [Shell Structures] 3 models, avg MAPE 3.4% │ │ [Solid Parts] 4 models, avg MAPE 4.2% │ │ │ │ Search: [aluminum modal_______] [Find Similar] │ │ │ │ Matching Models: │ │ - uav_arm_v2 (92% match) - Transfer Learning Available │ │ - bracket_opt (78% match) │ └─────────────────────────────────────────────────────────┘ ``` --- ## Phase 5: User Workflow (Non-Coder Experience) ### Scenario: New Optimization with NN Acceleration ``` Step 1: Create Study via Dashboard ┌─────────────────────────────────────────────────────────┐ │ NEW OPTIMIZATION STUDY │ ├─────────────────────────────────────────────────────────┤ │ Study Name: [drone_motor_mount___________] │ │ Description: [Motor mount bracket________] │ │ │ │ Model File: [Browse...] drone_mount.prt │ │ Sim File: [Browse...] drone_mount_sim.sim │ │ │ │ ☑ Enable Neural Network Acceleration │ │ ├─ Initial FEA Trials: [30____] │ │ ├─ Training Samples: [100___] │ │ ├─ Target Accuracy: [10% MAPE] │ │ └─ ☑ Save to Knowledge Base │ │ │ │ Similar existing model found: "uav_arm_optimization" │ │ ☑ Use as starting point (transfer learning) │ │ │ │ [Create Study] │ └─────────────────────────────────────────────────────────┘ Step 2: System Automatically Executes Protocol 12 - User sees progress in dashboard - No command-line needed - All stages automated Step 3: Review Results - Pareto front with FEA-validated designs - NN performance report - Knowledge saved for future use ``` --- ## Implementation Roadmap ### Phase 1: Config Schema Extension (1-2 days) - [ ] Define surrogate_settings schema - [ ] Update config validator - [ ] Create migration for existing configs ### Phase 2: Protocol 12 Runner (3-5 days) - [ ] Create HybridSurrogateRunner class - [ ] Implement stage transitions - [ ] Add progress callbacks for dashboard - [ ] Integrate existing scripts as modules ### Phase 3: Knowledge Base (2-3 days) - [ ] Create SQLite schema - [ ] Implement PhysicsKnowledgeBase API - [ ] Add similarity search - [ ] Basic transfer learning ### Phase 4: Dashboard Integration (2-3 days) - [ ] Surrogate status panel - [ ] Knowledge base browser - [ ] Study creation wizard with NN options ### Phase 5: Documentation & Testing (1-2 days) - [ ] User guide for non-coders - [ ] Integration tests - [ ] Example workflows --- ## Data Flow Architecture ``` ┌──────────────────────────────────────┐ │ optimization_config.json │ │ (Single source of truth for study) │ └──────────────────┬───────────────────┘ │ ┌──────────────────▼───────────────────┐ │ Protocol 12 Runner │ │ (Orchestrates entire workflow) │ └──────────────────┬───────────────────┘ │ ┌─────────────────┬───────────┼───────────┬─────────────────┐ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ FEA │ │Training │ │Surrogate│ │ NN │ │Knowledge│ │ Solver │ │ Data │ │ Trainer │ │ Optim │ │ Base │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ study.db │ │ (Optuna trials + training data + surrogate metadata) │ └─────────────────────────────────────────────────────────────────┘ │ ┌──────────────────▼───────────────────┐ │ physics_surrogates.db │ │ (Master knowledge base - global) │ └──────────────────────────────────────┘ ``` --- ## Key Benefits ### For Non-Coders 1. **Single JSON config** - No Python scripts to run manually 2. **Dashboard control** - Start/stop/monitor from browser 3. **Automatic recommendations** - System suggests best settings 4. **Knowledge reuse** - Similar problems get free speedup ### For the Organization 1. **Institutional memory** - Physics knowledge persists 2. **Faster iterations** - Each new study benefits from past work 3. **Reproducibility** - Everything tracked in databases 4. **Scalability** - Add more workers, train better models ### For the Workflow 1. **End-to-end automation** - No manual steps between stages 2. **Adaptive optimization** - System learns during run 3. **Validated results** - Top candidates always FEA-verified 4. **Rich reporting** - Performance metrics, comparisons, recommendations --- ## Next Steps 1. **Review this plan** - Get feedback on priorities 2. **Start with config schema** - Extend optimization_config.json 3. **Build Protocol 12** - Core automation logic 4. **Knowledge Base MVP** - Basic save/load functionality 5. **Dashboard integration** - Visual control panel --- *Document Version: 1.0* *Created: 2025-11-25* *Author: Claude Code + Antoine*