# SYS_16: Self-Aware Turbo (SAT) Optimization ## Version: 3.0 ## Status: VALIDATED ## Created: 2025-12-28 ## Updated: 2025-12-31 --- ## Quick Summary **SAT v3 achieved WS=205.58, beating all previous methods (V7 TPE: 218.26, V6 TPE: 225.41).** SAT is a surrogate-accelerated optimization method that: 1. Trains an **ensemble of 5 MLPs** on historical FEA data 2. Uses **adaptive exploration** that decreases over time (15%→8%→3%) 3. Filters candidates to prevent **duplicate evaluations** 4. Applies **soft mass constraints** in the acquisition function --- ## Version History | Version | Study | Training Data | Key Fix | Best WS | |---------|-------|---------------|---------|---------| | v1 | V7 | 129 (V6 only) | - | 218.26 | | v2 | V8 | 196 (V6 only) | Duplicate prevention | 271.38 | | **v3** | **V9** | **556 (V5-V8)** | **Adaptive exploration + mass targeting** | **205.58** | --- ## Problem Statement V5 surrogate + L-BFGS failed catastrophically because: 1. MLP predicted WS=280 but actual was WS=376 (30%+ error) 2. L-BFGS descended to regions **outside training distribution** 3. Surrogate had no way to signal uncertainty 4. All L-BFGS solutions converged to the same "fake optimum" **Root cause:** The surrogate is overconfident in regions where it has no data. --- ## Solution: Uncertainty-Aware Surrogate with Active Learning ### Core Principles 1. **Never trust a point prediction** - Always require uncertainty bounds 2. **High uncertainty = run FEA** - Don't optimize where you don't know 3. **Actively fill gaps** - Prioritize FEA in high-uncertainty regions 4. **Validate gradient solutions** - Check L-BFGS results against FEA before trusting --- ## Architecture ### 1. Ensemble Surrogate (Epistemic Uncertainty) Instead of one MLP, train **N independent models** with different initializations: ```python class EnsembleSurrogate: def __init__(self, n_models=5): self.models = [MLP() for _ in range(n_models)] def predict(self, x): preds = [m.predict(x) for m in self.models] mean = np.mean(preds, axis=0) std = np.std(preds, axis=0) # Epistemic uncertainty return mean, std def is_confident(self, x, threshold=0.1): mean, std = self.predict(x) # Confident if std < 10% of mean return (std / (mean + 1e-6)) < threshold ``` **Why this works:** Models trained on different random seeds will agree in well-sampled regions but disagree wildly in extrapolation regions. ### 2. Distance-Based OOD Detection Track training data distribution and flag points that are "too far": ```python class OODDetector: def __init__(self, X_train): self.X_train = X_train self.mean = X_train.mean(axis=0) self.std = X_train.std(axis=0) # Fit KNN for local density self.knn = NearestNeighbors(n_neighbors=5) self.knn.fit(X_train) def distance_to_training(self, x): """Return distance to nearest training points.""" distances, _ = self.knn.kneighbors(x.reshape(1, -1)) return distances.mean() def is_in_distribution(self, x, threshold=2.0): """Check if point is within 2 std of training data.""" z_scores = np.abs((x - self.mean) / (self.std + 1e-6)) return z_scores.max() < threshold ``` ### 3. Trust-Region L-BFGS Constrain L-BFGS to stay within training distribution: ```python def trust_region_lbfgs(surrogate, ood_detector, x0, max_iter=100): """L-BFGS that respects training data boundaries.""" def constrained_objective(x): # If OOD, return large penalty if not ood_detector.is_in_distribution(x): return 1e9 mean, std = surrogate.predict(x) # If uncertain, return upper confidence bound (pessimistic) if std > 0.1 * mean: return mean + 2 * std # Be conservative return mean result = minimize(constrained_objective, x0, method='L-BFGS-B') return result.x ``` ### 4. Acquisition Function with Uncertainty Use **Expected Improvement with Uncertainty** (like Bayesian Optimization): ```python def acquisition_score(x, surrogate, best_so_far): """Score = potential improvement weighted by confidence.""" mean, std = surrogate.predict(x) # Expected improvement (lower is better for minimization) improvement = best_so_far - mean # Exploration bonus for uncertain regions exploration = 0.5 * std # High score = worth evaluating with FEA return improvement + exploration def select_next_fea_candidates(surrogate, candidates, best_so_far, n=5): """Select candidates balancing exploitation and exploration.""" scores = [acquisition_score(c, surrogate, best_so_far) for c in candidates] # Pick top candidates by acquisition score top_indices = np.argsort(scores)[-n:] return [candidates[i] for i in top_indices] ``` --- ## Algorithm: Self-Aware Turbo (SAT) ``` INITIALIZE: - Load existing FEA data (X_train, Y_train) - Train ensemble surrogate on data - Fit OOD detector on X_train - Set best_ws = min(Y_train) PHASE 1: UNCERTAINTY MAPPING (10% of budget) FOR i in 1..N_mapping: - Sample random point x - Get uncertainty: mean, std = surrogate.predict(x) - If std > threshold: run FEA, add to training data - Retrain ensemble periodically This fills in the "holes" in the surrogate's knowledge. PHASE 2: EXPLOITATION WITH VALIDATION (80% of budget) FOR i in 1..N_exploit: - Generate 1000 TPE samples - Filter to keep only confident predictions (std < 10% of mean) - Filter to keep only in-distribution (OOD check) - Rank by predicted WS - Take top 5 candidates - Run FEA on all 5 - For each FEA result: - Compare predicted vs actual - If error > 20%: mark region as "unreliable", force exploration there - If error < 10%: update best, retrain surrogate - Every 10 iterations: retrain ensemble with new data PHASE 3: L-BFGS REFINEMENT (10% of budget) - Only run L-BFGS if ensemble R² > 0.95 on validation set - Use trust-region L-BFGS (stay within training distribution) FOR each L-BFGS solution: - Check ensemble disagreement - If models agree (std < 5%): run FEA to validate - If models disagree: skip, too uncertain - Compare L-BFGS prediction vs FEA - If error > 15%: ABORT L-BFGS phase, return to Phase 2 - If error < 10%: accept as candidate FINAL: - Return best FEA-validated design - Report uncertainty bounds for all objectives ``` --- ## Key Differences from V5 | Aspect | V5 (Failed) | SAT (Proposed) | |--------|-------------|----------------| | **Model** | Single MLP | Ensemble of 5 MLPs | | **Uncertainty** | None | Ensemble disagreement + OOD detection | | **L-BFGS** | Trust blindly | Trust-region, validate every step | | **Extrapolation** | Accept | Reject or penalize | | **Active learning** | No | Yes - prioritize uncertain regions | | **Validation** | After L-BFGS | Throughout | --- ## Implementation Checklist 1. [ ] `EnsembleSurrogate` class with N=5 MLPs 2. [ ] `OODDetector` with KNN + z-score checks 3. [ ] `acquisition_score()` balancing exploitation/exploration 4. [ ] Trust-region L-BFGS with OOD penalties 5. [ ] Automatic retraining when new FEA data arrives 6. [ ] Logging of prediction errors to track surrogate quality 7. [ ] Early abort if L-BFGS predictions consistently wrong --- ## Expected Behavior **In well-sampled regions:** - Ensemble agrees → Low uncertainty → Trust predictions - L-BFGS finds valid optima → FEA confirms → Success **In poorly-sampled regions:** - Ensemble disagrees → High uncertainty → Run FEA instead - L-BFGS penalized → Stays in trusted zone → No fake optima **At distribution boundaries:** - OOD detector flags → Reject predictions - Acquisition prioritizes → Active learning fills gaps --- ## Metrics to Track 1. **Surrogate R² on validation set** - Target > 0.95 before L-BFGS 2. **Prediction error histogram** - Should be centered at 0 3. **OOD rejection rate** - How often we refuse to predict 4. **Ensemble disagreement** - Average std across predictions 5. **L-BFGS success rate** - % of L-BFGS solutions that validate --- ## When to Use SAT vs Pure TPE | Scenario | Recommendation | |----------|----------------| | < 100 existing samples | Pure TPE (not enough for good surrogate) | | 100-500 samples | SAT Phase 1-2 only (no L-BFGS) | | > 500 samples | Full SAT with L-BFGS refinement | | High-dimensional (>20 params) | Pure TPE (curse of dimensionality) | | Noisy FEA | Pure TPE (surrogates struggle with noise) | --- ## SAT v3 Implementation Details ### Adaptive Exploration Schedule ```python def get_exploration_weight(trial_num): if trial_num <= 30: return 0.15 # Phase 1: 15% exploration elif trial_num <= 80: return 0.08 # Phase 2: 8% exploration else: return 0.03 # Phase 3: 3% exploitation ``` ### Acquisition Function (v3) ```python # Normalize components norm_ws = (pred_ws - pred_ws.min()) / (pred_ws.max() - pred_ws.min()) norm_dist = distances / distances.max() mass_penalty = max(0, pred_mass - 118.0) * 5.0 # Soft threshold at 118 kg # Adaptive acquisition (lower = better) acquisition = norm_ws - exploration_weight * norm_dist + norm_mass_penalty ``` ### Candidate Generation (v3) ```python for _ in range(1000): if random() < 0.7 and best_x is not None: # 70% exploitation: sample near best scale = uniform(0.05, 0.15) candidate = sample_near_point(best_x, scale) else: # 30% exploration: random sampling candidate = sample_random() ``` ### Key Configuration (v3) ```json { "n_ensemble_models": 5, "training_epochs": 800, "candidates_per_round": 1000, "min_distance_threshold": 0.03, "mass_soft_threshold": 118.0, "exploit_near_best_ratio": 0.7, "lbfgs_polish_trials": 10 } ``` --- ## V9 Results | Phase | Trials | Best WS | Mean WS | |-------|--------|---------|---------| | Phase 1 (explore) | 30 | 232.00 | 394.48 | | Phase 2 (balanced) | 50 | 222.01 | 360.51 | | Phase 3 (exploit) | 57+ | **205.58** | 262.57 | **Key metrics:** - 100% feasibility rate - 100% unique designs (no duplicates) - Surrogate R² = 0.99 --- ## References - Gaussian Process literature on uncertainty quantification - Deep Ensembles: Lakshminarayanan et al. (2017) - Bayesian Optimization with Expected Improvement - Trust-region methods for constrained optimization --- ## Implementation - **V9 Study:** `studies/M1_Mirror/m1_mirror_cost_reduction_flat_back_V9/` - **Script:** `run_sat_optimization.py` - **Ensemble:** `optimization_engine/surrogates/ensemble_surrogate.py` --- *The key insight: A surrogate that knows when it doesn't know is infinitely more valuable than one that's confidently wrong.*