263 lines
8.0 KiB
Markdown
263 lines
8.0 KiB
Markdown
|
|
# SYS_16: Self-Aware Turbo (SAT) Optimization
|
||
|
|
|
||
|
|
## Version: 1.0
|
||
|
|
## Status: PROPOSED
|
||
|
|
## Created: 2025-12-28
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Problem Statement
|
||
|
|
|
||
|
|
V5 surrogate + L-BFGS failed catastrophically because:
|
||
|
|
1. MLP predicted WS=280 but actual was WS=376 (30%+ error)
|
||
|
|
2. L-BFGS descended to regions **outside training distribution**
|
||
|
|
3. Surrogate had no way to signal uncertainty
|
||
|
|
4. All L-BFGS solutions converged to the same "fake optimum"
|
||
|
|
|
||
|
|
**Root cause:** The surrogate is overconfident in regions where it has no data.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Solution: Uncertainty-Aware Surrogate with Active Learning
|
||
|
|
|
||
|
|
### Core Principles
|
||
|
|
|
||
|
|
1. **Never trust a point prediction** - Always require uncertainty bounds
|
||
|
|
2. **High uncertainty = run FEA** - Don't optimize where you don't know
|
||
|
|
3. **Actively fill gaps** - Prioritize FEA in high-uncertainty regions
|
||
|
|
4. **Validate gradient solutions** - Check L-BFGS results against FEA before trusting
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
### 1. Ensemble Surrogate (Epistemic Uncertainty)
|
||
|
|
|
||
|
|
Instead of one MLP, train **N independent models** with different initializations:
|
||
|
|
|
||
|
|
```python
|
||
|
|
class EnsembleSurrogate:
|
||
|
|
def __init__(self, n_models=5):
|
||
|
|
self.models = [MLP() for _ in range(n_models)]
|
||
|
|
|
||
|
|
def predict(self, x):
|
||
|
|
preds = [m.predict(x) for m in self.models]
|
||
|
|
mean = np.mean(preds, axis=0)
|
||
|
|
std = np.std(preds, axis=0) # Epistemic uncertainty
|
||
|
|
return mean, std
|
||
|
|
|
||
|
|
def is_confident(self, x, threshold=0.1):
|
||
|
|
mean, std = self.predict(x)
|
||
|
|
# Confident if std < 10% of mean
|
||
|
|
return (std / (mean + 1e-6)) < threshold
|
||
|
|
```
|
||
|
|
|
||
|
|
**Why this works:** Models trained on different random seeds will agree in well-sampled regions but disagree wildly in extrapolation regions.
|
||
|
|
|
||
|
|
### 2. Distance-Based OOD Detection
|
||
|
|
|
||
|
|
Track training data distribution and flag points that are "too far":
|
||
|
|
|
||
|
|
```python
|
||
|
|
class OODDetector:
|
||
|
|
def __init__(self, X_train):
|
||
|
|
self.X_train = X_train
|
||
|
|
self.mean = X_train.mean(axis=0)
|
||
|
|
self.std = X_train.std(axis=0)
|
||
|
|
# Fit KNN for local density
|
||
|
|
self.knn = NearestNeighbors(n_neighbors=5)
|
||
|
|
self.knn.fit(X_train)
|
||
|
|
|
||
|
|
def distance_to_training(self, x):
|
||
|
|
"""Return distance to nearest training points."""
|
||
|
|
distances, _ = self.knn.kneighbors(x.reshape(1, -1))
|
||
|
|
return distances.mean()
|
||
|
|
|
||
|
|
def is_in_distribution(self, x, threshold=2.0):
|
||
|
|
"""Check if point is within 2 std of training data."""
|
||
|
|
z_scores = np.abs((x - self.mean) / (self.std + 1e-6))
|
||
|
|
return z_scores.max() < threshold
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. Trust-Region L-BFGS
|
||
|
|
|
||
|
|
Constrain L-BFGS to stay within training distribution:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def trust_region_lbfgs(surrogate, ood_detector, x0, max_iter=100):
|
||
|
|
"""L-BFGS that respects training data boundaries."""
|
||
|
|
|
||
|
|
def constrained_objective(x):
|
||
|
|
# If OOD, return large penalty
|
||
|
|
if not ood_detector.is_in_distribution(x):
|
||
|
|
return 1e9
|
||
|
|
|
||
|
|
mean, std = surrogate.predict(x)
|
||
|
|
# If uncertain, return upper confidence bound (pessimistic)
|
||
|
|
if std > 0.1 * mean:
|
||
|
|
return mean + 2 * std # Be conservative
|
||
|
|
|
||
|
|
return mean
|
||
|
|
|
||
|
|
result = minimize(constrained_objective, x0, method='L-BFGS-B')
|
||
|
|
return result.x
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4. Acquisition Function with Uncertainty
|
||
|
|
|
||
|
|
Use **Expected Improvement with Uncertainty** (like Bayesian Optimization):
|
||
|
|
|
||
|
|
```python
|
||
|
|
def acquisition_score(x, surrogate, best_so_far):
|
||
|
|
"""Score = potential improvement weighted by confidence."""
|
||
|
|
mean, std = surrogate.predict(x)
|
||
|
|
|
||
|
|
# Expected improvement (lower is better for minimization)
|
||
|
|
improvement = best_so_far - mean
|
||
|
|
|
||
|
|
# Exploration bonus for uncertain regions
|
||
|
|
exploration = 0.5 * std
|
||
|
|
|
||
|
|
# High score = worth evaluating with FEA
|
||
|
|
return improvement + exploration
|
||
|
|
|
||
|
|
def select_next_fea_candidates(surrogate, candidates, best_so_far, n=5):
|
||
|
|
"""Select candidates balancing exploitation and exploration."""
|
||
|
|
scores = [acquisition_score(c, surrogate, best_so_far) for c in candidates]
|
||
|
|
|
||
|
|
# Pick top candidates by acquisition score
|
||
|
|
top_indices = np.argsort(scores)[-n:]
|
||
|
|
return [candidates[i] for i in top_indices]
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Algorithm: Self-Aware Turbo (SAT)
|
||
|
|
|
||
|
|
```
|
||
|
|
INITIALIZE:
|
||
|
|
- Load existing FEA data (X_train, Y_train)
|
||
|
|
- Train ensemble surrogate on data
|
||
|
|
- Fit OOD detector on X_train
|
||
|
|
- Set best_ws = min(Y_train)
|
||
|
|
|
||
|
|
PHASE 1: UNCERTAINTY MAPPING (10% of budget)
|
||
|
|
FOR i in 1..N_mapping:
|
||
|
|
- Sample random point x
|
||
|
|
- Get uncertainty: mean, std = surrogate.predict(x)
|
||
|
|
- If std > threshold: run FEA, add to training data
|
||
|
|
- Retrain ensemble periodically
|
||
|
|
|
||
|
|
This fills in the "holes" in the surrogate's knowledge.
|
||
|
|
|
||
|
|
PHASE 2: EXPLOITATION WITH VALIDATION (80% of budget)
|
||
|
|
FOR i in 1..N_exploit:
|
||
|
|
- Generate 1000 TPE samples
|
||
|
|
- Filter to keep only confident predictions (std < 10% of mean)
|
||
|
|
- Filter to keep only in-distribution (OOD check)
|
||
|
|
- Rank by predicted WS
|
||
|
|
|
||
|
|
- Take top 5 candidates
|
||
|
|
- Run FEA on all 5
|
||
|
|
|
||
|
|
- For each FEA result:
|
||
|
|
- Compare predicted vs actual
|
||
|
|
- If error > 20%: mark region as "unreliable", force exploration there
|
||
|
|
- If error < 10%: update best, retrain surrogate
|
||
|
|
|
||
|
|
- Every 10 iterations: retrain ensemble with new data
|
||
|
|
|
||
|
|
PHASE 3: L-BFGS REFINEMENT (10% of budget)
|
||
|
|
- Only run L-BFGS if ensemble R² > 0.95 on validation set
|
||
|
|
- Use trust-region L-BFGS (stay within training distribution)
|
||
|
|
|
||
|
|
FOR each L-BFGS solution:
|
||
|
|
- Check ensemble disagreement
|
||
|
|
- If models agree (std < 5%): run FEA to validate
|
||
|
|
- If models disagree: skip, too uncertain
|
||
|
|
|
||
|
|
- Compare L-BFGS prediction vs FEA
|
||
|
|
- If error > 15%: ABORT L-BFGS phase, return to Phase 2
|
||
|
|
- If error < 10%: accept as candidate
|
||
|
|
|
||
|
|
FINAL:
|
||
|
|
- Return best FEA-validated design
|
||
|
|
- Report uncertainty bounds for all objectives
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Key Differences from V5
|
||
|
|
|
||
|
|
| Aspect | V5 (Failed) | SAT (Proposed) |
|
||
|
|
|--------|-------------|----------------|
|
||
|
|
| **Model** | Single MLP | Ensemble of 5 MLPs |
|
||
|
|
| **Uncertainty** | None | Ensemble disagreement + OOD detection |
|
||
|
|
| **L-BFGS** | Trust blindly | Trust-region, validate every step |
|
||
|
|
| **Extrapolation** | Accept | Reject or penalize |
|
||
|
|
| **Active learning** | No | Yes - prioritize uncertain regions |
|
||
|
|
| **Validation** | After L-BFGS | Throughout |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Implementation Checklist
|
||
|
|
|
||
|
|
1. [ ] `EnsembleSurrogate` class with N=5 MLPs
|
||
|
|
2. [ ] `OODDetector` with KNN + z-score checks
|
||
|
|
3. [ ] `acquisition_score()` balancing exploitation/exploration
|
||
|
|
4. [ ] Trust-region L-BFGS with OOD penalties
|
||
|
|
5. [ ] Automatic retraining when new FEA data arrives
|
||
|
|
6. [ ] Logging of prediction errors to track surrogate quality
|
||
|
|
7. [ ] Early abort if L-BFGS predictions consistently wrong
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Expected Behavior
|
||
|
|
|
||
|
|
**In well-sampled regions:**
|
||
|
|
- Ensemble agrees → Low uncertainty → Trust predictions
|
||
|
|
- L-BFGS finds valid optima → FEA confirms → Success
|
||
|
|
|
||
|
|
**In poorly-sampled regions:**
|
||
|
|
- Ensemble disagrees → High uncertainty → Run FEA instead
|
||
|
|
- L-BFGS penalized → Stays in trusted zone → No fake optima
|
||
|
|
|
||
|
|
**At distribution boundaries:**
|
||
|
|
- OOD detector flags → Reject predictions
|
||
|
|
- Acquisition prioritizes → Active learning fills gaps
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Metrics to Track
|
||
|
|
|
||
|
|
1. **Surrogate R² on validation set** - Target > 0.95 before L-BFGS
|
||
|
|
2. **Prediction error histogram** - Should be centered at 0
|
||
|
|
3. **OOD rejection rate** - How often we refuse to predict
|
||
|
|
4. **Ensemble disagreement** - Average std across predictions
|
||
|
|
5. **L-BFGS success rate** - % of L-BFGS solutions that validate
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## When to Use SAT vs Pure TPE
|
||
|
|
|
||
|
|
| Scenario | Recommendation |
|
||
|
|
|----------|----------------|
|
||
|
|
| < 100 existing samples | Pure TPE (not enough for good surrogate) |
|
||
|
|
| 100-500 samples | SAT Phase 1-2 only (no L-BFGS) |
|
||
|
|
| > 500 samples | Full SAT with L-BFGS refinement |
|
||
|
|
| High-dimensional (>20 params) | Pure TPE (curse of dimensionality) |
|
||
|
|
| Noisy FEA | Pure TPE (surrogates struggle with noise) |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- Gaussian Process literature on uncertainty quantification
|
||
|
|
- Deep Ensembles: Lakshminarayanan et al. (2017)
|
||
|
|
- Bayesian Optimization with Expected Improvement
|
||
|
|
- Trust-region methods for constrained optimization
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
*The key insight: A surrogate that knows when it doesn't know is infinitely more valuable than one that's confidently wrong.*
|