feat: Add NN Quality Assessor with relative accuracy thresholds
The Method Selector now uses relative accuracy thresholds to assess NN suitability by comparing NN error to problem variability (CV ratio). NNQualityAssessor features: - Physics-based objective classification (linear, smooth, nonlinear, chaotic) - CV ratio computation: nn_error / coefficient_of_variation - Turbo suitability score based on relative thresholds - Data collection from validation_report.json, turbo_report.json, and study.db Quality thresholds by objective type: - Linear (mass, volume): max 2% error, CV ratio < 0.5 - Smooth (frequency): max 5% error, CV ratio < 1.0 - Nonlinear (stress, stiffness): max 10% error, CV ratio < 2.0 - Chaotic (contact, buckling): max 20% error, CV ratio < 3.0 CLI output now includes: - Per-objective NN quality table with error, CV, ratio, and quality indicator - Turbo suitability and hybrid suitability percentages - Warnings when NN error exceeds physics-based thresholds Updated SYS_15_METHOD_SELECTOR.md to v2.0 with full NN Quality Assessment documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -3,9 +3,9 @@
|
||||
<!--
|
||||
PROTOCOL: Adaptive Method Selector
|
||||
LAYER: System
|
||||
VERSION: 1.0
|
||||
VERSION: 2.0
|
||||
STATUS: Active
|
||||
LAST_UPDATED: 2025-12-06
|
||||
LAST_UPDATED: 2025-12-07
|
||||
PRIVILEGE: user
|
||||
LOAD_WITH: [SYS_10_IMSO, SYS_11_MULTI_OBJECTIVE, SYS_14_NEURAL_ACCELERATION]
|
||||
-->
|
||||
@@ -16,9 +16,10 @@ The **Adaptive Method Selector (AMS)** analyzes optimization problems and recomm
|
||||
|
||||
1. **Static Analysis**: Problem characteristics from config (dimensionality, objectives, constraints)
|
||||
2. **Dynamic Analysis**: Early FEA trial metrics (smoothness, correlations, feasibility)
|
||||
3. **Runtime Monitoring**: Continuous optimization performance assessment
|
||||
3. **NN Quality Assessment**: Relative accuracy thresholds comparing NN error to problem variability
|
||||
4. **Runtime Monitoring**: Continuous optimization performance assessment
|
||||
|
||||
**Key Value**: Eliminates guesswork in choosing optimization strategies by providing data-driven recommendations.
|
||||
**Key Value**: Eliminates guesswork in choosing optimization strategies by providing data-driven recommendations with relative accuracy thresholds.
|
||||
|
||||
---
|
||||
|
||||
@@ -100,36 +101,90 @@ print(recommendation.alternatives) # Other methods with scores
|
||||
|
||||
---
|
||||
|
||||
## NN Quality Assessment
|
||||
|
||||
The method selector uses **relative accuracy thresholds** to assess NN suitability. Instead of absolute error limits, it compares NN error to the problem's natural variability (coefficient of variation).
|
||||
|
||||
### Core Concept
|
||||
|
||||
```
|
||||
NN Suitability = f(nn_error / coefficient_of_variation)
|
||||
|
||||
If nn_error >> CV → NN is unreliable (not learning, just noise)
|
||||
If nn_error ≈ CV → NN captures the trend (hybrid recommended)
|
||||
If nn_error << CV → NN is excellent (turbo viable)
|
||||
```
|
||||
|
||||
### Physics-Based Classification
|
||||
|
||||
Objectives are classified by their expected predictability:
|
||||
|
||||
| Objective Type | Examples | Max Expected Error | CV Ratio Limit |
|
||||
|----------------|----------|-------------------|----------------|
|
||||
| **Linear** | mass, volume | 2% | 0.5 |
|
||||
| **Smooth** | frequency, avg stress | 5% | 1.0 |
|
||||
| **Nonlinear** | max stress, stiffness | 10% | 2.0 |
|
||||
| **Chaotic** | contact, buckling | 20% | 3.0 |
|
||||
|
||||
### CV Ratio Interpretation
|
||||
|
||||
The **CV Ratio** = NN Error / (Coefficient of Variation × 100):
|
||||
|
||||
| CV Ratio | Quality | Interpretation |
|
||||
|----------|---------|----------------|
|
||||
| < 0.5 | ✓ Great | NN captures physics much better than noise |
|
||||
| 0.5 - 1.0 | ✓ Good | NN adds significant value for exploration |
|
||||
| 1.0 - 2.0 | ~ OK | NN is marginal, use with validation |
|
||||
| > 2.0 | ✗ Poor | NN not learning effectively, use FEA |
|
||||
|
||||
### Method Recommendations Based on Quality
|
||||
|
||||
| Turbo Suitability | Hybrid Suitability | Recommendation |
|
||||
|-------------------|--------------------|-----------------------|
|
||||
| > 80% | any | **TURBO** - trust NN fully |
|
||||
| 50-80% | > 50% | **TURBO** with monitoring |
|
||||
| < 50% | > 50% | **HYBRID_LOOP** - verify periodically |
|
||||
| < 30% | < 50% | **PURE_FEA** or retrain first |
|
||||
|
||||
### Data Sources
|
||||
|
||||
NN quality metrics are collected from:
|
||||
1. `validation_report.json` - FEA validation results
|
||||
2. `turbo_report.json` - Turbo mode validation history
|
||||
3. `study.db` - Trial `nn_error_percent` user attributes
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌───────────────────────────────────────────────────────────────────┐
|
||||
│ AdaptiveMethodSelector │
|
||||
├───────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────────────┐ ┌──────────────────────┐ │
|
||||
│ │ ProblemProfiler │ │ EarlyMetricsCollector │ │
|
||||
│ │ (static analysis)│ │ (dynamic analysis) │ │
|
||||
│ └────────┬─────────┘ └──────────┬────────────┘ │
|
||||
│ │ │ │
|
||||
│ ▼ ▼ │
|
||||
│ ┌────────────────────────────────────────────────┐ │
|
||||
│ │ _score_methods() │ │
|
||||
│ │ (rule-based scoring with weighted factors) │ │
|
||||
│ └──────────────────────┬─────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌────────────────────────────────────────────────┐ │
|
||||
│ │ MethodRecommendation │ │
|
||||
│ │ method, confidence, parameters, reasoning │ │
|
||||
│ └────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────────┐ │
|
||||
│ │ RuntimeAdvisor │ ← Monitors during optimization │
|
||||
│ │ (pivot advisor) │ │
|
||||
│ └──────────────────┘ │
|
||||
│ │
|
||||
└───────────────────────────────────────────────────────────────────┘
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ AdaptiveMethodSelector │
|
||||
├─────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌────────────────────┐ ┌───────────────────┐ │
|
||||
│ │ ProblemProfiler │ │EarlyMetricsCollector│ │ NNQualityAssessor │ │
|
||||
│ │(static analysis)│ │ (dynamic analysis) │ │ (NN accuracy) │ │
|
||||
│ └───────┬─────────┘ └─────────┬──────────┘ └─────────┬─────────┘ │
|
||||
│ │ │ │ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ _score_methods() │ │
|
||||
│ │ (rule-based scoring with static + dynamic + NN factors) │ │
|
||||
│ └───────────────────────────────┬─────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ MethodRecommendation │ │
|
||||
│ │ method, confidence, parameters, reasoning, warnings │ │
|
||||
│ └─────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────────┐ │
|
||||
│ │ RuntimeAdvisor │ ← Monitors during optimization │
|
||||
│ │ (pivot advisor) │ │
|
||||
│ └──────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
@@ -173,16 +228,37 @@ class EarlyMetrics:
|
||||
variable_sensitivity: Dict[str, float]
|
||||
```
|
||||
|
||||
### 3. AdaptiveMethodSelector
|
||||
### 3. NNQualityAssessor
|
||||
|
||||
Main entry point that combines static + dynamic analysis:
|
||||
Assesses NN surrogate quality relative to problem complexity:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class NNQualityMetrics:
|
||||
has_nn_data: bool = False
|
||||
n_validations: int = 0
|
||||
nn_errors: Dict[str, float] # Absolute % error per objective
|
||||
cv_ratios: Dict[str, float] # nn_error / (CV * 100) per objective
|
||||
expected_errors: Dict[str, float] # Physics-based threshold
|
||||
overall_quality: float # 0-1, based on absolute thresholds
|
||||
turbo_suitability: float # 0-1, based on CV ratios
|
||||
hybrid_suitability: float # 0-1, more lenient threshold
|
||||
objective_types: Dict[str, str] # 'linear', 'smooth', 'nonlinear', 'chaotic'
|
||||
```
|
||||
|
||||
### 4. AdaptiveMethodSelector
|
||||
|
||||
Main entry point that combines static + dynamic + NN quality analysis:
|
||||
|
||||
```python
|
||||
selector = AdaptiveMethodSelector(min_trials=20)
|
||||
recommendation = selector.recommend(config_path, db_path)
|
||||
recommendation = selector.recommend(config_path, db_path, results_dir=results_dir)
|
||||
|
||||
# Access last NN quality for display
|
||||
print(f"Turbo suitability: {selector.last_nn_quality.turbo_suitability:.0%}")
|
||||
```
|
||||
|
||||
### 4. RuntimeAdvisor
|
||||
### 5. RuntimeAdvisor
|
||||
|
||||
Monitors optimization progress and suggests pivots:
|
||||
|
||||
@@ -210,11 +286,24 @@ Problem Profile:
|
||||
Constraints: 1
|
||||
Max FEA budget: ~72 trials
|
||||
|
||||
NN Quality Assessment:
|
||||
Validations analyzed: 10
|
||||
|
||||
| Objective | NN Error | CV | Ratio | Type | Quality |
|
||||
|---------------|----------|--------|-------|------------|---------|
|
||||
| mass | 3.7% | 16.0% | 0.23 | linear | ✓ Great |
|
||||
| stress | 2.0% | 7.7% | 0.26 | nonlinear | ✓ Great |
|
||||
| stiffness | 7.8% | 38.9% | 0.20 | nonlinear | ✓ Great |
|
||||
|
||||
Overall Quality: 22%
|
||||
Turbo Suitability: 77%
|
||||
Hybrid Suitability: 88%
|
||||
|
||||
----------------------------------------------------------------------
|
||||
|
||||
RECOMMENDED: TURBO
|
||||
Confidence: 100%
|
||||
Reason: low-dimensional design space; sufficient FEA budget; smooth landscape (79%)
|
||||
Reason: low-dimensional design space; sufficient FEA budget; smooth landscape (79%); good NN quality (77%)
|
||||
|
||||
Suggested parameters:
|
||||
--nn-trials: 5000
|
||||
@@ -223,9 +312,12 @@ Problem Profile:
|
||||
--epochs: 150
|
||||
|
||||
Alternatives:
|
||||
- hybrid_loop (75%): uncertain landscape - hybrid adapts; adequate budget for iterations
|
||||
- hybrid_loop (90%): uncertain landscape - hybrid adapts; NN adds value with periodic retraining
|
||||
- pure_fea (50%): default recommendation
|
||||
|
||||
Warnings:
|
||||
! mass: NN error (3.7%) above expected (2%) - consider retraining or using hybrid mode
|
||||
|
||||
======================================================================
|
||||
```
|
||||
|
||||
@@ -312,8 +404,11 @@ optimization_engine/
|
||||
└── method_selector.py # Complete AMS implementation
|
||||
├── ProblemProfiler # Static config analysis
|
||||
├── EarlyMetricsCollector # Dynamic FEA metrics
|
||||
├── NNQualityMetrics # NN accuracy dataclass
|
||||
├── NNQualityAssessor # Relative accuracy assessment
|
||||
├── AdaptiveMethodSelector # Main recommendation engine
|
||||
├── RuntimeAdvisor # Mid-run pivot advisor
|
||||
├── print_recommendation() # CLI output with NN quality table
|
||||
└── recommend_method() # Convenience function
|
||||
```
|
||||
|
||||
@@ -323,4 +418,5 @@ optimization_engine/
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 2.0 | 2025-12-07 | Added NNQualityAssessor with relative accuracy thresholds |
|
||||
| 1.0 | 2025-12-06 | Initial implementation with 4 methods |
|
||||
|
||||
Reference in New Issue
Block a user