Files
Atomizer/docs/STUDY_CONTINUATION_STANDARD.md
Anto01 ca25fbdec5 fix: Remove arbitrary aspect ratio validation and add comprehensive pruning diagnostics
**Validation Changes (simulation_validator.py)**:
- Removed arbitrary aspect ratio limits (5.0-50.0) for circular_plate model
- User requirement: validation rules must be proposed, not automatic
- Validator now returns empty rules for circular_plate
- Relies solely on Optuna parameter bounds (user-defined feasibility)
- Fixed Unicode encoding issues in pruning_logger.py

**Root Cause Analysis**:
- 18-20% pruning in Protocol 10 tests was NOT validation failures
- All pruned trials had valid aspect ratios within bounds
- Root cause: pyNastran FATAL flag false positives
- Simulations succeeded but pyNastran rejected OP2 files

**New Modules**:
- pruning_logger.py: Comprehensive trial failure tracking
  - Logs validation, simulation, and OP2 extraction failures
  - Analyzes F06 files to detect false positives
  - Generates pruning_history.json and pruning_summary.json

- op2_extractor.py: Robust multi-strategy OP2 extraction
  - Standard OP2 read
  - Lenient read (debug=False)
  - F06 fallback parsing
  - Handles pyNastran FATAL flag issues

**Documentation**:
- SESSION_SUMMARY_NOV20.md: Complete session documentation
- FIX_VALIDATOR_PRUNING.md: Deprecated, retained for historical reference
- PRUNING_DIAGNOSTICS.md: Usage guide for pruning diagnostics
- STUDY_CONTINUATION_STANDARD.md: API documentation

**Impact**:
- Clean separation: parameter bounds = feasibility, validator = genuine failures
- Expected pruning reduction from 18% to <2% with robust extraction
- ~4-5 minutes saved per 50-trial study
- All optimization trials contribute valid data

**User Requirements Established**:
1. No arbitrary checks without user approval
2. Validation rules must be visible in optimization_config.json
3. Parameter bounds already define feasibility constraints
4. Physics-based constraints need clear justification
2025-11-20 20:25:33 -05:00

415 lines
11 KiB
Markdown

# Study Continuation - Atomizer Standard Feature
**Date**: November 20, 2025
**Status**: ✅ Implemented as Standard Feature
---
## Overview
Study continuation is now a **standardized Atomizer feature** for dashboard integration. It provides a clean API for continuing existing optimization studies with additional trials.
Previously, continuation was improvised on-demand. Now it's a first-class feature alongside "Start New Optimization".
---
## Module
[optimization_engine/study_continuation.py](../optimization_engine/study_continuation.py)
---
## API
### Main Function: `continue_study()`
```python
from optimization_engine.study_continuation import continue_study
results = continue_study(
study_dir=Path("studies/my_study"),
additional_trials=50,
objective_function=my_objective,
design_variables={'param1': (0, 10), 'param2': (0, 100)},
target_value=115.0,
tolerance=0.1,
verbose=True
)
```
**Returns**:
```python
{
'study': optuna.Study, # The study object
'total_trials': 100, # Total after continuation
'successful_trials': 95, # Completed trials
'pruned_trials': 5, # Failed trials
'best_value': 0.05, # Best objective value
'best_params': {...}, # Best parameters
'target_achieved': True # If target specified
}
```
### Utility Functions
#### `can_continue_study()`
Check if a study is ready for continuation:
```python
from optimization_engine.study_continuation import can_continue_study
can_continue, message = can_continue_study(Path("studies/my_study"))
if can_continue:
print(f"Ready: {message}")
# message: "Study 'my_study' ready (current trials: 50)"
else:
print(f"Cannot continue: {message}")
# message: "No study.db found. Run initial optimization first."
```
#### `get_study_status()`
Get current study information:
```python
from optimization_engine.study_continuation import get_study_status
status = get_study_status(Path("studies/my_study"))
if status:
print(f"Study: {status['study_name']}")
print(f"Trials: {status['total_trials']}")
print(f"Success rate: {status['successful_trials']/status['total_trials']*100:.1f}%")
print(f"Best: {status['best_value']}")
else:
print("Study not found or invalid")
```
**Returns**:
```python
{
'study_name': 'my_study',
'total_trials': 50,
'successful_trials': 47,
'pruned_trials': 3,
'pruning_rate': 0.06,
'best_value': 0.42,
'best_params': {'param1': 5.2, 'param2': 78.3}
}
```
---
## Dashboard Integration
### UI Workflow
When user selects a study in the dashboard:
```
1. User clicks on study → Dashboard calls get_study_status()
2. Dashboard shows study info card:
┌──────────────────────────────────────┐
│ Study: circular_plate_test │
│ Current Trials: 50 │
│ Success Rate: 94% │
│ Best Result: 0.42 Hz error │
│ │
│ [Continue Study] [View Results] │
└──────────────────────────────────────┘
3. User clicks "Continue Study" → Shows form:
┌──────────────────────────────────────┐
│ Continue Optimization │
│ │
│ Additional Trials: [50] │
│ Target Value (optional): [115.0] │
│ Tolerance (optional): [0.1] │
│ │
│ [Cancel] [Start] │
└──────────────────────────────────────┘
4. User clicks "Start" → Dashboard calls continue_study()
5. Progress shown in real-time (like initial optimization)
```
### Example Dashboard Code
```python
from pathlib import Path
from optimization_engine.study_continuation import (
get_study_status,
can_continue_study,
continue_study
)
def show_study_panel(study_dir: Path):
"""Display study panel with continuation option."""
# Get current status
status = get_study_status(study_dir)
if not status:
print("Study not found or incomplete")
return
# Show study info
print(f"Study: {status['study_name']}")
print(f"Current Trials: {status['total_trials']}")
print(f"Best Result: {status['best_value']:.4f}")
# Check if can continue
can_continue, message = can_continue_study(study_dir)
if can_continue:
# Enable "Continue" button
print("✓ Ready to continue")
else:
# Disable "Continue" button, show reason
print(f"✗ Cannot continue: {message}")
def handle_continue_button_click(study_dir: Path, additional_trials: int):
"""Handle user clicking 'Continue Study' button."""
# Load the objective function for this study
# (Dashboard needs to reconstruct this from study config)
from studies.my_study.run_optimization import objective
# Continue the study
results = continue_study(
study_dir=study_dir,
additional_trials=additional_trials,
objective_function=objective,
verbose=True # Stream output to dashboard
)
# Show completion notification
if results.get('target_achieved'):
notify_user(f"Target achieved! Best: {results['best_value']:.4f}")
else:
notify_user(f"Completed {additional_trials} trials. Best: {results['best_value']:.4f}")
```
---
## Comparison: Old vs New
### Before (Improvised)
Each study needed a custom `continue_optimization.py`:
```
studies/my_study/
├── run_optimization.py # Standard (from protocol)
├── continue_optimization.py # Improvised (custom for each study)
└── 2_results/
└── study.db
```
**Problems**:
- Not standardized across studies
- Manual creation required
- No dashboard integration possible
- Inconsistent behavior
### After (Standardized)
All studies use the same continuation API:
```
studies/my_study/
├── run_optimization.py # Standard (from protocol)
└── 2_results/
└── study.db
# No continue_optimization.py needed!
# Just call continue_study() from anywhere
```
**Benefits**:
- ✅ Standardized behavior
- ✅ Dashboard-ready API
- ✅ Consistent across all studies
- ✅ No per-study custom code
---
## Usage Examples
### Example 1: Simple Continuation
```python
from pathlib import Path
from optimization_engine.study_continuation import continue_study
from studies.my_study.run_optimization import objective
# Continue with 50 more trials
results = continue_study(
study_dir=Path("studies/my_study"),
additional_trials=50,
objective_function=objective
)
print(f"New best: {results['best_value']}")
```
### Example 2: With Target Checking
```python
# Continue until target is met or 100 additional trials
results = continue_study(
study_dir=Path("studies/circular_plate_test"),
additional_trials=100,
objective_function=objective,
target_value=115.0,
tolerance=0.1
)
if results['target_achieved']:
print(f"Success! Achieved in {results['total_trials']} total trials")
else:
print(f"Target not reached. Best: {results['best_value']}")
```
### Example 3: Dashboard Batch Processing
```python
from pathlib import Path
from optimization_engine.study_continuation import get_study_status
# Find all studies that can be continued
studies_dir = Path("studies")
for study_dir in studies_dir.iterdir():
if not study_dir.is_dir():
continue
status = get_study_status(study_dir)
if status and status['pruning_rate'] > 0.10:
print(f"⚠️ {status['study_name']}: High pruning rate ({status['pruning_rate']*100:.1f}%)")
print(f" Consider investigating before continuing")
elif status:
print(f"{status['study_name']}: {status['total_trials']} trials, best={status['best_value']:.4f}")
```
---
## File Structure
### Standard Study Directory
```
studies/my_study/
├── 1_setup/
│ ├── model/ # FEA model files
│ ├── workflow_config.json # Contains study_name
│ └── optimization_config.json
├── 2_results/
│ ├── study.db # Optuna database (required for continuation)
│ ├── optimization_history_incremental.json
│ └── intelligent_optimizer/
└── 3_reports/
└── OPTIMIZATION_REPORT.md
```
**Required for Continuation**:
- `1_setup/workflow_config.json` (contains study_name)
- `2_results/study.db` (Optuna database with trial data)
---
## Error Handling
The API provides clear error messages:
```python
# Study doesn't exist
can_continue_study(Path("studies/nonexistent"))
# Returns: (False, "No workflow_config.json found in studies/nonexistent/1_setup")
# Study exists but not run yet
can_continue_study(Path("studies/new_study"))
# Returns: (False, "No study.db found. Run initial optimization first.")
# Study database corrupted
can_continue_study(Path("studies/bad_study"))
# Returns: (False, "Study 'bad_study' not found in database")
# Study has no trials
can_continue_study(Path("studies/empty_study"))
# Returns: (False, "Study exists but has no trials yet")
```
---
## Dashboard Buttons
### Two Standard Actions
Every study in the dashboard should have:
1. **"Start New Optimization"** → Calls `run_optimization.py`
- Requires: Study setup complete
- Creates: Fresh study database
- Use when: Starting from scratch
2. **"Continue Study"** → Calls `continue_study()`
- Requires: Existing study.db with trials
- Preserves: All existing trial data
- Use when: Adding more iterations
Both are now **standardized Atomizer features**.
---
## Testing
Test the continuation API:
```bash
# Test status check
python -c "
from pathlib import Path
from optimization_engine.study_continuation import get_study_status
status = get_study_status(Path('studies/circular_plate_protocol10_v2_1_test'))
if status:
print(f\"Study: {status['study_name']}\")
print(f\"Trials: {status['total_trials']}\")
print(f\"Best: {status['best_value']}\")
"
# Test continuation check
python -c "
from pathlib import Path
from optimization_engine.study_continuation import can_continue_study
can_continue, msg = can_continue_study(Path('studies/circular_plate_protocol10_v2_1_test'))
print(f\"Can continue: {can_continue}\")
print(f\"Message: {msg}\")
"
```
---
## Summary
| Feature | Before | After |
|---------|--------|-------|
| Implementation | Improvised per study | Standardized module |
| Dashboard integration | Not possible | Full API support |
| Consistency | Varies by study | Uniform behavior |
| Error handling | Manual | Built-in with messages |
| Study status | Manual queries | `get_study_status()` |
| Continuation check | Manual | `can_continue_study()` |
**Status**: ✅ Ready for dashboard integration
**Module**: [optimization_engine/study_continuation.py](../optimization_engine/study_continuation.py)