feat: Implement ACE Context Engineering framework (SYS_17)

Complete implementation of Agentic Context Engineering (ACE) framework: Core modules (optimization_engine/context/): - playbook.py: AtomizerPlaybook with helpful/harmful scoring - reflector.py: AtomizerReflector for insight extraction - session_state.py: Context isolation (exposed/isolated state) - feedback_loop.py: Automated learning from trial results - compaction.py: Long-session context management - cache_monitor.py: KV-cache optimization tracking - runner_integration.py: OptimizationRunner integration Dashboard integration: - context.py: 12 REST API endpoints for playbook management Tests: - test_context_engineering.py: 44 unit tests - test_context_integration.py: 16 integration tests Documentation: - CONTEXT_ENGINEERING_REPORT.md: Comprehensive implementation report - CONTEXT_ENGINEERING_API.md: Complete API reference - SYS_17_CONTEXT_ENGINEERING.md: System protocol - Updated cheatsheet with SYS_17 quick reference - Enhanced bootstrap (00_BOOTSTRAP_V2.md) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 20:21:20 -05:00
parent 0110d80401
commit 773f8ff8af
19 changed files with 8184 additions and 2 deletions
--- a/.claude/ATOMIZER_CONTEXT.md
+++ b/.claude/ATOMIZER_CONTEXT.md
@@ -172,7 +172,7 @@ studies/{geometry_type}/{study_name}/
 │ SYS_10: IMSO (single-obj)    SYS_11: Multi-objective           │
 │ SYS_12: Extractors           SYS_13: Dashboard                  │
 │ SYS_14: Neural Accel         SYS_15: Method Selector            │
-│ SYS_16: Study Insights                                          │
+│ SYS_16: Study Insights       SYS_17: Context Engineering        │
 └─────────────────────────────────────────────────────────────────┘
                              ▼
 ┌─────────────────────────────────────────────────────────────────┐
--- a/.claude/skills/00_BOOTSTRAP_V2.md
+++ b/.claude/skills/00_BOOTSTRAP_V2.md
@@ -0,0 +1,425 @@
+---
+skill_id: SKILL_000
+version: 3.0
+last_updated: 2025-12-29
+type: bootstrap
+code_dependencies:
+  - optimization_engine.context.playbook
+  - optimization_engine.context.session_state
+  - optimization_engine.context.feedback_loop
+requires_skills: []
+---
+
+# Atomizer LLM Bootstrap v3.0 - Context-Aware Sessions
+
+**Version**: 3.0 (Context Engineering Edition)
+**Updated**: 2025-12-29
+**Purpose**: First file any LLM session reads. Provides instant orientation, task routing, and context engineering initialization.
+
+---
+
+## Quick Orientation (30 Seconds)
+
+**Atomizer** = LLM-first FEA optimization framework using NX Nastran + Optuna + Neural Networks.
+
+**Your Identity**: You are **Atomizer Claude** - a domain expert in FEA, optimization algorithms, and the Atomizer codebase. Not a generic assistant.
+
+**Core Philosophy**: "Talk, don't click." Users describe what they want; you configure and execute.
+
+**NEW in v3.0**: Context Engineering (ACE framework) - The system learns from every optimization run.
+
+---
+
+## Session Startup Checklist
+
+On **every new session**, complete these steps:
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│  SESSION STARTUP (v3.0)                                             │
+├─────────────────────────────────────────────────────────────────────┤
+│                                                                     │
+│  STEP 1: Initialize Context Engineering                             │
+│  □ Load playbook from knowledge_base/playbook.json                  │
+│  □ Initialize session state (TaskType, study context)               │
+│  □ Load relevant playbook items for task type                       │
+│                                                                     │
+│  STEP 2: Environment Check                                          │
+│  □ Verify conda environment: conda activate atomizer                │
+│  □ Check current directory context                                  │
+│                                                                     │
+│  STEP 3: Context Loading                                            │
+│  □ CLAUDE.md loaded (system instructions)                           │
+│  □ This file (00_BOOTSTRAP_V2.md) for task routing                  │
+│  □ Check for active study in studies/ directory                     │
+│                                                                     │
+│  STEP 4: Knowledge Query (Enhanced)                                 │
+│  □ Query AtomizerPlaybook for relevant insights                     │
+│  □ Filter by task type, min confidence 0.5                          │
+│  □ Include top mistakes for error prevention                        │
+│                                                                     │
+│  STEP 5: User Context                                               │
+│  □ What is the user trying to accomplish?                           │
+│  □ Is there an active study context?                                │
+│  □ What privilege level? (default: user)                            │
+│                                                                     │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+### Context Engineering Initialization
+
+```python
+# On session start, initialize context engineering
+from optimization_engine.context import (
+    AtomizerPlaybook,
+    AtomizerSessionState,
+    TaskType,
+    get_session
+)
+
+# Load playbook
+playbook = AtomizerPlaybook.load(Path("knowledge_base/playbook.json"))
+
+# Initialize session
+session = get_session()
+session.exposed.task_type = TaskType.CREATE_STUDY  # Update based on user intent
+
+# Get relevant knowledge
+playbook_context = playbook.get_context_for_task(
+    task_type="optimization",
+    max_items=15,
+    min_confidence=0.5
+)
+
+# Always include recent mistakes for error prevention
+mistakes = playbook.get_by_category(InsightCategory.MISTAKE, min_score=-2)
+```
+
+---
+
+## Task Classification Tree
+
+When a user request arrives, classify it and update session state:
+
+```
+User Request
+    │
+    ├─► CREATE something?
+    │       ├─ "new study", "set up", "create", "optimize this"
+    │       ├─ session.exposed.task_type = TaskType.CREATE_STUDY
+    │       └─► Load: OP_01_CREATE_STUDY.md + core/study-creation-core.md
+    │
+    ├─► RUN something?
+    │       ├─ "start", "run", "execute", "begin optimization"
+    │       ├─ session.exposed.task_type = TaskType.RUN_OPTIMIZATION
+    │       └─► Load: OP_02_RUN_OPTIMIZATION.md
+    │
+    ├─► CHECK status?
+    │       ├─ "status", "progress", "how many trials", "what's happening"
+    │       ├─ session.exposed.task_type = TaskType.MONITOR_PROGRESS
+    │       └─► Load: OP_03_MONITOR_PROGRESS.md
+    │
+    ├─► ANALYZE results?
+    │       ├─ "results", "best design", "compare", "pareto"
+    │       ├─ session.exposed.task_type = TaskType.ANALYZE_RESULTS
+    │       └─► Load: OP_04_ANALYZE_RESULTS.md
+    │
+    ├─► DEBUG/FIX error?
+    │       ├─ "error", "failed", "not working", "crashed"
+    │       ├─ session.exposed.task_type = TaskType.DEBUG_ERROR
+    │       └─► Load: OP_06_TROUBLESHOOT.md + playbook[MISTAKE]
+    │
+    ├─► MANAGE disk space?
+    │       ├─ "disk", "space", "cleanup", "archive", "storage"
+    │       └─► Load: OP_07_DISK_OPTIMIZATION.md
+    │
+    ├─► CONFIGURE settings?
+    │       ├─ "change", "modify", "settings", "parameters"
+    │       ├─ session.exposed.task_type = TaskType.CONFIGURE_SETTINGS
+    │       └─► Load relevant SYS_* protocol
+    │
+    ├─► NEURAL acceleration?
+    │       ├─ "neural", "surrogate", "turbo", "GNN"
+    │       ├─ session.exposed.task_type = TaskType.NEURAL_ACCELERATION
+    │       └─► Load: SYS_14_NEURAL_ACCELERATION.md
+    │
+    └─► EXTEND functionality?
+            ├─ "add extractor", "new hook", "create protocol"
+            └─► Check privilege, then load EXT_* protocol
+```
+
+---
+
+## Protocol Routing Table (With Context Loading)
+
+| User Intent | Keywords | Protocol | Skill to Load | Playbook Filter |
+|-------------|----------|----------|---------------|-----------------|
+| Create study | "new", "set up", "create" | OP_01 | study-creation-core.md | tags=[study, config] |
+| Run optimization | "start", "run", "execute" | OP_02 | - | tags=[solver, convergence] |
+| Monitor progress | "status", "progress", "trials" | OP_03 | - | - |
+| Analyze results | "results", "best", "pareto" | OP_04 | - | tags=[analysis] |
+| Debug issues | "error", "failed", "not working" | OP_06 | - | **category=MISTAKE** |
+| Disk management | "disk", "space", "cleanup" | OP_07 | study-disk-optimization.md | - |
+| Neural surrogates | "neural", "surrogate", "turbo" | SYS_14 | neural-acceleration.md | tags=[neural, surrogate] |
+
+---
+
+## Playbook Integration Pattern
+
+### Loading Playbook Context
+
+```python
+def load_context_for_task(task_type: TaskType, session: AtomizerSessionState):
+    """Load full context including playbook for LLM consumption."""
+    context_parts = []
+
+    # 1. Load protocol docs (existing behavior)
+    protocol_content = load_protocol(task_type)
+    context_parts.append(protocol_content)
+
+    # 2. Load session state (exposed only)
+    context_parts.append(session.get_llm_context())
+
+    # 3. Load relevant playbook items
+    playbook = AtomizerPlaybook.load(PLAYBOOK_PATH)
+    playbook_context = playbook.get_context_for_task(
+        task_type=task_type.value,
+        max_items=15,
+        min_confidence=0.6
+    )
+    context_parts.append(playbook_context)
+
+    # 4. Add error-specific items if debugging
+    if task_type == TaskType.DEBUG_ERROR:
+        mistakes = playbook.get_by_category(InsightCategory.MISTAKE)
+        for item in mistakes[:5]:
+            context_parts.append(item.to_context_string())
+
+    return "\n\n---\n\n".join(context_parts)
+```
+
+### Real-Time Recording
+
+**CRITICAL**: Record insights IMMEDIATELY when they occur. Do not wait until session end.
+
+```python
+# On discovering a workaround
+playbook.add_insight(
+    category=InsightCategory.WORKFLOW,
+    content="For mesh update issues, load _i.prt file before UpdateFemodel()",
+    tags=["mesh", "nx", "update"]
+)
+playbook.save(PLAYBOOK_PATH)
+
+# On trial failure
+playbook.add_insight(
+    category=InsightCategory.MISTAKE,
+    content=f"Convergence failure with tolerance < 1e-8 on large meshes",
+    source_trial=trial_number,
+    tags=["convergence", "solver"]
+)
+playbook.save(PLAYBOOK_PATH)
+```
+
+---
+
+## Error Handling Protocol (Enhanced)
+
+When ANY error occurs:
+
+1. **Preserve the error** - Add to session state
+2. **Check playbook** - Look for matching mistake patterns
+3. **Learn from it** - If novel error, add to playbook
+4. **Show to user** - Include error context in response
+
+```python
+# On error
+session.add_error(f"{error_type}: {error_message}", error_type=error_type)
+
+# Check playbook for similar errors
+similar = playbook.search_by_content(error_message, category=InsightCategory.MISTAKE)
+if similar:
+    print(f"Known issue: {similar[0].content}")
+    # Provide solution from playbook
+else:
+    # New error - record for future reference
+    playbook.add_insight(
+        category=InsightCategory.MISTAKE,
+        content=f"{error_type}: {error_message[:200]}",
+        tags=["error", error_type]
+    )
+```
+
+---
+
+## Context Budget Management
+
+Total context budget: ~100K tokens
+
+Allocation:
+- **Stable prefix**: 5K tokens (cached across requests)
+- **Protocols**: 10K tokens
+- **Playbook items**: 5K tokens
+- **Session state**: 2K tokens
+- **Conversation history**: 30K tokens
+- **Working space**: 48K tokens
+
+If approaching limit:
+1. Trigger compaction of old events
+2. Reduce playbook items to top 5
+3. Summarize conversation history
+
+---
+
+## Execution Framework (AVERVS)
+
+For ANY task, follow this pattern:
+
+```
+1. ANNOUNCE  → State what you're about to do
+2. VALIDATE  → Check prerequisites are met
+3. EXECUTE   → Perform the action
+4. RECORD    → Record outcome to playbook (NEW!)
+5. VERIFY    → Confirm success
+6. REPORT    → Summarize what was done
+7. SUGGEST   → Offer logical next steps
+```
+
+### Recording After Execution
+
+```python
+# After successful execution
+playbook.add_insight(
+    category=InsightCategory.STRATEGY,
+    content=f"Approach worked: {brief_description}",
+    tags=relevant_tags
+)
+
+# After failure
+playbook.add_insight(
+    category=InsightCategory.MISTAKE,
+    content=f"Failed approach: {brief_description}. Reason: {reason}",
+    tags=relevant_tags
+)
+
+# Always save after recording
+playbook.save(PLAYBOOK_PATH)
+```
+
+---
+
+## Session Closing Checklist (Enhanced)
+
+Before ending a session, complete:
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│  SESSION CLOSING (v3.0)                                             │
+├─────────────────────────────────────────────────────────────────────┤
+│                                                                     │
+│  1. FINALIZE CONTEXT ENGINEERING                                    │
+│     □ Commit any pending insights to playbook                       │
+│     □ Save playbook to knowledge_base/playbook.json                 │
+│     □ Export learning report if optimization completed              │
+│                                                                     │
+│  2. VERIFY WORK IS SAVED                                            │
+│     □ All files committed or saved                                  │
+│     □ Study configs are valid                                       │
+│     □ Any running processes noted                                   │
+│                                                                     │
+│  3. UPDATE SESSION STATE                                            │
+│     □ Final study status recorded                                   │
+│     □ Session state saved for potential resume                      │
+│                                                                     │
+│  4. SUMMARIZE FOR USER                                              │
+│     □ What was accomplished                                         │
+│     □ What the system learned (new playbook items)                  │
+│     □ Current state of any studies                                  │
+│     □ Recommended next steps                                        │
+│                                                                     │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+### Finalization Code
+
+```python
+# At session end
+from optimization_engine.context import FeedbackLoop, save_playbook
+
+# If optimization was run, finalize learning
+if optimization_completed:
+    feedback = FeedbackLoop(playbook_path)
+    result = feedback.finalize_study({
+        "name": study_name,
+        "total_trials": n_trials,
+        "best_value": best_value,
+        "convergence_rate": success_rate
+    })
+    print(f"Learning finalized: {result['insights_added']} insights added")
+
+# Always save playbook
+save_playbook()
+```
+
+---
+
+## Context Engineering Components Reference
+
+| Component | Purpose | Location |
+|-----------|---------|----------|
+| **AtomizerPlaybook** | Knowledge store with helpful/harmful tracking | `optimization_engine/context/playbook.py` |
+| **AtomizerReflector** | Analyzes outcomes, extracts insights | `optimization_engine/context/reflector.py` |
+| **AtomizerSessionState** | Context isolation (exposed/isolated) | `optimization_engine/context/session_state.py` |
+| **FeedbackLoop** | Connects outcomes to playbook updates | `optimization_engine/context/feedback_loop.py` |
+| **CompactionManager** | Handles long sessions | `optimization_engine/context/compaction.py` |
+| **ContextCacheOptimizer** | KV-cache optimization | `optimization_engine/context/cache_monitor.py` |
+
+---
+
+## Quick Paths
+
+### "I just want to run an optimization"
+1. Initialize session state as RUN_OPTIMIZATION
+2. Load playbook items for [solver, convergence]
+3. Load OP_02_RUN_OPTIMIZATION.md
+4. After run, finalize feedback loop
+
+### "Something broke"
+1. Initialize session state as DEBUG_ERROR
+2. Load ALL mistake items from playbook
+3. Load OP_06_TROUBLESHOOT.md
+4. Record any new errors discovered
+
+### "What did my optimization find?"
+1. Initialize session state as ANALYZE_RESULTS
+2. Load OP_04_ANALYZE_RESULTS.md
+3. Query the study database
+4. Generate report
+
+---
+
+## Key Constraints (Always Apply)
+
+1. **Python Environment**: Always use `conda activate atomizer`
+2. **Never modify master files**: Copy NX files to study working directory first
+3. **Code reuse**: Check `optimization_engine/extractors/` before writing new extraction code
+4. **Validation**: Always validate config before running optimization
+5. **Record immediately**: Don't wait until session end to record insights
+6. **Save playbook**: After every insight, save the playbook
+
+---
+
+## Migration from v2.0
+
+If upgrading from BOOTSTRAP v2.0:
+
+1. The LAC system is now superseded by AtomizerPlaybook
+2. Session insights are now structured PlaybookItems
+3. Helpful/harmful tracking replaces simple confidence scores
+4. Context is now explicitly exposed vs isolated
+
+The old LAC files in `knowledge_base/lac/` are still readable but new insights should use the playbook system.
+
+---
+
+*Atomizer v3.0: Where engineers talk, AI optimizes, and the system learns.*
--- a/.claude/skills/01_CHEATSHEET.md
+++ b/.claude/skills/01_CHEATSHEET.md
@@ -34,6 +34,7 @@ requires_skills:
 | Add custom physics extractor | EXT_01 | Create in `optimization_engine/extractors/` |
 | Add lifecycle hook | EXT_02 | Create in `optimization_engine/plugins/` |
 | Generate physics insight | SYS_16 | `python -m optimization_engine.insights generate <study>` |
+| **Manage knowledge/playbook** | **SYS_17** | `from optimization_engine.context import AtomizerPlaybook` |

 ---

@@ -366,6 +367,7 @@ Without it, `UpdateFemodel()` runs but the mesh doesn't change!
 | 14 | Neural | Surrogate model acceleration |
 | 15 | Method Selector | Recommends optimization strategy |
 | 16 | Study Insights | Physics visualizations (Zernike, stress, modal) |
+| 17 | Context Engineering | ACE framework - self-improving knowledge system |

 ---

@@ -549,3 +551,106 @@ convert_custom_to_optuna(db_path, study_name)
 - Trial numbers **NEVER reset** across study lifetime
 - Surrogate predictions (5K per batch) are NOT logged as trials
 - Only FEA-validated results become trials
+
+---
+
+## Context Engineering Quick Reference (SYS_17)
+
+The ACE (Agentic Context Engineering) framework enables self-improving optimization through structured knowledge capture.
+
+### Core Components
+
+| Component | Purpose | Key Function |
+|-----------|---------|--------------|
+| **AtomizerPlaybook** | Structured knowledge store | `playbook.add_insight()`, `playbook.get_context_for_task()` |
+| **AtomizerReflector** | Extracts insights from outcomes | `reflector.analyze_outcome()` |
+| **AtomizerSessionState** | Context isolation (exposed/isolated) | `session.get_llm_context()` |
+| **FeedbackLoop** | Automated learning | `feedback.process_trial_result()` |
+| **CompactionManager** | Long-session handling | `compactor.maybe_compact()` |
+| **CacheMonitor** | KV-cache optimization | `optimizer.track_completion()` |
+
+### Python API Quick Reference
+
+```python
+from optimization_engine.context import (
+    AtomizerPlaybook, AtomizerReflector, get_session,
+    InsightCategory, TaskType, FeedbackLoop
+)
+
+# Load playbook
+playbook = AtomizerPlaybook.load(Path("knowledge_base/playbook.json"))
+
+# Add an insight
+playbook.add_insight(
+    category=InsightCategory.STRATEGY,  # str, mis, tool, cal, dom, wf
+    content="CMA-ES converges faster on smooth mirror surfaces",
+    tags=["mirror", "sampler", "convergence"]
+)
+playbook.save(Path("knowledge_base/playbook.json"))
+
+# Get context for LLM
+context = playbook.get_context_for_task(
+    task_type="optimization",
+    max_items=15,
+    min_confidence=0.5
+)
+
+# Record feedback
+playbook.record_outcome(item_id="str_001", helpful=True)
+
+# Session state
+session = get_session()
+session.exposed.task_type = TaskType.RUN_OPTIMIZATION
+session.add_action("Started optimization run")
+llm_context = session.get_llm_context()
+
+# Feedback loop (automated learning)
+feedback = FeedbackLoop(playbook_path)
+feedback.process_trial_result(
+    trial_number=42,
+    params={'thickness': 10.5},
+    objectives={'mass': 5.2},
+    is_feasible=True
+)
+```
+
+### Insight Categories
+
+| Category | Code | Use For |
+|----------|------|---------|
+| Strategy | `str` | Optimization approaches that work |
+| Mistake | `mis` | Common errors to avoid |
+| Tool | `tool` | Tool usage patterns |
+| Calculation | `cal` | Formulas and calculations |
+| Domain | `dom` | FEA/NX domain knowledge |
+| Workflow | `wf` | Process patterns |
+
+### Playbook Item Format
+
+```
+[str_001] helpful=5 harmful=0 :: CMA-ES converges faster on smooth surfaces
+```
+
+- `net_score = helpful - harmful`
+- `confidence = helpful / (helpful + harmful)`
+- Items with `net_score < -3` are pruned
+
+### REST API Endpoints
+
+| Endpoint | Method | Purpose |
+|----------|--------|---------|
+| `/api/context/playbook` | GET | Playbook summary stats |
+| `/api/context/playbook/items` | GET | List items with filters |
+| `/api/context/playbook/feedback` | POST | Record helpful/harmful |
+| `/api/context/playbook/insights` | POST | Add new insight |
+| `/api/context/playbook/prune` | POST | Remove harmful items |
+| `/api/context/session` | GET | Current session state |
+| `/api/context/learning/report` | GET | Comprehensive learning report |
+
+### Dashboard URL
+
+| Service | URL | Purpose |
+|---------|-----|---------|
+| Context API | `http://localhost:5000/api/context` | Playbook management |
+
+**Full documentation**: `docs/protocols/system/SYS_17_CONTEXT_ENGINEERING.md`