feat: Implement ACE Context Engineering framework (SYS_17)
Complete implementation of Agentic Context Engineering (ACE) framework: Core modules (optimization_engine/context/): - playbook.py: AtomizerPlaybook with helpful/harmful scoring - reflector.py: AtomizerReflector for insight extraction - session_state.py: Context isolation (exposed/isolated state) - feedback_loop.py: Automated learning from trial results - compaction.py: Long-session context management - cache_monitor.py: KV-cache optimization tracking - runner_integration.py: OptimizationRunner integration Dashboard integration: - context.py: 12 REST API endpoints for playbook management Tests: - test_context_engineering.py: 44 unit tests - test_context_integration.py: 16 integration tests Documentation: - CONTEXT_ENGINEERING_REPORT.md: Comprehensive implementation report - CONTEXT_ENGINEERING_API.md: Complete API reference - SYS_17_CONTEXT_ENGINEERING.md: System protocol - Updated cheatsheet with SYS_17 quick reference - Enhanced bootstrap (00_BOOTSTRAP_V2.md) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
307
docs/protocols/system/SYS_17_CONTEXT_ENGINEERING.md
Normal file
307
docs/protocols/system/SYS_17_CONTEXT_ENGINEERING.md
Normal file
@@ -0,0 +1,307 @@
|
||||
---
|
||||
protocol_id: SYS_17
|
||||
version: 1.0
|
||||
last_updated: 2025-12-29
|
||||
status: active
|
||||
owner: system
|
||||
code_dependencies:
|
||||
- optimization_engine.context.*
|
||||
requires_protocols: []
|
||||
---
|
||||
|
||||
# SYS_17: Context Engineering System
|
||||
|
||||
## Overview
|
||||
|
||||
The Context Engineering System implements the **Agentic Context Engineering (ACE)** framework, enabling Atomizer to learn from every optimization run and accumulate institutional knowledge over time.
|
||||
|
||||
## When to Load This Protocol
|
||||
|
||||
Load SYS_17 when:
|
||||
- User asks about "learning", "playbook", or "context engineering"
|
||||
- Debugging why certain knowledge isn't being applied
|
||||
- Configuring context behavior
|
||||
- Analyzing what the system has learned
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### The ACE Framework
|
||||
|
||||
```
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ Generator │────▶│ Reflector │────▶│ Curator │
|
||||
│ (Opt Runs) │ │ (Analysis) │ │ (Playbook) │
|
||||
└─────────────┘ └─────────────┘ └─────────────┘
|
||||
│ │
|
||||
└───────────── Feedback ───────────────┘
|
||||
```
|
||||
|
||||
1. **Generator**: OptimizationRunner produces trial outcomes
|
||||
2. **Reflector**: Analyzes outcomes, extracts patterns
|
||||
3. **Curator**: Playbook stores and manages insights
|
||||
4. **Feedback**: Success/failure updates insight scores
|
||||
|
||||
### Playbook Item Structure
|
||||
|
||||
```
|
||||
[str-00001] helpful=8 harmful=0 :: "Use shell elements for thin walls"
|
||||
│ │ │ │
|
||||
│ │ │ └── Insight content
|
||||
│ │ └── Times advice led to failure
|
||||
│ └── Times advice led to success
|
||||
└── Unique ID (category-number)
|
||||
```
|
||||
|
||||
### Categories
|
||||
|
||||
| Code | Name | Description | Example |
|
||||
|------|------|-------------|---------|
|
||||
| `str` | STRATEGY | Optimization approaches | "Start with TPE, switch to CMA-ES" |
|
||||
| `mis` | MISTAKE | Things to avoid | "Don't use coarse mesh for stress" |
|
||||
| `tool` | TOOL | Tool usage tips | "Use GP sampler for few-shot" |
|
||||
| `cal` | CALCULATION | Formulas | "Safety factor = yield/max_stress" |
|
||||
| `dom` | DOMAIN | Domain knowledge | "Zernike coefficients for mirrors" |
|
||||
| `wf` | WORKFLOW | Workflow patterns | "Load _i.prt before UpdateFemodel()" |
|
||||
|
||||
## Key Components
|
||||
|
||||
### 1. AtomizerPlaybook
|
||||
|
||||
Location: `optimization_engine/context/playbook.py`
|
||||
|
||||
The central knowledge store. Handles:
|
||||
- Adding insights (with auto-deduplication)
|
||||
- Recording helpful/harmful outcomes
|
||||
- Generating filtered context for LLM
|
||||
- Pruning consistently harmful items
|
||||
- Persistence (JSON)
|
||||
|
||||
**Quick Usage:**
|
||||
```python
|
||||
from optimization_engine.context import get_playbook, save_playbook, InsightCategory
|
||||
|
||||
playbook = get_playbook()
|
||||
playbook.add_insight(InsightCategory.STRATEGY, "Use shell elements for thin walls")
|
||||
playbook.record_outcome("str-00001", helpful=True)
|
||||
save_playbook()
|
||||
```
|
||||
|
||||
### 2. AtomizerReflector
|
||||
|
||||
Location: `optimization_engine/context/reflector.py`
|
||||
|
||||
Analyzes optimization outcomes to extract insights:
|
||||
- Classifies errors (convergence, mesh, singularity, etc.)
|
||||
- Extracts success patterns
|
||||
- Generates study-level insights
|
||||
|
||||
**Quick Usage:**
|
||||
```python
|
||||
from optimization_engine.context import AtomizerReflector, OptimizationOutcome
|
||||
|
||||
reflector = AtomizerReflector(playbook)
|
||||
outcome = OptimizationOutcome(trial_number=42, success=True, ...)
|
||||
insights = reflector.analyze_trial(outcome)
|
||||
reflector.commit_insights()
|
||||
```
|
||||
|
||||
### 3. FeedbackLoop
|
||||
|
||||
Location: `optimization_engine/context/feedback_loop.py`
|
||||
|
||||
Automated learning loop that:
|
||||
- Processes trial results
|
||||
- Updates playbook scores based on outcomes
|
||||
- Tracks which items were active per trial
|
||||
- Finalizes learning at study end
|
||||
|
||||
**Quick Usage:**
|
||||
```python
|
||||
from optimization_engine.context import FeedbackLoop
|
||||
|
||||
feedback = FeedbackLoop(playbook_path)
|
||||
feedback.process_trial_result(trial_number=42, success=True, ...)
|
||||
feedback.finalize_study({"name": "study", "total_trials": 100, ...})
|
||||
```
|
||||
|
||||
### 4. SessionState
|
||||
|
||||
Location: `optimization_engine/context/session_state.py`
|
||||
|
||||
Manages context isolation:
|
||||
- **Exposed**: Always in LLM context (task type, recent actions, errors)
|
||||
- **Isolated**: On-demand access (full history, NX paths, F06 content)
|
||||
|
||||
**Quick Usage:**
|
||||
```python
|
||||
from optimization_engine.context import get_session, TaskType
|
||||
|
||||
session = get_session()
|
||||
session.exposed.task_type = TaskType.RUN_OPTIMIZATION
|
||||
session.add_action("Started trial 42")
|
||||
context = session.get_llm_context()
|
||||
```
|
||||
|
||||
### 5. CompactionManager
|
||||
|
||||
Location: `optimization_engine/context/compaction.py`
|
||||
|
||||
Handles long sessions:
|
||||
- Triggers compaction at threshold (default 50 events)
|
||||
- Summarizes old events into statistics
|
||||
- Preserves errors and milestones
|
||||
|
||||
### 6. CacheOptimizer
|
||||
|
||||
Location: `optimization_engine/context/cache_monitor.py`
|
||||
|
||||
Optimizes for KV-cache:
|
||||
- Three-tier context structure (stable/semi-stable/dynamic)
|
||||
- Tracks cache hit rate
|
||||
- Estimates cost savings
|
||||
|
||||
## Integration with OptimizationRunner
|
||||
|
||||
### Option 1: Mixin
|
||||
|
||||
```python
|
||||
from optimization_engine.context.runner_integration import ContextEngineeringMixin
|
||||
|
||||
class MyRunner(ContextEngineeringMixin, OptimizationRunner):
|
||||
def __init__(self, *args, **kwargs):
|
||||
super().__init__(*args, **kwargs)
|
||||
self.init_context_engineering()
|
||||
```
|
||||
|
||||
### Option 2: Wrapper
|
||||
|
||||
```python
|
||||
from optimization_engine.context.runner_integration import ContextAwareRunner
|
||||
|
||||
runner = OptimizationRunner(config_path=...)
|
||||
context_runner = ContextAwareRunner(runner)
|
||||
context_runner.run(n_trials=100)
|
||||
```
|
||||
|
||||
## Dashboard API
|
||||
|
||||
Base URL: `/api/context`
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/playbook` | GET | Playbook summary |
|
||||
| `/playbook/items` | GET | List items (with filters) |
|
||||
| `/playbook/items/{id}` | GET | Get specific item |
|
||||
| `/playbook/feedback` | POST | Record helpful/harmful |
|
||||
| `/playbook/insights` | POST | Add new insight |
|
||||
| `/playbook/prune` | POST | Prune harmful items |
|
||||
| `/playbook/context` | GET | Get LLM context string |
|
||||
| `/session` | GET | Session state |
|
||||
| `/learning/report` | GET | Learning report |
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Record Immediately
|
||||
|
||||
Don't wait until session end:
|
||||
```python
|
||||
# RIGHT: Record immediately
|
||||
playbook.add_insight(InsightCategory.MISTAKE, "Convergence failed with X")
|
||||
playbook.save(path)
|
||||
|
||||
# WRONG: Wait until end
|
||||
# (User might close session, learning lost)
|
||||
```
|
||||
|
||||
### 2. Be Specific
|
||||
|
||||
```python
|
||||
# GOOD: Specific and actionable
|
||||
"For bracket optimization with >5 variables, TPE outperforms random search"
|
||||
|
||||
# BAD: Vague
|
||||
"TPE is good"
|
||||
```
|
||||
|
||||
### 3. Include Context
|
||||
|
||||
```python
|
||||
playbook.add_insight(
|
||||
InsightCategory.STRATEGY,
|
||||
"Shell elements reduce solve time by 40% for thickness < 2mm",
|
||||
tags=["mesh", "shell", "performance"]
|
||||
)
|
||||
```
|
||||
|
||||
### 4. Review Harmful Items
|
||||
|
||||
Periodically check items with negative scores:
|
||||
```python
|
||||
harmful = [i for i in playbook.items.values() if i.net_score < 0]
|
||||
for item in harmful:
|
||||
print(f"{item.id}: {item.content[:50]}... (score={item.net_score})")
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Playbook Not Updating
|
||||
|
||||
1. Check playbook path:
|
||||
```python
|
||||
print(playbook_path) # Should be knowledge_base/playbook.json
|
||||
```
|
||||
|
||||
2. Verify save is called:
|
||||
```python
|
||||
playbook.save(path) # Must be explicit
|
||||
```
|
||||
|
||||
### Insights Not Appearing in Context
|
||||
|
||||
1. Check confidence threshold:
|
||||
```python
|
||||
# Default is 0.5 - new items start at 0.5
|
||||
context = playbook.get_context_for_task("opt", min_confidence=0.3)
|
||||
```
|
||||
|
||||
2. Check if items exist:
|
||||
```python
|
||||
print(f"Total items: {len(playbook.items)}")
|
||||
```
|
||||
|
||||
### Learning Not Working
|
||||
|
||||
1. Verify FeedbackLoop is finalized:
|
||||
```python
|
||||
feedback.finalize_study(...) # MUST be called
|
||||
```
|
||||
|
||||
2. Check context_items_used parameter:
|
||||
```python
|
||||
# Items must be explicitly tracked
|
||||
feedback.process_trial_result(
|
||||
...,
|
||||
context_items_used=list(playbook.items.keys())[:10]
|
||||
)
|
||||
```
|
||||
|
||||
## Files Reference
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `optimization_engine/context/__init__.py` | Module exports |
|
||||
| `optimization_engine/context/playbook.py` | Knowledge store |
|
||||
| `optimization_engine/context/reflector.py` | Outcome analysis |
|
||||
| `optimization_engine/context/session_state.py` | Context isolation |
|
||||
| `optimization_engine/context/feedback_loop.py` | Learning loop |
|
||||
| `optimization_engine/context/compaction.py` | Long session management |
|
||||
| `optimization_engine/context/cache_monitor.py` | KV-cache optimization |
|
||||
| `optimization_engine/context/runner_integration.py` | Runner integration |
|
||||
| `knowledge_base/playbook.json` | Persistent storage |
|
||||
|
||||
## See Also
|
||||
|
||||
- `docs/CONTEXT_ENGINEERING_REPORT.md` - Full implementation report
|
||||
- `.claude/skills/00_BOOTSTRAP_V2.md` - Enhanced bootstrap
|
||||
- `tests/test_context_engineering.py` - Unit tests
|
||||
- `tests/test_context_integration.py` - Integration tests
|
||||
Reference in New Issue
Block a user