docs/protocols/system/SYS_18_CONTEXT_ENGINEERING.md

---
protocol_id: SYS_17
version: 1.0
last_updated: 2025-12-29
status: active
owner: system
code_dependencies:
  - optimization_engine.context.*
requires_protocols: []
---

# SYS_17: Context Engineering System

## Overview

The Context Engineering System implements the **Agentic Context Engineering (ACE)** framework, enabling Atomizer to learn from every optimization run and accumulate institutional knowledge over time.

## When to Load This Protocol

Load SYS_17 when:
- User asks about "learning", "playbook", or "context engineering"
- Debugging why certain knowledge isn't being applied
- Configuring context behavior
- Analyzing what the system has learned

## Core Concepts

### The ACE Framework

```
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Generator  │────▶│  Reflector  │────▶│   Curator   │
│ (Opt Runs)  │     │ (Analysis)  │     │ (Playbook)  │
└─────────────┘     └─────────────┘     └─────────────┘
       │                                       │
       └───────────── Feedback ───────────────┘
```

1. **Generator**: OptimizationRunner produces trial outcomes
2. **Reflector**: Analyzes outcomes, extracts patterns
3. **Curator**: Playbook stores and manages insights
4. **Feedback**: Success/failure updates insight scores

### Playbook Item Structure

```
[str-00001] helpful=8 harmful=0 :: "Use shell elements for thin walls"
  │           │          │            │
  │           │          │            └── Insight content
  │           │          └── Times advice led to failure
  │           └── Times advice led to success
  └── Unique ID (category-number)
```

### Categories

| Code | Name | Description | Example |
|------|------|-------------|---------|
| `str` | STRATEGY | Optimization approaches | "Start with TPE, switch to CMA-ES" |
| `mis` | MISTAKE | Things to avoid | "Don't use coarse mesh for stress" |
| `tool` | TOOL | Tool usage tips | "Use GP sampler for few-shot" |
| `cal` | CALCULATION | Formulas | "Safety factor = yield/max_stress" |
| `dom` | DOMAIN | Domain knowledge | "Zernike coefficients for mirrors" |
| `wf` | WORKFLOW | Workflow patterns | "Load _i.prt before UpdateFemodel()" |

## Key Components

### 1. AtomizerPlaybook

Location: `optimization_engine/context/playbook.py`

The central knowledge store. Handles:
- Adding insights (with auto-deduplication)
- Recording helpful/harmful outcomes
- Generating filtered context for LLM
- Pruning consistently harmful items
- Persistence (JSON)

**Quick Usage:**
```python
from optimization_engine.context import get_playbook, save_playbook, InsightCategory

playbook = get_playbook()
playbook.add_insight(InsightCategory.STRATEGY, "Use shell elements for thin walls")
playbook.record_outcome("str-00001", helpful=True)
save_playbook()
```

### 2. AtomizerReflector

Location: `optimization_engine/context/reflector.py`

Analyzes optimization outcomes to extract insights:
- Classifies errors (convergence, mesh, singularity, etc.)
- Extracts success patterns
- Generates study-level insights

**Quick Usage:**
```python
from optimization_engine.context import AtomizerReflector, OptimizationOutcome

reflector = AtomizerReflector(playbook)
outcome = OptimizationOutcome(trial_number=42, success=True, ...)
insights = reflector.analyze_trial(outcome)
reflector.commit_insights()
```

### 3. FeedbackLoop

Location: `optimization_engine/context/feedback_loop.py`

Automated learning loop that:
- Processes trial results
- Updates playbook scores based on outcomes
- Tracks which items were active per trial
- Finalizes learning at study end

**Quick Usage:**
```python
from optimization_engine.context import FeedbackLoop

feedback = FeedbackLoop(playbook_path)
feedback.process_trial_result(trial_number=42, success=True, ...)
feedback.finalize_study({"name": "study", "total_trials": 100, ...})
```

### 4. SessionState

Location: `optimization_engine/context/session_state.py`

Manages context isolation:
- **Exposed**: Always in LLM context (task type, recent actions, errors)
- **Isolated**: On-demand access (full history, NX paths, F06 content)

**Quick Usage:**
```python
from optimization_engine.context import get_session, TaskType

session = get_session()
session.exposed.task_type = TaskType.RUN_OPTIMIZATION
session.add_action("Started trial 42")
context = session.get_llm_context()
```

### 5. CompactionManager

Location: `optimization_engine/context/compaction.py`

Handles long sessions:
- Triggers compaction at threshold (default 50 events)
- Summarizes old events into statistics
- Preserves errors and milestones

### 6. CacheOptimizer

Location: `optimization_engine/context/cache_monitor.py`

Optimizes for KV-cache:
- Three-tier context structure (stable/semi-stable/dynamic)
- Tracks cache hit rate
- Estimates cost savings

## Integration with OptimizationRunner

### Option 1: Mixin

```python
from optimization_engine.context.runner_integration import ContextEngineeringMixin

class MyRunner(ContextEngineeringMixin, OptimizationRunner):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.init_context_engineering()
```

### Option 2: Wrapper

```python
from optimization_engine.context.runner_integration import ContextAwareRunner

runner = OptimizationRunner(config_path=...)
context_runner = ContextAwareRunner(runner)
context_runner.run(n_trials=100)
```

## Dashboard API

Base URL: `/api/context`

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/playbook` | GET | Playbook summary |
| `/playbook/items` | GET | List items (with filters) |
| `/playbook/items/{id}` | GET | Get specific item |
| `/playbook/feedback` | POST | Record helpful/harmful |
| `/playbook/insights` | POST | Add new insight |
| `/playbook/prune` | POST | Prune harmful items |
| `/playbook/context` | GET | Get LLM context string |
| `/session` | GET | Session state |
| `/learning/report` | GET | Learning report |

## Best Practices

### 1. Record Immediately

Don't wait until session end:
```python
# RIGHT: Record immediately
playbook.add_insight(InsightCategory.MISTAKE, "Convergence failed with X")
playbook.save(path)

# WRONG: Wait until end
# (User might close session, learning lost)
```

### 2. Be Specific

```python
# GOOD: Specific and actionable
"For bracket optimization with >5 variables, TPE outperforms random search"

# BAD: Vague
"TPE is good"
```

### 3. Include Context

```python
playbook.add_insight(
    InsightCategory.STRATEGY,
    "Shell elements reduce solve time by 40% for thickness < 2mm",
    tags=["mesh", "shell", "performance"]
)
```

### 4. Review Harmful Items

Periodically check items with negative scores:
```python
harmful = [i for i in playbook.items.values() if i.net_score < 0]
for item in harmful:
    print(f"{item.id}: {item.content[:50]}... (score={item.net_score})")
```

## Troubleshooting

### Playbook Not Updating

1. Check playbook path:
```python
print(playbook_path)  # Should be knowledge_base/playbook.json
```

2. Verify save is called:
```python
playbook.save(path)  # Must be explicit
```

### Insights Not Appearing in Context

1. Check confidence threshold:
```python
# Default is 0.5 - new items start at 0.5
context = playbook.get_context_for_task("opt", min_confidence=0.3)
```

2. Check if items exist:
```python
print(f"Total items: {len(playbook.items)}")
```

### Learning Not Working

1. Verify FeedbackLoop is finalized:
```python
feedback.finalize_study(...)  # MUST be called
```

2. Check context_items_used parameter:
```python
# Items must be explicitly tracked
feedback.process_trial_result(
    ...,
    context_items_used=list(playbook.items.keys())[:10]
)
```

## Files Reference

| File | Purpose |
|------|---------|
| `optimization_engine/context/__init__.py` | Module exports |
| `optimization_engine/context/playbook.py` | Knowledge store |
| `optimization_engine/context/reflector.py` | Outcome analysis |
| `optimization_engine/context/session_state.py` | Context isolation |
| `optimization_engine/context/feedback_loop.py` | Learning loop |
| `optimization_engine/context/compaction.py` | Long session management |
| `optimization_engine/context/cache_monitor.py` | KV-cache optimization |
| `optimization_engine/context/runner_integration.py` | Runner integration |
| `knowledge_base/playbook.json` | Persistent storage |

## See Also

- `docs/CONTEXT_ENGINEERING_REPORT.md` - Full implementation report
- `.claude/skills/00_BOOTSTRAP_V2.md` - Enhanced bootstrap
- `tests/test_context_engineering.py` - Unit tests
- `tests/test_context_integration.py` - Integration tests
feat: Implement ACE Context Engineering framework (SYS_17) Complete implementation of Agentic Context Engineering (ACE) framework: Core modules (optimization_engine/context/): - playbook.py: AtomizerPlaybook with helpful/harmful scoring - reflector.py: AtomizerReflector for insight extraction - session_state.py: Context isolation (exposed/isolated state) - feedback_loop.py: Automated learning from trial results - compaction.py: Long-session context management - cache_monitor.py: KV-cache optimization tracking - runner_integration.py: OptimizationRunner integration Dashboard integration: - context.py: 12 REST API endpoints for playbook management Tests: - test_context_engineering.py: 44 unit tests - test_context_integration.py: 16 integration tests Documentation: - CONTEXT_ENGINEERING_REPORT.md: Comprehensive implementation report - CONTEXT_ENGINEERING_API.md: Complete API reference - SYS_17_CONTEXT_ENGINEERING.md: System protocol - Updated cheatsheet with SYS_17 quick reference - Enhanced bootstrap (00_BOOTSTRAP_V2.md) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> 2025-12-29 20:21:20 -05:00			`---`
			`protocol_id: SYS_17`
			`version: 1.0`
			`last_updated: 2025-12-29`
			`status: active`
			`owner: system`
			`code_dependencies:`
			`- optimization_engine.context.*`
			`requires_protocols: []`
			`---`

			`# SYS_17: Context Engineering System`

			`## Overview`

			`The Context Engineering System implements the Agentic Context Engineering (ACE) framework, enabling Atomizer to learn from every optimization run and accumulate institutional knowledge over time.`

			`## When to Load This Protocol`

			`Load SYS_17 when:`
			`- User asks about "learning", "playbook", or "context engineering"`
			`- Debugging why certain knowledge isn't being applied`
			`- Configuring context behavior`
			`- Analyzing what the system has learned`

			`## Core Concepts`

			`### The ACE Framework`

			```
			`┌─────────────┐ ┌─────────────┐ ┌─────────────┐`
			`│ Generator │────▶│ Reflector │────▶│ Curator │`
			`│ (Opt Runs) │ │ (Analysis) │ │ (Playbook) │`
			`└─────────────┘ └─────────────┘ └─────────────┘`
			`│ │`
			`└───────────── Feedback ───────────────┘`
			```

			`1. Generator: OptimizationRunner produces trial outcomes`
			`2. Reflector: Analyzes outcomes, extracts patterns`
			`3. Curator: Playbook stores and manages insights`
			`4. Feedback: Success/failure updates insight scores`

			`### Playbook Item Structure`

			```
			`[str-00001] helpful=8 harmful=0 :: "Use shell elements for thin walls"`
			`│ │ │ │`
			`│ │ │ └── Insight content`
			`│ │ └── Times advice led to failure`
			`│ └── Times advice led to success`
			`└── Unique ID (category-number)`
			```

			`### Categories`

			`\| Code \| Name \| Description \| Example \|`
			`\|------\|------\|-------------\|---------\|`
			\| `str` \| STRATEGY \| Optimization approaches \| "Start with TPE, switch to CMA-ES" \|
			\| `mis` \| MISTAKE \| Things to avoid \| "Don't use coarse mesh for stress" \|
			\| `tool` \| TOOL \| Tool usage tips \| "Use GP sampler for few-shot" \|
			\| `cal` \| CALCULATION \| Formulas \| "Safety factor = yield/max_stress" \|
			\| `dom` \| DOMAIN \| Domain knowledge \| "Zernike coefficients for mirrors" \|
			\| `wf` \| WORKFLOW \| Workflow patterns \| "Load _i.prt before UpdateFemodel()" \|

			`## Key Components`

			`### 1. AtomizerPlaybook`

			Location: `optimization_engine/context/playbook.py`

			`The central knowledge store. Handles:`
			`- Adding insights (with auto-deduplication)`
			`- Recording helpful/harmful outcomes`
			`- Generating filtered context for LLM`
			`- Pruning consistently harmful items`
			`- Persistence (JSON)`

			`Quick Usage:`
			```python
			`from optimization_engine.context import get_playbook, save_playbook, InsightCategory`

			`playbook = get_playbook()`
			`playbook.add_insight(InsightCategory.STRATEGY, "Use shell elements for thin walls")`
			`playbook.record_outcome("str-00001", helpful=True)`
			`save_playbook()`
			```

			`### 2. AtomizerReflector`

			Location: `optimization_engine/context/reflector.py`

			`Analyzes optimization outcomes to extract insights:`
			`- Classifies errors (convergence, mesh, singularity, etc.)`
			`- Extracts success patterns`
			`- Generates study-level insights`

			`Quick Usage:`
			```python
			`from optimization_engine.context import AtomizerReflector, OptimizationOutcome`

			`reflector = AtomizerReflector(playbook)`
			`outcome = OptimizationOutcome(trial_number=42, success=True, ...)`
			`insights = reflector.analyze_trial(outcome)`
			`reflector.commit_insights()`
			```

			`### 3. FeedbackLoop`

			Location: `optimization_engine/context/feedback_loop.py`

			`Automated learning loop that:`
			`- Processes trial results`
			`- Updates playbook scores based on outcomes`
			`- Tracks which items were active per trial`
			`- Finalizes learning at study end`

			`Quick Usage:`
			```python
			`from optimization_engine.context import FeedbackLoop`

			`feedback = FeedbackLoop(playbook_path)`
			`feedback.process_trial_result(trial_number=42, success=True, ...)`
			`feedback.finalize_study({"name": "study", "total_trials": 100, ...})`
			```

			`### 4. SessionState`

			Location: `optimization_engine/context/session_state.py`

			`Manages context isolation:`
			`- Exposed: Always in LLM context (task type, recent actions, errors)`
			`- Isolated: On-demand access (full history, NX paths, F06 content)`

			`Quick Usage:`
			```python
			`from optimization_engine.context import get_session, TaskType`

			`session = get_session()`
			`session.exposed.task_type = TaskType.RUN_OPTIMIZATION`
			`session.add_action("Started trial 42")`
			`context = session.get_llm_context()`
			```

			`### 5. CompactionManager`

			Location: `optimization_engine/context/compaction.py`

			`Handles long sessions:`
			`- Triggers compaction at threshold (default 50 events)`
			`- Summarizes old events into statistics`
			`- Preserves errors and milestones`

			`### 6. CacheOptimizer`

			Location: `optimization_engine/context/cache_monitor.py`

			`Optimizes for KV-cache:`
			`- Three-tier context structure (stable/semi-stable/dynamic)`
			`- Tracks cache hit rate`
			`- Estimates cost savings`

			`## Integration with OptimizationRunner`

			`### Option 1: Mixin`

			```python
			`from optimization_engine.context.runner_integration import ContextEngineeringMixin`

			`class MyRunner(ContextEngineeringMixin, OptimizationRunner):`
			`def __init__(self, args, *kwargs):`
			`super().__init__(args, *kwargs)`
			`self.init_context_engineering()`
			```

			`### Option 2: Wrapper`

			```python
			`from optimization_engine.context.runner_integration import ContextAwareRunner`

			`runner = OptimizationRunner(config_path=...)`
			`context_runner = ContextAwareRunner(runner)`
			`context_runner.run(n_trials=100)`
			```

			`## Dashboard API`

			Base URL: `/api/context`

			`\| Endpoint \| Method \| Description \|`
			`\|----------\|--------\|-------------\|`
			\| `/playbook` \| GET \| Playbook summary \|
			\| `/playbook/items` \| GET \| List items (with filters) \|
			\| `/playbook/items/{id}` \| GET \| Get specific item \|
			\| `/playbook/feedback` \| POST \| Record helpful/harmful \|
			\| `/playbook/insights` \| POST \| Add new insight \|
			\| `/playbook/prune` \| POST \| Prune harmful items \|
			\| `/playbook/context` \| GET \| Get LLM context string \|
			\| `/session` \| GET \| Session state \|
			\| `/learning/report` \| GET \| Learning report \|

			`## Best Practices`

			`### 1. Record Immediately`

			`Don't wait until session end:`
			```python
			`# RIGHT: Record immediately`
			`playbook.add_insight(InsightCategory.MISTAKE, "Convergence failed with X")`
			`playbook.save(path)`

			`# WRONG: Wait until end`
			`# (User might close session, learning lost)`
			```

			`### 2. Be Specific`

			```python
			`# GOOD: Specific and actionable`
			`"For bracket optimization with >5 variables, TPE outperforms random search"`

			`# BAD: Vague`
			`"TPE is good"`
			```

			`### 3. Include Context`

			```python
			`playbook.add_insight(`
			`InsightCategory.STRATEGY,`
			`"Shell elements reduce solve time by 40% for thickness < 2mm",`
			`tags=["mesh", "shell", "performance"]`
			`)`
			```

			`### 4. Review Harmful Items`

			`Periodically check items with negative scores:`
			```python
			`harmful = [i for i in playbook.items.values() if i.net_score < 0]`
			`for item in harmful:`
			`print(f"{item.id}: {item.content[:50]}... (score={item.net_score})")`
			```

			`## Troubleshooting`

			`### Playbook Not Updating`

			`1. Check playbook path:`
			```python
			`print(playbook_path) # Should be knowledge_base/playbook.json`
			```

			`2. Verify save is called:`
			```python
			`playbook.save(path) # Must be explicit`
			```

			`### Insights Not Appearing in Context`

			`1. Check confidence threshold:`
			```python
			`# Default is 0.5 - new items start at 0.5`
			`context = playbook.get_context_for_task("opt", min_confidence=0.3)`
			```

			`2. Check if items exist:`
			```python
			`print(f"Total items: {len(playbook.items)}")`
			```

			`### Learning Not Working`

			`1. Verify FeedbackLoop is finalized:`
			```python
			`feedback.finalize_study(...) # MUST be called`
			```

			`2. Check context_items_used parameter:`
			```python
			`# Items must be explicitly tracked`
			`feedback.process_trial_result(`
			`...,`
			`context_items_used=list(playbook.items.keys())[:10]`
			`)`
			```

			`## Files Reference`

			`\| File \| Purpose \|`
			`\|------\|---------\|`
			\| `optimization_engine/context/__init__.py` \| Module exports \|
			\| `optimization_engine/context/playbook.py` \| Knowledge store \|
			\| `optimization_engine/context/reflector.py` \| Outcome analysis \|
			\| `optimization_engine/context/session_state.py` \| Context isolation \|
			\| `optimization_engine/context/feedback_loop.py` \| Learning loop \|
			\| `optimization_engine/context/compaction.py` \| Long session management \|
			\| `optimization_engine/context/cache_monitor.py` \| KV-cache optimization \|
			\| `optimization_engine/context/runner_integration.py` \| Runner integration \|
			\| `knowledge_base/playbook.json` \| Persistent storage \|

			`## See Also`

			- `docs/CONTEXT_ENGINEERING_REPORT.md` - Full implementation report
			- `.claude/skills/00_BOOTSTRAP_V2.md` - Enhanced bootstrap
			- `tests/test_context_engineering.py` - Unit tests
			- `tests/test_context_integration.py` - Integration tests