docs: Day 5 — extractor scope + stale follow-ups cleaned
Documents the LLM-assisted extractor's in-scope / out-of-scope categories derived from the first live triage pass (16 promoted, 35 rejected). Five in-scope classes, six explicit out-of-scope classes, trust model summary, multi-model future direction. Cleaned up stale follow-up items in next-steps.md: rule expansion marked deprioritized, LLM extractor marked done, retrieval harness marked done with expansion pending. Fixed docstring timeout (45s -> 90s) in extractor_llm.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -226,14 +226,53 @@ candidate was a synthetic test capture from earlier in the session
|
||||
- Capture → reinforce is working correctly on live data (length-aware
|
||||
matcher verified on live paraphrase of a p04 memory).
|
||||
|
||||
Follow-up candidates (not yet scheduled):
|
||||
Follow-up candidates:
|
||||
|
||||
1. Extractor rule expansion — add conversational-form rules so real
|
||||
session text has a chance of surfacing candidates.
|
||||
2. LLM-assisted extractor as a separate rule family, guarded by
|
||||
confidence and always landing in `status=candidate` (never active).
|
||||
3. Retrieval eval harness — diffable scorecard of
|
||||
`formatted_context` across a fixed question set per active project.
|
||||
1. ~~Extractor rule expansion~~ — Day 2 baseline showed 0% recall
|
||||
across 5 distinct miss classes; rule expansion cannot close a
|
||||
5-way miss. Deprioritized.
|
||||
2. ~~LLM-assisted extractor~~ — DONE 2026-04-12. `extractor_llm.py`
|
||||
shells out to `claude -p` (Haiku, OAuth, no API key). First live
|
||||
run: 100% recall, 2.55 yield/interaction on a 20-interaction
|
||||
labeled set. First triage: 51 candidates → 16 promoted, 35
|
||||
rejected (31% accept rate). Active memories 20 → 36.
|
||||
3. ~~Retrieval eval harness~~ — DONE 2026-04-11 (scripts/retrieval_eval.py,
|
||||
6/6 passing). Expansion to 15-20 fixtures is mini-phase Day 6.
|
||||
|
||||
## Extractor Scope — 2026-04-12
|
||||
|
||||
What the LLM-assisted extractor (`src/atocore/memory/extractor_llm.py`)
|
||||
extracts from conversational Claude Code captures:
|
||||
|
||||
**In scope:**
|
||||
|
||||
- Architectural commitments (e.g. "Z-axis is engage/retract, not
|
||||
continuous position")
|
||||
- Ratified decisions with project scope (e.g. "USB SSD mandatory on
|
||||
RPi for telemetry storage")
|
||||
- Durable engineering facts (e.g. "telemetry data rate ~29 MB/hour")
|
||||
- Working rules and adaptation patterns (e.g. "extraction stays off
|
||||
the capture hot path")
|
||||
- Interface invariants (e.g. "controller-job.v1 in, run-log.v1 out;
|
||||
no firmware change needed")
|
||||
|
||||
**Out of scope (intentionally rejected by triage):**
|
||||
|
||||
- Transient roadmap / plan steps that will be stale in a week
|
||||
- Operational instructions ("run this command to deploy")
|
||||
- Process rules that live in DEV-LEDGER.md / AGENTS.md, not in memory
|
||||
- Implementation details that are too granular (individual field names
|
||||
when the parent concept is already captured)
|
||||
- Already-fixed review findings (P1/P2 that no longer apply)
|
||||
- Duplicates of existing active memories with wrong project tags
|
||||
|
||||
**Trust model:**
|
||||
|
||||
- Extraction stays off the capture hot path (batch / manual only)
|
||||
- All candidates land as `status=candidate`, never auto-promoted
|
||||
- Human or auto-triage reviews before promotion to active
|
||||
- Future direction: multi-model extraction + triage (Codex/Gemini as
|
||||
second-pass reviewers for robustness against single-model bias)
|
||||
|
||||
## Long-Run Goal
|
||||
|
||||
|
||||
@@ -29,7 +29,7 @@ Configuration:
|
||||
- ``ATOCORE_LLM_EXTRACTOR_MODEL`` overrides the model alias (default
|
||||
``haiku``).
|
||||
- ``ATOCORE_LLM_EXTRACTOR_TIMEOUT_S`` overrides the per-call timeout
|
||||
(default 45 seconds — first invocation is slow because Node.js
|
||||
(default 90 seconds — first invocation is slow because Node.js
|
||||
startup plus OAuth check is non-trivial).
|
||||
|
||||
Implementation notes:
|
||||
|
||||
Reference in New Issue
Block a user