docs: Day 5 — extractor scope + stale follow-ups cleaned
Documents the LLM-assisted extractor's in-scope / out-of-scope categories derived from the first live triage pass (16 promoted, 35 rejected). Five in-scope classes, six explicit out-of-scope classes, trust model summary, multi-model future direction. Cleaned up stale follow-up items in next-steps.md: rule expansion marked deprioritized, LLM extractor marked done, retrieval harness marked done with expansion pending. Fixed docstring timeout (45s -> 90s) in extractor_llm.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -226,14 +226,53 @@ candidate was a synthetic test capture from earlier in the session
|
|||||||
- Capture → reinforce is working correctly on live data (length-aware
|
- Capture → reinforce is working correctly on live data (length-aware
|
||||||
matcher verified on live paraphrase of a p04 memory).
|
matcher verified on live paraphrase of a p04 memory).
|
||||||
|
|
||||||
Follow-up candidates (not yet scheduled):
|
Follow-up candidates:
|
||||||
|
|
||||||
1. Extractor rule expansion — add conversational-form rules so real
|
1. ~~Extractor rule expansion~~ — Day 2 baseline showed 0% recall
|
||||||
session text has a chance of surfacing candidates.
|
across 5 distinct miss classes; rule expansion cannot close a
|
||||||
2. LLM-assisted extractor as a separate rule family, guarded by
|
5-way miss. Deprioritized.
|
||||||
confidence and always landing in `status=candidate` (never active).
|
2. ~~LLM-assisted extractor~~ — DONE 2026-04-12. `extractor_llm.py`
|
||||||
3. Retrieval eval harness — diffable scorecard of
|
shells out to `claude -p` (Haiku, OAuth, no API key). First live
|
||||||
`formatted_context` across a fixed question set per active project.
|
run: 100% recall, 2.55 yield/interaction on a 20-interaction
|
||||||
|
labeled set. First triage: 51 candidates → 16 promoted, 35
|
||||||
|
rejected (31% accept rate). Active memories 20 → 36.
|
||||||
|
3. ~~Retrieval eval harness~~ — DONE 2026-04-11 (scripts/retrieval_eval.py,
|
||||||
|
6/6 passing). Expansion to 15-20 fixtures is mini-phase Day 6.
|
||||||
|
|
||||||
|
## Extractor Scope — 2026-04-12
|
||||||
|
|
||||||
|
What the LLM-assisted extractor (`src/atocore/memory/extractor_llm.py`)
|
||||||
|
extracts from conversational Claude Code captures:
|
||||||
|
|
||||||
|
**In scope:**
|
||||||
|
|
||||||
|
- Architectural commitments (e.g. "Z-axis is engage/retract, not
|
||||||
|
continuous position")
|
||||||
|
- Ratified decisions with project scope (e.g. "USB SSD mandatory on
|
||||||
|
RPi for telemetry storage")
|
||||||
|
- Durable engineering facts (e.g. "telemetry data rate ~29 MB/hour")
|
||||||
|
- Working rules and adaptation patterns (e.g. "extraction stays off
|
||||||
|
the capture hot path")
|
||||||
|
- Interface invariants (e.g. "controller-job.v1 in, run-log.v1 out;
|
||||||
|
no firmware change needed")
|
||||||
|
|
||||||
|
**Out of scope (intentionally rejected by triage):**
|
||||||
|
|
||||||
|
- Transient roadmap / plan steps that will be stale in a week
|
||||||
|
- Operational instructions ("run this command to deploy")
|
||||||
|
- Process rules that live in DEV-LEDGER.md / AGENTS.md, not in memory
|
||||||
|
- Implementation details that are too granular (individual field names
|
||||||
|
when the parent concept is already captured)
|
||||||
|
- Already-fixed review findings (P1/P2 that no longer apply)
|
||||||
|
- Duplicates of existing active memories with wrong project tags
|
||||||
|
|
||||||
|
**Trust model:**
|
||||||
|
|
||||||
|
- Extraction stays off the capture hot path (batch / manual only)
|
||||||
|
- All candidates land as `status=candidate`, never auto-promoted
|
||||||
|
- Human or auto-triage reviews before promotion to active
|
||||||
|
- Future direction: multi-model extraction + triage (Codex/Gemini as
|
||||||
|
second-pass reviewers for robustness against single-model bias)
|
||||||
|
|
||||||
## Long-Run Goal
|
## Long-Run Goal
|
||||||
|
|
||||||
|
|||||||
@@ -29,7 +29,7 @@ Configuration:
|
|||||||
- ``ATOCORE_LLM_EXTRACTOR_MODEL`` overrides the model alias (default
|
- ``ATOCORE_LLM_EXTRACTOR_MODEL`` overrides the model alias (default
|
||||||
``haiku``).
|
``haiku``).
|
||||||
- ``ATOCORE_LLM_EXTRACTOR_TIMEOUT_S`` overrides the per-call timeout
|
- ``ATOCORE_LLM_EXTRACTOR_TIMEOUT_S`` overrides the per-call timeout
|
||||||
(default 45 seconds — first invocation is slow because Node.js
|
(default 90 seconds — first invocation is slow because Node.js
|
||||||
startup plus OAuth check is non-trivial).
|
startup plus OAuth check is non-trivial).
|
||||||
|
|
||||||
Implementation notes:
|
Implementation notes:
|
||||||
|
|||||||
Reference in New Issue
Block a user