diff --git a/DEV-LEDGER.md b/DEV-LEDGER.md index 4a2825d..4f43963 100644 --- a/DEV-LEDGER.md +++ b/DEV-LEDGER.md @@ -6,11 +6,12 @@ ## Orientation -- **live_sha** (Dalidou `/health` build_sha): `38f6e52` -- **last_updated**: 2026-04-11 by Claude (Day 1+2 eval on working branch, Day 4 gate escalated) -- **main_tip**: `d9dc55f` (unchanged; Day 1+2 artifacts live on `claude/extractor-eval-loop @ 7d8d599`) -- **test_count**: 264 passing +- **live_sha** (Dalidou `/health` build_sha): `95daa5c` +- **last_updated**: 2026-04-12 by Claude (Day 4 complete, first triage done, 36 active memories) +- **main_tip**: `06792d8` +- **test_count**: 278 passing - **harness**: `6/6 PASS` (`python scripts/retrieval_eval.py` against live Dalidou) +- **active_memories**: 36 (was 20 before triage; p06-polisher 2->16, atocore 0->5) - **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron env `ATOCORE_BACKUP_RSYNC`, verified ## Active Plan @@ -126,6 +127,10 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha ## Recent Decisions +- **2026-04-12** Day 4 gate cleared: LLM-assisted extraction via `claude -p` (OAuth, no API key) is the path forward. Rule extractor stays as default for structural cues. *Proposed by:* Claude. *Ratified by:* Antoine. +- **2026-04-12** First live triage: 16 promoted, 35 rejected from 51 LLM-extracted candidates. 31% accept rate. Active memory count 20->36. *Executed by:* Claude. *Ratified by:* Antoine. +- **2026-04-12** No API keys allowed in AtoCore — LLM-assisted features use OAuth via `claude -p` or equivalent CLI-authenticated paths. *Proposed by:* Antoine. +- **2026-04-12** Multi-model extraction direction: extraction/triage should be model-agnostic, with Codex/Gemini/Ollama as second-pass reviewers for robustness. *Proposed by:* Antoine. - **2026-04-11** Adopt this ledger as shared operating memory between Claude and Codex. *Proposed by:* Antoine. *Ratified by:* Antoine. - **2026-04-11** Accept Codex's 8-day mini-phase plan verbatim as Active Plan. *Proposed by:* Codex. *Ratified by:* Antoine. - **2026-04-11** Review findings live in `DEV-LEDGER.md` with Codex owning finding text and Claude updating status fields only. *Proposed by:* Codex. *Ratified by:* Antoine. @@ -136,6 +141,7 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha ## Session Log +- **2026-04-12 Claude** `330ecfb..06792d8` (merged eval-loop branch + triage). Day 1-4 of the mini-phase completed in one session. Day 2 baseline: rule extractor 0% recall, 5 distinct miss classes. Day 4 gate cleared: LLM extractor (claude -p haiku, OAuth) hit 100% recall, 2.55 yield/interaction. Refactored from anthropic SDK to subprocess after "no API key" rule. First live triage: 51 candidates -> 16 promoted, 35 rejected. Active memories 20->36. p06-polisher went from 2 to 16 memories (firmware/telemetry architecture set). POST /memory now accepts status field. Test count 264->278. - **2026-04-11 Claude** `claude/extractor-eval-loop @ 7d8d599` — Day 1+2 of the mini-phase. Froze a 64-interaction snapshot (`scripts/eval_data/interactions_snapshot_2026-04-11.json`) and labeled 20 by length-stratified random sample (5 positive, 15 zero; 7 total expected candidates). Built `scripts/extractor_eval.py` as a file-based eval runner. **Day 2 baseline: rule extractor hit 0% yield / 0% recall / 0% precision on the labeled set; 5 false negatives across 5 distinct miss classes (recommendation_prose, architectural_change_summary, spec_update_announcement, layered_recommendation, alignment_assertion).** This is the Day 4 hard-stop signal arriving two days early — a single rule expansion cannot close a 5-way miss, and widening rules blindly will collapse precision. The Day 4 decision gate is escalated to Antoine for ratification before Day 3 touches any extractor code. No extractor code on main has changed. - **2026-04-11 Codex (ledger audit)** fixed stale `main_tip`, retargeted R1 from the API surface to the live Claude Stop hook, and formalized the review write protocol so Claude can consume findings without rewriting them. - **2026-04-11 Claude** `b3253f3..59331e5` (1 commit). Wired the DEV-LEDGER, added session protocol to AGENTS.md, created project-local CLAUDE.md, deleted stale `codex/port-atocore-ops-client` remote branch. No code changes, no redeploy needed.