fix(extraction): R11 container 503 + R12 shared prompt module

R11: POST /admin/extract-batch with mode=llm now returns 503 when the claude CLI is unavailable (was silently returning success with 0 candidates), with a message pointing at the host-side script. +2 tests. R12: extracted SYSTEM_PROMPT + parse_llm_json_array + normalize_candidate_item + build_user_message into stdlib-only src/atocore/memory/_llm_prompt.py. Both the container extractor and scripts/batch_llm_extract_live.py now import from it, eliminating the prompt/parser drift risk. Tests 297 -> 299. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:47:01 -04:00
parent dc9fdd3a38
commit c2e7064238
6 changed files with 310 additions and 302 deletions
--- a/DEV-LEDGER.md
+++ b/DEV-LEDGER.md
@@ -6,10 +6,10 @@

 ## Orientation

- **live_sha** (Dalidou `/health` build_sha): `3f23ca1` (signal-aggressive extractor live; fix needs redeploy)
- **last_updated**: 2026-04-14 by Claude (OpenClaw importer live, Karpathy upgrades shipped)
- **main_tip**: `58ea21d`
- **test_count**: 297 passing (+7 engineering layer tests)
+- **live_sha** (Dalidou `/health` build_sha): `58ea21d` (verified 2026-04-14 via /health)
+- **last_updated**: 2026-04-14 by Claude (R11+R12 closed, R3 declined)
+- **main_tip**: `dc9fdd3` (pre-R11/R12 commit; new commit pending for this session)
+- **test_count**: 299 passing (+2 R11 api-503 tests)
 - **harness**: `17/18 PASS` (only p06-tailscale — chunk bleed)
 - **vectors**: 33,253
 - **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
@@ -131,7 +131,7 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
 |-----|--------|----------|------------------------------------|-------------------------------------------------------------------------|--------------|--------|------------|-------------|
 | R1  | Codex  | P1       | deploy/hooks/capture_stop.py:76-85 | Live Claude capture still omits `extract`, so "loop closed both sides" remains overstated in practice even though the API supports it | fixed        | Claude | 2026-04-11 | c67bec0     |
 | R2  | Codex  | P1       | src/atocore/context/builder.py     | Project memories excluded from pack                                     | fixed        | Claude | 2026-04-11 | 8ea53f4     |
-| R3  | Claude | P2       | src/atocore/memory/extractor.py    | Rule cues (`## Decision:`) never fire on conversational LLM text        | open         | Claude | 2026-04-11 |             |
+| R3  | Claude | P2       | src/atocore/memory/extractor.py    | Rule cues (`## Decision:`) never fire on conversational LLM text        | declined     | Claude | 2026-04-11 | see 2026-04-14 session log |
 | R4  | Codex  | P2       | DEV-LEDGER.md:11                   | Orientation `main_tip` was stale versus `HEAD` / `origin/main`          | fixed        | Codex  | 2026-04-11 | 81307ce     |
 | R5  | Codex  | P1       | src/atocore/interactions/service.py:157-174 | The deployed extraction path still calls only the rule extractor; the new LLM extractor is eval/script-only, so Day 4 "gate cleared" is true as a benchmark result but not as an operational extraction path | fixed        | Claude | 2026-04-12 | c67bec0     |
 | R6  | Codex  | P1       | src/atocore/memory/extractor_llm.py:258-276 | LLM extraction accepts model-supplied `project` verbatim with no fallback to `interaction.project`; live triage promoted a clearly p06 memory (offline/network rule) as project=`""`, which explains the p06-offline-design harness miss and falsifies the current "all 3 failures are budget-contention" claim | fixed        | Claude | 2026-04-12 | 39d73e9     |
@@ -139,8 +139,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
 | R8  | Codex  | P2       | tests/test_extractor_llm.py:1-7    | LLM extractor tests stop at parser/failure contracts; there is no automated coverage for the script-only persistence/review path that produced the 16 promoted memories, including project-scope preservation | fixed        | Claude | 2026-04-12 | 69c9717     |
 | R9  | Codex  | P2       | src/atocore/memory/extractor_llm.py:258-259 | The R6 fallback only repairs empty project output. A wrong non-empty model project still overrides the interaction's known scope, so project attribution is improved but not yet trust-preserving. | fixed        | Claude | 2026-04-12 | e5e9a99     |
 | R10 | Codex  | P2       | docs/master-plan-status.md:31-33   | "Phase 8 - OpenClaw Integration" is fair as a baseline milestone, but not as a "primary" integration claim. `t420-openclaw/atocore.py` currently covers a narrow read-oriented subset (13 request shapes vs 32 API routes) plus fail-open health, while memory/interactions/admin write paths remain out of surface. | open         | Claude | 2026-04-12 |             |
-| R11 | Codex  | P2       | src/atocore/api/routes.py:773-845  | `POST /admin/extract-batch` still accepts `mode="llm"` inside the container and returns a successful 0-candidate result instead of surfacing that host-only LLM extraction is unavailable from this runtime. That is a misleading API contract for operators. | open         | Claude | 2026-04-12 |             |
-| R12 | Codex  | P2       | scripts/batch_llm_extract_live.py:39-190 | The host-side extractor duplicates the LLM system prompt and JSON parsing logic from `src/atocore/memory/extractor_llm.py`. It works today, but this is now a prompt/parser drift risk across the container and host implementations. | open         | Claude | 2026-04-12 |             |
+| R11 | Codex  | P2       | src/atocore/api/routes.py:773-845  | `POST /admin/extract-batch` still accepts `mode="llm"` inside the container and returns a successful 0-candidate result instead of surfacing that host-only LLM extraction is unavailable from this runtime. That is a misleading API contract for operators. | fixed        | Claude | 2026-04-12 | (pending)   |
+| R12 | Codex  | P2       | scripts/batch_llm_extract_live.py:39-190 | The host-side extractor duplicates the LLM system prompt and JSON parsing logic from `src/atocore/memory/extractor_llm.py`. It works today, but this is now a prompt/parser drift risk across the container and host implementations. | fixed        | Claude | 2026-04-12 | (pending)   |
 | R13 | Codex  | P2       | DEV-LEDGER.md:12                    | The new `286 passing` test-count claim is not reproducibly auditable from the current audit environments: neither Dalidou nor the clean worktree has `pytest` available. The claim may be true in Claude's dev shell, but it remains unverified in this audit. | open         | Claude | 2026-04-12 |             |

 ## Recent Decisions
@@ -159,6 +159,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha

 ## Session Log

+- **2026-04-14 Claude (pm)** Closed R11+R12, declined R3. **R11 (fixed):** `POST /admin/extract-batch` with `mode="llm"` now returns 503 when the `claude` CLI is not on PATH, with a message pointing at the host-side script. Previously it silently returned a success-0 payload, masking host-vs-container truth. 2 new tests in `test_extraction_pipeline.py` cover the 503 path and the rule-mode-still-works path. **R12 (fixed):** extracted shared `SYSTEM_PROMPT` + `parse_llm_json_array` + `normalize_candidate_item` + `build_user_message` into stdlib-only `src/atocore/memory/_llm_prompt.py`. Both `src/atocore/memory/extractor_llm.py` (container) and `scripts/batch_llm_extract_live.py` (host) now import from it. The host script uses `sys.path` to reach the stdlib-only module without needing the full atocore package. Project-attribution policy stays path-specific (container uses registry-check; host defers to server). **R3 (declined):** rule cues not firing on conversational LLM text is by design now — the LLM extractor (llm-0.4.0) is the production path for conversational content as of the Day 4 gate (2026-04-12). Expanding rules to match conversational prose risks the FP blowup Day 2 already showed. Rule extractor stays narrow for structural PKM text. Tests 297 → 299. Live `/health` still `58ea21d`; this session's changes need deploy.
+
 - **2026-04-14 Claude** MAJOR session: Engineering knowledge layer V1 (Layer 2) built — entity + relationship tables, 15 types, 12 relationship kinds, 35 bootstrapped entities across p04/p05/p06. Human Mirror (Layer 3) — GET /projects/{name}/mirror.html + navigable wiki at /wiki with search. Karpathy-inspired upgrades: contradiction detection in triage, weekly lint pass, weekly synthesis pass producing "current state" paragraphs at top of project pages. Auto-detection of new projects from extraction. Registry persistence fix (ATOCORE_PROJECT_REGISTRY_DIR env var). abb-space/p08 aliases added, atomizer-v2 ingested (568 docs, +12,472 vectors). Identity/preference seed (6 new), signal-aggressive extractor rewrite (llm-0.4.0), auto vault refresh in cron. **OpenClaw one-way pull importer** built per codex proposal — reads /home/papa/clawd SOUL.md, USER.md, MEMORY.md, MODEL-ROUTING.md, memory/*.md via SSH, hash-delta import, pipeline triages. First import: 10 candidates → 10 promoted with lenient triage rule. Active memories 47→84. State entries 61→78. Tests 290→297. Dashboard at /admin/dashboard. Wiki at /wiki.