fix(retrieval): enforce project-scoped context boundaries

2026-04-24 10:46:56 -04:00
parent c53e61eb67
commit c7212900b0
11 changed files with 737 additions and 68 deletions
--- a/docs/current-state.md
+++ b/docs/current-state.md
@@ -1,6 +1,7 @@
-# AtoCore — Current State (2026-04-22)
+# AtoCore - Current State (2026-04-24)

-Live deploy: `2712c5d` · Dalidou health: ok · Harness: 17/18 · Tests: 547 passing.
+Live deploy: `2b86543` · Dalidou health: ok · Harness: 18/20 with 1 known
+content gap and 1 current blocking project-bleed guard · Tests: 553 passing.

 ## V1-0 landed 2026-04-22

@@ -13,9 +14,8 @@ supersede) with Q-3 fail-open. Prod backfill ran cleanly — 31 legacy
 active/superseded entities flagged `hand_authored=1`, follow-up dry-run
 returned 0 remaining rows. Test count 533 → 547 (+14).

-R14 (P2, non-blocking): `POST /entities/{id}/promote` route fix translates
-the new `ValueError` into 400. Branch `claude/r14-promote-400` pending
-Codex review + squash-merge.
+R14 is closed: `POST /entities/{id}/promote` now translates the new
+caller-fixable V1-0 `ValueError` into HTTP 400.

 **Next in the V1 track:** V1-A (minimal query slice + Q-6 killer-correctness
 integration). Gated on pipeline soak (~2026-04-26) + 100+ active memory
@@ -65,10 +65,10 @@ Last nightly run (2026-04-19 03:00 UTC): **31 promoted · 39 rejected · 0 needs
 | 7G | Re-extraction on prompt version bump | pending |
 | 7H | Chroma vector hygiene (delete vectors for superseded memories) | pending |

-## Known gaps (honest)
+## Known gaps (honest, refreshed 2026-04-24)

 1. **Capture surface is Claude-Code-and-OpenClaw only.** Conversations in Claude Desktop, Claude.ai web, phone, or any other LLM UI are NOT captured. Example: the rotovap/mushroom chat yesterday never reached AtoCore because no hook fired. See Q4 below.
-2. **OpenClaw is capture-only, not context-grounded.** The plugin POSTs `/interactions` on `llm_output` but does NOT call `/context/build` on `before_agent_start`. OpenClaw's underlying agent runs blind. See Q2 below.
-3. **Human interface (wiki) is thin and static.** 5 project cards + a "System" line. No dashboard for the autonomous activity. No per-memory detail page. See Q3/Q5.
-4. **Harness 17/18** — the `p04-constraints` fixture wants "Zerodur" but retrieval surfaces related-not-exact terms. Content gap, not a retrieval regression.
-5. **Two projects under-populated**: p05-interferometer (4 memories, 18 state) and atomizer-v2 (1 memory, 6 state). Batch re-extract with the new llm-0.6.0 prompt would help.
+2. **Project-scoped retrieval still needs deployment verification.** The April 24 audit reproduced cross-project competition on broad p05 prompts. The current branch adds registry-aware project filtering and a harness guard; verify after deploy.
+3. **Human interface is useful but not yet the V1 Human Mirror.** Wiki/dashboard pages exist, but the spec routes, deterministic mirror files, disputed markers, and curated annotations remain V1-D work.
+4. **Harness known issue:** `p04-constraints` wants "Zerodur" and "1.2"; live retrieval surfaces related constraints but not those exact strings. Treat as content/state gap until fixed.
+5. **Formal docs lag the ledger during fast work.** Use `DEV-LEDGER.md` and `python scripts/live_status.py` for live truth, then copy verified claims into these docs.