fix(retrieval): enforce project-scoped context boundaries
This commit is contained in:
@@ -7,16 +7,16 @@
|
||||
## Orientation
|
||||
|
||||
- **live_sha** (Dalidou `/health` build_sha): `2b86543` (verified 2026-04-23T15:20:53Z post-R14 deploy; status=ok)
|
||||
- **last_updated**: 2026-04-23 by Claude (R14 squash-merged + deployed; Orientation refreshed)
|
||||
- **last_updated**: 2026-04-24 by Codex (audit-improvements foundation branch; live status refreshed)
|
||||
- **main_tip**: `2b86543`
|
||||
- **test_count**: 548 (547 + 1 R14 regression test)
|
||||
- **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — known content gap, not regression; consistent since 2026-04-19)
|
||||
- **test_count**: 553
|
||||
- **harness**: `18/20 PASS` on live Dalidou plus 1 known content gap and 1 blocking project-bleed guard pending deploy of this branch
|
||||
- **vectors**: 33,253
|
||||
- **active_memories**: 784 (up from 84 pre-density-batch — density gate CRUSHED vs V1-A's 100-target)
|
||||
- **candidate_memories**: 2 (triage queue drained)
|
||||
- **interactions**: 500+ (limit=2000 query returned 500 — density batch has been running; actual may be higher, confirm via /stats next update)
|
||||
- **active_memories**: 290 (`/admin/dashboard` 2026-04-24; note integrity panel reports a separate active_memory_count=951 and needs reconciliation)
|
||||
- **candidate_memories**: 0 (triage queue drained)
|
||||
- **interactions**: 950 (`/admin/dashboard` 2026-04-24)
|
||||
- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08)
|
||||
- **project_state_entries**: 63 (atocore alone; full cross-project count not re-sampled this update)
|
||||
- **project_state_entries**: 128 across registered projects (`/admin/dashboard` 2026-04-24)
|
||||
- **entities**: 66 (up from 35 — V1-0 backfill + ongoing work; 0 open conflicts)
|
||||
- **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron, verified
|
||||
- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → **auto-promote/expire (NEW)** → weekly synth/lint Sundays → **retrieval harness (NEW)** → **pipeline summary (NEW)**
|
||||
@@ -170,6 +170,10 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
||||
|
||||
## Session Log
|
||||
|
||||
- **2026-04-24 Codex (audit improvements foundation)** Started implementation of the audit recommendations on branch `codex/audit-improvements-foundation` from `origin/main@c53e61e`. First tranche: registry-aware project-scoped retrieval filtering (`ATOCORE_RANK_PROJECT_SCOPE_FILTER`, widened candidate pull before filtering), eval harness known-issue lane, two p05 project-bleed fixtures, `scripts/live_status.py`, README/current-state/master-plan status refresh. Verified `pytest -q`: 550 passed in 67.11s. Live retrieval harness against undeployed production: 20 fixtures, 18 pass, 1 known issue (`p04-constraints` Zerodur/1.2 content gap), 1 blocking guard (`p05-broad-status-no-atomizer`) still failing because production has not yet deployed the retrieval filter and currently pulls `P04-GigaBIT-M1-KB-design` into broad p05 status context. Live dashboard refresh: health ok, build `2b86543`, docs 1748, chunks/vectors 33253, interactions 948, active memories 289, candidates 0, project_state total 128. Noted count discrepancy: dashboard memories.active=289 while integrity active_memory_count=951; schedule reconciliation in a follow-up.
|
||||
|
||||
- **2026-04-24 Codex (independent-audit hardening)** Applied the Opus independent audit's fast follow-ups before merge/deploy. Closed the two P1s by making project-scope ownership path/tag-based only, adding path-segment/tag-exact matching to avoid short-alias substring collisions, and keeping title/heading text out of provenance decisions. Added regression tests for title poisoning, substring collision, and unknown-project fallback. Added retrieval log fields `raw_results_count`, `post_filter_count`, `post_filter_dropped`, and `underfilled`. Added retrieval-eval run metadata (`generated_at`, `base_url`, `/health`) and `live_status.py` auth-token/status support. README now documents the ranking knobs and clarifies that the hard scope filter and soft project match boost are separate controls. Verified `pytest -q`: 553 passed in 66.07s. Live production remains expected-predeploy: 20 fixtures, 18 pass, 1 known content gap, 1 blocking p05 bleed guard. Latest live dashboard: build `2b86543`, docs 1748, chunks/vectors 33253, interactions 950, active memories 290, candidates 0, project_state total 128.
|
||||
|
||||
- **2026-04-23 Codex + Claude (R14 closed)** Codex reviewed `claude/r14-promote-400` at `3888db9`, no findings: "The route change is narrowly scoped: `promote_entity()` still returns False for not-found/not-candidate cases, so the existing 404 behavior remains intact, while caller-fixable validation failures now surface as 400." Ran `pytest tests/test_v1_0_write_invariants.py -q` from an isolated worktree: 15 passed in 1.91s. Claude squash-merged to main as `0989fed`, followed by ledger close-out `2b86543`, then deployed via canonical script. Dalidou `/health` reports build_sha=`2b86543e6ad26011b39a44509cc8df3809725171`, build_time `2026-04-23T15:20:53Z`, status=ok. R14 closed. Orientation refreshed earlier this session also reflected the V1-A gate status: **density gate CLEARED** (784 active memories vs 100 target — density batch-extract ran between 2026-04-22 and 2026-04-23 and more than crushed the gate), **soak gate at day 5 of ~7** (F4 first run 2026-04-19; nightly clean 2026-04-19 through 2026-04-23; only chronic failure is the known p04-constraints "Zerodur" content gap). V1-A branches from a clean V1-0 baseline as soon as the soak is called done.
|
||||
|
||||
- **2026-04-22 Codex + Antoine (V1-0 closed)** Codex approved `f16cd52` after re-running both original probes (legacy-candidate promote + supersede hook — both correct) and the three targeted regression suites (`test_v1_0_write_invariants.py`, `test_engineering_v1_phase5.py`, `test_inbox_crossproject.py` — all pass). Squash-merged to main as `2712c5d` ("feat(engineering): enforce V1-0 write invariants"). Deployed to Dalidou via the canonical deploy script; `/health` build_sha=`2712c5d2d03cb2a6af38b559664afd1c4cd0e050` status=ok. Validated backup snapshot at `/srv/storage/atocore/backups/snapshots/20260422T190624Z` taken BEFORE prod backfill. Prod backfill of `scripts/v1_0_backfill_provenance.py` against live DB: dry-run found 31 active/superseded entities with no provenance, list reviewed and looked sane; live run with default `hand_authored=1` flag path updated 31 rows; follow-up dry-run returned 0 rows remaining → no lingering F-8 violations in prod. Codex logged one residual P2 (R14): HTTP `POST /entities/{id}/promote` route doesn't translate the new service-layer `ValueError` into 400 — legacy bad candidate promoted through the API surfaces as 500. Not blocking. V1-0 closed. **Gates for V1-A**: soak window ends ~2026-04-26; 100-active-memory density target (currently 84 active + the ~31 newly flagged ones — need to check how those count in density math). V1-A holds until both gates clear.
|
||||
|
||||
Reference in New Issue
Block a user