docs: record fully green retrieval harness
This commit is contained in:
@@ -1,10 +1,10 @@
|
||||
# AtoCore - Current State (2026-04-25)
|
||||
|
||||
Update 2026-04-25: project-id chunk/vector metadata is deployed and backfilled.
|
||||
Live Dalidou is on `a87d984`; `/health` is ok with 33,253 vectors and sources
|
||||
ready. Live retrieval harness is 19/20 with 0 blocking failures and 1 known
|
||||
content gap (`p04-constraints` missing `Zerodur` / `1.2`). Full local suite:
|
||||
571 passed.
|
||||
Update 2026-04-25: project-id chunk/vector metadata is deployed and backfilled,
|
||||
and the final p04 Trusted Project State budget/ranking gap is closed. Live
|
||||
Dalidou is on `d3de9f6`; `/health` is ok with 33,253 vectors and sources
|
||||
ready. Live retrieval harness is 20/20 with 0 blocking failures and 0 known
|
||||
issues. Full local suite: 572 passed.
|
||||
|
||||
The project-id backfill was applied per populated project after a
|
||||
Chroma-inclusive backup at
|
||||
@@ -76,10 +76,10 @@ Last nightly run (2026-04-19 03:00 UTC): **31 promoted · 39 rejected · 0 needs
|
||||
| 7G | Re-extraction on prompt version bump | pending |
|
||||
| 7H | Chroma vector hygiene (delete vectors for superseded memories) | pending |
|
||||
|
||||
## Known gaps (honest, refreshed 2026-04-24)
|
||||
## Known gaps (honest, refreshed 2026-04-25)
|
||||
|
||||
1. **Capture surface is Claude-Code-and-OpenClaw only.** Conversations in Claude Desktop, Claude.ai web, phone, or any other LLM UI are NOT captured. Example: the rotovap/mushroom chat yesterday never reached AtoCore because no hook fired. See Q4 below.
|
||||
2. **Project-scoped retrieval guard is deployed and passing.** Explicit `project_id` chunk/vector metadata is now present in SQLite and Chroma for the 33,253-vector corpus. Retrieval prefers exact metadata ownership and keeps path/tag matching as a legacy fallback.
|
||||
3. **Human interface is useful but not yet the V1 Human Mirror.** Wiki/dashboard pages exist, but the spec routes, deterministic mirror files, disputed markers, and curated annotations remain V1-D work.
|
||||
4. **Harness known issue:** `p04-constraints` wants "Zerodur" and "1.2"; live retrieval surfaces related constraints but not those exact strings. Treat as content/state gap until fixed.
|
||||
4. **Harness is currently green.** The former `p04-constraints` known issue is closed; query-relevant Trusted Project State entries now rank before state-budget truncation.
|
||||
5. **Formal docs lag the ledger during fast work.** Use `DEV-LEDGER.md` and `python scripts/live_status.py` for live truth, then copy verified claims into these docs.
|
||||
|
||||
@@ -133,7 +133,7 @@ deferred from the shared client until their workflows are exercised.
|
||||
|
||||
## What Is Real Today (updated 2026-04-25)
|
||||
|
||||
- canonical AtoCore runtime on Dalidou (`a87d984`, deploy.sh verified)
|
||||
- canonical AtoCore runtime on Dalidou (`d3de9f6`, deploy.sh verified)
|
||||
- 33,253 vectors across 6 registered projects, with explicit `project_id`
|
||||
metadata backfilled into SQLite and Chroma after snapshot
|
||||
`/srv/storage/atocore/backups/snapshots/20260424T154358Z`
|
||||
@@ -154,10 +154,10 @@ deferred from the shared client until their workflows are exercised.
|
||||
- query-relevance memory ranking with overlap-density scoring and widened
|
||||
query-time candidate pools so older exact-intent project memories can rank
|
||||
ahead of generic high-confidence notes
|
||||
- retrieval eval harness: 20 fixtures; current live has 19 pass, 1 known
|
||||
content gap, and 0 blocking failures after the project-id backfill and
|
||||
memory-ranking stabilization deploy
|
||||
- 571 tests passing on `main`
|
||||
- query-relevance Trusted Project State ranking before state-budget truncation
|
||||
- retrieval eval harness: 20 fixtures; current live has 20 pass, 0 known
|
||||
issues, and 0 blocking failures
|
||||
- 572 tests passing on `main`
|
||||
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
|
||||
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
|
||||
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
|
||||
@@ -178,10 +178,7 @@ These are the current practical priorities.
|
||||
Target: 100+ active memories.
|
||||
3. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
|
||||
different model than the extractor for independent validation
|
||||
4. **Fix p04-constraints harness failure** — retrieval doesn't surface
|
||||
"Zerodur" for p04 constraint queries. Investigate if it's a missing
|
||||
memory or retrieval ranking issue.
|
||||
5. **Fix Dalidou Git credentials** — the host checkout can fetch but cannot
|
||||
4. **Fix Dalidou Git credentials** — the host checkout can fetch but cannot
|
||||
push to Gitea over HTTP in non-interactive SSH sessions. Prefer switching
|
||||
the deploy checkout to a Gitea SSH key; PAT-backed `credential.helper store`
|
||||
is the fallback.
|
||||
|
||||
Reference in New Issue
Block a user