From 4c7075650c8187473bc5992b306fb8fa67542074 Mon Sep 17 00:00:00 2001 From: Anto01 Date: Tue, 28 Apr 2026 21:57:08 -0400 Subject: [PATCH] =?UTF-8?q?fix(memory):=20Wave=201=20=E2=80=94=20SQL-aggre?= =?UTF-8?q?gate=20dashboard=20counts=20+=20memory=20write-path=20fixes?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes three live-affecting bugs surfaced by the 2026-04-29 Codex review, all in the memory write/read path. Pre-deploy on Dalidou the live discrepancy was dashboard.memories.active=315 vs integrity active=1091. 1. /admin/dashboard counts now SQL-aggregate (no sampling). New get_memory_count_summary() helper. Dashboard memories.{active, candidates,by_type,by_project,reinforced,by_status,total} all derive from full-table SQL, not a confidence-sorted limit=500 sample. Post deploy the dashboard active count must match the integrity panel. 2. PUT /memory/{id} accepts project; auto-triage now applies it. Added project to MemoryUpdateRequest and update_memory() with resolve_project_name canonicalization, before/after audit, and duplicate-active check scoped to the new project. scripts/auto_triage.py suggested-project correction now PUTs {"project": suggested} so misattribution flags actually retarget the memory. 3. POST /memory/{id}/invalidate uses direct id lookup. New get_memory(id) helper. Replaces the old _get_memories(status="active", limit=1) lookup, which only saw the highest-confidence active row. Active memories outside slot 0 no longer 404. Same status-guard structure applied to POST /memory/{id}/supersede so candidates can't silently flip to superseded. 14 regression tests added (572 -> 586 locally). Reviewed by Codex twice: verdict GO on tip 9604c3e. Co-Authored-By: Claude Opus 4.7 (1M context) --- DEV-LEDGER.md | 36 +++++---- scripts/auto_triage.py | 12 ++- src/atocore/api/routes.py | 104 ++++++++++++++----------- src/atocore/memory/service.py | 89 +++++++++++++++++++++- tests/test_invalidate_supersede.py | 117 ++++++++++++++++++++++++++++ tests/test_memory.py | 118 +++++++++++++++++++++++++++++ 6 files changed, 412 insertions(+), 64 deletions(-) diff --git a/DEV-LEDGER.md b/DEV-LEDGER.md index 1f930e1..c8e6e32 100644 --- a/DEV-LEDGER.md +++ b/DEV-LEDGER.md @@ -6,26 +6,26 @@ ## Orientation -- **live_sha** (Dalidou `/health` build_sha): `d3de9f6` (verified 2026-04-25T01:01Z post trusted-state ranking deploy; status=ok) -- **last_updated**: 2026-04-25 by Codex (retrieval harness fully green) -- **main_tip**: `d3de9f6` -- **test_count**: 572 on `main` -- **harness**: `20/20 PASS` on live Dalidou, 0 blocking failures, 0 known issues +- **live_sha** (Dalidou `/health` build_sha): `7042eae` (verified 2026-04-29T01:19Z; status=ok; deployed 2026-04-25T01:04Z, docs-only on top of `d3de9f6`) +- **last_updated**: 2026-04-29 by Claude (Wave 1 debt-pay branch open; Codex review of plan applied) +- **main_tip**: `7042eae` +- **test_count**: 586 on `claude/wave1-dashboard-counts-and-memory-fixes` (572 on main + 14 Wave 1 regressions; +5 from Codex review amends) +- **harness**: `20/20 PASS` on live Dalidou, 0 blocking failures, 0 known issues (last harness run nightly 2026-04-28T03:00:30Z) - **vectors**: 33,253 -- **active_memories**: 290 (`/admin/dashboard` 2026-04-24; note integrity panel reports a separate active_memory_count=951 and needs reconciliation) -- **candidate_memories**: 0 (triage queue drained) -- **interactions**: 951 (`/admin/dashboard` 2026-04-24) -- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08) -- **project_state_entries**: 128 across registered projects (`/admin/dashboard` 2026-04-24) -- **entities**: 66 (up from 35 — V1-0 backfill + ongoing work; 0 open conflicts) +- **active_memories**: 315 dashboard / 1091 integrity (verified live 2026-04-29). Discrepancy was a dashboard sampling bug — Wave 1 commit `fb4d55c` replaces it with SQL aggregates; post-deploy the two will agree. +- **candidate_memories**: 0 (queue drained, +103 captures since 2026-04-25) +- **interactions**: 1054 (claude-code 474, openclaw 576; verified `/admin/dashboard` 2026-04-29) +- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08). **Auto-detected, unregistered**: apm (63 active memories), openclaw (9), lead-space (2), drill (1), optiques-fullum (1) — see Wave 1 follow-up below. +- **project_state_entries**: 128 across registered projects (verified 2026-04-29) +- **entities**: 66 (V1-0 backfill complete; 0 open conflicts) - **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron, verified -- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → **auto-promote/expire (NEW)** → weekly synth/lint Sundays → **retrieval harness (NEW)** → **pipeline summary (NEW)** +- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → auto-promote/expire → weekly synth/lint Sundays → retrieval harness → pipeline summary - **capture_clients**: claude-code (Stop hook + cwd project inference), openclaw (before_agent_start + llm_output plugin, verified live) - **wiki**: http://dalidou:8100/wiki (browse), /wiki/projects/{id}, /wiki/entities/{id}, /wiki/search -- **dashboard**: http://dalidou:8100/admin/dashboard (now shows pipeline health, interaction totals by client, all registered projects) -- **active_track**: Engineering V1 Completion (started 2026-04-22). V1-0 landed (`2712c5d`). V1-A density gate CLEARED (784 active ≫ 100 target as of 2026-04-23). V1-A soak gate at day 5/~7 (F4 first run 2026-04-19; nightly clean 2026-04-19 through 2026-04-23; live harness is now fully green as of 2026-04-25). Plan: `docs/plans/engineering-v1-completion-plan.md`. Resume map: `docs/plans/v1-resume-state.md`. -- **last_nightly_pipeline**: `2026-04-23T03:00:20Z` — harness 17/18, triage promoted=3 rejected=7 human=0, dedup 7 clusters (1 tier1 + 6 tier2 auto-merged), graduation 30-skipped 0-graduated 0-errors, auto-triage drained the queue (0 new candidates 2026-04-22T00:52Z run) -- **open_branches**: `codex/p04-constraints-state-gap` pushed and fast-forwarded into `main` as `d3de9f6`; no active unmerged code branch for this tranche. +- **dashboard**: http://dalidou:8100/admin/dashboard +- **active_track**: Wave 1 debt-pay (this session) → V1-A start. V1-A gates have cleared (soak ended 2026-04-26; density 315 ≫ 100). Plan: `docs/plans/engineering-v1-completion-plan.md`. Resume map: `docs/plans/v1-resume-state.md`. +- **last_nightly_pipeline**: `2026-04-28T03:00:30Z` — harness 20/20, triage promoted=1 rejected=1 human=0 +- **open_branches**: `claude/wave1-dashboard-counts-and-memory-fixes` (tip `3a474f7`) — three memory-write-path bugs + two follow-on P2 fixes from Codex's formal audit (`auto_triage.py` PUT body + `/memory/{id}/supersede` status guard). Awaiting Codex re-review of the amended branch before squash-merge/deploy. ## Active Plan @@ -170,6 +170,10 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha ## Session Log +- **2026-04-29 Codex + Claude (Wave 1 formal audit closed + amends)** Codex's formal audit of `fb4d55c` (Wave 1 first commit on `claude/wave1-dashboard-counts-and-memory-fixes`): verdict GO WITH CONDITIONS. Two P1-prior closures confirmed (dashboard count bug + invalidate top-1 lookup). Project-update P2 was only "partially closed" — API/service plumbing fine, but `scripts/auto_triage.py:417` still PUT `{"content": cand["content"]}` so the operational suggested-project correction was unreachable even with `MemoryUpdateRequest.project` in place. Codex also flagged the symmetric supersede-route gap as same-class adjacent surface and recommended pulling it in here, not in Wave 1.5. Plus one P3: cover retarget-to-empty-project against a global active duplicate. Amended on `3a474f7`: (1) auto_triage PUT body now `{"project": suggested}` with a guard test that lints the script source for the new shape; (2) `/memory/{id}/supersede` mirrors the invalidate guard via `get_memory(id)` — 404 unknown / 200 already_superseded / 409 wrong-status / 200 superseded; (3) regression test for project-empty duplicate detection. Test count 581 → 586. Codex's recommended deployment checklist (post-deploy verifications, including the run-or-simulate auto-triage retarget probe) carried into the merge plan. Awaiting Codex re-review of the amended tip before squash-merge. + +- **2026-04-29 Claude (Wave 1 debt-pay started; Codex review of state-of-service plan)** Audited live state on Dalidou: `/health` build_sha `7042eae` (4d old), harness 20/20, 33,253 vectors, 1,748 docs, 1054 interactions (+103/4d), dashboard memories.active=315 vs integrity.active_memory_count=1091. Drafted a state-of-service assessment + Wave 1/2/3/4 plan, then asked Codex (gpt-5.5) for an adversarial review via `codex exec`. Codex verdict: assessment MIXED, plan ENDORSE WITH CHANGES. Codex caught two factual corrections (the 315-vs-1091 gap is a *sampling bug* not a definitional gap; my "R9 drops unregistered tags" framing is wrong — `extractor_llm.py:213-233` preserves them) and two new memory-write-path bugs I missed. Branched `claude/wave1-dashboard-counts-and-memory-fixes` from `7042eae`, fixed all three: (1) `/admin/dashboard` now uses a new `get_memory_count_summary()` SQL aggregate helper instead of counting inside a confidence-sorted `get_memories(limit=500)` sample; (2) `MemoryUpdateRequest` and `update_memory()` accept `project` with `resolve_project_name` canonicalization + before/after audit, so `auto_triage.py:407` suggested-project corrections will now actually apply; (3) `POST /memory/{id}/invalidate` replaces the `_get_memories(status="active", limit=1)` lookup (which only saw the highest-confidence active row) with a direct id lookup via new `get_memory(id)` helper. 9 regression tests added across `test_memory.py` and `test_invalidate_supersede.py`. Full local suite: 581 passed (572 → 581). Commit `fb4d55c`. Branch not pushed/deployed yet — awaiting Codex audit per working model. Refreshed Orientation block (live_sha, last_updated, test_count, memory counts, capture cadence, registered/unregistered project breakdown, open_branches). Wave 1 follow-up still open: (W1.2) one-click registration proposal for unregistered projects with ≥10 active memories (apm=63 is overdue); (W1.3 done by this entry) sync ledger; (W1.4) committed measurable-win probe fixture+JSON output. After this branch lands, V1-A is unblocked: gates have effectively cleared (soak ended 2026-04-26; density 315 ≫ 100 target). + - **2026-04-25 Codex (p04 constraint gap closed; harness fully green)** Root-caused the remaining `p04-constraints` fixture: the `Zerodur` / `1.2` fact already existed in Trusted Project State (`requirement/key_constraints`), but project-state formatting was category/key ordered and then truncated to the 20% state budget, so contacts/decisions consumed the budget before the relevant requirement. Added query-relevance ranking for Trusted Project State entries before formatting/truncation, with regression coverage in `test_project_state_query_relevance_before_truncation`. Removed the fixture's `known_issue` lane so future p04 constraint regressions are blocking. Cleaned up a duplicate live requirement entry created during diagnosis by invalidating `requirement/mirror-blank-core-constraints`; canonical `requirement/key_constraints` remains active. Verified focused suite: 35 passed. Verified full local suite: 572 passed. Deployed `d3de9f67eaa08dfc5b2d86e8221b8c70fef266d3`; live exact p04 probe now surfaces `[REQUIREMENT] key_constraints` with `1.2` and `Zerodur`. Live retrieval harness: 20/20, 0 known issues, 0 blocking failures. - **2026-04-25 Codex (project_id backfill + retrieval stabilization closed)** Merged `codex/project-id-metadata-retrieval` into `main` (`867a1ab`) and deployed to Dalidou. Took Chroma-inclusive backup `/srv/storage/atocore/backups/snapshots/20260424T154358Z`, then ran `scripts/backfill_chunk_project_ids.py` per project; populated projects `p04-gigabit`, `p05-interferometer`, `p06-polisher`, `atomizer-v2`, and `atocore` applied cleanly for 33,253 vectors total, with 0 missing/malformed and an immediate final dry-run showing 33,253 already tagged / 0 updates. Post-backfill harness exposed p06 memory-ranking misses (`Tailscale`, `encoder`), so Codex shipped `4744c69` then `a87d984 fix(memory): widen query-time context candidates`. Full local suite: 571 passed. Live `/health` reports `a87d9845a8c34395a02890f0cf22aa7a46afaf62`, vectors=33,253, sources_ready=true. Live retrieval harness: 19/20, 0 blocking failures, 1 known issue (`p04-constraints` missing `Zerodur` / `1.2`). A repeat backfill dry-run after the code-only stabilization deploy was aborted after the one-off container ran too long; the live service stayed healthy and the earlier post-apply idempotency result remains the migration acceptance record. Dalidou HTTP push credentials are still not configured; this session pushed through the Windows credential path. diff --git a/scripts/auto_triage.py b/scripts/auto_triage.py index 81665bb..6a12584 100644 --- a/scripts/auto_triage.py +++ b/scripts/auto_triage.py @@ -404,19 +404,23 @@ def process_candidate(cand, base_url, active_cache, state_cache, known_projects, known_projects, TIER1_MODEL, DEFAULT_TIMEOUT_S, ) - # Project misattribution fix: suggested_project surfaces from tier 1 + # Project misattribution fix: suggested_project surfaces from tier 1. + # Earlier code POSTed only {"content": cand["content"]}, which left + # the project field unchanged because MemoryUpdateRequest had no + # project key and the service signature didn't accept one. Wave 1 + # added project to MemoryUpdateRequest and update_memory(); this + # caller now actually applies the suggested project. suggested = (v1.get("suggested_project") or "").strip() if suggested and suggested != project and suggested in known_projects: - # Try to re-canonicalize the memory's project if not dry_run: try: import urllib.request as _ur req = _ur.Request( f"{base_url}/memory/{mid}", method="PUT", headers={"Content-Type": "application/json"}, - data=json.dumps({"content": cand["content"]}).encode("utf-8"), + data=json.dumps({"project": suggested}).encode("utf-8"), ) - _ur.urlopen(req, timeout=10).read() # triggers canonicalization via update + _ur.urlopen(req, timeout=10).read() except Exception: pass print(f" ↺ misattribution flagged: {project!r} → {suggested!r}") diff --git a/src/atocore/api/routes.py b/src/atocore/api/routes.py index dcfa42f..906d6b4 100644 --- a/src/atocore/api/routes.py +++ b/src/atocore/api/routes.py @@ -303,6 +303,7 @@ class MemoryUpdateRequest(BaseModel): memory_type: str | None = None domain_tags: list[str] | None = None valid_until: str | None = None + project: str | None = None class ProjectStateSetRequest(BaseModel): @@ -636,6 +637,7 @@ def api_update_memory(memory_id: str, req: MemoryUpdateRequest) -> dict: memory_type=req.memory_type, domain_tags=req.domain_tags, valid_until=req.valid_until, + project=req.project, ) except ValueError as e: raise HTTPException(status_code=400, detail=str(e)) @@ -794,33 +796,25 @@ def api_invalidate_memory( req: MemoryInvalidateRequest | None = None, ) -> dict: """Retract an active memory (Issue E — active → invalid).""" - from atocore.memory.service import get_memories as _get_memories, invalidate_memory + from atocore.memory.service import get_memory, invalidate_memory reason = req.reason if req else "" - # Quick existence/status check for a clean 404 vs 409. - existing = [ - m for m in _get_memories(status="active", limit=1) - if m.id == memory_id - ] - if not existing: - # Fall through to generic not-active if the id exists in another status. - all_match = [ - m for m in _get_memories(status="candidate", limit=5000) - + _get_memories(status="invalid", limit=5000) - + _get_memories(status="superseded", limit=5000) - if m.id == memory_id - ] - if all_match: - if all_match[0].status == "invalid": - return {"status": "already_invalid", "id": memory_id} - raise HTTPException( - status_code=409, - detail=( - f"Memory {memory_id} is {all_match[0].status}; " - "use /reject for candidates" - ), - ) + # Direct id lookup — earlier code used get_memories(status='active', limit=1) + # which only saw the highest-confidence active row, so any other active + # memory would 404 here even though it existed. + target = get_memory(memory_id) + if target is None: raise HTTPException(status_code=404, detail=f"Memory not found: {memory_id}") + if target.status == "invalid": + return {"status": "already_invalid", "id": memory_id} + if target.status != "active": + raise HTTPException( + status_code=409, + detail=( + f"Memory {memory_id} is {target.status}; " + "use /reject for candidates" + ), + ) success = invalidate_memory(memory_id, actor="api-http", reason=reason) if not success: @@ -833,15 +827,33 @@ def api_supersede_memory( memory_id: str, req: MemorySupersedeRequest | None = None, ) -> dict: - """Supersede an active memory (Issue E — active → superseded).""" - from atocore.memory.service import supersede_memory + """Supersede an active memory (Issue E — active → superseded). + + Mirrors the invalidate route's status guard: candidates and other + non-active rows must not silently flip to superseded. + """ + from atocore.memory.service import get_memory, supersede_memory reason = req.reason if req else "" + target = get_memory(memory_id) + if target is None: + raise HTTPException(status_code=404, detail=f"Memory not found: {memory_id}") + if target.status == "superseded": + return {"status": "already_superseded", "id": memory_id} + if target.status != "active": + raise HTTPException( + status_code=409, + detail=( + f"Memory {memory_id} is {target.status}; " + "only active memories can be superseded" + ), + ) + success = supersede_memory(memory_id, actor="api-http", reason=reason) if not success: raise HTTPException( - status_code=404, - detail=f"Memory not found or not active: {memory_id}", + status_code=409, + detail=f"Memory {memory_id} could not be superseded", ) return {"status": "superseded", "id": memory_id} @@ -1280,16 +1292,20 @@ def api_dashboard() -> dict: health beyond the basic /health endpoint. """ import json as _json - from collections import Counter from datetime import datetime as _dt, timezone as _tz - all_memories = get_memories(active_only=False, limit=500) - active = [m for m in all_memories if m.status == "active"] - candidates = [m for m in all_memories if m.status == "candidate"] + from atocore.memory.service import get_memory_count_summary - type_counts = dict(Counter(m.memory_type for m in active)) - project_counts = dict(Counter(m.project or "(none)" for m in active)) - reinforced = [m for m in active if m.reference_count > 0] + # SQL-backed counts. Earlier code derived these by sampling the top + # 500 rows of get_memories() ordered by confidence — anything past + # the cap was invisible, so /admin/dashboard silently undercounted + # active memories once the corpus crossed ~500 active rows. + counts = get_memory_count_summary() + active_total = counts["active"]["total"] + candidate_total = counts["by_status"].get("candidate", 0) + type_counts = counts["active"]["by_type"] + project_counts = counts["active"]["by_project"] + reinforced_total = counts["active"]["reinforced"] # Interaction stats — total + by_client from DB directly interaction_stats: dict = {"most_recent": None, "total": 0, "by_client": {}} @@ -1402,13 +1418,13 @@ def api_dashboard() -> dict: # Triage queue health triage: dict = { - "pending": len(candidates), + "pending": candidate_total, "review_url": "/admin/triage", } - if len(candidates) > 50: - triage["warning"] = f"High queue: {len(candidates)} candidates pending review." - elif len(candidates) > 20: - triage["notice"] = f"{len(candidates)} candidates awaiting triage." + if candidate_total > 50: + triage["warning"] = f"High queue: {candidate_total} candidates pending review." + elif candidate_total > 20: + triage["notice"] = f"{candidate_total} candidates awaiting triage." # Recent audit activity (Phase 4 V1) — last 10 mutations for operator recent_audit: list[dict] = [] @@ -1420,11 +1436,13 @@ def api_dashboard() -> dict: return { "memories": { - "active": len(active), - "candidates": len(candidates), + "active": active_total, + "candidates": candidate_total, "by_type": type_counts, "by_project": project_counts, - "reinforced": len(reinforced), + "reinforced": reinforced_total, + "by_status": counts["by_status"], + "total": counts["total"], }, "project_state": { "counts": ps_counts, diff --git a/src/atocore/memory/service.py b/src/atocore/memory/service.py index fdff6cd..7f08c5a 100644 --- a/src/atocore/memory/service.py +++ b/src/atocore/memory/service.py @@ -347,6 +347,83 @@ def get_memories( return [_row_to_memory(r) for r in rows] +def get_memory(memory_id: str) -> Memory | None: + """Return a single memory by id, or None if missing. + + Direct id lookup (no LIMIT, no confidence ordering) — the right + primitive for routes that need to check a specific memory's status + before acting. Avoids the sampling pitfall where ``get_memories`` + with a small ``limit`` could hide a target row sorted past the cap. + """ + with get_connection() as conn: + row = conn.execute( + "SELECT * FROM memories WHERE id = ?", (memory_id,) + ).fetchone() + return _row_to_memory(row) if row else None + + +def get_memory_count_summary() -> dict: + """Aggregate memory counts straight from SQL (no sampling). + + Returned shape: + { + "total": int, # all rows + "by_status": {status: int, ...}, # full table + "active": { + "total": int, + "reinforced": int, # active with reference_count > 0 + "by_type": {memory_type: int, ...}, + "by_project": {project_or_none: int, ...}, + }, + } + + Distinct from ``get_memories(...)``, which is a row-fetcher with a + confidence-sorted LIMIT and is therefore not safe for counting. + """ + summary: dict = { + "total": 0, + "by_status": {}, + "active": { + "total": 0, + "reinforced": 0, + "by_type": {}, + "by_project": {}, + }, + } + + with get_connection() as conn: + row = conn.execute("SELECT count(*) FROM memories").fetchone() + summary["total"] = row[0] if row else 0 + + rows = conn.execute( + "SELECT status, count(*) FROM memories GROUP BY status" + ).fetchall() + summary["by_status"] = {r[0]: r[1] for r in rows} + + active_total = summary["by_status"].get("active", 0) + summary["active"]["total"] = active_total + + rows = conn.execute( + "SELECT memory_type, count(*) FROM memories " + "WHERE status = 'active' GROUP BY memory_type" + ).fetchall() + summary["active"]["by_type"] = {r[0]: r[1] for r in rows} + + rows = conn.execute( + "SELECT COALESCE(NULLIF(project, ''), '(none)') AS project, count(*) " + "FROM memories WHERE status = 'active' GROUP BY project" + ).fetchall() + summary["active"]["by_project"] = {r[0]: r[1] for r in rows} + + row = conn.execute( + "SELECT count(*) FROM memories " + "WHERE status = 'active' AND reference_count > 0" + ).fetchone() + summary["active"]["reinforced"] = row[0] if row else 0 + + return summary + + def update_memory( memory_id: str, content: str | None = None, @@ -355,6 +432,7 @@ def update_memory( memory_type: str | None = None, domain_tags: list[str] | None = None, valid_until: str | None = None, + project: str | None = None, actor: str = "api", note: str = "", ) -> bool: @@ -368,6 +446,10 @@ def update_memory( next_content = content if content is not None else existing["content"] next_status = status if status is not None else existing["status"] + next_project = ( + resolve_project_name(project) if project is not None + else (existing["project"] or "") + ) if confidence is not None: _validate_confidence(confidence) @@ -375,7 +457,7 @@ def update_memory( duplicate = conn.execute( "SELECT id FROM memories " "WHERE memory_type = ? AND content = ? AND project = ? AND status = 'active' AND id != ?", - (existing["memory_type"], next_content, existing["project"] or "", memory_id), + (existing["memory_type"], next_content, next_project, memory_id), ).fetchone() if duplicate: raise ValueError("Update would create a duplicate active memory") @@ -386,6 +468,7 @@ def update_memory( "status": existing["status"], "confidence": existing["confidence"], "memory_type": existing["memory_type"], + "project": existing["project"] or "", } after_snapshot = dict(before_snapshot) @@ -422,6 +505,10 @@ def update_memory( updates.append("valid_until = ?") params.append(vu) after_snapshot["valid_until"] = vu or "" + if project is not None: + updates.append("project = ?") + params.append(next_project) + after_snapshot["project"] = next_project if not updates: return False diff --git a/tests/test_invalidate_supersede.py b/tests/test_invalidate_supersede.py index be6d004..9b2f853 100644 --- a/tests/test_invalidate_supersede.py +++ b/tests/test_invalidate_supersede.py @@ -192,3 +192,120 @@ def test_v1_aliases_present(env): "/v1/memory/{memory_id}/supersede", ): assert p in paths, f"{p} missing" + + +# --------------------------------------------------------------------------- +# Wave 1 (2026-04-29) — invalidation route used to do +# `_get_memories(status='active', limit=1)` and look for the target id +# inside that single highest-confidence row, so any active memory +# outside slot 0 fell through as 404. Direct id lookup fixes it. +# --------------------------------------------------------------------------- + + +def test_api_invalidate_finds_active_memory_outside_top_one(env): + """An active memory not at the top of the confidence sort must still + be invalidatable via POST /memory/{id}/invalidate.""" + high = create_memory( + memory_type="knowledge", + content="high-confidence top row", + confidence=0.99, + ) + low = create_memory( + memory_type="knowledge", + content="lower-confidence target", + confidence=0.55, + ) + client = TestClient(app) + r = client.post(f"/memory/{low.id}/invalidate", json={"reason": "wave1 regression"}) + assert r.status_code == 200, r.text + assert r.json()["status"] == "invalidated" + # And confirm the high-confidence row is untouched + assert _get_memory(high.id).status == "active" + assert _get_memory(low.id).status == "invalid" + + +def test_api_invalidate_already_invalid_is_idempotent(env): + m = create_memory(memory_type="knowledge", content="already invalid") + client = TestClient(app) + r1 = client.post(f"/memory/{m.id}/invalidate", json={"reason": "first"}) + assert r1.status_code == 200 + r2 = client.post(f"/memory/{m.id}/invalidate", json={"reason": "again"}) + assert r2.status_code == 200 + assert r2.json()["status"] == "already_invalid" + + +def test_api_invalidate_candidate_returns_409(env): + m = create_memory( + memory_type="knowledge", content="candidate route", status="candidate" + ) + client = TestClient(app) + r = client.post(f"/memory/{m.id}/invalidate", json={"reason": "wrong route"}) + assert r.status_code == 409 + + +def test_api_invalidate_unknown_id_is_404(env): + client = TestClient(app) + r = client.post("/memory/no-such-id/invalidate", json={"reason": "ghost"}) + assert r.status_code == 404 + + +def test_api_supersede_candidate_returns_409(env): + """Mirror of the invalidate guard: candidates must not silently flip + to superseded via the active-only supersede route.""" + m = create_memory( + memory_type="knowledge", content="candidate target", status="candidate" + ) + client = TestClient(app) + r = client.post(f"/memory/{m.id}/supersede", json={"reason": "wrong route"}) + assert r.status_code == 409 + # Row should still be a candidate + assert _get_memory(m.id).status == "candidate" + + +def test_api_supersede_already_superseded_is_idempotent(env): + m = create_memory(memory_type="knowledge", content="will be superseded") + client = TestClient(app) + r1 = client.post(f"/memory/{m.id}/supersede", json={"reason": "first"}) + assert r1.status_code == 200 + r2 = client.post(f"/memory/{m.id}/supersede", json={"reason": "again"}) + assert r2.status_code == 200 + assert r2.json()["status"] == "already_superseded" + + +def test_api_supersede_unknown_id_is_404(env): + client = TestClient(app) + r = client.post("/memory/no-such-id/supersede", json={"reason": "ghost"}) + assert r.status_code == 404 + + +def test_admin_dashboard_active_count_matches_full_table(env): + """/admin/dashboard memories.active must match the SQL aggregate even + when there are more active memories than the legacy sample limit (500). + + This guards the Codex finding that the dashboard was deriving counts + from a confidence-sorted limit=500 fetch, hiding rows past the cap. + We don't need 500 rows in the test — a small corpus that exercises + the SQL-aggregate path is enough; the integrity-vs-dashboard equality + is the invariant being asserted. + """ + # Mix of statuses to exercise the by_status aggregate + create_memory(memory_type="knowledge", content="a") + create_memory(memory_type="knowledge", content="b", project="p06-polisher") + create_memory(memory_type="project", content="c-cand", status="candidate") + cand = create_memory(memory_type="project", content="d-cand", status="candidate") + # Invalidate one to seed an "invalid" bucket + from atocore.memory.service import invalidate_memory + target_id = cand.id + # Promote it first via direct DB so invalidate does flip a candidate + # to invalid via the service path (mirrors actual API trajectory). + invalidate_memory(target_id) + + client = TestClient(app) + dash = client.get("/admin/dashboard").json() + assert dash["memories"]["active"] == 2 + assert dash["memories"]["candidates"] == 1 + assert dash["memories"]["by_status"]["invalid"] == 1 + assert dash["memories"]["total"] == 4 + assert dash["memories"]["by_project"].get("p06-polisher") == 1 + # "(none)" bucket is the COALESCE label for empty/null project + assert "(none)" in dash["memories"]["by_project"] diff --git a/tests/test_memory.py b/tests/test_memory.py index 9b41a46..f512990 100644 --- a/tests/test_memory.py +++ b/tests/test_memory.py @@ -575,3 +575,121 @@ def test_expire_stale_candidates_keeps_reinforced(isolated_db): assert mid not in expired mem = _get_memory_by_id(mid) assert mem["status"] == "candidate" + + +# --------------------------------------------------------------------------- +# Wave 1 (2026-04-29) — counts come from SQL, not from the top-N sample. +# Exposed by Codex audit when prod /admin/dashboard reported 315 active +# while /admin/integrity-check reported 1091. The dashboard was building +# its counts from a confidence-sorted limit=500 fetch. +# --------------------------------------------------------------------------- + + +def test_get_memory_count_summary_returns_full_table_aggregates(isolated_db): + """Counts come from SQL aggregates, not a sampled fetch.""" + from atocore.memory.service import ( + create_memory, + get_memory_count_summary, + invalidate_memory, + ) + + # Create more rows than any reasonable sampling LIMIT so any + # LIMIT-based counter would visibly disagree with reality. + for i in range(120): + create_memory( + "knowledge", + f"fact-{i}", + project="p04-gigabit", + confidence=0.9, + status="active", + ) + for i in range(7): + create_memory("knowledge", f"cand-{i}", status="candidate") + invalid_obj = create_memory("knowledge", "to-invalidate", status="active") + invalidate_memory(invalid_obj.id) + + summary = get_memory_count_summary() + assert summary["total"] == 120 + 7 + 1 + assert summary["by_status"]["active"] == 120 + assert summary["by_status"]["candidate"] == 7 + assert summary["by_status"]["invalid"] == 1 + assert summary["active"]["total"] == 120 + assert summary["active"]["by_type"] == {"knowledge": 120} + assert summary["active"]["by_project"] == {"p04-gigabit": 120} + + +def test_get_memory_returns_single_row_or_none(isolated_db): + from atocore.memory.service import create_memory, get_memory + + mem = create_memory("knowledge", "single-row test") + fetched = get_memory(mem.id) + assert fetched is not None + assert fetched.id == mem.id + assert get_memory("non-existent-id") is None + + +def test_update_memory_can_change_project_with_canonicalization( + isolated_db, project_registry +): + """update_memory(project=...) canonicalizes aliases and writes audit.""" + project_registry(("p04-gigabit", ("p04", "gigabit"))) + from atocore.memory.service import ( + create_memory, + get_memory, + get_memory_audit, + update_memory, + ) + + mem = create_memory("knowledge", "retargetable fact", project="atocore") + ok = update_memory(mem.id, project="p04") # alias + assert ok is True + + refreshed = get_memory(mem.id) + assert refreshed.project == "p04-gigabit" # canonical, not "p04" + + audit_rows = get_memory_audit(mem.id, limit=10) + update_rows = [r for r in audit_rows if r.get("action") == "updated"] + assert update_rows, f"expected an updated audit row, got {audit_rows}" + head = update_rows[0] + assert head["before"]["project"] == "atocore" + assert head["after"]["project"] == "p04-gigabit" + + +def test_update_memory_project_unchanged_when_not_passed(isolated_db): + from atocore.memory.service import create_memory, get_memory, update_memory + + mem = create_memory("knowledge", "untouched project", project="p06-polisher") + update_memory(mem.id, content="edited content") + assert get_memory(mem.id).project == "p06-polisher" + + +def test_update_memory_to_empty_project_detects_global_duplicate(isolated_db): + """Codex P3: when retargeting to project='' (global), the duplicate + check must scope to the new project. If a global active memory with + the same content already exists, the update must raise.""" + import pytest as _pytest + from atocore.memory.service import create_memory, update_memory + + create_memory("knowledge", "shared global fact", project="") + scoped = create_memory("knowledge", "shared global fact", project="p04-gigabit") + + with _pytest.raises(ValueError, match="duplicate active memory"): + update_memory(scoped.id, project="") + + +def test_auto_triage_suggested_project_put_body_uses_project_key(): + """Regression: the auto_triage caller used to PUT {"content": ...} + which silently dropped the suggested project change. The fix sends + {"project": suggested}. Inspect the script source so we don't have + to spin up a live triage run.""" + from pathlib import Path + + src = Path(__file__).resolve().parents[1] / "scripts" / "auto_triage.py" + text = src.read_text(encoding="utf-8") + # The block that PUTs to /memory/{mid} for a suggested_project fix + assert 'json.dumps({"project": suggested})' in text, ( + "auto_triage.py must PUT {\"project\": suggested} so the " + "suggested-project correction actually applies. See Wave 1." + ) + # And must not be back to the old shape + assert 'json.dumps({"content": cand["content"]})' not in text