Compare commits
10 Commits
codex/proj
...
9604c3e9ae
| Author | SHA1 | Date | |
|---|---|---|---|
| 9604c3e9ae | |||
| 3a474f750c | |||
| 4e6fba7cb9 | |||
| fb4d55cbcd | |||
| 7042eaea46 | |||
| d3de9f67ea | |||
| 0fc6705173 | |||
| a87d9845a8 | |||
|
|
4744c69d10 | ||
| 867a1abfaa |
@@ -6,26 +6,26 @@
|
||||
|
||||
## Orientation
|
||||
|
||||
- **live_sha** (Dalidou `/health` build_sha): `f44a211` (verified 2026-04-24T14:48:44Z post audit-improvements deploy; status=ok)
|
||||
- **last_updated**: 2026-04-24 by Codex (retrieval boundary deployed; project_id metadata branch started)
|
||||
- **main_tip**: `f44a211`
|
||||
- **test_count**: 567 on `codex/project-id-metadata-retrieval` (deployed main baseline: 553)
|
||||
- **harness**: `19/20 PASS` on live Dalidou, 0 blocking failures, 1 known content gap (`p04-constraints`)
|
||||
- **live_sha** (Dalidou `/health` build_sha): `7042eae` (verified 2026-04-29T01:19Z; status=ok; deployed 2026-04-25T01:04Z, docs-only on top of `d3de9f6`)
|
||||
- **last_updated**: 2026-04-29 by Claude (Wave 1 debt-pay branch open; Codex review of plan applied)
|
||||
- **main_tip**: `7042eae`
|
||||
- **test_count**: 586 on `claude/wave1-dashboard-counts-and-memory-fixes` (572 on main + 14 Wave 1 regressions; +5 from Codex review amends)
|
||||
- **harness**: `20/20 PASS` on live Dalidou, 0 blocking failures, 0 known issues (last harness run nightly 2026-04-28T03:00:30Z)
|
||||
- **vectors**: 33,253
|
||||
- **active_memories**: 290 (`/admin/dashboard` 2026-04-24; note integrity panel reports a separate active_memory_count=951 and needs reconciliation)
|
||||
- **candidate_memories**: 0 (triage queue drained)
|
||||
- **interactions**: 951 (`/admin/dashboard` 2026-04-24)
|
||||
- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08)
|
||||
- **project_state_entries**: 128 across registered projects (`/admin/dashboard` 2026-04-24)
|
||||
- **entities**: 66 (up from 35 — V1-0 backfill + ongoing work; 0 open conflicts)
|
||||
- **active_memories**: 315 dashboard / 1091 integrity (verified live 2026-04-29). Discrepancy was a dashboard sampling bug — Wave 1 commit `fb4d55c` replaces it with SQL aggregates; post-deploy the two will agree.
|
||||
- **candidate_memories**: 0 (queue drained, +103 captures since 2026-04-25)
|
||||
- **interactions**: 1054 (claude-code 474, openclaw 576; verified `/admin/dashboard` 2026-04-29)
|
||||
- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08). **Auto-detected, unregistered**: apm (63 active memories), openclaw (9), lead-space (2), drill (1), optiques-fullum (1) — see Wave 1 follow-up below.
|
||||
- **project_state_entries**: 128 across registered projects (verified 2026-04-29)
|
||||
- **entities**: 66 (V1-0 backfill complete; 0 open conflicts)
|
||||
- **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron, verified
|
||||
- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → **auto-promote/expire (NEW)** → weekly synth/lint Sundays → **retrieval harness (NEW)** → **pipeline summary (NEW)**
|
||||
- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → auto-promote/expire → weekly synth/lint Sundays → retrieval harness → pipeline summary
|
||||
- **capture_clients**: claude-code (Stop hook + cwd project inference), openclaw (before_agent_start + llm_output plugin, verified live)
|
||||
- **wiki**: http://dalidou:8100/wiki (browse), /wiki/projects/{id}, /wiki/entities/{id}, /wiki/search
|
||||
- **dashboard**: http://dalidou:8100/admin/dashboard (now shows pipeline health, interaction totals by client, all registered projects)
|
||||
- **active_track**: Engineering V1 Completion (started 2026-04-22). V1-0 landed (`2712c5d`). V1-A density gate CLEARED (784 active ≫ 100 target as of 2026-04-23). V1-A soak gate at day 5/~7 (F4 first run 2026-04-19; nightly clean 2026-04-19 through 2026-04-23; failures confined to the known p04-constraints content gap). Plan: `docs/plans/engineering-v1-completion-plan.md`. Resume map: `docs/plans/v1-resume-state.md`.
|
||||
- **last_nightly_pipeline**: `2026-04-23T03:00:20Z` — harness 17/18, triage promoted=3 rejected=7 human=0, dedup 7 clusters (1 tier1 + 6 tier2 auto-merged), graduation 30-skipped 0-graduated 0-errors, auto-triage drained the queue (0 new candidates 2026-04-22T00:52Z run)
|
||||
- **open_branches**: none — R14 squash-merged as `0989fed` and deployed 2026-04-23T15:20:53Z. V1-A is the next scheduled work
|
||||
- **dashboard**: http://dalidou:8100/admin/dashboard
|
||||
- **active_track**: Wave 1 debt-pay (this session) → V1-A start. V1-A gates have cleared (soak ended 2026-04-26; density 315 ≫ 100). Plan: `docs/plans/engineering-v1-completion-plan.md`. Resume map: `docs/plans/v1-resume-state.md`.
|
||||
- **last_nightly_pipeline**: `2026-04-28T03:00:30Z` — harness 20/20, triage promoted=1 rejected=1 human=0
|
||||
- **open_branches**: `claude/wave1-dashboard-counts-and-memory-fixes` (tip `3a474f7`) — three memory-write-path bugs + two follow-on P2 fixes from Codex's formal audit (`auto_triage.py` PUT body + `/memory/{id}/supersede` status guard). Awaiting Codex re-review of the amended branch before squash-merge/deploy.
|
||||
|
||||
## Active Plan
|
||||
|
||||
@@ -170,6 +170,14 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
||||
|
||||
## Session Log
|
||||
|
||||
- **2026-04-29 Codex + Claude (Wave 1 formal audit closed + amends)** Codex's formal audit of `fb4d55c` (Wave 1 first commit on `claude/wave1-dashboard-counts-and-memory-fixes`): verdict GO WITH CONDITIONS. Two P1-prior closures confirmed (dashboard count bug + invalidate top-1 lookup). Project-update P2 was only "partially closed" — API/service plumbing fine, but `scripts/auto_triage.py:417` still PUT `{"content": cand["content"]}` so the operational suggested-project correction was unreachable even with `MemoryUpdateRequest.project` in place. Codex also flagged the symmetric supersede-route gap as same-class adjacent surface and recommended pulling it in here, not in Wave 1.5. Plus one P3: cover retarget-to-empty-project against a global active duplicate. Amended on `3a474f7`: (1) auto_triage PUT body now `{"project": suggested}` with a guard test that lints the script source for the new shape; (2) `/memory/{id}/supersede` mirrors the invalidate guard via `get_memory(id)` — 404 unknown / 200 already_superseded / 409 wrong-status / 200 superseded; (3) regression test for project-empty duplicate detection. Test count 581 → 586. Codex's recommended deployment checklist (post-deploy verifications, including the run-or-simulate auto-triage retarget probe) carried into the merge plan. Awaiting Codex re-review of the amended tip before squash-merge.
|
||||
|
||||
- **2026-04-29 Claude (Wave 1 debt-pay started; Codex review of state-of-service plan)** Audited live state on Dalidou: `/health` build_sha `7042eae` (4d old), harness 20/20, 33,253 vectors, 1,748 docs, 1054 interactions (+103/4d), dashboard memories.active=315 vs integrity.active_memory_count=1091. Drafted a state-of-service assessment + Wave 1/2/3/4 plan, then asked Codex (gpt-5.5) for an adversarial review via `codex exec`. Codex verdict: assessment MIXED, plan ENDORSE WITH CHANGES. Codex caught two factual corrections (the 315-vs-1091 gap is a *sampling bug* not a definitional gap; my "R9 drops unregistered tags" framing is wrong — `extractor_llm.py:213-233` preserves them) and two new memory-write-path bugs I missed. Branched `claude/wave1-dashboard-counts-and-memory-fixes` from `7042eae`, fixed all three: (1) `/admin/dashboard` now uses a new `get_memory_count_summary()` SQL aggregate helper instead of counting inside a confidence-sorted `get_memories(limit=500)` sample; (2) `MemoryUpdateRequest` and `update_memory()` accept `project` with `resolve_project_name` canonicalization + before/after audit, so `auto_triage.py:407` suggested-project corrections will now actually apply; (3) `POST /memory/{id}/invalidate` replaces the `_get_memories(status="active", limit=1)` lookup (which only saw the highest-confidence active row) with a direct id lookup via new `get_memory(id)` helper. 9 regression tests added across `test_memory.py` and `test_invalidate_supersede.py`. Full local suite: 581 passed (572 → 581). Commit `fb4d55c`. Branch not pushed/deployed yet — awaiting Codex audit per working model. Refreshed Orientation block (live_sha, last_updated, test_count, memory counts, capture cadence, registered/unregistered project breakdown, open_branches). Wave 1 follow-up still open: (W1.2) one-click registration proposal for unregistered projects with ≥10 active memories (apm=63 is overdue); (W1.3 done by this entry) sync ledger; (W1.4) committed measurable-win probe fixture+JSON output. After this branch lands, V1-A is unblocked: gates have effectively cleared (soak ended 2026-04-26; density 315 ≫ 100 target).
|
||||
|
||||
- **2026-04-25 Codex (p04 constraint gap closed; harness fully green)** Root-caused the remaining `p04-constraints` fixture: the `Zerodur` / `1.2` fact already existed in Trusted Project State (`requirement/key_constraints`), but project-state formatting was category/key ordered and then truncated to the 20% state budget, so contacts/decisions consumed the budget before the relevant requirement. Added query-relevance ranking for Trusted Project State entries before formatting/truncation, with regression coverage in `test_project_state_query_relevance_before_truncation`. Removed the fixture's `known_issue` lane so future p04 constraint regressions are blocking. Cleaned up a duplicate live requirement entry created during diagnosis by invalidating `requirement/mirror-blank-core-constraints`; canonical `requirement/key_constraints` remains active. Verified focused suite: 35 passed. Verified full local suite: 572 passed. Deployed `d3de9f67eaa08dfc5b2d86e8221b8c70fef266d3`; live exact p04 probe now surfaces `[REQUIREMENT] key_constraints` with `1.2` and `Zerodur`. Live retrieval harness: 20/20, 0 known issues, 0 blocking failures.
|
||||
|
||||
- **2026-04-25 Codex (project_id backfill + retrieval stabilization closed)** Merged `codex/project-id-metadata-retrieval` into `main` (`867a1ab`) and deployed to Dalidou. Took Chroma-inclusive backup `/srv/storage/atocore/backups/snapshots/20260424T154358Z`, then ran `scripts/backfill_chunk_project_ids.py` per project; populated projects `p04-gigabit`, `p05-interferometer`, `p06-polisher`, `atomizer-v2`, and `atocore` applied cleanly for 33,253 vectors total, with 0 missing/malformed and an immediate final dry-run showing 33,253 already tagged / 0 updates. Post-backfill harness exposed p06 memory-ranking misses (`Tailscale`, `encoder`), so Codex shipped `4744c69` then `a87d984 fix(memory): widen query-time context candidates`. Full local suite: 571 passed. Live `/health` reports `a87d9845a8c34395a02890f0cf22aa7a46afaf62`, vectors=33,253, sources_ready=true. Live retrieval harness: 19/20, 0 blocking failures, 1 known issue (`p04-constraints` missing `Zerodur` / `1.2`). A repeat backfill dry-run after the code-only stabilization deploy was aborted after the one-off container ran too long; the live service stayed healthy and the earlier post-apply idempotency result remains the migration acceptance record. Dalidou HTTP push credentials are still not configured; this session pushed through the Windows credential path.
|
||||
|
||||
- **2026-04-24 Codex (retrieval boundary deployed + project_id metadata tranche)** Merged `codex/audit-improvements-foundation` to `main` as `f44a211` and pushed to Dalidou Gitea. Took pre-deploy runtime backup `/srv/storage/atocore/backups/snapshots/20260424T144810Z` (DB + registry, no Chroma). Deployed via `papa@dalidou` canonical `deploy/dalidou/deploy.sh`; live `/health` reports build_sha `f44a2114970008a7eec4e7fc2860c8f072914e38`, build_time `2026-04-24T14:48:44Z`, status ok. Post-deploy retrieval harness: 20 fixtures, 19 pass, 0 blocking failures, 1 known issue (`p04-constraints`). The former blocker `p05-broad-status-no-atomizer` now passes. Manual p05 `context-build "current status"` spot check shows no p04/Atomizer source bleed in retrieved chunks. Started follow-up branch `codex/project-id-metadata-retrieval`: registered-project ingestion now writes explicit `project_id` into DB chunk metadata and Chroma vector metadata; retrieval prefers exact `project_id` when present and keeps path/tag matching as legacy fallback; added dry-run-by-default `scripts/backfill_chunk_project_ids.py` to backfill SQLite + Chroma metadata; added tests for project-id ingestion, registered refresh propagation, exact project-id retrieval, and collision fallback. Verified targeted suite (`test_ingestion.py`, `test_project_registry.py`, `test_retrieval.py`): 36 passed. Verified full suite: 556 passed in 72.44s. Branch not merged or deployed yet.
|
||||
|
||||
- **2026-04-24 Codex (project_id audit response)** Applied independent-audit fixes on `codex/project-id-metadata-retrieval`. Closed the nightly `/ingest/sources` clobber risk by adding registry-level `derive_project_id_for_path()` and making unscoped `ingest_file()` derive ownership from registered ingest roots when possible; `refresh_registered_project()` still passes the canonical project id directly. Changed retrieval so empty `project_id` falls through to legacy path/tag ownership instead of short-circuiting as unowned. Hardened `scripts/backfill_chunk_project_ids.py`: `--apply` now requires `--chroma-snapshot-confirmed`, runs Chroma metadata updates before SQLite writes, batches updates, skips/report missing vectors, skips/report malformed metadata, reports already-tagged rows, and turns missing ingestion tables into a JSON `db_warning` instead of a traceback. Added tests for auto-derive ingestion, empty-project fallback, ingest-root overlap rejection, and backfill dry-run/apply/snapshot/missing-vector/malformed cases. Verified targeted suite (`test_backfill_chunk_project_ids.py`, `test_ingestion.py`, `test_project_registry.py`, `test_retrieval.py`): 45 passed. Verified full suite: 565 passed in 73.16s. Local dry-run on empty/default data returns 0 updates with `db_warning` rather than crashing. Branch still not merged/deployed.
|
||||
|
||||
@@ -1,11 +1,18 @@
|
||||
# AtoCore - Current State (2026-04-24)
|
||||
# AtoCore - Current State (2026-04-25)
|
||||
|
||||
Update 2026-04-24: audit-improvements deployed as `f44a211`; live harness is
|
||||
19/20 with 0 blocking failures and 1 known content gap. Active follow-up branch
|
||||
`codex/project-id-metadata-retrieval` is at 567 passing tests.
|
||||
Update 2026-04-25: project-id chunk/vector metadata is deployed and backfilled,
|
||||
and the final p04 Trusted Project State budget/ranking gap is closed. Live
|
||||
Dalidou is on `d3de9f6`; `/health` is ok with 33,253 vectors and sources
|
||||
ready. Live retrieval harness is 20/20 with 0 blocking failures and 0 known
|
||||
issues. Full local suite: 572 passed.
|
||||
|
||||
Live deploy: `2b86543` · Dalidou health: ok · Harness: 18/20 with 1 known
|
||||
content gap and 1 current blocking project-bleed guard · Tests: 553 passing.
|
||||
The project-id backfill was applied per populated project after a
|
||||
Chroma-inclusive backup at
|
||||
`/srv/storage/atocore/backups/snapshots/20260424T154358Z`. The immediate
|
||||
post-apply dry-run reported 33,253 already tagged, 0 updates, 0 missing, and 0
|
||||
malformed. A later repeat dry-run after the code-only ranking deploy was
|
||||
aborted because the one-off container ran too long; the earlier post-apply
|
||||
idempotency result remains the migration acceptance record.
|
||||
|
||||
## V1-0 landed 2026-04-22
|
||||
|
||||
@@ -69,10 +76,10 @@ Last nightly run (2026-04-19 03:00 UTC): **31 promoted · 39 rejected · 0 needs
|
||||
| 7G | Re-extraction on prompt version bump | pending |
|
||||
| 7H | Chroma vector hygiene (delete vectors for superseded memories) | pending |
|
||||
|
||||
## Known gaps (honest, refreshed 2026-04-24)
|
||||
## Known gaps (honest, refreshed 2026-04-25)
|
||||
|
||||
1. **Capture surface is Claude-Code-and-OpenClaw only.** Conversations in Claude Desktop, Claude.ai web, phone, or any other LLM UI are NOT captured. Example: the rotovap/mushroom chat yesterday never reached AtoCore because no hook fired. See Q4 below.
|
||||
2. **Project-scoped retrieval guard is deployed and passing.** The April 24 p05 broad-status bleed guard now passes on live Dalidou. The active follow-up branch adds explicit `project_id` chunk/vector metadata so the deployed path/tag heuristic can become a legacy fallback.
|
||||
2. **Project-scoped retrieval guard is deployed and passing.** Explicit `project_id` chunk/vector metadata is now present in SQLite and Chroma for the 33,253-vector corpus. Retrieval prefers exact metadata ownership and keeps path/tag matching as a legacy fallback.
|
||||
3. **Human interface is useful but not yet the V1 Human Mirror.** Wiki/dashboard pages exist, but the spec routes, deterministic mirror files, disputed markers, and curated annotations remain V1-D work.
|
||||
4. **Harness known issue:** `p04-constraints` wants "Zerodur" and "1.2"; live retrieval surfaces related constraints but not those exact strings. Treat as content/state gap until fixed.
|
||||
4. **Harness is currently green.** The former `p04-constraints` known issue is closed; query-relevant Trusted Project State entries now rank before state-budget truncation.
|
||||
5. **Formal docs lag the ledger during fast work.** Use `DEV-LEDGER.md` and `python scripts/live_status.py` for live truth, then copy verified claims into these docs.
|
||||
|
||||
@@ -131,10 +131,12 @@ This sits implicitly between Phase 8 (OpenClaw) and Phase 11
|
||||
(multi-model). Memory-review and engineering-entity commands are
|
||||
deferred from the shared client until their workflows are exercised.
|
||||
|
||||
## What Is Real Today (updated 2026-04-24)
|
||||
## What Is Real Today (updated 2026-04-25)
|
||||
|
||||
- canonical AtoCore runtime on Dalidou (`2b86543`, deploy.sh verified)
|
||||
- 33,253 vectors across 6 registered projects
|
||||
- canonical AtoCore runtime on Dalidou (`d3de9f6`, deploy.sh verified)
|
||||
- 33,253 vectors across 6 registered projects, with explicit `project_id`
|
||||
metadata backfilled into SQLite and Chroma after snapshot
|
||||
`/srv/storage/atocore/backups/snapshots/20260424T154358Z`
|
||||
- 951 captured interactions as of the 2026-04-24 live dashboard; refresh
|
||||
exact live counts with
|
||||
`python scripts/live_status.py`
|
||||
@@ -149,10 +151,13 @@ deferred from the shared client until their workflows are exercised.
|
||||
- 290 active memories and 0 candidate memories as of the 2026-04-24 live
|
||||
dashboard
|
||||
- context pack assembly with 4 tiers: Trusted Project State > identity/preference > project memories > retrieved chunks
|
||||
- query-relevance memory ranking with overlap-density scoring
|
||||
- retrieval eval harness: 20 fixtures; current live has 19 pass, 1 known
|
||||
content gap, and 0 blocking failures after the audit-improvements deploy
|
||||
- 567 tests passing on the active `codex/project-id-metadata-retrieval` branch
|
||||
- query-relevance memory ranking with overlap-density scoring and widened
|
||||
query-time candidate pools so older exact-intent project memories can rank
|
||||
ahead of generic high-confidence notes
|
||||
- query-relevance Trusted Project State ranking before state-budget truncation
|
||||
- retrieval eval harness: 20 fixtures; current live has 20 pass, 0 known
|
||||
issues, and 0 blocking failures
|
||||
- 572 tests passing on `main`
|
||||
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
|
||||
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
|
||||
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
|
||||
@@ -173,9 +178,10 @@ These are the current practical priorities.
|
||||
Target: 100+ active memories.
|
||||
3. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
|
||||
different model than the extractor for independent validation
|
||||
4. **Fix p04-constraints harness failure** — retrieval doesn't surface
|
||||
"Zerodur" for p04 constraint queries. Investigate if it's a missing
|
||||
memory or retrieval ranking issue.
|
||||
4. **Fix Dalidou Git credentials** — the host checkout can fetch but cannot
|
||||
push to Gitea over HTTP in non-interactive SSH sessions. Prefer switching
|
||||
the deploy checkout to a Gitea SSH key; PAT-backed `credential.helper store`
|
||||
is the fallback.
|
||||
|
||||
## Active — Engineering V1 Completion Track (started 2026-04-22)
|
||||
|
||||
|
||||
@@ -404,19 +404,23 @@ def process_candidate(cand, base_url, active_cache, state_cache, known_projects,
|
||||
known_projects, TIER1_MODEL, DEFAULT_TIMEOUT_S,
|
||||
)
|
||||
|
||||
# Project misattribution fix: suggested_project surfaces from tier 1
|
||||
# Project misattribution fix: suggested_project surfaces from tier 1.
|
||||
# Earlier code POSTed only {"content": cand["content"]}, which left
|
||||
# the project field unchanged because MemoryUpdateRequest had no
|
||||
# project key and the service signature didn't accept one. Wave 1
|
||||
# added project to MemoryUpdateRequest and update_memory(); this
|
||||
# caller now actually applies the suggested project.
|
||||
suggested = (v1.get("suggested_project") or "").strip()
|
||||
if suggested and suggested != project and suggested in known_projects:
|
||||
# Try to re-canonicalize the memory's project
|
||||
if not dry_run:
|
||||
try:
|
||||
import urllib.request as _ur
|
||||
req = _ur.Request(
|
||||
f"{base_url}/memory/{mid}", method="PUT",
|
||||
headers={"Content-Type": "application/json"},
|
||||
data=json.dumps({"content": cand["content"]}).encode("utf-8"),
|
||||
data=json.dumps({"project": suggested}).encode("utf-8"),
|
||||
)
|
||||
_ur.urlopen(req, timeout=10).read() # triggers canonicalization via update
|
||||
_ur.urlopen(req, timeout=10).read()
|
||||
except Exception:
|
||||
pass
|
||||
print(f" ↺ misattribution flagged: {project!r} → {suggested!r}")
|
||||
|
||||
@@ -27,8 +27,7 @@
|
||||
"expect_absent": [
|
||||
"polisher suite"
|
||||
],
|
||||
"known_issue": true,
|
||||
"notes": "Known content gap as of 2026-04-24: live retrieval surfaces related constraints but not the exact Zerodur / 1.2 strings. Keep visible, but do not make nightly harness red until the source/state gap is fixed."
|
||||
"notes": "Regression guard: query-relevant Trusted Project State requirements must survive the project-state budget cap."
|
||||
},
|
||||
{
|
||||
"name": "p04-short-ambiguous",
|
||||
|
||||
@@ -303,6 +303,7 @@ class MemoryUpdateRequest(BaseModel):
|
||||
memory_type: str | None = None
|
||||
domain_tags: list[str] | None = None
|
||||
valid_until: str | None = None
|
||||
project: str | None = None
|
||||
|
||||
|
||||
class ProjectStateSetRequest(BaseModel):
|
||||
@@ -636,6 +637,7 @@ def api_update_memory(memory_id: str, req: MemoryUpdateRequest) -> dict:
|
||||
memory_type=req.memory_type,
|
||||
domain_tags=req.domain_tags,
|
||||
valid_until=req.valid_until,
|
||||
project=req.project,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
@@ -794,33 +796,25 @@ def api_invalidate_memory(
|
||||
req: MemoryInvalidateRequest | None = None,
|
||||
) -> dict:
|
||||
"""Retract an active memory (Issue E — active → invalid)."""
|
||||
from atocore.memory.service import get_memories as _get_memories, invalidate_memory
|
||||
from atocore.memory.service import get_memory, invalidate_memory
|
||||
|
||||
reason = req.reason if req else ""
|
||||
# Quick existence/status check for a clean 404 vs 409.
|
||||
existing = [
|
||||
m for m in _get_memories(status="active", limit=1)
|
||||
if m.id == memory_id
|
||||
]
|
||||
if not existing:
|
||||
# Fall through to generic not-active if the id exists in another status.
|
||||
all_match = [
|
||||
m for m in _get_memories(status="candidate", limit=5000)
|
||||
+ _get_memories(status="invalid", limit=5000)
|
||||
+ _get_memories(status="superseded", limit=5000)
|
||||
if m.id == memory_id
|
||||
]
|
||||
if all_match:
|
||||
if all_match[0].status == "invalid":
|
||||
# Direct id lookup — earlier code used get_memories(status='active', limit=1)
|
||||
# which only saw the highest-confidence active row, so any other active
|
||||
# memory would 404 here even though it existed.
|
||||
target = get_memory(memory_id)
|
||||
if target is None:
|
||||
raise HTTPException(status_code=404, detail=f"Memory not found: {memory_id}")
|
||||
if target.status == "invalid":
|
||||
return {"status": "already_invalid", "id": memory_id}
|
||||
if target.status != "active":
|
||||
raise HTTPException(
|
||||
status_code=409,
|
||||
detail=(
|
||||
f"Memory {memory_id} is {all_match[0].status}; "
|
||||
f"Memory {memory_id} is {target.status}; "
|
||||
"use /reject for candidates"
|
||||
),
|
||||
)
|
||||
raise HTTPException(status_code=404, detail=f"Memory not found: {memory_id}")
|
||||
|
||||
success = invalidate_memory(memory_id, actor="api-http", reason=reason)
|
||||
if not success:
|
||||
@@ -833,15 +827,33 @@ def api_supersede_memory(
|
||||
memory_id: str,
|
||||
req: MemorySupersedeRequest | None = None,
|
||||
) -> dict:
|
||||
"""Supersede an active memory (Issue E — active → superseded)."""
|
||||
from atocore.memory.service import supersede_memory
|
||||
"""Supersede an active memory (Issue E — active → superseded).
|
||||
|
||||
Mirrors the invalidate route's status guard: candidates and other
|
||||
non-active rows must not silently flip to superseded.
|
||||
"""
|
||||
from atocore.memory.service import get_memory, supersede_memory
|
||||
|
||||
reason = req.reason if req else ""
|
||||
target = get_memory(memory_id)
|
||||
if target is None:
|
||||
raise HTTPException(status_code=404, detail=f"Memory not found: {memory_id}")
|
||||
if target.status == "superseded":
|
||||
return {"status": "already_superseded", "id": memory_id}
|
||||
if target.status != "active":
|
||||
raise HTTPException(
|
||||
status_code=409,
|
||||
detail=(
|
||||
f"Memory {memory_id} is {target.status}; "
|
||||
"only active memories can be superseded"
|
||||
),
|
||||
)
|
||||
|
||||
success = supersede_memory(memory_id, actor="api-http", reason=reason)
|
||||
if not success:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Memory not found or not active: {memory_id}",
|
||||
status_code=409,
|
||||
detail=f"Memory {memory_id} could not be superseded",
|
||||
)
|
||||
return {"status": "superseded", "id": memory_id}
|
||||
|
||||
@@ -1280,16 +1292,20 @@ def api_dashboard() -> dict:
|
||||
health beyond the basic /health endpoint.
|
||||
"""
|
||||
import json as _json
|
||||
from collections import Counter
|
||||
from datetime import datetime as _dt, timezone as _tz
|
||||
|
||||
all_memories = get_memories(active_only=False, limit=500)
|
||||
active = [m for m in all_memories if m.status == "active"]
|
||||
candidates = [m for m in all_memories if m.status == "candidate"]
|
||||
from atocore.memory.service import get_memory_count_summary
|
||||
|
||||
type_counts = dict(Counter(m.memory_type for m in active))
|
||||
project_counts = dict(Counter(m.project or "(none)" for m in active))
|
||||
reinforced = [m for m in active if m.reference_count > 0]
|
||||
# SQL-backed counts. Earlier code derived these by sampling the top
|
||||
# 500 rows of get_memories() ordered by confidence — anything past
|
||||
# the cap was invisible, so /admin/dashboard silently undercounted
|
||||
# active memories once the corpus crossed ~500 active rows.
|
||||
counts = get_memory_count_summary()
|
||||
active_total = counts["active"]["total"]
|
||||
candidate_total = counts["by_status"].get("candidate", 0)
|
||||
type_counts = counts["active"]["by_type"]
|
||||
project_counts = counts["active"]["by_project"]
|
||||
reinforced_total = counts["active"]["reinforced"]
|
||||
|
||||
# Interaction stats — total + by_client from DB directly
|
||||
interaction_stats: dict = {"most_recent": None, "total": 0, "by_client": {}}
|
||||
@@ -1402,13 +1418,13 @@ def api_dashboard() -> dict:
|
||||
|
||||
# Triage queue health
|
||||
triage: dict = {
|
||||
"pending": len(candidates),
|
||||
"pending": candidate_total,
|
||||
"review_url": "/admin/triage",
|
||||
}
|
||||
if len(candidates) > 50:
|
||||
triage["warning"] = f"High queue: {len(candidates)} candidates pending review."
|
||||
elif len(candidates) > 20:
|
||||
triage["notice"] = f"{len(candidates)} candidates awaiting triage."
|
||||
if candidate_total > 50:
|
||||
triage["warning"] = f"High queue: {candidate_total} candidates pending review."
|
||||
elif candidate_total > 20:
|
||||
triage["notice"] = f"{candidate_total} candidates awaiting triage."
|
||||
|
||||
# Recent audit activity (Phase 4 V1) — last 10 mutations for operator
|
||||
recent_audit: list[dict] = []
|
||||
@@ -1420,11 +1436,13 @@ def api_dashboard() -> dict:
|
||||
|
||||
return {
|
||||
"memories": {
|
||||
"active": len(active),
|
||||
"candidates": len(candidates),
|
||||
"active": active_total,
|
||||
"candidates": candidate_total,
|
||||
"by_type": type_counts,
|
||||
"by_project": project_counts,
|
||||
"reinforced": len(reinforced),
|
||||
"reinforced": reinforced_total,
|
||||
"by_status": counts["by_status"],
|
||||
"total": counts["total"],
|
||||
},
|
||||
"project_state": {
|
||||
"counts": ps_counts,
|
||||
|
||||
@@ -11,7 +11,7 @@ from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.context.project_state import format_project_state, get_state
|
||||
from atocore.context.project_state import ProjectStateEntry, format_project_state, get_state
|
||||
from atocore.memory.service import get_memories_for_context
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.engineering.service import get_entities, get_entity_with_context
|
||||
@@ -116,6 +116,11 @@ def build_context(
|
||||
if canonical_project:
|
||||
state_entries = get_state(canonical_project)
|
||||
if state_entries:
|
||||
state_entries = _rank_project_state_entries(
|
||||
state_entries,
|
||||
query=user_prompt,
|
||||
project=canonical_project,
|
||||
)
|
||||
project_state_text = format_project_state(state_entries)
|
||||
project_state_text, project_state_chars = _truncate_text_block(
|
||||
project_state_text,
|
||||
@@ -284,6 +289,55 @@ def get_last_context_pack() -> ContextPack | None:
|
||||
return _last_context_pack
|
||||
|
||||
|
||||
def _rank_project_state_entries(
|
||||
entries: list[ProjectStateEntry],
|
||||
query: str,
|
||||
project: str,
|
||||
) -> list[ProjectStateEntry]:
|
||||
"""Promote query-relevant trusted state before the state band is truncated."""
|
||||
if not query or len(entries) <= 1:
|
||||
return entries
|
||||
|
||||
from atocore.memory.reinforcement import _normalize, _tokenize
|
||||
|
||||
query_text = _normalize(query.replace("_", " "))
|
||||
query_tokens = set(_tokenize(query_text))
|
||||
query_tokens -= {
|
||||
"how",
|
||||
"what",
|
||||
"when",
|
||||
"where",
|
||||
"which",
|
||||
"who",
|
||||
"why",
|
||||
"current",
|
||||
"status",
|
||||
"project",
|
||||
}
|
||||
for part in (project or "").lower().replace("_", "-").split("-"):
|
||||
query_tokens.discard(part)
|
||||
if not query_tokens:
|
||||
return entries
|
||||
|
||||
scored: list[tuple[int, float, float, int, ProjectStateEntry]] = []
|
||||
for index, entry in enumerate(entries):
|
||||
entry_text = " ".join(
|
||||
[
|
||||
entry.category,
|
||||
entry.key.replace("_", " "),
|
||||
entry.value,
|
||||
entry.source,
|
||||
]
|
||||
)
|
||||
entry_tokens = _tokenize(_normalize(entry_text))
|
||||
overlap = len(entry_tokens & query_tokens) if entry_tokens else 0
|
||||
density = overlap / len(entry_tokens) if entry_tokens else 0.0
|
||||
scored.append((overlap, density, entry.confidence, -index, entry))
|
||||
|
||||
scored.sort(key=lambda item: (item[0], item[1], item[2], item[3]), reverse=True)
|
||||
return [entry for _, _, _, _, entry in scored]
|
||||
|
||||
|
||||
def _rank_chunks(
|
||||
candidates: list[ChunkResult],
|
||||
project_hint: str | None,
|
||||
|
||||
@@ -50,6 +50,9 @@ MEMORY_STATUSES = [
|
||||
"graduated", # Phase 5: memory has become an entity; content frozen, forward pointer in properties
|
||||
]
|
||||
|
||||
DEFAULT_CONTEXT_MEMORY_LIMIT = 30
|
||||
QUERY_CONTEXT_MEMORY_LIMIT = 120
|
||||
|
||||
|
||||
@dataclass
|
||||
class Memory:
|
||||
@@ -344,6 +347,83 @@ def get_memories(
|
||||
return [_row_to_memory(r) for r in rows]
|
||||
|
||||
|
||||
def get_memory(memory_id: str) -> Memory | None:
|
||||
"""Return a single memory by id, or None if missing.
|
||||
|
||||
Direct id lookup (no LIMIT, no confidence ordering) — the right
|
||||
primitive for routes that need to check a specific memory's status
|
||||
before acting. Avoids the sampling pitfall where ``get_memories``
|
||||
with a small ``limit`` could hide a target row sorted past the cap.
|
||||
"""
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT * FROM memories WHERE id = ?", (memory_id,)
|
||||
).fetchone()
|
||||
return _row_to_memory(row) if row else None
|
||||
|
||||
|
||||
def get_memory_count_summary() -> dict:
|
||||
"""Aggregate memory counts straight from SQL (no sampling).
|
||||
|
||||
Returned shape:
|
||||
{
|
||||
"total": int, # all rows
|
||||
"by_status": {status: int, ...}, # full table
|
||||
"active": {
|
||||
"total": int,
|
||||
"reinforced": int, # active with reference_count > 0
|
||||
"by_type": {memory_type: int, ...},
|
||||
"by_project": {project_or_none: int, ...},
|
||||
},
|
||||
}
|
||||
|
||||
Distinct from ``get_memories(...)``, which is a row-fetcher with a
|
||||
confidence-sorted LIMIT and is therefore not safe for counting.
|
||||
"""
|
||||
summary: dict = {
|
||||
"total": 0,
|
||||
"by_status": {},
|
||||
"active": {
|
||||
"total": 0,
|
||||
"reinforced": 0,
|
||||
"by_type": {},
|
||||
"by_project": {},
|
||||
},
|
||||
}
|
||||
|
||||
with get_connection() as conn:
|
||||
row = conn.execute("SELECT count(*) FROM memories").fetchone()
|
||||
summary["total"] = row[0] if row else 0
|
||||
|
||||
rows = conn.execute(
|
||||
"SELECT status, count(*) FROM memories GROUP BY status"
|
||||
).fetchall()
|
||||
summary["by_status"] = {r[0]: r[1] for r in rows}
|
||||
|
||||
active_total = summary["by_status"].get("active", 0)
|
||||
summary["active"]["total"] = active_total
|
||||
|
||||
rows = conn.execute(
|
||||
"SELECT memory_type, count(*) FROM memories "
|
||||
"WHERE status = 'active' GROUP BY memory_type"
|
||||
).fetchall()
|
||||
summary["active"]["by_type"] = {r[0]: r[1] for r in rows}
|
||||
|
||||
rows = conn.execute(
|
||||
"SELECT COALESCE(NULLIF(project, ''), '(none)') AS project, count(*) "
|
||||
"FROM memories WHERE status = 'active' GROUP BY project"
|
||||
).fetchall()
|
||||
summary["active"]["by_project"] = {r[0]: r[1] for r in rows}
|
||||
|
||||
row = conn.execute(
|
||||
"SELECT count(*) FROM memories "
|
||||
"WHERE status = 'active' AND reference_count > 0"
|
||||
).fetchone()
|
||||
summary["active"]["reinforced"] = row[0] if row else 0
|
||||
|
||||
return summary
|
||||
|
||||
|
||||
def update_memory(
|
||||
memory_id: str,
|
||||
content: str | None = None,
|
||||
@@ -352,6 +432,7 @@ def update_memory(
|
||||
memory_type: str | None = None,
|
||||
domain_tags: list[str] | None = None,
|
||||
valid_until: str | None = None,
|
||||
project: str | None = None,
|
||||
actor: str = "api",
|
||||
note: str = "",
|
||||
) -> bool:
|
||||
@@ -365,6 +446,10 @@ def update_memory(
|
||||
|
||||
next_content = content if content is not None else existing["content"]
|
||||
next_status = status if status is not None else existing["status"]
|
||||
next_project = (
|
||||
resolve_project_name(project) if project is not None
|
||||
else (existing["project"] or "")
|
||||
)
|
||||
if confidence is not None:
|
||||
_validate_confidence(confidence)
|
||||
|
||||
@@ -372,7 +457,7 @@ def update_memory(
|
||||
duplicate = conn.execute(
|
||||
"SELECT id FROM memories "
|
||||
"WHERE memory_type = ? AND content = ? AND project = ? AND status = 'active' AND id != ?",
|
||||
(existing["memory_type"], next_content, existing["project"] or "", memory_id),
|
||||
(existing["memory_type"], next_content, next_project, memory_id),
|
||||
).fetchone()
|
||||
if duplicate:
|
||||
raise ValueError("Update would create a duplicate active memory")
|
||||
@@ -383,6 +468,7 @@ def update_memory(
|
||||
"status": existing["status"],
|
||||
"confidence": existing["confidence"],
|
||||
"memory_type": existing["memory_type"],
|
||||
"project": existing["project"] or "",
|
||||
}
|
||||
after_snapshot = dict(before_snapshot)
|
||||
|
||||
@@ -419,6 +505,10 @@ def update_memory(
|
||||
updates.append("valid_until = ?")
|
||||
params.append(vu)
|
||||
after_snapshot["valid_until"] = vu or ""
|
||||
if project is not None:
|
||||
updates.append("project = ?")
|
||||
params.append(next_project)
|
||||
after_snapshot["project"] = next_project
|
||||
|
||||
if not updates:
|
||||
return False
|
||||
@@ -896,6 +986,7 @@ def get_memories_for_context(
|
||||
from atocore.memory.reinforcement import _normalize, _tokenize
|
||||
|
||||
query_tokens = _tokenize(_normalize(query))
|
||||
query_tokens = _prepare_memory_query_tokens(query_tokens, project=project)
|
||||
if not query_tokens:
|
||||
query_tokens = None
|
||||
|
||||
@@ -908,12 +999,13 @@ def get_memories_for_context(
|
||||
# ``_rank_memories_for_query`` via Python's stable sort.
|
||||
pool: list[Memory] = []
|
||||
seen_ids: set[str] = set()
|
||||
candidate_limit = QUERY_CONTEXT_MEMORY_LIMIT if query_tokens is not None else DEFAULT_CONTEXT_MEMORY_LIMIT
|
||||
for mtype in memory_types:
|
||||
for mem in get_memories(
|
||||
memory_type=mtype,
|
||||
project=project,
|
||||
min_confidence=0.5,
|
||||
limit=30,
|
||||
limit=candidate_limit,
|
||||
):
|
||||
if mem.id in seen_ids:
|
||||
continue
|
||||
@@ -980,11 +1072,11 @@ def _rank_memories_for_query(
|
||||
) -> list["Memory"]:
|
||||
"""Rerank a memory list by lexical overlap with a pre-tokenized query.
|
||||
|
||||
Primary key: overlap_density (overlap_count / memory_token_count),
|
||||
which rewards short focused memories that match the query precisely
|
||||
over long overview memories that incidentally share a few tokens.
|
||||
Secondary: absolute overlap count. Tertiary: domain-tag match.
|
||||
Quaternary: confidence.
|
||||
Primary key: absolute overlap count, which keeps a richer memory
|
||||
matching multiple query-intent terms ahead of a short memory that
|
||||
only happens to share one term. Secondary: overlap_density
|
||||
(overlap_count / memory_token_count), so ties still prefer short
|
||||
focused memories. Tertiary: domain-tag match. Quaternary: confidence.
|
||||
|
||||
Phase 3: domain_tags contribute a boost when they appear in the
|
||||
query text. A memory tagged [optics, thermal] for a query about
|
||||
@@ -1010,10 +1102,46 @@ def _rank_memories_for_query(
|
||||
tag_hits += 1
|
||||
|
||||
scored.append((density, overlap, tag_hits, mem.confidence, mem))
|
||||
scored.sort(key=lambda t: (t[0], t[1], t[2], t[3]), reverse=True)
|
||||
scored.sort(key=lambda t: (t[1], t[0], t[2], t[3]), reverse=True)
|
||||
return [mem for _, _, _, _, mem in scored]
|
||||
|
||||
|
||||
_MEMORY_QUERY_STOP_TOKENS = {
|
||||
"how",
|
||||
"what",
|
||||
"when",
|
||||
"where",
|
||||
"which",
|
||||
"who",
|
||||
"why",
|
||||
"current",
|
||||
"status",
|
||||
"project",
|
||||
"machine",
|
||||
}
|
||||
|
||||
_MEMORY_QUERY_TOKEN_EXPANSIONS = {
|
||||
"remotely": {"remote"},
|
||||
}
|
||||
|
||||
|
||||
def _prepare_memory_query_tokens(
|
||||
query_tokens: set[str],
|
||||
project: str | None = None,
|
||||
) -> set[str]:
|
||||
"""Remove project-scope noise and add tiny intent-preserving expansions."""
|
||||
prepared = set(query_tokens)
|
||||
for token in list(prepared):
|
||||
prepared.update(_MEMORY_QUERY_TOKEN_EXPANSIONS.get(token, set()))
|
||||
|
||||
prepared -= _MEMORY_QUERY_STOP_TOKENS
|
||||
if project:
|
||||
for part in project.lower().replace("_", "-").split("-"):
|
||||
if part:
|
||||
prepared.discard(part)
|
||||
return prepared
|
||||
|
||||
|
||||
def _row_to_memory(row) -> Memory:
|
||||
"""Convert a DB row to Memory dataclass."""
|
||||
import json as _json
|
||||
|
||||
@@ -143,6 +143,52 @@ def test_project_state_respects_total_budget(tmp_data_dir, sample_markdown):
|
||||
assert len(pack.formatted_context) <= 120
|
||||
|
||||
|
||||
def test_project_state_query_relevance_before_truncation(tmp_data_dir, sample_markdown):
|
||||
"""Relevant trusted state should survive the project-state budget cap."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
set_state(
|
||||
"p04-gigabit",
|
||||
"contact",
|
||||
"abb-space",
|
||||
"ABB Space is the primary vendor contact for polishing, CCP, IBF, procurement coordination, "
|
||||
"contract administration, interface planning, and delivery discussions.",
|
||||
)
|
||||
set_state(
|
||||
"p04-gigabit",
|
||||
"decision",
|
||||
"back-structure",
|
||||
"Option B selected: conical isogrid back structure with variable rib density. "
|
||||
"Chosen over flat-back for stiffness-to-weight ratio and manufacturability.",
|
||||
)
|
||||
set_state(
|
||||
"p04-gigabit",
|
||||
"decision",
|
||||
"polishing-vendor",
|
||||
"ABB Space selected as polishing vendor. Contract includes computer-controlled polishing "
|
||||
"and ion beam figuring.",
|
||||
)
|
||||
set_state(
|
||||
"p04-gigabit",
|
||||
"requirement",
|
||||
"key_constraints",
|
||||
"The program targets a 1.2 m lightweight Zerodur mirror with filtered mechanical WFE below 15 nm "
|
||||
"and mass below 103.5 kg.",
|
||||
)
|
||||
|
||||
pack = build_context(
|
||||
"what are the key GigaBIT M1 program constraints",
|
||||
project_hint="p04-gigabit",
|
||||
budget=3000,
|
||||
)
|
||||
|
||||
assert "Zerodur" in pack.formatted_context
|
||||
assert "1.2" in pack.formatted_context
|
||||
assert pack.formatted_context.find("[REQUIREMENT]") < pack.formatted_context.find("[CONTACT]")
|
||||
|
||||
|
||||
def test_project_hint_matches_state_case_insensitively(tmp_data_dir, sample_markdown):
|
||||
"""Project state lookup should not depend on exact casing."""
|
||||
init_db()
|
||||
|
||||
@@ -192,3 +192,120 @@ def test_v1_aliases_present(env):
|
||||
"/v1/memory/{memory_id}/supersede",
|
||||
):
|
||||
assert p in paths, f"{p} missing"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Wave 1 (2026-04-29) — invalidation route used to do
|
||||
# `_get_memories(status='active', limit=1)` and look for the target id
|
||||
# inside that single highest-confidence row, so any active memory
|
||||
# outside slot 0 fell through as 404. Direct id lookup fixes it.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_api_invalidate_finds_active_memory_outside_top_one(env):
|
||||
"""An active memory not at the top of the confidence sort must still
|
||||
be invalidatable via POST /memory/{id}/invalidate."""
|
||||
high = create_memory(
|
||||
memory_type="knowledge",
|
||||
content="high-confidence top row",
|
||||
confidence=0.99,
|
||||
)
|
||||
low = create_memory(
|
||||
memory_type="knowledge",
|
||||
content="lower-confidence target",
|
||||
confidence=0.55,
|
||||
)
|
||||
client = TestClient(app)
|
||||
r = client.post(f"/memory/{low.id}/invalidate", json={"reason": "wave1 regression"})
|
||||
assert r.status_code == 200, r.text
|
||||
assert r.json()["status"] == "invalidated"
|
||||
# And confirm the high-confidence row is untouched
|
||||
assert _get_memory(high.id).status == "active"
|
||||
assert _get_memory(low.id).status == "invalid"
|
||||
|
||||
|
||||
def test_api_invalidate_already_invalid_is_idempotent(env):
|
||||
m = create_memory(memory_type="knowledge", content="already invalid")
|
||||
client = TestClient(app)
|
||||
r1 = client.post(f"/memory/{m.id}/invalidate", json={"reason": "first"})
|
||||
assert r1.status_code == 200
|
||||
r2 = client.post(f"/memory/{m.id}/invalidate", json={"reason": "again"})
|
||||
assert r2.status_code == 200
|
||||
assert r2.json()["status"] == "already_invalid"
|
||||
|
||||
|
||||
def test_api_invalidate_candidate_returns_409(env):
|
||||
m = create_memory(
|
||||
memory_type="knowledge", content="candidate route", status="candidate"
|
||||
)
|
||||
client = TestClient(app)
|
||||
r = client.post(f"/memory/{m.id}/invalidate", json={"reason": "wrong route"})
|
||||
assert r.status_code == 409
|
||||
|
||||
|
||||
def test_api_invalidate_unknown_id_is_404(env):
|
||||
client = TestClient(app)
|
||||
r = client.post("/memory/no-such-id/invalidate", json={"reason": "ghost"})
|
||||
assert r.status_code == 404
|
||||
|
||||
|
||||
def test_api_supersede_candidate_returns_409(env):
|
||||
"""Mirror of the invalidate guard: candidates must not silently flip
|
||||
to superseded via the active-only supersede route."""
|
||||
m = create_memory(
|
||||
memory_type="knowledge", content="candidate target", status="candidate"
|
||||
)
|
||||
client = TestClient(app)
|
||||
r = client.post(f"/memory/{m.id}/supersede", json={"reason": "wrong route"})
|
||||
assert r.status_code == 409
|
||||
# Row should still be a candidate
|
||||
assert _get_memory(m.id).status == "candidate"
|
||||
|
||||
|
||||
def test_api_supersede_already_superseded_is_idempotent(env):
|
||||
m = create_memory(memory_type="knowledge", content="will be superseded")
|
||||
client = TestClient(app)
|
||||
r1 = client.post(f"/memory/{m.id}/supersede", json={"reason": "first"})
|
||||
assert r1.status_code == 200
|
||||
r2 = client.post(f"/memory/{m.id}/supersede", json={"reason": "again"})
|
||||
assert r2.status_code == 200
|
||||
assert r2.json()["status"] == "already_superseded"
|
||||
|
||||
|
||||
def test_api_supersede_unknown_id_is_404(env):
|
||||
client = TestClient(app)
|
||||
r = client.post("/memory/no-such-id/supersede", json={"reason": "ghost"})
|
||||
assert r.status_code == 404
|
||||
|
||||
|
||||
def test_admin_dashboard_active_count_matches_full_table(env):
|
||||
"""/admin/dashboard memories.active must match the SQL aggregate even
|
||||
when there are more active memories than the legacy sample limit (500).
|
||||
|
||||
This guards the Codex finding that the dashboard was deriving counts
|
||||
from a confidence-sorted limit=500 fetch, hiding rows past the cap.
|
||||
We don't need 500 rows in the test — a small corpus that exercises
|
||||
the SQL-aggregate path is enough; the integrity-vs-dashboard equality
|
||||
is the invariant being asserted.
|
||||
"""
|
||||
# Mix of statuses to exercise the by_status aggregate
|
||||
create_memory(memory_type="knowledge", content="a")
|
||||
create_memory(memory_type="knowledge", content="b", project="p06-polisher")
|
||||
create_memory(memory_type="project", content="c-cand", status="candidate")
|
||||
cand = create_memory(memory_type="project", content="d-cand", status="candidate")
|
||||
# Invalidate one to seed an "invalid" bucket
|
||||
from atocore.memory.service import invalidate_memory
|
||||
target_id = cand.id
|
||||
# Promote it first via direct DB so invalidate does flip a candidate
|
||||
# to invalid via the service path (mirrors actual API trajectory).
|
||||
invalidate_memory(target_id)
|
||||
|
||||
client = TestClient(app)
|
||||
dash = client.get("/admin/dashboard").json()
|
||||
assert dash["memories"]["active"] == 2
|
||||
assert dash["memories"]["candidates"] == 1
|
||||
assert dash["memories"]["by_status"]["invalid"] == 1
|
||||
assert dash["memories"]["total"] == 4
|
||||
assert dash["memories"]["by_project"].get("p06-polisher") == 1
|
||||
# "(none)" bucket is the COALESCE label for empty/null project
|
||||
assert "(none)" in dash["memories"]["by_project"]
|
||||
|
||||
@@ -428,6 +428,136 @@ def test_context_builder_tag_boost_orders_results(isolated_db):
|
||||
assert idx_tagged < idx_untagged
|
||||
|
||||
|
||||
def test_project_memory_ranking_ignores_scope_noise(isolated_db):
|
||||
"""Project words should not crowd out the actual query intent."""
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
|
||||
create_memory(
|
||||
"project",
|
||||
"Norman is the end operator for p06-polisher and requires an explicit manual mode to operate the machine.",
|
||||
project="p06-polisher",
|
||||
confidence=0.7,
|
||||
)
|
||||
create_memory(
|
||||
"project",
|
||||
"Polisher Control firmware spec document titled 'Fulum Polisher Machine Control Firmware Spec v1' lives in PKM.",
|
||||
project="p06-polisher",
|
||||
confidence=0.7,
|
||||
)
|
||||
create_memory(
|
||||
"project",
|
||||
"Machine design principle: works fully offline and independently; network connection is for remote access only",
|
||||
project="p06-polisher",
|
||||
confidence=0.5,
|
||||
)
|
||||
create_memory(
|
||||
"project",
|
||||
"Use Tailscale mesh for RPi remote access to provide SSH, file transfer, and NAT traversal without port forwarding.",
|
||||
project="p06-polisher",
|
||||
confidence=0.5,
|
||||
)
|
||||
|
||||
text, _ = get_memories_for_context(
|
||||
memory_types=["project"],
|
||||
project="p06-polisher",
|
||||
budget=360,
|
||||
query="how do we access the polisher machine remotely",
|
||||
)
|
||||
|
||||
assert "Tailscale" in text
|
||||
assert text.find("remote access only") < text.find("Tailscale")
|
||||
assert "manual mode" not in text
|
||||
|
||||
|
||||
def test_project_memory_ranking_prefers_multiple_intent_hits(isolated_db):
|
||||
"""A rich memory with several query hits should beat a terse one-hit memory."""
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
|
||||
create_memory(
|
||||
"project",
|
||||
"CGH vendor selected for p05. Active integration coordination with Katie/AOM.",
|
||||
project="p05-interferometer",
|
||||
confidence=0.7,
|
||||
)
|
||||
create_memory(
|
||||
"knowledge",
|
||||
"Vendor-summary current signal: 4D is the strongest technical Twyman-Green candidate; "
|
||||
"a certified used Zygo Verifire SV around $55k emerged as a strong value path.",
|
||||
project="p05-interferometer",
|
||||
confidence=0.9,
|
||||
)
|
||||
|
||||
text, _ = get_memories_for_context(
|
||||
memory_types=["project", "knowledge"],
|
||||
project="p05-interferometer",
|
||||
budget=220,
|
||||
query="what is the current vendor signal for the interferometer procurement",
|
||||
)
|
||||
|
||||
assert "4D" in text
|
||||
assert "Zygo" in text
|
||||
|
||||
|
||||
def test_project_memory_query_ranks_beyond_confidence_prefilter(isolated_db):
|
||||
"""Query-time ranking should see older low-confidence but exact-intent memories."""
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
|
||||
for idx in range(35):
|
||||
create_memory(
|
||||
"project",
|
||||
f"High confidence p06 filler memory {idx}: Polisher Control planning note.",
|
||||
project="p06-polisher",
|
||||
confidence=0.9,
|
||||
)
|
||||
create_memory(
|
||||
"project",
|
||||
"Use Tailscale mesh for RPi remote access to provide SSH, file transfer, and NAT traversal without port forwarding.",
|
||||
project="p06-polisher",
|
||||
confidence=0.5,
|
||||
)
|
||||
|
||||
text, _ = get_memories_for_context(
|
||||
memory_types=["project"],
|
||||
project="p06-polisher",
|
||||
budget=360,
|
||||
query="how do we access the polisher machine remotely",
|
||||
)
|
||||
|
||||
assert "Tailscale" in text
|
||||
|
||||
|
||||
def test_project_memory_query_prefers_exact_cam_fact(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
|
||||
create_memory(
|
||||
"project",
|
||||
"Polisher Control firmware spec document titled 'Fulum Polisher Machine Control Firmware Spec v1' lives in PKM.",
|
||||
project="p06-polisher",
|
||||
confidence=0.9,
|
||||
)
|
||||
create_memory(
|
||||
"project",
|
||||
"Polisher Control doc must cover manual mode for Norman as a required deliverable per the plan.",
|
||||
project="p06-polisher",
|
||||
confidence=0.9,
|
||||
)
|
||||
create_memory(
|
||||
"project",
|
||||
"Cam amplitude and offset are mechanically set by operator and read via encoders; no actuators control them.",
|
||||
project="p06-polisher",
|
||||
confidence=0.5,
|
||||
)
|
||||
|
||||
text, _ = get_memories_for_context(
|
||||
memory_types=["project"],
|
||||
project="p06-polisher",
|
||||
budget=300,
|
||||
query="how is cam amplitude controlled on the polisher",
|
||||
)
|
||||
|
||||
assert "encoders" in text
|
||||
|
||||
|
||||
def test_expire_stale_candidates_keeps_reinforced(isolated_db):
|
||||
from atocore.memory.service import create_memory, expire_stale_candidates
|
||||
from atocore.models.database import get_connection
|
||||
@@ -445,3 +575,121 @@ def test_expire_stale_candidates_keeps_reinforced(isolated_db):
|
||||
assert mid not in expired
|
||||
mem = _get_memory_by_id(mid)
|
||||
assert mem["status"] == "candidate"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Wave 1 (2026-04-29) — counts come from SQL, not from the top-N sample.
|
||||
# Exposed by Codex audit when prod /admin/dashboard reported 315 active
|
||||
# while /admin/integrity-check reported 1091. The dashboard was building
|
||||
# its counts from a confidence-sorted limit=500 fetch.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_get_memory_count_summary_returns_full_table_aggregates(isolated_db):
|
||||
"""Counts come from SQL aggregates, not a sampled fetch."""
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
get_memory_count_summary,
|
||||
invalidate_memory,
|
||||
)
|
||||
|
||||
# Create more rows than any reasonable sampling LIMIT so any
|
||||
# LIMIT-based counter would visibly disagree with reality.
|
||||
for i in range(120):
|
||||
create_memory(
|
||||
"knowledge",
|
||||
f"fact-{i}",
|
||||
project="p04-gigabit",
|
||||
confidence=0.9,
|
||||
status="active",
|
||||
)
|
||||
for i in range(7):
|
||||
create_memory("knowledge", f"cand-{i}", status="candidate")
|
||||
invalid_obj = create_memory("knowledge", "to-invalidate", status="active")
|
||||
invalidate_memory(invalid_obj.id)
|
||||
|
||||
summary = get_memory_count_summary()
|
||||
assert summary["total"] == 120 + 7 + 1
|
||||
assert summary["by_status"]["active"] == 120
|
||||
assert summary["by_status"]["candidate"] == 7
|
||||
assert summary["by_status"]["invalid"] == 1
|
||||
assert summary["active"]["total"] == 120
|
||||
assert summary["active"]["by_type"] == {"knowledge": 120}
|
||||
assert summary["active"]["by_project"] == {"p04-gigabit": 120}
|
||||
|
||||
|
||||
def test_get_memory_returns_single_row_or_none(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memory
|
||||
|
||||
mem = create_memory("knowledge", "single-row test")
|
||||
fetched = get_memory(mem.id)
|
||||
assert fetched is not None
|
||||
assert fetched.id == mem.id
|
||||
assert get_memory("non-existent-id") is None
|
||||
|
||||
|
||||
def test_update_memory_can_change_project_with_canonicalization(
|
||||
isolated_db, project_registry
|
||||
):
|
||||
"""update_memory(project=...) canonicalizes aliases and writes audit."""
|
||||
project_registry(("p04-gigabit", ("p04", "gigabit")))
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
get_memory,
|
||||
get_memory_audit,
|
||||
update_memory,
|
||||
)
|
||||
|
||||
mem = create_memory("knowledge", "retargetable fact", project="atocore")
|
||||
ok = update_memory(mem.id, project="p04") # alias
|
||||
assert ok is True
|
||||
|
||||
refreshed = get_memory(mem.id)
|
||||
assert refreshed.project == "p04-gigabit" # canonical, not "p04"
|
||||
|
||||
audit_rows = get_memory_audit(mem.id, limit=10)
|
||||
update_rows = [r for r in audit_rows if r.get("action") == "updated"]
|
||||
assert update_rows, f"expected an updated audit row, got {audit_rows}"
|
||||
head = update_rows[0]
|
||||
assert head["before"]["project"] == "atocore"
|
||||
assert head["after"]["project"] == "p04-gigabit"
|
||||
|
||||
|
||||
def test_update_memory_project_unchanged_when_not_passed(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memory, update_memory
|
||||
|
||||
mem = create_memory("knowledge", "untouched project", project="p06-polisher")
|
||||
update_memory(mem.id, content="edited content")
|
||||
assert get_memory(mem.id).project == "p06-polisher"
|
||||
|
||||
|
||||
def test_update_memory_to_empty_project_detects_global_duplicate(isolated_db):
|
||||
"""Codex P3: when retargeting to project='' (global), the duplicate
|
||||
check must scope to the new project. If a global active memory with
|
||||
the same content already exists, the update must raise."""
|
||||
import pytest as _pytest
|
||||
from atocore.memory.service import create_memory, update_memory
|
||||
|
||||
create_memory("knowledge", "shared global fact", project="")
|
||||
scoped = create_memory("knowledge", "shared global fact", project="p04-gigabit")
|
||||
|
||||
with _pytest.raises(ValueError, match="duplicate active memory"):
|
||||
update_memory(scoped.id, project="")
|
||||
|
||||
|
||||
def test_auto_triage_suggested_project_put_body_uses_project_key():
|
||||
"""Regression: the auto_triage caller used to PUT {"content": ...}
|
||||
which silently dropped the suggested project change. The fix sends
|
||||
{"project": suggested}. Inspect the script source so we don't have
|
||||
to spin up a live triage run."""
|
||||
from pathlib import Path
|
||||
|
||||
src = Path(__file__).resolve().parents[1] / "scripts" / "auto_triage.py"
|
||||
text = src.read_text(encoding="utf-8")
|
||||
# The block that PUTs to /memory/{mid} for a suggested_project fix
|
||||
assert 'json.dumps({"project": suggested})' in text, (
|
||||
"auto_triage.py must PUT {\"project\": suggested} so the "
|
||||
"suggested-project correction actually applies. See Wave 1."
|
||||
)
|
||||
# And must not be back to the old shape
|
||||
assert 'json.dumps({"content": cand["content"]})' not in text
|
||||
|
||||
Reference in New Issue
Block a user