Compare commits
2 Commits
4e6fba7cb9
...
9604c3e9ae
| Author | SHA1 | Date | |
|---|---|---|---|
| 9604c3e9ae | |||
| 3a474f750c |
@@ -9,7 +9,7 @@
|
||||
- **live_sha** (Dalidou `/health` build_sha): `7042eae` (verified 2026-04-29T01:19Z; status=ok; deployed 2026-04-25T01:04Z, docs-only on top of `d3de9f6`)
|
||||
- **last_updated**: 2026-04-29 by Claude (Wave 1 debt-pay branch open; Codex review of plan applied)
|
||||
- **main_tip**: `7042eae`
|
||||
- **test_count**: 581 on `claude/wave1-dashboard-counts-and-memory-fixes` (572 on main + 9 Wave 1 regressions)
|
||||
- **test_count**: 586 on `claude/wave1-dashboard-counts-and-memory-fixes` (572 on main + 14 Wave 1 regressions; +5 from Codex review amends)
|
||||
- **harness**: `20/20 PASS` on live Dalidou, 0 blocking failures, 0 known issues (last harness run nightly 2026-04-28T03:00:30Z)
|
||||
- **vectors**: 33,253
|
||||
- **active_memories**: 315 dashboard / 1091 integrity (verified live 2026-04-29). Discrepancy was a dashboard sampling bug — Wave 1 commit `fb4d55c` replaces it with SQL aggregates; post-deploy the two will agree.
|
||||
@@ -25,7 +25,7 @@
|
||||
- **dashboard**: http://dalidou:8100/admin/dashboard
|
||||
- **active_track**: Wave 1 debt-pay (this session) → V1-A start. V1-A gates have cleared (soak ended 2026-04-26; density 315 ≫ 100). Plan: `docs/plans/engineering-v1-completion-plan.md`. Resume map: `docs/plans/v1-resume-state.md`.
|
||||
- **last_nightly_pipeline**: `2026-04-28T03:00:30Z` — harness 20/20, triage promoted=1 rejected=1 human=0
|
||||
- **open_branches**: `claude/wave1-dashboard-counts-and-memory-fixes` (commit `fb4d55c`) — three memory-write-path bugs surfaced by Codex review; awaiting Codex audit before merge/deploy.
|
||||
- **open_branches**: `claude/wave1-dashboard-counts-and-memory-fixes` (tip `3a474f7`) — three memory-write-path bugs + two follow-on P2 fixes from Codex's formal audit (`auto_triage.py` PUT body + `/memory/{id}/supersede` status guard). Awaiting Codex re-review of the amended branch before squash-merge/deploy.
|
||||
|
||||
## Active Plan
|
||||
|
||||
@@ -170,6 +170,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
||||
|
||||
## Session Log
|
||||
|
||||
- **2026-04-29 Codex + Claude (Wave 1 formal audit closed + amends)** Codex's formal audit of `fb4d55c` (Wave 1 first commit on `claude/wave1-dashboard-counts-and-memory-fixes`): verdict GO WITH CONDITIONS. Two P1-prior closures confirmed (dashboard count bug + invalidate top-1 lookup). Project-update P2 was only "partially closed" — API/service plumbing fine, but `scripts/auto_triage.py:417` still PUT `{"content": cand["content"]}` so the operational suggested-project correction was unreachable even with `MemoryUpdateRequest.project` in place. Codex also flagged the symmetric supersede-route gap as same-class adjacent surface and recommended pulling it in here, not in Wave 1.5. Plus one P3: cover retarget-to-empty-project against a global active duplicate. Amended on `3a474f7`: (1) auto_triage PUT body now `{"project": suggested}` with a guard test that lints the script source for the new shape; (2) `/memory/{id}/supersede` mirrors the invalidate guard via `get_memory(id)` — 404 unknown / 200 already_superseded / 409 wrong-status / 200 superseded; (3) regression test for project-empty duplicate detection. Test count 581 → 586. Codex's recommended deployment checklist (post-deploy verifications, including the run-or-simulate auto-triage retarget probe) carried into the merge plan. Awaiting Codex re-review of the amended tip before squash-merge.
|
||||
|
||||
- **2026-04-29 Claude (Wave 1 debt-pay started; Codex review of state-of-service plan)** Audited live state on Dalidou: `/health` build_sha `7042eae` (4d old), harness 20/20, 33,253 vectors, 1,748 docs, 1054 interactions (+103/4d), dashboard memories.active=315 vs integrity.active_memory_count=1091. Drafted a state-of-service assessment + Wave 1/2/3/4 plan, then asked Codex (gpt-5.5) for an adversarial review via `codex exec`. Codex verdict: assessment MIXED, plan ENDORSE WITH CHANGES. Codex caught two factual corrections (the 315-vs-1091 gap is a *sampling bug* not a definitional gap; my "R9 drops unregistered tags" framing is wrong — `extractor_llm.py:213-233` preserves them) and two new memory-write-path bugs I missed. Branched `claude/wave1-dashboard-counts-and-memory-fixes` from `7042eae`, fixed all three: (1) `/admin/dashboard` now uses a new `get_memory_count_summary()` SQL aggregate helper instead of counting inside a confidence-sorted `get_memories(limit=500)` sample; (2) `MemoryUpdateRequest` and `update_memory()` accept `project` with `resolve_project_name` canonicalization + before/after audit, so `auto_triage.py:407` suggested-project corrections will now actually apply; (3) `POST /memory/{id}/invalidate` replaces the `_get_memories(status="active", limit=1)` lookup (which only saw the highest-confidence active row) with a direct id lookup via new `get_memory(id)` helper. 9 regression tests added across `test_memory.py` and `test_invalidate_supersede.py`. Full local suite: 581 passed (572 → 581). Commit `fb4d55c`. Branch not pushed/deployed yet — awaiting Codex audit per working model. Refreshed Orientation block (live_sha, last_updated, test_count, memory counts, capture cadence, registered/unregistered project breakdown, open_branches). Wave 1 follow-up still open: (W1.2) one-click registration proposal for unregistered projects with ≥10 active memories (apm=63 is overdue); (W1.3 done by this entry) sync ledger; (W1.4) committed measurable-win probe fixture+JSON output. After this branch lands, V1-A is unblocked: gates have effectively cleared (soak ended 2026-04-26; density 315 ≫ 100 target).
|
||||
|
||||
- **2026-04-25 Codex (p04 constraint gap closed; harness fully green)** Root-caused the remaining `p04-constraints` fixture: the `Zerodur` / `1.2` fact already existed in Trusted Project State (`requirement/key_constraints`), but project-state formatting was category/key ordered and then truncated to the 20% state budget, so contacts/decisions consumed the budget before the relevant requirement. Added query-relevance ranking for Trusted Project State entries before formatting/truncation, with regression coverage in `test_project_state_query_relevance_before_truncation`. Removed the fixture's `known_issue` lane so future p04 constraint regressions are blocking. Cleaned up a duplicate live requirement entry created during diagnosis by invalidating `requirement/mirror-blank-core-constraints`; canonical `requirement/key_constraints` remains active. Verified focused suite: 35 passed. Verified full local suite: 572 passed. Deployed `d3de9f67eaa08dfc5b2d86e8221b8c70fef266d3`; live exact p04 probe now surfaces `[REQUIREMENT] key_constraints` with `1.2` and `Zerodur`. Live retrieval harness: 20/20, 0 known issues, 0 blocking failures.
|
||||
|
||||
@@ -404,19 +404,23 @@ def process_candidate(cand, base_url, active_cache, state_cache, known_projects,
|
||||
known_projects, TIER1_MODEL, DEFAULT_TIMEOUT_S,
|
||||
)
|
||||
|
||||
# Project misattribution fix: suggested_project surfaces from tier 1
|
||||
# Project misattribution fix: suggested_project surfaces from tier 1.
|
||||
# Earlier code POSTed only {"content": cand["content"]}, which left
|
||||
# the project field unchanged because MemoryUpdateRequest had no
|
||||
# project key and the service signature didn't accept one. Wave 1
|
||||
# added project to MemoryUpdateRequest and update_memory(); this
|
||||
# caller now actually applies the suggested project.
|
||||
suggested = (v1.get("suggested_project") or "").strip()
|
||||
if suggested and suggested != project and suggested in known_projects:
|
||||
# Try to re-canonicalize the memory's project
|
||||
if not dry_run:
|
||||
try:
|
||||
import urllib.request as _ur
|
||||
req = _ur.Request(
|
||||
f"{base_url}/memory/{mid}", method="PUT",
|
||||
headers={"Content-Type": "application/json"},
|
||||
data=json.dumps({"content": cand["content"]}).encode("utf-8"),
|
||||
data=json.dumps({"project": suggested}).encode("utf-8"),
|
||||
)
|
||||
_ur.urlopen(req, timeout=10).read() # triggers canonicalization via update
|
||||
_ur.urlopen(req, timeout=10).read()
|
||||
except Exception:
|
||||
pass
|
||||
print(f" ↺ misattribution flagged: {project!r} → {suggested!r}")
|
||||
|
||||
@@ -827,15 +827,33 @@ def api_supersede_memory(
|
||||
memory_id: str,
|
||||
req: MemorySupersedeRequest | None = None,
|
||||
) -> dict:
|
||||
"""Supersede an active memory (Issue E — active → superseded)."""
|
||||
from atocore.memory.service import supersede_memory
|
||||
"""Supersede an active memory (Issue E — active → superseded).
|
||||
|
||||
Mirrors the invalidate route's status guard: candidates and other
|
||||
non-active rows must not silently flip to superseded.
|
||||
"""
|
||||
from atocore.memory.service import get_memory, supersede_memory
|
||||
|
||||
reason = req.reason if req else ""
|
||||
target = get_memory(memory_id)
|
||||
if target is None:
|
||||
raise HTTPException(status_code=404, detail=f"Memory not found: {memory_id}")
|
||||
if target.status == "superseded":
|
||||
return {"status": "already_superseded", "id": memory_id}
|
||||
if target.status != "active":
|
||||
raise HTTPException(
|
||||
status_code=409,
|
||||
detail=(
|
||||
f"Memory {memory_id} is {target.status}; "
|
||||
"only active memories can be superseded"
|
||||
),
|
||||
)
|
||||
|
||||
success = supersede_memory(memory_id, actor="api-http", reason=reason)
|
||||
if not success:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Memory not found or not active: {memory_id}",
|
||||
status_code=409,
|
||||
detail=f"Memory {memory_id} could not be superseded",
|
||||
)
|
||||
return {"status": "superseded", "id": memory_id}
|
||||
|
||||
|
||||
@@ -249,6 +249,35 @@ def test_api_invalidate_unknown_id_is_404(env):
|
||||
assert r.status_code == 404
|
||||
|
||||
|
||||
def test_api_supersede_candidate_returns_409(env):
|
||||
"""Mirror of the invalidate guard: candidates must not silently flip
|
||||
to superseded via the active-only supersede route."""
|
||||
m = create_memory(
|
||||
memory_type="knowledge", content="candidate target", status="candidate"
|
||||
)
|
||||
client = TestClient(app)
|
||||
r = client.post(f"/memory/{m.id}/supersede", json={"reason": "wrong route"})
|
||||
assert r.status_code == 409
|
||||
# Row should still be a candidate
|
||||
assert _get_memory(m.id).status == "candidate"
|
||||
|
||||
|
||||
def test_api_supersede_already_superseded_is_idempotent(env):
|
||||
m = create_memory(memory_type="knowledge", content="will be superseded")
|
||||
client = TestClient(app)
|
||||
r1 = client.post(f"/memory/{m.id}/supersede", json={"reason": "first"})
|
||||
assert r1.status_code == 200
|
||||
r2 = client.post(f"/memory/{m.id}/supersede", json={"reason": "again"})
|
||||
assert r2.status_code == 200
|
||||
assert r2.json()["status"] == "already_superseded"
|
||||
|
||||
|
||||
def test_api_supersede_unknown_id_is_404(env):
|
||||
client = TestClient(app)
|
||||
r = client.post("/memory/no-such-id/supersede", json={"reason": "ghost"})
|
||||
assert r.status_code == 404
|
||||
|
||||
|
||||
def test_admin_dashboard_active_count_matches_full_table(env):
|
||||
"""/admin/dashboard memories.active must match the SQL aggregate even
|
||||
when there are more active memories than the legacy sample limit (500).
|
||||
|
||||
@@ -661,3 +661,35 @@ def test_update_memory_project_unchanged_when_not_passed(isolated_db):
|
||||
mem = create_memory("knowledge", "untouched project", project="p06-polisher")
|
||||
update_memory(mem.id, content="edited content")
|
||||
assert get_memory(mem.id).project == "p06-polisher"
|
||||
|
||||
|
||||
def test_update_memory_to_empty_project_detects_global_duplicate(isolated_db):
|
||||
"""Codex P3: when retargeting to project='' (global), the duplicate
|
||||
check must scope to the new project. If a global active memory with
|
||||
the same content already exists, the update must raise."""
|
||||
import pytest as _pytest
|
||||
from atocore.memory.service import create_memory, update_memory
|
||||
|
||||
create_memory("knowledge", "shared global fact", project="")
|
||||
scoped = create_memory("knowledge", "shared global fact", project="p04-gigabit")
|
||||
|
||||
with _pytest.raises(ValueError, match="duplicate active memory"):
|
||||
update_memory(scoped.id, project="")
|
||||
|
||||
|
||||
def test_auto_triage_suggested_project_put_body_uses_project_key():
|
||||
"""Regression: the auto_triage caller used to PUT {"content": ...}
|
||||
which silently dropped the suggested project change. The fix sends
|
||||
{"project": suggested}. Inspect the script source so we don't have
|
||||
to spin up a live triage run."""
|
||||
from pathlib import Path
|
||||
|
||||
src = Path(__file__).resolve().parents[1] / "scripts" / "auto_triage.py"
|
||||
text = src.read_text(encoding="utf-8")
|
||||
# The block that PUTs to /memory/{mid} for a suggested_project fix
|
||||
assert 'json.dumps({"project": suggested})' in text, (
|
||||
"auto_triage.py must PUT {\"project\": suggested} so the "
|
||||
"suggested-project correction actually applies. See Wave 1."
|
||||
)
|
||||
# And must not be back to the old shape
|
||||
assert 'json.dumps({"content": cand["content"]})' not in text
|
||||
|
||||
Reference in New Issue
Block a user