Files
ATOCore/docs/PHASE-7-MEMORY-CONSOLIDATION.md
Anto01 028d4c3594 feat: Phase 7A — semantic memory dedup ("sleep cycle" V1)
New table memory_merge_candidates + service functions to cluster
near-duplicate active memories within (project, memory_type) buckets,
draft a unified content via LLM, and merge on human approval. Source
memories become superseded (never deleted); merged memory carries
union of tags, max of confidence, sum of reference_count.

- schema migration for memory_merge_candidates
- atocore.memory.similarity: cosine + transitive clustering
- atocore.memory._dedup_prompt: stdlib-only LLM prompt preserving every specific
- service: merge_memories / create_merge_candidate / get_merge_candidates / reject_merge_candidate
- scripts/memory_dedup.py: host-side detector (HTTP-only, idempotent)
- 5 API endpoints under /admin/memory/merge-candidates* + /admin/memory/dedup-scan
- triage UI: purple "🔗 Merge Candidates" section + "🔗 Scan for duplicates" bar
- batch-extract.sh Step B3 (0.90 daily, 0.85 Sundays)
- deploy/dalidou/dedup-watcher.sh for UI-triggered scans
- 21 new tests (374 → 395)
- docs/PHASE-7-MEMORY-CONSOLIDATION.md covering 7A-7H roadmap

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:30:49 -04:00

6.6 KiB
Raw Blame History

Phase 7 — Memory Consolidation (the "Sleep Cycle")

Status: 7A in progress · 7B-H scoped, deferred Design principle: "Like human memory while sleeping, but more robotic — never discard relevant details. Consolidate, update, supersede — don't delete."

Why

Phases 16 built capture + triage + graduation + emerging-project detection. What they don't solve:

# Problem Fix
1 Redundancy — "APM uses NX" said 5 different ways across 5 memories 7A Semantic dedup
2 Latent contradictions — "chose Zygo" + "switched from Zygo" both active 7B Pair contradiction detection
3 Tag drift — firmware, fw, firmware-control fragment retrieval 7C Tag canonicalization
4 Confidence staleness — 6-month unreferenced memory ranks as fresh 7D Confidence decay
5 No memory drill-down page 7E /wiki/memories/{id}
6 Domain knowledge siloed per project 7F /wiki/domains/{tag}
7 Prompt upgrades (llm-0.5 → 0.6) don't re-process old interactions 7G Re-extraction on version bump
8 Superseded memory vectors still in Chroma polluting retrieval 7H Vector hygiene

Collectively: the brain needs a nightly pass that looks at what it already knows and tidies up — dedup, resolve contradictions, canonicalize tags, decay stale facts — without losing information.

Subphases

7A — Semantic dedup + consolidation (this sprint)

Compute embeddings on active memories, find pairs within (project, memory_type) bucket above similarity threshold (default 0.88), cluster, draft a unified memory via LLM, human approves in triage UI. On approve: sources become superseded, new merged memory created with union of source_refs, sum of reference_count, max of confidence. Ships first because redundancy compounds — every new memory potentially duplicates an old one.

Detailed spec lives in the working plan (dapper-cooking-tower.md) and across the files listed under "Files touched" below. Key decisions:

  • LLM drafts, human approves — no silent auto-merge.
  • Same (project, memory_type) bucket only. Cross-project merges are rare + risky → separate flow in 7B.
  • Recompute embeddings each scan (~2s / 335 memories). Persist only if scan time becomes a problem.
  • Cluster-based proposals (ABC → one merge), not pair-based.
  • status=superseded never deleted — still queryable with filter.

Schema: new table memory_merge_candidates (pending | approved | rejected). Cron: nightly at threshold 0.90 (tight); weekly (Sundays) at 0.85 (deeper cleanup). UI: new "🔗 Merge Candidates" section in /admin/triage.

Files touched in 7A:

  • src/atocore/models/database.py — migration
  • src/atocore/memory/similarity.py — new, compute_memory_similarity()
  • src/atocore/memory/_dedup_prompt.py — new, shared LLM prompt
  • src/atocore/memory/service.pymerge_memories()
  • scripts/memory_dedup.py — new, host-side detector (HTTP-only)
  • src/atocore/api/routes.py — 5 new endpoints under /admin/memory/
  • src/atocore/engineering/triage_ui.py — merge cards section
  • deploy/dalidou/batch-extract.sh — Step B3
  • deploy/dalidou/dedup-watcher.sh — new, UI-triggered scans
  • tests/test_memory_dedup.py — ~10-15 new tests

7B — Memory-to-memory contradiction detection

Same embedding-pair machinery as 7A but within a different band (similarity 0.700.88 — semantically related but different wording). LLM classifies each pair: duplicate | complementary | contradicts | supersedes-older. Contradictions write a memory_conflicts row + surface a triage badge. Clear supersessions (both tier 1 sonnet and tier 2 opus agree) auto-mark the older as superseded.

7C — Tag canonicalization

Weekly LLM pass over domain_tags distribution, proposes alias → canonical map (e.g. fw → firmware). Human approves via UI (one-click pattern, same as emerging-project registration). Bulk-rewrites domain_tags atomically across all memories.

7D — Confidence decay

Daily lightweight job. For memories with reference_count=0 AND last_referenced_at older than 30 days: multiply confidence by 0.97/day (~2-month half-life). Reinforcement already bumps confidence. Below 0.3 → auto-supersede with reason decayed, no references. Reversible (tune half-life), non-destructive (still searchable with status filter).

7E — Memory detail page /wiki/memories/{id}

Provenance chain: source_chunk → interaction → graduated_to_entity. Audit trail (Phase 4 has the data). Related memories (same project + tag + semantic neighbors). Decay trajectory plot (if 7D ships). Link target from every memory surfaced anywhere in the wiki.

7F — Cross-project domain view /wiki/domains/{tag}

One page per domain_tag showing all memories + graduated entities with that tag, grouped by project. "Optics across p04+p05+p06" becomes a real navigable page. Answers the long-standing question the tag system was meant to enable.

7G — Re-extraction on prompt upgrade

batch_llm_extract_live.py --force-reextract --since DATE. Dedupe key: (interaction_id, extractor_version) — same run on same interaction doesn't double-create. Triggered manually when LLM_EXTRACTOR_VERSION bumps. Not automatic (destructive).

7H — Vector store hygiene

Nightly: scan source_chunks and memory_embeddings (added in 7A V2) for status=superseded|invalid. Delete matching vectors from Chroma. Fail-open — the retrieval harness catches any real regression.

Verification & ship order

  1. 7A — ship + observe 1 week → validate merge proposals are high-signal, rejection rate acceptable
  2. 7D — decay is low-risk + high-compounding value; ship second
  3. 7C — clean up tag fragmentation before 7F depends on canonical tags
  4. 7E + 7F — UX surfaces; ship together once data is clean
  5. 7B — contradictions flow (pairs harder than duplicates to classify; wait for 7A data to tune threshold)
  6. 7G — on-demand; no ship until we actually bump the extractor prompt
  7. 7H — housekeeping; after 7A + 7B + 7D have generated enough superseded rows to matter

Scope NOT in Phase 7

  • Graduated memories (entity-descended) are frozen — exempt from dedup/decay. Entity consolidation is a separate Phase (8+).
  • Auto-merging without human approval (always human-in-the-loop in V1).
  • Summarization / compression — a different problem (reducing the number of chunks per memory, not the number of memories).
  • Forgetting policies — there's no user-facing "delete this" flow in Phase 7. Supersede + filter covers the need.