feat: Phase 7A — semantic memory dedup ("sleep cycle" V1)
New table memory_merge_candidates + service functions to cluster near-duplicate active memories within (project, memory_type) buckets, draft a unified content via LLM, and merge on human approval. Source memories become superseded (never deleted); merged memory carries union of tags, max of confidence, sum of reference_count. - schema migration for memory_merge_candidates - atocore.memory.similarity: cosine + transitive clustering - atocore.memory._dedup_prompt: stdlib-only LLM prompt preserving every specific - service: merge_memories / create_merge_candidate / get_merge_candidates / reject_merge_candidate - scripts/memory_dedup.py: host-side detector (HTTP-only, idempotent) - 5 API endpoints under /admin/memory/merge-candidates* + /admin/memory/dedup-scan - triage UI: purple "🔗 Merge Candidates" section + "🔗 Scan for duplicates" bar - batch-extract.sh Step B3 (0.90 daily, 0.85 Sundays) - deploy/dalidou/dedup-watcher.sh for UI-triggered scans - 21 new tests (374 → 395) - docs/PHASE-7-MEMORY-CONSOLIDATION.md covering 7A-7H roadmap Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -251,6 +251,42 @@ def _apply_migrations(conn: sqlite3.Connection) -> None:
|
||||
"CREATE INDEX IF NOT EXISTS idx_interactions_created_at ON interactions(created_at)"
|
||||
)
|
||||
|
||||
# Phase 7A (Memory Consolidation — "sleep cycle"): merge candidates.
|
||||
# When the dedup detector finds a cluster of semantically similar active
|
||||
# memories within the same (project, memory_type) bucket, it drafts a
|
||||
# unified content via LLM and writes a proposal here. The triage UI
|
||||
# surfaces these for human approval. On approve, source memories become
|
||||
# status=superseded and a new merged memory is created.
|
||||
# memory_ids is a JSON array (length >= 2) of the source memory ids.
|
||||
# proposed_* hold the LLM's draft; a human can edit before approve.
|
||||
# result_memory_id is filled on approve with the new merged memory's id.
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS memory_merge_candidates (
|
||||
id TEXT PRIMARY KEY,
|
||||
status TEXT DEFAULT 'pending',
|
||||
memory_ids TEXT NOT NULL,
|
||||
similarity REAL,
|
||||
proposed_content TEXT,
|
||||
proposed_memory_type TEXT,
|
||||
proposed_project TEXT,
|
||||
proposed_tags TEXT DEFAULT '[]',
|
||||
proposed_confidence REAL,
|
||||
reason TEXT DEFAULT '',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
resolved_at DATETIME,
|
||||
resolved_by TEXT,
|
||||
result_memory_id TEXT
|
||||
)
|
||||
"""
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_mmc_status ON memory_merge_candidates(status)"
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_mmc_created_at ON memory_merge_candidates(created_at)"
|
||||
)
|
||||
|
||||
|
||||
def _column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool:
|
||||
rows = conn.execute(f"PRAGMA table_info({table})").fetchall()
|
||||
|
||||
Reference in New Issue
Block a user