feat: Phase 6 — Living Taxonomy + Universal Capture

Closes two real-use gaps:
1. "APM tool" gap: work done outside Claude Code (desktop, web, phone,
   other machine) was invisible to AtoCore.
2. Project discovery gap: manual JSON-file edits required to promote
   an emerging theme to a first-class project.

B — atocore_remember MCP tool (scripts/atocore_mcp.py):
- New MCP tool for universal capture from any MCP-aware client
  (Claude Desktop, Code, Cursor, Zed, Windsurf, etc.)
- Accepts content (required) + memory_type/project/confidence/
  valid_until/domain_tags (all optional with sensible defaults)
- Creates a candidate memory, goes through the existing 3-tier triage
  (no bypass — the quality gate catches noise)
- Detailed tool description guides Claude on when to invoke: "remember
  this", "save that for later", "don't lose this fact"
- Total tools exposed by MCP server: 14 → 15

C.1 Emerging-concepts detector (scripts/detect_emerging.py):
- Nightly scan of active + candidate memories for:
  * Unregistered project names with ≥3 memory occurrences
  * Top 20 domain_tags by frequency (emerging categories)
  * Active memories with reference_count ≥ 5 + valid_until set
    (reinforced transients — candidates for extension)
- Writes findings to atocore/proposals/* project state entries
- Emits "warning" alert via Phase 4 framework the FIRST time a new
  project crosses the 5-memory alert threshold (avoids spam)
- Configurable via env vars: ATOCORE_EMERGING_PROJECT_MIN (default 3),
  ATOCORE_EMERGING_ALERT_THRESHOLD (default 5), TOP_TAGS_LIMIT (20)

C.2 Registration surface (src/atocore/api/routes.py + wiki.py):
- POST /admin/projects/register-emerging — one-click register with
  sensible defaults (ingest_roots auto-filled with
  vault:incoming/projects/<id>/ convention). Clears the proposal
  from the dashboard list on success.
- Dashboard /admin/dashboard: new "proposals" section with
  unregistered_projects + emerging_categories + reinforced_transients.
- Wiki homepage: "📋 Emerging" section rendering each unregistered
  project as a card with count + 2 sample memory previews + inline
  "📌 Register as project" button that calls the endpoint via fetch,
  reloads the page on success.

C.3 Transient-to-durable extension
(src/atocore/memory/service.py + API + cron):
- New extend_reinforced_valid_until() function — scans active memories
  with valid_until in the next 30 days and reference_count ≥ 5.
  Extends expiry by 90 days. If reference_count ≥ 10, clears expiry
  entirely (makes permanent). Writes audit rows via the Phase 4
  memory_audit framework with actor="transient-to-durable".
- POST /admin/memory/extend-reinforced — API wrapper for cron.
- Matches the user's intuition: "something transient becomes important
  if you keep coming back to it".

Nightly cron (deploy/dalidou/batch-extract.sh):
- Step F2: detect_emerging.py (after F pipeline summary)
- Step F3: /admin/memory/extend-reinforced (before integrity check)
- Both fail-open; errors don't break the pipeline.

Tests: 366 → 374 (+8 for Phase 6):
- 6 tests for extend_reinforced_valid_until covering:
  extension path, permanent path, skip far-future, skip low-refs,
  skip permanent memories, audit row write
- 2 smoke tests for the detector (imports cleanly, handles empty DB)
- MCP tool changes don't need new tests — the wrapper is pure passthrough

Design decisions documented in plan file:
- atocore_remember deliberately doesn't bypass triage (quality gate)
- Detector is passive (surfaces proposals) not active (auto-registers)
- Sensible ingest-root defaults ("vault:incoming/projects/<id>/")
  so registration is one-click with no file-path thinking
- Extension adds 90 days rather than clearing expiry (gradual
  permanence earned through sustained reinforcement)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-18 08:08:55 -04:00
parent cc68839306
commit 02055e8db3
7 changed files with 736 additions and 0 deletions

View File

@@ -369,6 +369,72 @@ def api_project_registration(req: ProjectRegistrationProposalRequest) -> dict:
raise HTTPException(status_code=400, detail=str(e))
class RegisterEmergingRequest(BaseModel):
project_id: str
description: str = ""
aliases: list[str] | None = None
@router.post("/admin/projects/register-emerging")
def api_register_emerging_project(req: RegisterEmergingRequest) -> dict:
"""Phase 6 C.2 — one-click register a detected emerging project.
Fills in sensible defaults so the user doesn't have to think about
paths: ingest_roots defaults to vault:incoming/projects/<project_id>/
(will be empty until the user creates content there, which is fine).
Delegates to the existing register_project() for validation + file
write. Clears the project from the unregistered_projects proposal
list so it stops appearing in the dashboard.
"""
import json as _json
pid = (req.project_id or "").strip().lower()
if not pid:
raise HTTPException(status_code=400, detail="project_id is required")
aliases = req.aliases or []
description = req.description or f"Emerging project registered from dashboard: {pid}"
ingest_roots = [{
"source": "vault",
"subpath": f"incoming/projects/{pid}/",
"label": pid,
}]
try:
result = register_project(
project_id=pid,
aliases=aliases,
description=description,
ingest_roots=ingest_roots,
)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# Clear from proposals so dashboard doesn't keep showing it
try:
from atocore.context.project_state import get_state, set_state
for e in get_state("atocore"):
if e.category == "proposals" and e.key == "unregistered_projects":
try:
current = _json.loads(e.value)
except Exception:
current = []
filtered = [p for p in current if p.get("project") != pid]
set_state(
project_name="atocore",
category="proposals",
key="unregistered_projects",
value=_json.dumps(filtered),
source="register-emerging",
)
break
except Exception:
pass # non-fatal
result["message"] = f"Project {pid!r} registered. Now has a wiki page, system map, and killer queries."
return result
@router.put("/projects/{project_name}")
def api_project_update(project_name: str, req: ProjectUpdateRequest) -> dict:
"""Update an existing project registration."""
@@ -1190,6 +1256,25 @@ def api_dashboard() -> dict:
except Exception:
pass
# Phase 6 C.2: emerging-concepts proposals from the detector
proposals: dict = {}
try:
for entry in get_state("atocore"):
if entry.category != "proposals":
continue
try:
data = _json.loads(entry.value)
except Exception:
continue
if entry.key == "unregistered_projects":
proposals["unregistered_projects"] = data
elif entry.key == "emerging_categories":
proposals["emerging_categories"] = data
elif entry.key == "reinforced_transients":
proposals["reinforced_transients"] = data
except Exception:
pass
# Project state counts — include all registered projects
ps_counts = {}
try:
@@ -1248,6 +1333,7 @@ def api_dashboard() -> dict:
"integrity": integrity,
"alerts": alerts,
"recent_audit": recent_audit,
"proposals": proposals,
}
@@ -1431,6 +1517,19 @@ def api_graduation_status() -> dict:
return out
@router.post("/admin/memory/extend-reinforced")
def api_extend_reinforced() -> dict:
"""Phase 6 C.3 — batch transient-to-durable extension.
Scans active memories with valid_until in the next 30 days and
reference_count >= 5. Extends expiry by 90 days, or clears it
entirely (permanent) if reference_count >= 10. Writes audit rows.
"""
from atocore.memory.service import extend_reinforced_valid_until
extended = extend_reinforced_valid_until()
return {"extended_count": len(extended), "extensions": extended}
@router.get("/admin/graduation/stats")
def api_graduation_stats() -> dict:
"""Phase 5F graduation stats for dashboard."""

View File

@@ -116,6 +116,40 @@ def render_homepage() -> str:
lines.append('</a>')
lines.append('</div>')
# Phase 6 C.2: Emerging projects section
try:
import json as _json
emerging_projects = []
state_entries = get_state("atocore")
for e in state_entries:
if e.category == "proposals" and e.key == "unregistered_projects":
try:
emerging_projects = _json.loads(e.value)
except Exception:
emerging_projects = []
break
if emerging_projects:
lines.append('<h2>📋 Emerging</h2>')
lines.append('<p class="emerging-intro">Projects that appear in memories but aren\'t yet registered. '
'One click to promote them to first-class projects.</p>')
lines.append('<div class="emerging-grid">')
for ep in emerging_projects[:10]:
name = ep.get("project", "?")
count = ep.get("count", 0)
samples = ep.get("sample_contents", [])
samples_html = "".join(f'<li>{s[:120]}</li>' for s in samples[:2])
lines.append(
f'<div class="emerging-card">'
f'<h3>{name}</h3>'
f'<div class="emerging-count">{count} memories</div>'
f'<ul class="emerging-samples">{samples_html}</ul>'
f'<button class="btn-register-emerging" onclick="registerEmerging({name!r})">📌 Register as project</button>'
f'</div>'
)
lines.append('</div>')
except Exception:
pass
# Quick stats
all_entities = get_entities(limit=500)
all_memories = get_memories(active_only=True, limit=500)
@@ -324,7 +358,41 @@ _TEMPLATE = """<!DOCTYPE html>
.tag-badge:hover { opacity: 0.85; text-decoration: none; }
.mem-expiry { font-size: 0.75rem; color: #d97706; font-style: italic; margin-left: 0.4rem; }
@media (prefers-color-scheme: dark) { .mem-expiry { color: #fbbf24; } }
/* Phase 6 C.2 — Emerging projects section */
.emerging-intro { font-size: 0.9rem; opacity: 0.75; margin-bottom: 0.8rem; }
.emerging-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 1rem; margin-bottom: 1rem; }
.emerging-card { background: var(--card); border: 1px dashed var(--accent); border-radius: 8px; padding: 1rem; }
.emerging-card h3 { margin: 0 0 0.3rem 0; color: var(--accent); font-family: monospace; font-size: 1rem; }
.emerging-count { font-size: 0.8rem; opacity: 0.6; margin-bottom: 0.5rem; }
.emerging-samples { font-size: 0.85rem; margin: 0.5rem 0; padding-left: 1.2rem; opacity: 0.8; }
.emerging-samples li { margin-bottom: 0.25rem; }
.btn-register-emerging { width: 100%; padding: 0.45rem 0.9rem; background: var(--accent); color: white; border: 1px solid var(--accent); border-radius: 4px; cursor: pointer; font-size: 0.88rem; font-weight: 500; margin-top: 0.5rem; }
.btn-register-emerging:hover { opacity: 0.9; }
</style>
<script>
async function registerEmerging(projectId) {
if (!confirm(`Register "${projectId}" as a first-class project?\n\nThis creates:\n• /wiki/projects/${projectId} page\n• System map + gaps + killer queries\n• Triage + graduation support\n\nIngest root defaults to vault:incoming/projects/${projectId}/`)) {
return;
}
try {
const r = await fetch('/admin/projects/register-emerging', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({project_id: projectId}),
});
if (r.ok) {
const data = await r.json();
alert(data.message || `Registered ${projectId}`);
window.location.reload();
} else {
const err = await r.text();
alert(`Registration failed: ${r.status}\n${err.substring(0, 300)}`);
}
} catch (e) {
alert(`Network error: ${e.message}`);
}
}
</script>
</head>
<body>
{{nav}}

View File

@@ -604,6 +604,93 @@ def auto_promote_reinforced(
return promoted
def extend_reinforced_valid_until(
min_reference_count: int = 5,
permanent_reference_count: int = 10,
extension_days: int = 90,
imminent_expiry_days: int = 30,
) -> list[dict]:
"""Phase 6 C.3 — transient-to-durable auto-extension.
For active memories with valid_until within the next N days AND
reference_count >= min_reference_count: extend valid_until by
extension_days. If reference_count >= permanent_reference_count,
clear valid_until entirely (becomes permanent).
Matches the user's intuition: "something transient becomes important
if you keep coming back to it". The system watches reinforcement
signals and extends expiry so context packs keep seeing durable
facts instead of letting them decay out.
Returns a list of {memory_id, action, old, new} dicts for each
memory touched.
"""
from datetime import timedelta
now = datetime.now(timezone.utc)
horizon = (now + timedelta(days=imminent_expiry_days)).strftime("%Y-%m-%d")
new_expiry = (now + timedelta(days=extension_days)).strftime("%Y-%m-%d")
now_str = now.strftime("%Y-%m-%d %H:%M:%S")
extended: list[dict] = []
with get_connection() as conn:
rows = conn.execute(
"SELECT id, valid_until, reference_count FROM memories "
"WHERE status = 'active' "
"AND valid_until IS NOT NULL AND valid_until != '' "
"AND substr(valid_until, 1, 10) <= ? "
"AND COALESCE(reference_count, 0) >= ?",
(horizon, min_reference_count),
).fetchall()
for r in rows:
mid = r["id"]
old_vu = r["valid_until"]
ref_count = int(r["reference_count"] or 0)
if ref_count >= permanent_reference_count:
# Permanent promotion
conn.execute(
"UPDATE memories SET valid_until = NULL, updated_at = ? WHERE id = ?",
(now_str, mid),
)
extended.append({
"memory_id": mid, "action": "made_permanent",
"old_valid_until": old_vu, "new_valid_until": None,
"reference_count": ref_count,
})
else:
# 90-day extension
conn.execute(
"UPDATE memories SET valid_until = ?, updated_at = ? WHERE id = ?",
(new_expiry, now_str, mid),
)
extended.append({
"memory_id": mid, "action": "extended",
"old_valid_until": old_vu, "new_valid_until": new_expiry,
"reference_count": ref_count,
})
# Audit rows via the shared framework (fail-open)
for ex in extended:
try:
_audit_memory(
memory_id=ex["memory_id"],
action="valid_until_extended",
actor="transient-to-durable",
before={"valid_until": ex["old_valid_until"]},
after={"valid_until": ex["new_valid_until"]},
note=f"reinforced {ex['reference_count']}x; {ex['action']}",
)
except Exception:
pass
if extended:
log.info("reinforced_valid_until_extended", count=len(extended))
return extended
def expire_stale_candidates(
max_age_days: int = 14,
) -> list[str]: