feat: Phase 6 — Living Taxonomy + Universal Capture
Closes two real-use gaps:
1. "APM tool" gap: work done outside Claude Code (desktop, web, phone,
other machine) was invisible to AtoCore.
2. Project discovery gap: manual JSON-file edits required to promote
an emerging theme to a first-class project.
B — atocore_remember MCP tool (scripts/atocore_mcp.py):
- New MCP tool for universal capture from any MCP-aware client
(Claude Desktop, Code, Cursor, Zed, Windsurf, etc.)
- Accepts content (required) + memory_type/project/confidence/
valid_until/domain_tags (all optional with sensible defaults)
- Creates a candidate memory, goes through the existing 3-tier triage
(no bypass — the quality gate catches noise)
- Detailed tool description guides Claude on when to invoke: "remember
this", "save that for later", "don't lose this fact"
- Total tools exposed by MCP server: 14 → 15
C.1 Emerging-concepts detector (scripts/detect_emerging.py):
- Nightly scan of active + candidate memories for:
* Unregistered project names with ≥3 memory occurrences
* Top 20 domain_tags by frequency (emerging categories)
* Active memories with reference_count ≥ 5 + valid_until set
(reinforced transients — candidates for extension)
- Writes findings to atocore/proposals/* project state entries
- Emits "warning" alert via Phase 4 framework the FIRST time a new
project crosses the 5-memory alert threshold (avoids spam)
- Configurable via env vars: ATOCORE_EMERGING_PROJECT_MIN (default 3),
ATOCORE_EMERGING_ALERT_THRESHOLD (default 5), TOP_TAGS_LIMIT (20)
C.2 Registration surface (src/atocore/api/routes.py + wiki.py):
- POST /admin/projects/register-emerging — one-click register with
sensible defaults (ingest_roots auto-filled with
vault:incoming/projects/<id>/ convention). Clears the proposal
from the dashboard list on success.
- Dashboard /admin/dashboard: new "proposals" section with
unregistered_projects + emerging_categories + reinforced_transients.
- Wiki homepage: "📋 Emerging" section rendering each unregistered
project as a card with count + 2 sample memory previews + inline
"📌 Register as project" button that calls the endpoint via fetch,
reloads the page on success.
C.3 Transient-to-durable extension
(src/atocore/memory/service.py + API + cron):
- New extend_reinforced_valid_until() function — scans active memories
with valid_until in the next 30 days and reference_count ≥ 5.
Extends expiry by 90 days. If reference_count ≥ 10, clears expiry
entirely (makes permanent). Writes audit rows via the Phase 4
memory_audit framework with actor="transient-to-durable".
- POST /admin/memory/extend-reinforced — API wrapper for cron.
- Matches the user's intuition: "something transient becomes important
if you keep coming back to it".
Nightly cron (deploy/dalidou/batch-extract.sh):
- Step F2: detect_emerging.py (after F pipeline summary)
- Step F3: /admin/memory/extend-reinforced (before integrity check)
- Both fail-open; errors don't break the pipeline.
Tests: 366 → 374 (+8 for Phase 6):
- 6 tests for extend_reinforced_valid_until covering:
extension path, permanent path, skip far-future, skip low-refs,
skip permanent memories, audit row write
- 2 smoke tests for the detector (imports cleanly, handles empty DB)
- MCP tool changes don't need new tests — the wrapper is pure passthrough
Design decisions documented in plan file:
- atocore_remember deliberately doesn't bypass triage (quality gate)
- Detector is passive (surfaces proposals) not active (auto-registers)
- Sensible ingest-root defaults ("vault:incoming/projects/<id>/")
so registration is one-click with no file-path thinking
- Extension adds 90 days rather than clearing expiry (gradual
permanence earned through sustained reinforcement)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -150,6 +150,22 @@ print(f'Pipeline summary persisted: {json.dumps(summary)}')
|
||||
log "WARN: pipeline summary persistence failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step F2: Emerging-concepts detector (Phase 6 C.1)
|
||||
log "Step F2: emerging-concepts detector"
|
||||
python3 "$APP_DIR/scripts/detect_emerging.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
2>&1 || {
|
||||
log "WARN: emerging detector failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step F3: Transient-to-durable extension (Phase 6 C.3)
|
||||
log "Step F3: transient-to-durable extension"
|
||||
curl -sSf -X POST "$ATOCORE_URL/admin/memory/extend-reinforced" \
|
||||
-H 'Content-Type: application/json' \
|
||||
2>&1 | tail -5 || {
|
||||
log "WARN: extend-reinforced failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step G: Integrity check (Phase 4 V1)
|
||||
log "Step G: integrity check"
|
||||
python3 "$APP_DIR/scripts/integrity_check.py" \
|
||||
|
||||
@@ -243,6 +243,72 @@ def _tool_projects(args: dict) -> str:
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_remember(args: dict) -> str:
|
||||
"""Phase 6 Part B — universal capture from any Claude session.
|
||||
|
||||
Wraps POST /memory to create a candidate memory tagged with
|
||||
source='mcp-remember'. The existing 3-tier triage is the quality
|
||||
gate: nothing becomes active until sonnet (+ opus if borderline)
|
||||
approves it. Returns the memory id so the caller can reference it
|
||||
in the same session.
|
||||
"""
|
||||
content = (args.get("content") or "").strip()
|
||||
if not content:
|
||||
return "Error: 'content' is required."
|
||||
|
||||
memory_type = (args.get("memory_type") or "knowledge").strip()
|
||||
valid_types = ["identity", "preference", "project", "episodic", "knowledge", "adaptation"]
|
||||
if memory_type not in valid_types:
|
||||
return f"Error: memory_type must be one of {valid_types}."
|
||||
|
||||
project = (args.get("project") or "").strip()
|
||||
try:
|
||||
confidence = float(args.get("confidence") or 0.6)
|
||||
except (TypeError, ValueError):
|
||||
confidence = 0.6
|
||||
confidence = max(0.0, min(1.0, confidence))
|
||||
|
||||
valid_until = (args.get("valid_until") or "").strip()
|
||||
tags = args.get("domain_tags") or []
|
||||
if not isinstance(tags, list):
|
||||
tags = []
|
||||
# Normalize tags: lowercase, dedupe, cap at 10
|
||||
clean_tags: list[str] = []
|
||||
for t in tags[:10]:
|
||||
if not isinstance(t, str):
|
||||
continue
|
||||
t = t.strip().lower()
|
||||
if t and t not in clean_tags:
|
||||
clean_tags.append(t)
|
||||
|
||||
payload = {
|
||||
"memory_type": memory_type,
|
||||
"content": content,
|
||||
"project": project,
|
||||
"confidence": confidence,
|
||||
"status": "candidate",
|
||||
}
|
||||
if valid_until:
|
||||
payload["valid_until"] = valid_until
|
||||
if clean_tags:
|
||||
payload["domain_tags"] = clean_tags
|
||||
|
||||
result, err = safe_call(http_post, "/memory", payload)
|
||||
if err:
|
||||
return f"AtoCore remember failed: {err}"
|
||||
|
||||
mid = result.get("id", "?")
|
||||
scope = project if project else "(global)"
|
||||
tag_str = f" tags=[{', '.join(clean_tags)}]" if clean_tags else ""
|
||||
expires = f" valid_until={valid_until}" if valid_until else ""
|
||||
return (
|
||||
f"Remembered as candidate: id={mid}\n"
|
||||
f" type={memory_type} project={scope} confidence={confidence:.2f}{tag_str}{expires}\n"
|
||||
f"Will flow through the standard triage pipeline within 24h "
|
||||
f"(or on next auto-process button click at /admin/triage)."
|
||||
)
|
||||
|
||||
|
||||
def _tool_health(args: dict) -> str:
|
||||
"""Check AtoCore service health."""
|
||||
result, err = safe_call(http_get, "/health")
|
||||
@@ -527,6 +593,58 @@ TOOLS = [
|
||||
},
|
||||
"handler": _tool_memory_create,
|
||||
},
|
||||
{
|
||||
"name": "atocore_remember",
|
||||
"description": (
|
||||
"Save a durable fact to AtoCore's memory layer from any conversation. "
|
||||
"Use when the user says 'remember this', 'save that for later', "
|
||||
"'don't lose this fact', or when you identify a decision/insight/"
|
||||
"preference worth persisting across future sessions. The fact "
|
||||
"goes through quality review before being consulted in future "
|
||||
"context packs (so durable facts get kept, noise gets rejected). "
|
||||
"Call multiple times if one conversation has multiple distinct "
|
||||
"facts worth remembering — one tool call per atomic fact. "
|
||||
"Prefer 'knowledge' type for cross-project engineering insights, "
|
||||
"'project' for facts specific to one project, 'preference' for "
|
||||
"user work-style notes, 'adaptation' for standing behavioral rules."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"content": {
|
||||
"type": "string",
|
||||
"description": "The atomic fact to remember. Under 250 chars. Should stand alone without session context.",
|
||||
},
|
||||
"memory_type": {
|
||||
"type": "string",
|
||||
"enum": ["identity", "preference", "project", "episodic", "knowledge", "adaptation"],
|
||||
"default": "knowledge",
|
||||
},
|
||||
"project": {
|
||||
"type": "string",
|
||||
"description": "Project id if scoped. Empty for cross-project. Unregistered names flagged by triage as 'emerging project' proposals.",
|
||||
},
|
||||
"confidence": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 1,
|
||||
"default": 0.6,
|
||||
"description": "0.5-0.7 typical. 0.8+ only for ratified/committed claims.",
|
||||
},
|
||||
"valid_until": {
|
||||
"type": "string",
|
||||
"description": "ISO date YYYY-MM-DD if time-bounded (e.g. current state, scheduled event, quote expiry). Empty for permanent facts.",
|
||||
},
|
||||
"domain_tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Lowercase topical tags (optics, thermal, firmware, procurement, etc.) for cross-project retrieval. 2-5 tags typical.",
|
||||
},
|
||||
},
|
||||
"required": ["content"],
|
||||
},
|
||||
"handler": _tool_remember,
|
||||
},
|
||||
{
|
||||
"name": "atocore_project_state",
|
||||
"description": (
|
||||
|
||||
200
scripts/detect_emerging.py
Normal file
200
scripts/detect_emerging.py
Normal file
@@ -0,0 +1,200 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Phase 6 C.1 — Emerging-concepts detector.
|
||||
|
||||
Scans active + candidate memories to surface:
|
||||
1. Unregistered projects — project strings appearing on 3+ memories
|
||||
that aren't in the project registry. Surface for one-click
|
||||
registration.
|
||||
2. Emerging categories — top 20 domain_tags by frequency, for
|
||||
"what themes are emerging in my work?" intelligence.
|
||||
3. Reinforced transients — active memories with reference_count >= 5
|
||||
AND valid_until set. These "were temporary but now durable";
|
||||
candidates for valid_until extension (handled by a sibling script).
|
||||
|
||||
Writes results to project_state under atocore/proposals/*. Emits a
|
||||
warning alert the FIRST time a project crosses the 5-memory threshold
|
||||
(so the user gets notified without being spammed on every run).
|
||||
|
||||
Usage:
|
||||
python3 scripts/detect_emerging.py [--base-url URL] [--dry-run]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from collections import Counter, defaultdict
|
||||
|
||||
# src/ importable so we can reuse service helpers
|
||||
_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
|
||||
if _SRC_DIR not in sys.path:
|
||||
sys.path.insert(0, _SRC_DIR)
|
||||
|
||||
|
||||
PROJECT_MIN_MEMORIES = int(os.environ.get("ATOCORE_EMERGING_PROJECT_MIN", "3"))
|
||||
PROJECT_ALERT_THRESHOLD = int(os.environ.get("ATOCORE_EMERGING_ALERT_THRESHOLD", "5"))
|
||||
TOP_TAGS_LIMIT = int(os.environ.get("ATOCORE_EMERGING_TOP_TAGS", "20"))
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Detect emerging projects + categories")
|
||||
parser.add_argument("--base-url", default=os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100"))
|
||||
parser.add_argument("--dry-run", action="store_true", help="Report without writing to project state")
|
||||
args = parser.parse_args()
|
||||
|
||||
from atocore.memory.service import get_memories
|
||||
from atocore.projects.registry import load_project_registry
|
||||
from atocore.context.project_state import set_state, get_state
|
||||
|
||||
# Registered project ids (including aliases — a memory tagged 'p04' should
|
||||
# NOT be flagged as emerging since 'p04' is a registered alias for p04-gigabit)
|
||||
registered = set()
|
||||
for p in load_project_registry():
|
||||
registered.add(p.project_id.lower())
|
||||
for alias in p.aliases:
|
||||
registered.add(alias.lower())
|
||||
|
||||
# Pull active + candidate memories (give ourselves a broad view)
|
||||
active = get_memories(active_only=True, limit=500)
|
||||
candidates = get_memories(status="candidate", limit=500)
|
||||
all_mems = list(active) + list(candidates)
|
||||
|
||||
# --- Unregistered projects ---
|
||||
project_mems: dict[str, list] = defaultdict(list)
|
||||
for m in all_mems:
|
||||
proj = (m.project or "").strip().lower()
|
||||
if not proj or proj in registered:
|
||||
continue
|
||||
project_mems[proj].append(m)
|
||||
|
||||
unregistered = []
|
||||
for proj, mems in sorted(project_mems.items()):
|
||||
if len(mems) < PROJECT_MIN_MEMORIES:
|
||||
continue
|
||||
unregistered.append({
|
||||
"project": proj,
|
||||
"count": len(mems),
|
||||
"sample_memory_ids": [m.id for m in mems[:3]],
|
||||
"sample_contents": [(m.content or "")[:150] for m in mems[:3]],
|
||||
})
|
||||
|
||||
# --- Emerging domain_tags (only active memories — candidates might be noise) ---
|
||||
tag_counter = Counter()
|
||||
for m in active:
|
||||
for t in (m.domain_tags or []):
|
||||
if isinstance(t, str) and t.strip():
|
||||
tag_counter[t.strip().lower()] += 1
|
||||
|
||||
emerging_tags = [
|
||||
{"tag": tag, "count": cnt}
|
||||
for tag, cnt in tag_counter.most_common(TOP_TAGS_LIMIT)
|
||||
]
|
||||
|
||||
# --- Reinforced transients ---
|
||||
reinforced = []
|
||||
for m in active:
|
||||
ref_count = getattr(m, "reference_count", 0) or 0
|
||||
vu = (getattr(m, "valid_until", "") or "").strip()
|
||||
if ref_count >= 5 and vu:
|
||||
reinforced.append({
|
||||
"memory_id": m.id,
|
||||
"reference_count": ref_count,
|
||||
"valid_until": vu,
|
||||
"content_preview": (m.content or "")[:150],
|
||||
"project": m.project or "",
|
||||
})
|
||||
|
||||
# --- Output ---
|
||||
result = {
|
||||
"unregistered_projects": unregistered,
|
||||
"emerging_categories": emerging_tags,
|
||||
"reinforced_transients": reinforced,
|
||||
"counts": {
|
||||
"active_memories": len(active),
|
||||
"candidate_memories": len(candidates),
|
||||
"unregistered_project_count": len(unregistered),
|
||||
"emerging_tag_count": len(emerging_tags),
|
||||
"reinforced_transient_count": len(reinforced),
|
||||
},
|
||||
}
|
||||
|
||||
print(json.dumps(result, indent=2))
|
||||
|
||||
if args.dry_run:
|
||||
return
|
||||
|
||||
# --- Persist to project state ---
|
||||
try:
|
||||
set_state(
|
||||
project_name="atocore",
|
||||
category="proposals",
|
||||
key="unregistered_projects",
|
||||
value=json.dumps(unregistered),
|
||||
source="emerging detector",
|
||||
)
|
||||
set_state(
|
||||
project_name="atocore",
|
||||
category="proposals",
|
||||
key="emerging_categories",
|
||||
value=json.dumps(emerging_tags),
|
||||
source="emerging detector",
|
||||
)
|
||||
set_state(
|
||||
project_name="atocore",
|
||||
category="proposals",
|
||||
key="reinforced_transients",
|
||||
value=json.dumps(reinforced),
|
||||
source="emerging detector",
|
||||
)
|
||||
except Exception as e:
|
||||
print(f"WARN: failed to persist to project state: {e}", file=sys.stderr)
|
||||
|
||||
# --- Alert on NEW projects crossing alert threshold ---
|
||||
try:
|
||||
# Read previous run's projects to detect "new" ones
|
||||
prev_unregistered: list = []
|
||||
for e in get_state("atocore"):
|
||||
if e.category == "proposals" and e.key == "unregistered_projects_prev":
|
||||
try:
|
||||
prev_unregistered = json.loads(e.value)
|
||||
except Exception:
|
||||
pass
|
||||
prev_names = {p.get("project") for p in prev_unregistered if isinstance(p, dict)}
|
||||
|
||||
newly_crossed = [
|
||||
p for p in unregistered
|
||||
if p["count"] >= PROJECT_ALERT_THRESHOLD
|
||||
and p["project"] not in prev_names
|
||||
]
|
||||
if newly_crossed:
|
||||
from atocore.observability.alerts import emit_alert
|
||||
names = ", ".join(p["project"] for p in newly_crossed)
|
||||
emit_alert(
|
||||
severity="warning",
|
||||
title=f"Emerging project(s) detected: {names}",
|
||||
message=(
|
||||
f"{len(newly_crossed)} unregistered project(s) have crossed "
|
||||
f"the {PROJECT_ALERT_THRESHOLD}-memory threshold and may "
|
||||
f"warrant registration: {names}. Review at /wiki or "
|
||||
f"/admin/dashboard."
|
||||
),
|
||||
context={"projects": [p["project"] for p in newly_crossed]},
|
||||
)
|
||||
|
||||
# Persist this run's list for next-run comparison
|
||||
set_state(
|
||||
project_name="atocore",
|
||||
category="proposals",
|
||||
key="unregistered_projects_prev",
|
||||
value=json.dumps(unregistered),
|
||||
source="emerging detector",
|
||||
)
|
||||
except Exception as e:
|
||||
print(f"WARN: alert/state write failed: {e}", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -369,6 +369,72 @@ def api_project_registration(req: ProjectRegistrationProposalRequest) -> dict:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
|
||||
|
||||
class RegisterEmergingRequest(BaseModel):
|
||||
project_id: str
|
||||
description: str = ""
|
||||
aliases: list[str] | None = None
|
||||
|
||||
|
||||
@router.post("/admin/projects/register-emerging")
|
||||
def api_register_emerging_project(req: RegisterEmergingRequest) -> dict:
|
||||
"""Phase 6 C.2 — one-click register a detected emerging project.
|
||||
|
||||
Fills in sensible defaults so the user doesn't have to think about
|
||||
paths: ingest_roots defaults to vault:incoming/projects/<project_id>/
|
||||
(will be empty until the user creates content there, which is fine).
|
||||
Delegates to the existing register_project() for validation + file
|
||||
write. Clears the project from the unregistered_projects proposal
|
||||
list so it stops appearing in the dashboard.
|
||||
"""
|
||||
import json as _json
|
||||
|
||||
pid = (req.project_id or "").strip().lower()
|
||||
if not pid:
|
||||
raise HTTPException(status_code=400, detail="project_id is required")
|
||||
|
||||
aliases = req.aliases or []
|
||||
description = req.description or f"Emerging project registered from dashboard: {pid}"
|
||||
ingest_roots = [{
|
||||
"source": "vault",
|
||||
"subpath": f"incoming/projects/{pid}/",
|
||||
"label": pid,
|
||||
}]
|
||||
|
||||
try:
|
||||
result = register_project(
|
||||
project_id=pid,
|
||||
aliases=aliases,
|
||||
description=description,
|
||||
ingest_roots=ingest_roots,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
|
||||
# Clear from proposals so dashboard doesn't keep showing it
|
||||
try:
|
||||
from atocore.context.project_state import get_state, set_state
|
||||
for e in get_state("atocore"):
|
||||
if e.category == "proposals" and e.key == "unregistered_projects":
|
||||
try:
|
||||
current = _json.loads(e.value)
|
||||
except Exception:
|
||||
current = []
|
||||
filtered = [p for p in current if p.get("project") != pid]
|
||||
set_state(
|
||||
project_name="atocore",
|
||||
category="proposals",
|
||||
key="unregistered_projects",
|
||||
value=_json.dumps(filtered),
|
||||
source="register-emerging",
|
||||
)
|
||||
break
|
||||
except Exception:
|
||||
pass # non-fatal
|
||||
|
||||
result["message"] = f"Project {pid!r} registered. Now has a wiki page, system map, and killer queries."
|
||||
return result
|
||||
|
||||
|
||||
@router.put("/projects/{project_name}")
|
||||
def api_project_update(project_name: str, req: ProjectUpdateRequest) -> dict:
|
||||
"""Update an existing project registration."""
|
||||
@@ -1190,6 +1256,25 @@ def api_dashboard() -> dict:
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Phase 6 C.2: emerging-concepts proposals from the detector
|
||||
proposals: dict = {}
|
||||
try:
|
||||
for entry in get_state("atocore"):
|
||||
if entry.category != "proposals":
|
||||
continue
|
||||
try:
|
||||
data = _json.loads(entry.value)
|
||||
except Exception:
|
||||
continue
|
||||
if entry.key == "unregistered_projects":
|
||||
proposals["unregistered_projects"] = data
|
||||
elif entry.key == "emerging_categories":
|
||||
proposals["emerging_categories"] = data
|
||||
elif entry.key == "reinforced_transients":
|
||||
proposals["reinforced_transients"] = data
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Project state counts — include all registered projects
|
||||
ps_counts = {}
|
||||
try:
|
||||
@@ -1248,6 +1333,7 @@ def api_dashboard() -> dict:
|
||||
"integrity": integrity,
|
||||
"alerts": alerts,
|
||||
"recent_audit": recent_audit,
|
||||
"proposals": proposals,
|
||||
}
|
||||
|
||||
|
||||
@@ -1431,6 +1517,19 @@ def api_graduation_status() -> dict:
|
||||
return out
|
||||
|
||||
|
||||
@router.post("/admin/memory/extend-reinforced")
|
||||
def api_extend_reinforced() -> dict:
|
||||
"""Phase 6 C.3 — batch transient-to-durable extension.
|
||||
|
||||
Scans active memories with valid_until in the next 30 days and
|
||||
reference_count >= 5. Extends expiry by 90 days, or clears it
|
||||
entirely (permanent) if reference_count >= 10. Writes audit rows.
|
||||
"""
|
||||
from atocore.memory.service import extend_reinforced_valid_until
|
||||
extended = extend_reinforced_valid_until()
|
||||
return {"extended_count": len(extended), "extensions": extended}
|
||||
|
||||
|
||||
@router.get("/admin/graduation/stats")
|
||||
def api_graduation_stats() -> dict:
|
||||
"""Phase 5F graduation stats for dashboard."""
|
||||
|
||||
@@ -116,6 +116,40 @@ def render_homepage() -> str:
|
||||
lines.append('</a>')
|
||||
lines.append('</div>')
|
||||
|
||||
# Phase 6 C.2: Emerging projects section
|
||||
try:
|
||||
import json as _json
|
||||
emerging_projects = []
|
||||
state_entries = get_state("atocore")
|
||||
for e in state_entries:
|
||||
if e.category == "proposals" and e.key == "unregistered_projects":
|
||||
try:
|
||||
emerging_projects = _json.loads(e.value)
|
||||
except Exception:
|
||||
emerging_projects = []
|
||||
break
|
||||
if emerging_projects:
|
||||
lines.append('<h2>📋 Emerging</h2>')
|
||||
lines.append('<p class="emerging-intro">Projects that appear in memories but aren\'t yet registered. '
|
||||
'One click to promote them to first-class projects.</p>')
|
||||
lines.append('<div class="emerging-grid">')
|
||||
for ep in emerging_projects[:10]:
|
||||
name = ep.get("project", "?")
|
||||
count = ep.get("count", 0)
|
||||
samples = ep.get("sample_contents", [])
|
||||
samples_html = "".join(f'<li>{s[:120]}</li>' for s in samples[:2])
|
||||
lines.append(
|
||||
f'<div class="emerging-card">'
|
||||
f'<h3>{name}</h3>'
|
||||
f'<div class="emerging-count">{count} memories</div>'
|
||||
f'<ul class="emerging-samples">{samples_html}</ul>'
|
||||
f'<button class="btn-register-emerging" onclick="registerEmerging({name!r})">📌 Register as project</button>'
|
||||
f'</div>'
|
||||
)
|
||||
lines.append('</div>')
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Quick stats
|
||||
all_entities = get_entities(limit=500)
|
||||
all_memories = get_memories(active_only=True, limit=500)
|
||||
@@ -324,7 +358,41 @@ _TEMPLATE = """<!DOCTYPE html>
|
||||
.tag-badge:hover { opacity: 0.85; text-decoration: none; }
|
||||
.mem-expiry { font-size: 0.75rem; color: #d97706; font-style: italic; margin-left: 0.4rem; }
|
||||
@media (prefers-color-scheme: dark) { .mem-expiry { color: #fbbf24; } }
|
||||
/* Phase 6 C.2 — Emerging projects section */
|
||||
.emerging-intro { font-size: 0.9rem; opacity: 0.75; margin-bottom: 0.8rem; }
|
||||
.emerging-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 1rem; margin-bottom: 1rem; }
|
||||
.emerging-card { background: var(--card); border: 1px dashed var(--accent); border-radius: 8px; padding: 1rem; }
|
||||
.emerging-card h3 { margin: 0 0 0.3rem 0; color: var(--accent); font-family: monospace; font-size: 1rem; }
|
||||
.emerging-count { font-size: 0.8rem; opacity: 0.6; margin-bottom: 0.5rem; }
|
||||
.emerging-samples { font-size: 0.85rem; margin: 0.5rem 0; padding-left: 1.2rem; opacity: 0.8; }
|
||||
.emerging-samples li { margin-bottom: 0.25rem; }
|
||||
.btn-register-emerging { width: 100%; padding: 0.45rem 0.9rem; background: var(--accent); color: white; border: 1px solid var(--accent); border-radius: 4px; cursor: pointer; font-size: 0.88rem; font-weight: 500; margin-top: 0.5rem; }
|
||||
.btn-register-emerging:hover { opacity: 0.9; }
|
||||
</style>
|
||||
<script>
|
||||
async function registerEmerging(projectId) {
|
||||
if (!confirm(`Register "${projectId}" as a first-class project?\n\nThis creates:\n• /wiki/projects/${projectId} page\n• System map + gaps + killer queries\n• Triage + graduation support\n\nIngest root defaults to vault:incoming/projects/${projectId}/`)) {
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const r = await fetch('/admin/projects/register-emerging', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({project_id: projectId}),
|
||||
});
|
||||
if (r.ok) {
|
||||
const data = await r.json();
|
||||
alert(data.message || `Registered ${projectId}`);
|
||||
window.location.reload();
|
||||
} else {
|
||||
const err = await r.text();
|
||||
alert(`Registration failed: ${r.status}\n${err.substring(0, 300)}`);
|
||||
}
|
||||
} catch (e) {
|
||||
alert(`Network error: ${e.message}`);
|
||||
}
|
||||
}
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
{{nav}}
|
||||
|
||||
@@ -604,6 +604,93 @@ def auto_promote_reinforced(
|
||||
return promoted
|
||||
|
||||
|
||||
def extend_reinforced_valid_until(
|
||||
min_reference_count: int = 5,
|
||||
permanent_reference_count: int = 10,
|
||||
extension_days: int = 90,
|
||||
imminent_expiry_days: int = 30,
|
||||
) -> list[dict]:
|
||||
"""Phase 6 C.3 — transient-to-durable auto-extension.
|
||||
|
||||
For active memories with valid_until within the next N days AND
|
||||
reference_count >= min_reference_count: extend valid_until by
|
||||
extension_days. If reference_count >= permanent_reference_count,
|
||||
clear valid_until entirely (becomes permanent).
|
||||
|
||||
Matches the user's intuition: "something transient becomes important
|
||||
if you keep coming back to it". The system watches reinforcement
|
||||
signals and extends expiry so context packs keep seeing durable
|
||||
facts instead of letting them decay out.
|
||||
|
||||
Returns a list of {memory_id, action, old, new} dicts for each
|
||||
memory touched.
|
||||
"""
|
||||
from datetime import timedelta
|
||||
|
||||
now = datetime.now(timezone.utc)
|
||||
horizon = (now + timedelta(days=imminent_expiry_days)).strftime("%Y-%m-%d")
|
||||
new_expiry = (now + timedelta(days=extension_days)).strftime("%Y-%m-%d")
|
||||
now_str = now.strftime("%Y-%m-%d %H:%M:%S")
|
||||
|
||||
extended: list[dict] = []
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT id, valid_until, reference_count FROM memories "
|
||||
"WHERE status = 'active' "
|
||||
"AND valid_until IS NOT NULL AND valid_until != '' "
|
||||
"AND substr(valid_until, 1, 10) <= ? "
|
||||
"AND COALESCE(reference_count, 0) >= ?",
|
||||
(horizon, min_reference_count),
|
||||
).fetchall()
|
||||
|
||||
for r in rows:
|
||||
mid = r["id"]
|
||||
old_vu = r["valid_until"]
|
||||
ref_count = int(r["reference_count"] or 0)
|
||||
|
||||
if ref_count >= permanent_reference_count:
|
||||
# Permanent promotion
|
||||
conn.execute(
|
||||
"UPDATE memories SET valid_until = NULL, updated_at = ? WHERE id = ?",
|
||||
(now_str, mid),
|
||||
)
|
||||
extended.append({
|
||||
"memory_id": mid, "action": "made_permanent",
|
||||
"old_valid_until": old_vu, "new_valid_until": None,
|
||||
"reference_count": ref_count,
|
||||
})
|
||||
else:
|
||||
# 90-day extension
|
||||
conn.execute(
|
||||
"UPDATE memories SET valid_until = ?, updated_at = ? WHERE id = ?",
|
||||
(new_expiry, now_str, mid),
|
||||
)
|
||||
extended.append({
|
||||
"memory_id": mid, "action": "extended",
|
||||
"old_valid_until": old_vu, "new_valid_until": new_expiry,
|
||||
"reference_count": ref_count,
|
||||
})
|
||||
|
||||
# Audit rows via the shared framework (fail-open)
|
||||
for ex in extended:
|
||||
try:
|
||||
_audit_memory(
|
||||
memory_id=ex["memory_id"],
|
||||
action="valid_until_extended",
|
||||
actor="transient-to-durable",
|
||||
before={"valid_until": ex["old_valid_until"]},
|
||||
after={"valid_until": ex["new_valid_until"]},
|
||||
note=f"reinforced {ex['reference_count']}x; {ex['action']}",
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if extended:
|
||||
log.info("reinforced_valid_until_extended", count=len(extended))
|
||||
return extended
|
||||
|
||||
|
||||
def expire_stale_candidates(
|
||||
max_age_days: int = 14,
|
||||
) -> list[str]:
|
||||
|
||||
148
tests/test_phase6_living_taxonomy.py
Normal file
148
tests/test_phase6_living_taxonomy.py
Normal file
@@ -0,0 +1,148 @@
|
||||
"""Phase 6 tests — Living Taxonomy: detector + transient-to-durable extension."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime, timedelta, timezone
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
extend_reinforced_valid_until,
|
||||
)
|
||||
from atocore.models.database import get_connection, init_db
|
||||
|
||||
|
||||
def _set_memory_fields(mem_id, reference_count=None, valid_until=None):
|
||||
"""Helper to force memory state for tests."""
|
||||
with get_connection() as conn:
|
||||
fields, params = [], []
|
||||
if reference_count is not None:
|
||||
fields.append("reference_count = ?")
|
||||
params.append(reference_count)
|
||||
if valid_until is not None:
|
||||
fields.append("valid_until = ?")
|
||||
params.append(valid_until)
|
||||
params.append(mem_id)
|
||||
conn.execute(
|
||||
f"UPDATE memories SET {', '.join(fields)} WHERE id = ?",
|
||||
params,
|
||||
)
|
||||
|
||||
|
||||
# --- Transient-to-durable extension (C.3) ---
|
||||
|
||||
|
||||
def test_extend_extends_imminent_valid_until(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Reinforced content for extension")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=6, valid_until=soon)
|
||||
|
||||
result = extend_reinforced_valid_until()
|
||||
assert len(result) == 1
|
||||
assert result[0]["memory_id"] == mem.id
|
||||
assert result[0]["action"] == "extended"
|
||||
# New expiry should be ~90 days out
|
||||
new_date = datetime.strptime(result[0]["new_valid_until"], "%Y-%m-%d")
|
||||
days_out = (new_date - datetime.now(timezone.utc).replace(tzinfo=None)).days
|
||||
assert 85 <= days_out <= 92 # ~90 days, some slop for test timing
|
||||
|
||||
|
||||
def test_extend_makes_permanent_at_high_reference_count(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Heavy-referenced content")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=15, valid_until=soon)
|
||||
|
||||
result = extend_reinforced_valid_until()
|
||||
assert len(result) == 1
|
||||
assert result[0]["action"] == "made_permanent"
|
||||
assert result[0]["new_valid_until"] is None
|
||||
|
||||
# Verify the DB reflects the cleared expiry
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT valid_until FROM memories WHERE id = ?", (mem.id,)
|
||||
).fetchone()
|
||||
assert row["valid_until"] is None
|
||||
|
||||
|
||||
def test_extend_skips_not_expiring_soon(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Far-future expiry")
|
||||
far = (datetime.now(timezone.utc) + timedelta(days=365)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=6, valid_until=far)
|
||||
|
||||
result = extend_reinforced_valid_until(imminent_expiry_days=30)
|
||||
assert result == []
|
||||
|
||||
|
||||
def test_extend_skips_low_reference_count(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Not reinforced enough")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=2, valid_until=soon)
|
||||
|
||||
result = extend_reinforced_valid_until(min_reference_count=5)
|
||||
assert result == []
|
||||
|
||||
|
||||
def test_extend_skips_permanent_memory(tmp_data_dir):
|
||||
"""Memory with no valid_until is already permanent — shouldn't touch."""
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Already permanent")
|
||||
_set_memory_fields(mem.id, reference_count=20)
|
||||
# no valid_until
|
||||
|
||||
result = extend_reinforced_valid_until()
|
||||
assert result == []
|
||||
|
||||
|
||||
def test_extend_writes_audit_row(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Audited extension")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=6, valid_until=soon)
|
||||
|
||||
extend_reinforced_valid_until()
|
||||
|
||||
from atocore.memory.service import get_memory_audit
|
||||
audit = get_memory_audit(mem.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "valid_until_extended" in actions
|
||||
entry = next(a for a in audit if a["action"] == "valid_until_extended")
|
||||
assert entry["actor"] == "transient-to-durable"
|
||||
|
||||
|
||||
# --- Emerging detector (smoke tests — detector runs against live DB state
|
||||
# so we test the shape of results rather than full integration here) ---
|
||||
|
||||
|
||||
def test_detector_imports_cleanly():
|
||||
"""Detector module must import without errors (it's called from nightly cron)."""
|
||||
import importlib.util
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Load the detector script as a module
|
||||
script = Path(__file__).resolve().parent.parent / "scripts" / "detect_emerging.py"
|
||||
assert script.exists()
|
||||
spec = importlib.util.spec_from_file_location("detect_emerging", script)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
# Don't actually run main() — just verify it parses and defines expected names
|
||||
spec.loader.exec_module(mod)
|
||||
assert hasattr(mod, "main")
|
||||
assert hasattr(mod, "PROJECT_MIN_MEMORIES")
|
||||
assert hasattr(mod, "PROJECT_ALERT_THRESHOLD")
|
||||
|
||||
|
||||
def test_detector_handles_empty_db(tmp_data_dir):
|
||||
"""Detector should handle zero memories without crashing."""
|
||||
init_db()
|
||||
# Don't create any memories. Just verify the queries work via the service layer.
|
||||
from atocore.memory.service import get_memories
|
||||
active = get_memories(active_only=True, limit=500)
|
||||
candidates = get_memories(status="candidate", limit=500)
|
||||
assert active == []
|
||||
assert candidates == []
|
||||
Reference in New Issue
Block a user