feat: Phase 5F/5G/5H — graduation, conflicts, MCP engineering tools
The population move + the safety net + the universal consumer hookup,
all shipped together. This is where the engineering graph becomes
genuinely useful against the real 262-memory corpus.
5F: Memory → Entity graduation (THE population move)
- src/atocore/engineering/_graduation_prompt.py: stdlib-only shared
prompt module mirroring _llm_prompt.py pattern (container + host
use same system prompt, no drift)
- scripts/graduate_memories.py: host-side batch driver that asks
claude-p "does this memory describe a typed entity?" and creates
entity candidates with source_refs pointing back to the memory
- promote_entity() now scans source_refs for memory:* prefix; if
found, flips source memory to status='graduated' with
graduated_to_entity_id forward pointer + writes memory_audit row
- GET /admin/graduation/stats exposes graduation rate for dashboard
5G: Sync conflict detection on entity promote
- src/atocore/engineering/conflicts.py: detect_conflicts_for_entity()
runs on every active promote. V1 checks 3 slot kinds narrowly to
avoid false positives:
* component.material (multiple USES_MATERIAL edges)
* component.part_of (multiple PART_OF edges)
* requirement.name (duplicate active Requirements in same project)
- Conflicts + members persist via the tables built in 5A
- Fires a "warning" alert via Phase 4 framework
- Deduplicates: same (slot_kind, slot_key) won't get a new row
- resolve_conflict(action="dismiss|supersede_others|no_action"):
supersede_others marks non-winner members as status='superseded'
- GET /admin/conflicts + POST /admin/conflicts/{id}/resolve
5H: MCP + context pack integration
- scripts/atocore_mcp.py: 7 new engineering tools exposed to every
MCP-aware client (Claude Desktop, Claude Code, Cursor, Zed):
* atocore_engineering_map (Q-001/004 system tree)
* atocore_engineering_gaps (Q-006/009/011 killer queries — THE
director's question surfaced as a built-in tool)
* atocore_engineering_requirements_for_component (Q-005)
* atocore_engineering_decisions (Q-008)
* atocore_engineering_changes (Q-013 — reads entity audit log)
* atocore_engineering_impact (Q-016 BFS downstream)
* atocore_engineering_evidence (Q-017 inbound provenance)
- MCP tools total: 14 (7 memory/state/health + 7 engineering)
- context/builder.py _build_engineering_context now appends a compact
gaps summary ("Gaps: N orphan reqs, M risky decisions, K unsupported
claims") so every project-scoped LLM call sees "what we're missing"
Tests: 341 → 356 (15 new):
- 5F: graduation prompt parses positive/negative decisions, rejects
unknown entity types, tolerates markdown fences; promote_entity
marks source memory graduated with forward pointer; entity without
memory refs promotes cleanly
- 5G: component.material + component.part_of + requirement.name
conflicts detected; clean component triggers nothing; dedup works;
supersede_others resolution marks losers; dismiss leaves both
active; end-to-end promote triggers detection
- 5H: graduation user message includes project + type + content
No regressions across the 341 prior tests. The MCP server now answers
"which p05 requirements aren't satisfied?" directly from any Claude
session — no user prompt engineering, no context hacks.
Next to kick off from user: run graduation script on Dalidou to
populate the graph from 262 existing memories:
ssh papa@dalidou 'cd /srv/storage/atocore/app && PYTHONPATH=src \
python3 scripts/graduate_memories.py --project p05-interferometer --limit 30 --dry-run'
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -243,6 +243,197 @@ def _tool_health(args: dict) -> str:
|
||||
return f"AtoCore healthy: sha={sha} vectors={vectors} env={env}"
|
||||
|
||||
|
||||
# --- Phase 5H: Engineering query tools ---
|
||||
|
||||
|
||||
def _tool_system_map(args: dict) -> str:
|
||||
"""Q-001 + Q-004: subsystem/component tree for a project."""
|
||||
project = (args.get("project") or "").strip()
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
result, err = safe_call(
|
||||
http_get, f"/engineering/projects/{urllib.parse.quote(project)}/systems"
|
||||
)
|
||||
if err:
|
||||
return f"Engineering query failed: {err}"
|
||||
subs = result.get("subsystems", []) or []
|
||||
orphans = result.get("orphan_components", []) or []
|
||||
if not subs and not orphans:
|
||||
return f"No subsystems or components registered for {project}."
|
||||
lines = [f"System map for {project}:"]
|
||||
for s in subs:
|
||||
lines.append(f"\n[{s['name']}] — {s.get('description') or '(no description)'}")
|
||||
for c in s.get("components", []):
|
||||
mats = ", ".join(c.get("materials", [])) or "-"
|
||||
lines.append(f" • {c['name']} (materials: {mats})")
|
||||
if orphans:
|
||||
lines.append(f"\nOrphan components (not attached to any subsystem):")
|
||||
for c in orphans:
|
||||
lines.append(f" • {c['name']}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_gaps(args: dict) -> str:
|
||||
"""Q-006 + Q-009 + Q-011: find coverage gaps. Director's most-used query."""
|
||||
project = (args.get("project") or "").strip()
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
result, err = safe_call(
|
||||
http_get, f"/engineering/gaps",
|
||||
params={"project": project},
|
||||
)
|
||||
if err:
|
||||
return f"Gap query failed: {err}"
|
||||
|
||||
orphan = result.get("orphan_requirements", {})
|
||||
risky = result.get("risky_decisions", {})
|
||||
unsup = result.get("unsupported_claims", {})
|
||||
|
||||
counts = f"{orphan.get('count',0)}/{risky.get('count',0)}/{unsup.get('count',0)}"
|
||||
lines = [f"Coverage gaps for {project} (orphan reqs / risky decisions / unsupported claims: {counts}):\n"]
|
||||
|
||||
if orphan.get("count", 0):
|
||||
lines.append(f"ORPHAN REQUIREMENTS ({orphan['count']}) — no component claims to satisfy:")
|
||||
for g in orphan.get("gaps", [])[:10]:
|
||||
lines.append(f" • {g['name']}: {(g.get('description') or '')[:120]}")
|
||||
lines.append("")
|
||||
if risky.get("count", 0):
|
||||
lines.append(f"RISKY DECISIONS ({risky['count']}) — based on flagged assumptions:")
|
||||
for g in risky.get("gaps", [])[:10]:
|
||||
lines.append(f" • {g['decision_name']} (assumption: {g['assumption_name']} — {g['assumption_status']})")
|
||||
lines.append("")
|
||||
if unsup.get("count", 0):
|
||||
lines.append(f"UNSUPPORTED CLAIMS ({unsup['count']}) — no Result entity backs them:")
|
||||
for g in unsup.get("gaps", [])[:10]:
|
||||
lines.append(f" • {g['name']}: {(g.get('description') or '')[:120]}")
|
||||
|
||||
if orphan.get("count", 0) == 0 and risky.get("count", 0) == 0 and unsup.get("count", 0) == 0:
|
||||
lines.append("✓ No gaps detected — every requirement satisfied, no flagged assumptions, all claims have evidence.")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_requirements_for(args: dict) -> str:
|
||||
"""Q-005: requirements that a component satisfies."""
|
||||
component_id = (args.get("component_id") or "").strip()
|
||||
if not component_id:
|
||||
return "Error: 'component_id' is required."
|
||||
result, err = safe_call(
|
||||
http_get, f"/engineering/components/{urllib.parse.quote(component_id)}/requirements"
|
||||
)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
reqs = result.get("requirements", []) or []
|
||||
if not reqs:
|
||||
return "No requirements associated with this component."
|
||||
lines = [f"Component satisfies {len(reqs)} requirement(s):"]
|
||||
for r in reqs:
|
||||
lines.append(f" • {r['name']}: {(r.get('description') or '')[:150]}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_decisions_affecting(args: dict) -> str:
|
||||
"""Q-008: decisions affecting a project or subsystem."""
|
||||
project = (args.get("project") or "").strip()
|
||||
subsystem = args.get("subsystem_id") or args.get("subsystem") or ""
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
params = {"project": project}
|
||||
if subsystem:
|
||||
params["subsystem"] = subsystem
|
||||
result, err = safe_call(http_get, "/engineering/decisions", params=params)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
decisions = result.get("decisions", []) or []
|
||||
if not decisions:
|
||||
scope = f"subsystem {subsystem}" if subsystem else f"project {project}"
|
||||
return f"No decisions recorded for {scope}."
|
||||
scope = f"subsystem {subsystem}" if subsystem else project
|
||||
lines = [f"{len(decisions)} decision(s) affecting {scope}:"]
|
||||
for d in decisions:
|
||||
lines.append(f" • {d['name']}: {(d.get('description') or '')[:150]}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_recent_changes(args: dict) -> str:
|
||||
"""Q-013: what changed recently in the engineering graph."""
|
||||
project = (args.get("project") or "").strip()
|
||||
since = args.get("since") or ""
|
||||
limit = int(args.get("limit") or 20)
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
params = {"project": project, "limit": limit}
|
||||
if since:
|
||||
params["since"] = since
|
||||
result, err = safe_call(http_get, "/engineering/changes", params=params)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
changes = result.get("changes", []) or []
|
||||
if not changes:
|
||||
return f"No entity changes in {project} since {since or '(all time)'}."
|
||||
lines = [f"Recent changes in {project} ({len(changes)}):"]
|
||||
for c in changes:
|
||||
lines.append(
|
||||
f" [{c['timestamp'][:16]}] {c['action']:10s} "
|
||||
f"[{c.get('entity_type','?')}] {c.get('entity_name','?')} "
|
||||
f"by {c.get('actor','?')}"
|
||||
)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_impact(args: dict) -> str:
|
||||
"""Q-016: impact of changing an entity (downstream BFS)."""
|
||||
entity = (args.get("entity_id") or args.get("entity") or "").strip()
|
||||
if not entity:
|
||||
return "Error: 'entity_id' is required."
|
||||
max_depth = int(args.get("max_depth") or 3)
|
||||
result, err = safe_call(
|
||||
http_get, "/engineering/impact",
|
||||
params={"entity": entity, "max_depth": max_depth},
|
||||
)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
root = result.get("root") or {}
|
||||
impacted = result.get("impacted", []) or []
|
||||
if not impacted:
|
||||
return f"Nothing downstream of [{root.get('entity_type','?')}] {root.get('name','?')}."
|
||||
lines = [
|
||||
f"Changing [{root.get('entity_type')}] {root.get('name')} "
|
||||
f"would affect {len(impacted)} entity(ies) (max depth {max_depth}):"
|
||||
]
|
||||
for i in impacted[:25]:
|
||||
indent = " " * i.get("depth", 1)
|
||||
lines.append(f"{indent}→ [{i['entity_type']}] {i['name']} (via {i['relationship']})")
|
||||
if len(impacted) > 25:
|
||||
lines.append(f" ... and {len(impacted)-25} more")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_evidence(args: dict) -> str:
|
||||
"""Q-017: evidence chain for an entity."""
|
||||
entity = (args.get("entity_id") or args.get("entity") or "").strip()
|
||||
if not entity:
|
||||
return "Error: 'entity_id' is required."
|
||||
result, err = safe_call(http_get, "/engineering/evidence", params={"entity": entity})
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
root = result.get("root") or {}
|
||||
chain = result.get("evidence_chain", []) or []
|
||||
lines = [f"Evidence for [{root.get('entity_type','?')}] {root.get('name','?')}:"]
|
||||
if not chain:
|
||||
lines.append(" (no inbound provenance edges)")
|
||||
else:
|
||||
for e in chain:
|
||||
lines.append(
|
||||
f" {e['via']} ← [{e['source_type']}] {e['source_name']}: "
|
||||
f"{(e.get('source_description') or '')[:100]}"
|
||||
)
|
||||
refs = result.get("direct_source_refs") or []
|
||||
if refs:
|
||||
lines.append(f"\nDirect source_refs: {refs[:5]}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
TOOLS = [
|
||||
{
|
||||
"name": "atocore_context",
|
||||
@@ -358,6 +549,121 @@ TOOLS = [
|
||||
"inputSchema": {"type": "object", "properties": {}},
|
||||
"handler": _tool_health,
|
||||
},
|
||||
# --- Phase 5H: Engineering knowledge graph tools ---
|
||||
{
|
||||
"name": "atocore_engineering_map",
|
||||
"description": (
|
||||
"Get the subsystem/component tree for an engineering project. "
|
||||
"Returns the full system architecture: subsystems, their components, "
|
||||
"materials, and any orphan components not attached to a subsystem. "
|
||||
"Use when the user asks about project structure or system design."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string", "description": "Project id (e.g. p04-gigabit)"},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_system_map,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_gaps",
|
||||
"description": (
|
||||
"Find coverage gaps in a project's engineering graph: orphan "
|
||||
"requirements (no component satisfies them), risky decisions "
|
||||
"(based on flagged assumptions), and unsupported claims (no "
|
||||
"Result evidence). This is the director's most useful query — "
|
||||
"answers 'what am I forgetting?' in seconds."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_gaps,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_requirements_for_component",
|
||||
"description": "List the requirements a specific component claims to satisfy (Q-005).",
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"component_id": {"type": "string"},
|
||||
},
|
||||
"required": ["component_id"],
|
||||
},
|
||||
"handler": _tool_requirements_for,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_decisions",
|
||||
"description": (
|
||||
"Decisions that affect a project, optionally scoped to a specific "
|
||||
"subsystem. Use when the user asks 'what did we decide about X?'"
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
"subsystem_id": {"type": "string", "description": "optional subsystem entity id"},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_decisions_affecting,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_changes",
|
||||
"description": (
|
||||
"Recent changes to the engineering graph for a project: which "
|
||||
"entities were created/promoted/rejected/updated, by whom, when. "
|
||||
"Use for 'what changed recently?' type questions."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
"since": {"type": "string", "description": "ISO timestamp; optional"},
|
||||
"limit": {"type": "integer", "minimum": 1, "maximum": 200, "default": 20},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_recent_changes,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_impact",
|
||||
"description": (
|
||||
"Impact analysis: what's downstream of a given entity. BFS over "
|
||||
"outbound relationships up to max_depth. Use to answer 'what would "
|
||||
"break if I change X?'"
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"entity_id": {"type": "string"},
|
||||
"max_depth": {"type": "integer", "minimum": 1, "maximum": 5, "default": 3},
|
||||
},
|
||||
"required": ["entity_id"],
|
||||
},
|
||||
"handler": _tool_impact,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_evidence",
|
||||
"description": (
|
||||
"Evidence chain for an entity: what supports it? Walks inbound "
|
||||
"SUPPORTS / EVIDENCED_BY / DESCRIBED_BY / VALIDATED_BY / ANALYZED_BY "
|
||||
"edges. Use for 'how do we know X is true?' type questions."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"entity_id": {"type": "string"},
|
||||
},
|
||||
"required": ["entity_id"],
|
||||
},
|
||||
"handler": _tool_evidence,
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
|
||||
237
scripts/graduate_memories.py
Normal file
237
scripts/graduate_memories.py
Normal file
@@ -0,0 +1,237 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Phase 5F — Memory → Entity graduation batch pass.
|
||||
|
||||
Takes active memories, asks claude-p whether each describes a typed
|
||||
engineering entity, and creates entity candidates for the ones that do.
|
||||
Each candidate carries source_refs back to its source memory so human
|
||||
review can trace provenance.
|
||||
|
||||
Human reviews the entity candidates via /admin/triage (same UI as memory
|
||||
triage). When a candidate is promoted, a post-promote hook marks the source
|
||||
memory as `graduated` and sets `graduated_to_entity_id` for traceability.
|
||||
|
||||
This is THE population move: without it, the engineering graph stays sparse
|
||||
and the killer queries (Q-006/009/011) have nothing to find gaps in.
|
||||
|
||||
Usage:
|
||||
python3 scripts/graduate_memories.py --base-url http://127.0.0.1:8100 \\
|
||||
--project p05-interferometer --limit 20
|
||||
|
||||
# Dry run (don't create entities, just show decisions):
|
||||
python3 scripts/graduate_memories.py --project p05-interferometer --dry-run
|
||||
|
||||
# Process all active memories across all projects (big run):
|
||||
python3 scripts/graduate_memories.py --limit 200
|
||||
|
||||
Host-side because claude CLI lives on Dalidou, not in the container.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from typing import Any
|
||||
|
||||
# Make src/ importable so we can reuse the stdlib-only prompt module
|
||||
_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
|
||||
if _SRC_DIR not in sys.path:
|
||||
sys.path.insert(0, _SRC_DIR)
|
||||
|
||||
from atocore.engineering._graduation_prompt import ( # noqa: E402
|
||||
GRADUATION_PROMPT_VERSION,
|
||||
SYSTEM_PROMPT,
|
||||
build_user_message,
|
||||
parse_graduation_output,
|
||||
)
|
||||
|
||||
|
||||
DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
|
||||
DEFAULT_MODEL = os.environ.get("ATOCORE_LLM_EXTRACTOR_MODEL", "sonnet")
|
||||
DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_GRADUATION_TIMEOUT_S", "90"))
|
||||
|
||||
_sandbox_cwd = None
|
||||
|
||||
|
||||
def get_sandbox_cwd() -> str:
|
||||
"""Temp cwd so claude CLI doesn't auto-discover project CLAUDE.md files."""
|
||||
global _sandbox_cwd
|
||||
if _sandbox_cwd is None:
|
||||
_sandbox_cwd = tempfile.mkdtemp(prefix="ato-graduate-")
|
||||
return _sandbox_cwd
|
||||
|
||||
|
||||
def api_get(base_url: str, path: str) -> dict:
|
||||
req = urllib.request.Request(f"{base_url}{path}")
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def api_post(base_url: str, path: str, body: dict | None = None) -> dict:
|
||||
data = json.dumps(body or {}).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{base_url}{path}", method="POST",
|
||||
headers={"Content-Type": "application/json"}, data=data,
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def graduate_one(memory: dict, model: str, timeout_s: float) -> dict[str, Any] | None:
|
||||
"""Ask claude whether this memory describes a typed entity.
|
||||
|
||||
Returns None on any failure (parse error, timeout, exit!=0).
|
||||
Applies retry+pacing to match the pattern in auto_triage/batch_extract.
|
||||
"""
|
||||
if not shutil.which("claude"):
|
||||
return None
|
||||
|
||||
user_msg = build_user_message(
|
||||
memory_content=memory.get("content", "") or "",
|
||||
memory_project=memory.get("project", "") or "",
|
||||
memory_type=memory.get("memory_type", "") or "",
|
||||
)
|
||||
|
||||
args = [
|
||||
"claude", "-p",
|
||||
"--model", model,
|
||||
"--append-system-prompt", SYSTEM_PROMPT,
|
||||
"--disable-slash-commands",
|
||||
user_msg,
|
||||
]
|
||||
|
||||
last_error = ""
|
||||
for attempt in range(3):
|
||||
if attempt > 0:
|
||||
time.sleep(2 ** attempt)
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
args, capture_output=True, text=True,
|
||||
timeout=timeout_s, cwd=get_sandbox_cwd(),
|
||||
encoding="utf-8", errors="replace",
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
last_error = "timeout"
|
||||
continue
|
||||
except Exception as exc:
|
||||
last_error = f"subprocess error: {exc}"
|
||||
continue
|
||||
|
||||
if completed.returncode == 0:
|
||||
return parse_graduation_output(completed.stdout or "")
|
||||
|
||||
stderr = (completed.stderr or "").strip()[:200]
|
||||
last_error = f"exit_{completed.returncode}: {stderr}" if stderr else f"exit_{completed.returncode}"
|
||||
|
||||
print(f" ! claude failed after 3 tries: {last_error}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def create_entity_candidate(
|
||||
base_url: str,
|
||||
decision: dict,
|
||||
memory: dict,
|
||||
) -> str | None:
|
||||
"""Create an entity candidate with source_refs pointing at the memory."""
|
||||
try:
|
||||
result = api_post(base_url, "/entities", {
|
||||
"entity_type": decision["entity_type"],
|
||||
"name": decision["name"],
|
||||
"project": memory.get("project", "") or "",
|
||||
"description": decision["description"],
|
||||
"properties": {
|
||||
"graduated_from_memory": memory["id"],
|
||||
"proposed_relationships": decision["relationships"],
|
||||
"prompt_version": GRADUATION_PROMPT_VERSION,
|
||||
},
|
||||
"status": "candidate",
|
||||
"confidence": decision["confidence"],
|
||||
"source_refs": [f"memory:{memory['id']}"],
|
||||
})
|
||||
return result.get("id")
|
||||
except Exception as e:
|
||||
print(f" ! entity create failed: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Graduate active memories into entity candidates")
|
||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
|
||||
parser.add_argument("--model", default=DEFAULT_MODEL)
|
||||
parser.add_argument("--project", default=None, help="Only graduate memories in this project")
|
||||
parser.add_argument("--limit", type=int, default=50, help="Max memories to process")
|
||||
parser.add_argument("--min-confidence", type=float, default=0.3,
|
||||
help="Skip memories with confidence below this (they're probably noise)")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Show decisions without creating entities")
|
||||
args = parser.parse_args()
|
||||
|
||||
# Fetch active memories
|
||||
query = "status=active"
|
||||
query += f"&limit={args.limit}"
|
||||
if args.project:
|
||||
query += f"&project={args.project}"
|
||||
result = api_get(args.base_url, f"/memory?{query}")
|
||||
memories = result.get("memories", [])
|
||||
|
||||
# Filter by min_confidence + skip already-graduated
|
||||
memories = [m for m in memories
|
||||
if m.get("confidence", 0) >= args.min_confidence
|
||||
and m.get("status") != "graduated"]
|
||||
|
||||
print(f"graduating: {len(memories)} memories project={args.project or '(all)'} "
|
||||
f"model={args.model} dry_run={args.dry_run}")
|
||||
|
||||
graduated = 0
|
||||
skipped = 0
|
||||
errors = 0
|
||||
entities_created: list[str] = []
|
||||
|
||||
for i, mem in enumerate(memories, 1):
|
||||
if i > 1:
|
||||
time.sleep(0.5) # light pacing, matches auto_triage
|
||||
mid = mem["id"]
|
||||
label = f"[{i:3d}/{len(memories)}] {mid[:8]} [{mem.get('memory_type','?')}]"
|
||||
|
||||
decision = graduate_one(mem, args.model, DEFAULT_TIMEOUT_S)
|
||||
if decision is None:
|
||||
print(f" ERROR {label} (graduate_one returned None)")
|
||||
errors += 1
|
||||
continue
|
||||
|
||||
if not decision.get("graduate"):
|
||||
reason = decision.get("reason", "(no reason)")
|
||||
print(f" skip {label} {reason}")
|
||||
skipped += 1
|
||||
continue
|
||||
|
||||
etype = decision["entity_type"]
|
||||
ename = decision["name"]
|
||||
nrel = len(decision.get("relationships", []))
|
||||
|
||||
if args.dry_run:
|
||||
print(f" WOULD {label} → [{etype}] {ename!r} ({nrel} rels)")
|
||||
graduated += 1
|
||||
else:
|
||||
entity_id = create_entity_candidate(args.base_url, decision, mem)
|
||||
if entity_id:
|
||||
print(f" CREATE {label} → [{etype}] {ename!r} ({nrel} rels) entity={entity_id[:8]}")
|
||||
graduated += 1
|
||||
entities_created.append(entity_id)
|
||||
else:
|
||||
errors += 1
|
||||
|
||||
print(f"\ntotal: graduated={graduated} skipped={skipped} errors={errors}")
|
||||
if entities_created:
|
||||
print(f"Review at /admin/triage ({len(entities_created)} entity candidates created)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user