feat: Phase 5F/5G/5H — graduation, conflicts, MCP engineering tools

The population move + the safety net + the universal consumer hookup, all shipped together. This is where the engineering graph becomes genuinely useful against the real 262-memory corpus. 5F: Memory → Entity graduation (THE population move) - src/atocore/engineering/_graduation_prompt.py: stdlib-only shared prompt module mirroring _llm_prompt.py pattern (container + host use same system prompt, no drift) - scripts/graduate_memories.py: host-side batch driver that asks claude-p "does this memory describe a typed entity?" and creates entity candidates with source_refs pointing back to the memory - promote_entity() now scans source_refs for memory:* prefix; if found, flips source memory to status='graduated' with graduated_to_entity_id forward pointer + writes memory_audit row - GET /admin/graduation/stats exposes graduation rate for dashboard 5G: Sync conflict detection on entity promote - src/atocore/engineering/conflicts.py: detect_conflicts_for_entity() runs on every active promote. V1 checks 3 slot kinds narrowly to avoid false positives: * component.material (multiple USES_MATERIAL edges) * component.part_of (multiple PART_OF edges) * requirement.name (duplicate active Requirements in same project) - Conflicts + members persist via the tables built in 5A - Fires a "warning" alert via Phase 4 framework - Deduplicates: same (slot_kind, slot_key) won't get a new row - resolve_conflict(action="dismiss|supersede_others|no_action"): supersede_others marks non-winner members as status='superseded' - GET /admin/conflicts + POST /admin/conflicts/{id}/resolve 5H: MCP + context pack integration - scripts/atocore_mcp.py: 7 new engineering tools exposed to every MCP-aware client (Claude Desktop, Claude Code, Cursor, Zed): * atocore_engineering_map (Q-001/004 system tree) * atocore_engineering_gaps (Q-006/009/011 killer queries — THE director's question surfaced as a built-in tool) * atocore_engineering_requirements_for_component (Q-005) * atocore_engineering_decisions (Q-008) * atocore_engineering_changes (Q-013 — reads entity audit log) * atocore_engineering_impact (Q-016 BFS downstream) * atocore_engineering_evidence (Q-017 inbound provenance) - MCP tools total: 14 (7 memory/state/health + 7 engineering) - context/builder.py _build_engineering_context now appends a compact gaps summary ("Gaps: N orphan reqs, M risky decisions, K unsupported claims") so every project-scoped LLM call sees "what we're missing" Tests: 341 → 356 (15 new): - 5F: graduation prompt parses positive/negative decisions, rejects unknown entity types, tolerates markdown fences; promote_entity marks source memory graduated with forward pointer; entity without memory refs promotes cleanly - 5G: component.material + component.part_of + requirement.name conflicts detected; clean component triggers nothing; dedup works; supersede_others resolution marks losers; dismiss leaves both active; end-to-end promote triggers detection - 5H: graduation user message includes project + type + content No regressions across the 341 prior tests. The MCP server now answers "which p05 requirements aren't satisfied?" directly from any Claude session — no user prompt engineering, no context hacks. Next to kick off from user: run graduation script on Dalidou to populate the graph from 262 existing memories: ssh papa@dalidou 'cd /srv/storage/atocore/app && PYTHONPATH=src \ python3 scripts/graduate_memories.py --project p05-interferometer --limit 30 --dry-run' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:53:03 -04:00
parent 53b71639ad
commit 3316ff99f9
8 changed files with 1425 additions and 7 deletions
--- a/scripts/atocore_mcp.py
+++ b/scripts/atocore_mcp.py
@@ -243,6 +243,197 @@ def _tool_health(args: dict) -> str:
    return f"AtoCore healthy: sha={sha} vectors={vectors} env={env}"


+# --- Phase 5H: Engineering query tools ---
+
+
+def _tool_system_map(args: dict) -> str:
+    """Q-001 + Q-004: subsystem/component tree for a project."""
+    project = (args.get("project") or "").strip()
+    if not project:
+        return "Error: 'project' is required."
+    result, err = safe_call(
+        http_get, f"/engineering/projects/{urllib.parse.quote(project)}/systems"
+    )
+    if err:
+        return f"Engineering query failed: {err}"
+    subs = result.get("subsystems", []) or []
+    orphans = result.get("orphan_components", []) or []
+    if not subs and not orphans:
+        return f"No subsystems or components registered for {project}."
+    lines = [f"System map for {project}:"]
+    for s in subs:
+        lines.append(f"\n[{s['name']}] — {s.get('description') or '(no description)'}")
+        for c in s.get("components", []):
+            mats = ", ".join(c.get("materials", [])) or "-"
+            lines.append(f"  • {c['name']} (materials: {mats})")
+    if orphans:
+        lines.append(f"\nOrphan components (not attached to any subsystem):")
+        for c in orphans:
+            lines.append(f"  • {c['name']}")
+    return "\n".join(lines)
+
+
+def _tool_gaps(args: dict) -> str:
+    """Q-006 + Q-009 + Q-011: find coverage gaps. Director's most-used query."""
+    project = (args.get("project") or "").strip()
+    if not project:
+        return "Error: 'project' is required."
+    result, err = safe_call(
+        http_get, f"/engineering/gaps",
+        params={"project": project},
+    )
+    if err:
+        return f"Gap query failed: {err}"
+
+    orphan = result.get("orphan_requirements", {})
+    risky = result.get("risky_decisions", {})
+    unsup = result.get("unsupported_claims", {})
+
+    counts = f"{orphan.get('count',0)}/{risky.get('count',0)}/{unsup.get('count',0)}"
+    lines = [f"Coverage gaps for {project} (orphan reqs / risky decisions / unsupported claims: {counts}):\n"]
+
+    if orphan.get("count", 0):
+        lines.append(f"ORPHAN REQUIREMENTS ({orphan['count']}) — no component claims to satisfy:")
+        for g in orphan.get("gaps", [])[:10]:
+            lines.append(f"  • {g['name']}: {(g.get('description') or '')[:120]}")
+        lines.append("")
+    if risky.get("count", 0):
+        lines.append(f"RISKY DECISIONS ({risky['count']}) — based on flagged assumptions:")
+        for g in risky.get("gaps", [])[:10]:
+            lines.append(f"  • {g['decision_name']} (assumption: {g['assumption_name']} — {g['assumption_status']})")
+        lines.append("")
+    if unsup.get("count", 0):
+        lines.append(f"UNSUPPORTED CLAIMS ({unsup['count']}) — no Result entity backs them:")
+        for g in unsup.get("gaps", [])[:10]:
+            lines.append(f"  • {g['name']}: {(g.get('description') or '')[:120]}")
+
+    if orphan.get("count", 0) == 0 and risky.get("count", 0) == 0 and unsup.get("count", 0) == 0:
+        lines.append("✓ No gaps detected — every requirement satisfied, no flagged assumptions, all claims have evidence.")
+
+    return "\n".join(lines)
+
+
+def _tool_requirements_for(args: dict) -> str:
+    """Q-005: requirements that a component satisfies."""
+    component_id = (args.get("component_id") or "").strip()
+    if not component_id:
+        return "Error: 'component_id' is required."
+    result, err = safe_call(
+        http_get, f"/engineering/components/{urllib.parse.quote(component_id)}/requirements"
+    )
+    if err:
+        return f"Query failed: {err}"
+    reqs = result.get("requirements", []) or []
+    if not reqs:
+        return "No requirements associated with this component."
+    lines = [f"Component satisfies {len(reqs)} requirement(s):"]
+    for r in reqs:
+        lines.append(f"  • {r['name']}: {(r.get('description') or '')[:150]}")
+    return "\n".join(lines)
+
+
+def _tool_decisions_affecting(args: dict) -> str:
+    """Q-008: decisions affecting a project or subsystem."""
+    project = (args.get("project") or "").strip()
+    subsystem = args.get("subsystem_id") or args.get("subsystem") or ""
+    if not project:
+        return "Error: 'project' is required."
+    params = {"project": project}
+    if subsystem:
+        params["subsystem"] = subsystem
+    result, err = safe_call(http_get, "/engineering/decisions", params=params)
+    if err:
+        return f"Query failed: {err}"
+    decisions = result.get("decisions", []) or []
+    if not decisions:
+        scope = f"subsystem {subsystem}" if subsystem else f"project {project}"
+        return f"No decisions recorded for {scope}."
+    scope = f"subsystem {subsystem}" if subsystem else project
+    lines = [f"{len(decisions)} decision(s) affecting {scope}:"]
+    for d in decisions:
+        lines.append(f"  • {d['name']}: {(d.get('description') or '')[:150]}")
+    return "\n".join(lines)
+
+
+def _tool_recent_changes(args: dict) -> str:
+    """Q-013: what changed recently in the engineering graph."""
+    project = (args.get("project") or "").strip()
+    since = args.get("since") or ""
+    limit = int(args.get("limit") or 20)
+    if not project:
+        return "Error: 'project' is required."
+    params = {"project": project, "limit": limit}
+    if since:
+        params["since"] = since
+    result, err = safe_call(http_get, "/engineering/changes", params=params)
+    if err:
+        return f"Query failed: {err}"
+    changes = result.get("changes", []) or []
+    if not changes:
+        return f"No entity changes in {project} since {since or '(all time)'}."
+    lines = [f"Recent changes in {project} ({len(changes)}):"]
+    for c in changes:
+        lines.append(
+            f"  [{c['timestamp'][:16]}] {c['action']:10s} "
+            f"[{c.get('entity_type','?')}] {c.get('entity_name','?')} "
+            f"by {c.get('actor','?')}"
+        )
+    return "\n".join(lines)
+
+
+def _tool_impact(args: dict) -> str:
+    """Q-016: impact of changing an entity (downstream BFS)."""
+    entity = (args.get("entity_id") or args.get("entity") or "").strip()
+    if not entity:
+        return "Error: 'entity_id' is required."
+    max_depth = int(args.get("max_depth") or 3)
+    result, err = safe_call(
+        http_get, "/engineering/impact",
+        params={"entity": entity, "max_depth": max_depth},
+    )
+    if err:
+        return f"Query failed: {err}"
+    root = result.get("root") or {}
+    impacted = result.get("impacted", []) or []
+    if not impacted:
+        return f"Nothing downstream of [{root.get('entity_type','?')}] {root.get('name','?')}."
+    lines = [
+        f"Changing [{root.get('entity_type')}] {root.get('name')} "
+        f"would affect {len(impacted)} entity(ies) (max depth {max_depth}):"
+    ]
+    for i in impacted[:25]:
+        indent = "  " * i.get("depth", 1)
+        lines.append(f"{indent}→ [{i['entity_type']}] {i['name']} (via {i['relationship']})")
+    if len(impacted) > 25:
+        lines.append(f"  ... and {len(impacted)-25} more")
+    return "\n".join(lines)
+
+
+def _tool_evidence(args: dict) -> str:
+    """Q-017: evidence chain for an entity."""
+    entity = (args.get("entity_id") or args.get("entity") or "").strip()
+    if not entity:
+        return "Error: 'entity_id' is required."
+    result, err = safe_call(http_get, "/engineering/evidence", params={"entity": entity})
+    if err:
+        return f"Query failed: {err}"
+    root = result.get("root") or {}
+    chain = result.get("evidence_chain", []) or []
+    lines = [f"Evidence for [{root.get('entity_type','?')}] {root.get('name','?')}:"]
+    if not chain:
+        lines.append("  (no inbound provenance edges)")
+    else:
+        for e in chain:
+            lines.append(
+                f"  {e['via']} ← [{e['source_type']}] {e['source_name']}: "
+                f"{(e.get('source_description') or '')[:100]}"
+            )
+    refs = result.get("direct_source_refs") or []
+    if refs:
+        lines.append(f"\nDirect source_refs: {refs[:5]}")
+    return "\n".join(lines)
+
+
 TOOLS = [
    {
        "name": "atocore_context",
@@ -358,6 +549,121 @@ TOOLS = [
        "inputSchema": {"type": "object", "properties": {}},
        "handler": _tool_health,
    },
+    # --- Phase 5H: Engineering knowledge graph tools ---
+    {
+        "name": "atocore_engineering_map",
+        "description": (
+            "Get the subsystem/component tree for an engineering project. "
+            "Returns the full system architecture: subsystems, their components, "
+            "materials, and any orphan components not attached to a subsystem. "
+            "Use when the user asks about project structure or system design."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string", "description": "Project id (e.g. p04-gigabit)"},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_system_map,
+    },
+    {
+        "name": "atocore_engineering_gaps",
+        "description": (
+            "Find coverage gaps in a project's engineering graph: orphan "
+            "requirements (no component satisfies them), risky decisions "
+            "(based on flagged assumptions), and unsupported claims (no "
+            "Result evidence). This is the director's most useful query — "
+            "answers 'what am I forgetting?' in seconds."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_gaps,
+    },
+    {
+        "name": "atocore_engineering_requirements_for_component",
+        "description": "List the requirements a specific component claims to satisfy (Q-005).",
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "component_id": {"type": "string"},
+            },
+            "required": ["component_id"],
+        },
+        "handler": _tool_requirements_for,
+    },
+    {
+        "name": "atocore_engineering_decisions",
+        "description": (
+            "Decisions that affect a project, optionally scoped to a specific "
+            "subsystem. Use when the user asks 'what did we decide about X?'"
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+                "subsystem_id": {"type": "string", "description": "optional subsystem entity id"},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_decisions_affecting,
+    },
+    {
+        "name": "atocore_engineering_changes",
+        "description": (
+            "Recent changes to the engineering graph for a project: which "
+            "entities were created/promoted/rejected/updated, by whom, when. "
+            "Use for 'what changed recently?' type questions."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+                "since": {"type": "string", "description": "ISO timestamp; optional"},
+                "limit": {"type": "integer", "minimum": 1, "maximum": 200, "default": 20},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_recent_changes,
+    },
+    {
+        "name": "atocore_engineering_impact",
+        "description": (
+            "Impact analysis: what's downstream of a given entity. BFS over "
+            "outbound relationships up to max_depth. Use to answer 'what would "
+            "break if I change X?'"
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "entity_id": {"type": "string"},
+                "max_depth": {"type": "integer", "minimum": 1, "maximum": 5, "default": 3},
+            },
+            "required": ["entity_id"],
+        },
+        "handler": _tool_impact,
+    },
+    {
+        "name": "atocore_engineering_evidence",
+        "description": (
+            "Evidence chain for an entity: what supports it? Walks inbound "
+            "SUPPORTS / EVIDENCED_BY / DESCRIBED_BY / VALIDATED_BY / ANALYZED_BY "
+            "edges. Use for 'how do we know X is true?' type questions."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "entity_id": {"type": "string"},
+            },
+            "required": ["entity_id"],
+        },
+        "handler": _tool_evidence,
+    },
 ]


--- a/scripts/graduate_memories.py
+++ b/scripts/graduate_memories.py
@@ -0,0 +1,237 @@
+#!/usr/bin/env python3
+"""Phase 5F — Memory → Entity graduation batch pass.
+
+Takes active memories, asks claude-p whether each describes a typed
+engineering entity, and creates entity candidates for the ones that do.
+Each candidate carries source_refs back to its source memory so human
+review can trace provenance.
+
+Human reviews the entity candidates via /admin/triage (same UI as memory
+triage). When a candidate is promoted, a post-promote hook marks the source
+memory as `graduated` and sets `graduated_to_entity_id` for traceability.
+
+This is THE population move: without it, the engineering graph stays sparse
+and the killer queries (Q-006/009/011) have nothing to find gaps in.
+
+Usage:
+  python3 scripts/graduate_memories.py --base-url http://127.0.0.1:8100 \\
+      --project p05-interferometer --limit 20
+
+  # Dry run (don't create entities, just show decisions):
+  python3 scripts/graduate_memories.py --project p05-interferometer --dry-run
+
+  # Process all active memories across all projects (big run):
+  python3 scripts/graduate_memories.py --limit 200
+
+Host-side because claude CLI lives on Dalidou, not in the container.
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import shutil
+import subprocess
+import sys
+import tempfile
+import time
+import urllib.error
+import urllib.request
+from typing import Any
+
+# Make src/ importable so we can reuse the stdlib-only prompt module
+_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
+if _SRC_DIR not in sys.path:
+    sys.path.insert(0, _SRC_DIR)
+
+from atocore.engineering._graduation_prompt import (  # noqa: E402
+    GRADUATION_PROMPT_VERSION,
+    SYSTEM_PROMPT,
+    build_user_message,
+    parse_graduation_output,
+)
+
+
+DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
+DEFAULT_MODEL = os.environ.get("ATOCORE_LLM_EXTRACTOR_MODEL", "sonnet")
+DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_GRADUATION_TIMEOUT_S", "90"))
+
+_sandbox_cwd = None
+
+
+def get_sandbox_cwd() -> str:
+    """Temp cwd so claude CLI doesn't auto-discover project CLAUDE.md files."""
+    global _sandbox_cwd
+    if _sandbox_cwd is None:
+        _sandbox_cwd = tempfile.mkdtemp(prefix="ato-graduate-")
+    return _sandbox_cwd
+
+
+def api_get(base_url: str, path: str) -> dict:
+    req = urllib.request.Request(f"{base_url}{path}")
+    with urllib.request.urlopen(req, timeout=15) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def api_post(base_url: str, path: str, body: dict | None = None) -> dict:
+    data = json.dumps(body or {}).encode("utf-8")
+    req = urllib.request.Request(
+        f"{base_url}{path}", method="POST",
+        headers={"Content-Type": "application/json"}, data=data,
+    )
+    with urllib.request.urlopen(req, timeout=15) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def graduate_one(memory: dict, model: str, timeout_s: float) -> dict[str, Any] | None:
+    """Ask claude whether this memory describes a typed entity.
+
+    Returns None on any failure (parse error, timeout, exit!=0).
+    Applies retry+pacing to match the pattern in auto_triage/batch_extract.
+    """
+    if not shutil.which("claude"):
+        return None
+
+    user_msg = build_user_message(
+        memory_content=memory.get("content", "") or "",
+        memory_project=memory.get("project", "") or "",
+        memory_type=memory.get("memory_type", "") or "",
+    )
+
+    args = [
+        "claude", "-p",
+        "--model", model,
+        "--append-system-prompt", SYSTEM_PROMPT,
+        "--disable-slash-commands",
+        user_msg,
+    ]
+
+    last_error = ""
+    for attempt in range(3):
+        if attempt > 0:
+            time.sleep(2 ** attempt)
+        try:
+            completed = subprocess.run(
+                args, capture_output=True, text=True,
+                timeout=timeout_s, cwd=get_sandbox_cwd(),
+                encoding="utf-8", errors="replace",
+            )
+        except subprocess.TimeoutExpired:
+            last_error = "timeout"
+            continue
+        except Exception as exc:
+            last_error = f"subprocess error: {exc}"
+            continue
+
+        if completed.returncode == 0:
+            return parse_graduation_output(completed.stdout or "")
+
+        stderr = (completed.stderr or "").strip()[:200]
+        last_error = f"exit_{completed.returncode}: {stderr}" if stderr else f"exit_{completed.returncode}"
+
+    print(f"  ! claude failed after 3 tries: {last_error}", file=sys.stderr)
+    return None
+
+
+def create_entity_candidate(
+    base_url: str,
+    decision: dict,
+    memory: dict,
+) -> str | None:
+    """Create an entity candidate with source_refs pointing at the memory."""
+    try:
+        result = api_post(base_url, "/entities", {
+            "entity_type": decision["entity_type"],
+            "name": decision["name"],
+            "project": memory.get("project", "") or "",
+            "description": decision["description"],
+            "properties": {
+                "graduated_from_memory": memory["id"],
+                "proposed_relationships": decision["relationships"],
+                "prompt_version": GRADUATION_PROMPT_VERSION,
+            },
+            "status": "candidate",
+            "confidence": decision["confidence"],
+            "source_refs": [f"memory:{memory['id']}"],
+        })
+        return result.get("id")
+    except Exception as e:
+        print(f"  ! entity create failed: {e}", file=sys.stderr)
+        return None
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Graduate active memories into entity candidates")
+    parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
+    parser.add_argument("--model", default=DEFAULT_MODEL)
+    parser.add_argument("--project", default=None, help="Only graduate memories in this project")
+    parser.add_argument("--limit", type=int, default=50, help="Max memories to process")
+    parser.add_argument("--min-confidence", type=float, default=0.3,
+                        help="Skip memories with confidence below this (they're probably noise)")
+    parser.add_argument("--dry-run", action="store_true", help="Show decisions without creating entities")
+    args = parser.parse_args()
+
+    # Fetch active memories
+    query = "status=active"
+    query += f"&limit={args.limit}"
+    if args.project:
+        query += f"&project={args.project}"
+    result = api_get(args.base_url, f"/memory?{query}")
+    memories = result.get("memories", [])
+
+    # Filter by min_confidence + skip already-graduated
+    memories = [m for m in memories
+                if m.get("confidence", 0) >= args.min_confidence
+                and m.get("status") != "graduated"]
+
+    print(f"graduating: {len(memories)} memories  project={args.project or '(all)'}  "
+          f"model={args.model}  dry_run={args.dry_run}")
+
+    graduated = 0
+    skipped = 0
+    errors = 0
+    entities_created: list[str] = []
+
+    for i, mem in enumerate(memories, 1):
+        if i > 1:
+            time.sleep(0.5)  # light pacing, matches auto_triage
+        mid = mem["id"]
+        label = f"[{i:3d}/{len(memories)}] {mid[:8]} [{mem.get('memory_type','?')}]"
+
+        decision = graduate_one(mem, args.model, DEFAULT_TIMEOUT_S)
+        if decision is None:
+            print(f"  ERROR  {label}  (graduate_one returned None)")
+            errors += 1
+            continue
+
+        if not decision.get("graduate"):
+            reason = decision.get("reason", "(no reason)")
+            print(f"  skip   {label}  {reason}")
+            skipped += 1
+            continue
+
+        etype = decision["entity_type"]
+        ename = decision["name"]
+        nrel = len(decision.get("relationships", []))
+
+        if args.dry_run:
+            print(f"  WOULD  {label}  → [{etype}] {ename!r}  ({nrel} rels)")
+            graduated += 1
+        else:
+            entity_id = create_entity_candidate(args.base_url, decision, mem)
+            if entity_id:
+                print(f"  CREATE {label}  → [{etype}] {ename!r}  ({nrel} rels)  entity={entity_id[:8]}")
+                graduated += 1
+                entities_created.append(entity_id)
+            else:
+                errors += 1
+
+    print(f"\ntotal: graduated={graduated} skipped={skipped} errors={errors}")
+    if entities_created:
+        print(f"Review at /admin/triage ({len(entities_created)} entity candidates created)")
+
+
+if __name__ == "__main__":
+    main()