wiki: hide low-signal memories + collapse ambient AKC sessions on domain/homepage

Real usage showed two failure modes on /wiki/domains/{tag} and the homepage: 1. Empty-transcript AKC sessions (mic-off) and synthetic E2E test memories competed with real knowledge in topical listings. A user hitting the 'optics' domain page saw pages of '(no transcript)' and 'IMG integration test — synthetic session' before finding anything useful. 2. Every AKC voice session writes an episodic memory as provenance. At one session per capture burst these quickly dominate any domain page carrying a common tag like 'p05' or 'optics'. This change partitions memories into three buckets on the domain and homepage surfaces: - low-signal → hidden entirely (counted in a dim sub-line) - AKC session → collapsed behind a single link to /wiki/activity - real → rendered inline Filter predicates are additive: a memory that's both an empty-transcript session AND tagged as AKC lands in low-signal (priority). Tests: 4 new tests in test_wiki_pages.py lock the partition behaviour. All 15 pre-existing wiki tests still pass. No schema change, no migration needed.
feat(wiki): [[wikilinks]] with redlinks + cross-project resolver (Issue B)
2026-04-22 13:51:40 -04:00 · 2026-04-22 09:15:14 -04:00 · 2026-04-22 09:02:13 -04:00 · 2026-04-21 21:56:24 -04:00 · 2026-04-21 21:46:52 -04:00 · 2026-04-21 20:17:32 -04:00
72 changed files with 15471 additions and 513 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -6,6 +6,7 @@ __pycache__/
 dist/
 build/
 .pytest_cache/
+.mypy_cache/
 htmlcov/
 .coverage
 venv/
--- a/DEV-LEDGER.md
+++ b/DEV-LEDGER.md
@@ -7,9 +7,9 @@
 ## Orientation

 - **live_sha** (Dalidou `/health` build_sha): `775960c` (verified 2026-04-16 via /health, build_time 2026-04-16T17:59:30Z)
- **last_updated**: 2026-04-16 by Claude ("Make It Actually Useful" sprint — observability + Phase 10)
+- **last_updated**: 2026-04-18 by Claude (Phase 7A — Memory Consolidation "sleep cycle" V1 on branch, not yet deployed)
 - **main_tip**: `999788b`
- **test_count**: 303 (4 new Phase 10 tests)
+- **test_count**: 533 (prior 521 + 12 new wikilink/redlink tests)
 - **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression)
 - **vectors**: 33,253
 - **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
@@ -160,6 +160,24 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha

 ## Session Log

+- **2026-04-22 Claude (pm)** Issue B (wiki redlinks) landed — last remaining P2 from Antoine's sprint plan. `_wikilink_transform(text, current_project)` in `src/atocore/engineering/wiki.py` replaces `[[Name]]` / `[[Name|Display]]` tokens (pre-markdown) with HTML anchors. Resolution order: same-project exact-name match → live `wikilink`; other-project match → live link with `(in project X)` scope indicator (`wikilink-cross`); no match → `redlink` pointing at `/wiki/new?name=<quoted>&project=<current>`. New route `GET /wiki/new` renders a pre-filled "create this entity" form that POSTs to `/v1/entities` via a minimal inline fetch() and redirects to the new entity's wiki page on success. Transform applied in `render_project` (over the mirror markdown) and `render_entity` (over the description body). CSS: dashed-underline accent for live wikilinks, red italic + dashed for redlinks. 12 new tests including the regression from the spec (entity A references `[[EntityB]]` → initial render has `class="redlink"`; after EntityB is created, re-render no longer has redlink and includes `/wiki/entities/{b.id}`). Tests 521 → 533. All 6 acceptance criteria from the sprint plan ("daily-usable") now green: retract/supersede, edit without cloning, cross-project has a home, visual evidence, wiki readable, AKC can capture reliably.
+
+- **2026-04-22 Claude** PATCH `/entities/{id}` + Issue D (/v1/engineering/* aliases) landed. New `update_entity()` in `src/atocore/engineering/service.py` supports partial updates to description (replace), properties (shallow merge — `null` value deletes a key), confidence (0..1, 400 on bounds violation), source_refs (append + dedup). Writes an `updated` audit row with full before/after snapshots. Forbidden via this path: entity_type / project / name / status — those require supersede+create or the dedicated status endpoints, by design. New route `PATCH /entities/{id}` aliased under `/v1`. Issue D: all 10 `/engineering/*` query paths (decisions, systems, components/{id}/requirements, changes, gaps + sub-paths, impact, evidence) added to the `/v1` allowlist. 12 new PATCH tests (merge, null-delete, confidence bounds, source_refs dedup, 404, audit row, v1 alias). Tests 509 → 521. Next: commit + deploy, then Issue B (wiki redlinks) as the last remaining P2 per Antoine's sprint order.
+
+- **2026-04-21 Claude (night)** Issue E (retraction path for active entities + memories) landed. Two new entity endpoints and two new memory endpoints, all aliased under `/v1`: `POST /entities/{id}/invalidate` (active→invalid, 200 idempotent on already-invalid, 409 if candidate/superseded, 404 if missing), `POST /entities/{id}/supersede` (active→superseded + auto-creates `supersedes` relationship from the new entity to the old one; rejects self-supersede and unknown superseded_by with 400), `POST /memory/{id}/invalidate`, `POST /memory/{id}/supersede`. `invalidate_memory`/`supersede_memory` in service.py now take a `reason` string that lands in the audit `note`. New service helper `invalidate_active_entity(id, reason)` returns `(ok, code)` where code is one of `invalidated | already_invalid | not_active | not_found` for a clean HTTP-status mapping. 15 new tests. Tests 494 → 509. Unblocks correction workflows — no more SQL required to retract mistakes.
+
+- **2026-04-21 Claude (cleanup)** One-time SQL cleanup on live Dalidou: flipped 8 `status='active' → 'invalid'` rows in `entities` (CGH, tower, "interferometer mirror tower", steel, "steel (likely)" in p05-interferometer + 3 remaining `AKC-E2E-Test-*` rows that were still active). Each update paired with a `memory_audit` row (action=`invalidated`, actor=`sql-cleanup`, note references Issue E pending). Executed inside the `atocore` container via `docker exec` since `/srv/storage/atocore/data/db/atocore.db` is root-owned and the service holds write perms. Verification: `GET /entities?project=p05-interferometer&scope_only=true` now 21 active, zero pollution. Issue E (public `POST /v1/entities/{id}/invalidate` for active→invalid) remains open — this cleanup should not be needed again once E ships.
+
+- **2026-04-21 Claude (evening)** Issue F (visual evidence) landed. New `src/atocore/assets/` module provides hash-dedup binary storage (`<assets_dir>/<hash[:2]>/<hash>.<ext>`) with on-demand JPEG thumbnails cached under `.thumbnails/<size>/`. New `assets` table (hash_sha256 unique, mime_type, size, width/height, source_refs, status). `artifact` added to `ENTITY_TYPES`; no schema change needed on entities (`properties` stays free-form JSON carrying `kind`/`asset_id`/`caption`/`capture_context`). `EVIDENCED_BY` already in the relationship enum — no change. New API: `POST /assets` (multipart, 20 MB cap, MIME allowlist: png/jpeg/webp/gif/pdf/step/iges), `GET /assets/{id}` (streams original), `GET /assets/{id}/thumbnail?size=N` (Pillow, 16-2048 px clamp), `GET /assets/{id}/meta`, `GET /admin/assets/orphans`, `DELETE /assets/{id}` (409 if referenced), `GET /entities/{id}/evidence` (returns EVIDENCED_BY artifacts with asset metadata resolved). All aliased under `/v1`. Wiki: artifact entity pages render full image + caption + capture_context; other entity pages render an "Visual evidence" strip of EVIDENCED_BY thumbnails linking to full-res + artifact detail page. PDFs render as a link; other artifact kinds render as labeled chips. Added `python-multipart` + `Pillow>=10.0.0` to deps; docker-compose gets an `${ATOCORE_ASSETS_DIR}` bind mount; Dalidou `.env` updated with `ATOCORE_ASSETS_DIR=/srv/storage/atocore/data/assets`. 16 new tests (hash dedup, size cap, mime allowlist, thumbnail cache, orphan detection, invalidate gating, multipart upload, evidence API, v1 aliases, wiki rendering). Tests 478 → 494.
+
+- **2026-04-21 Claude (pm)** Issue C (inbox + cross-project entities) landed. `inbox` is a reserved pseudo-project: auto-exists, cannot be registered/updated/aliased (enforced in `src/atocore/projects/registry.py` via `is_reserved_project` + `register_project`/`update_project` guards). `project=""` remains the cross-project/global bucket for facts that apply to every project. `resolve_project_name("inbox")` is stable and does not hit the registry. `get_entities` now scopes: `project=""` → only globals; `project="inbox"` → only inbox; `project="<real>"` default → that project plus globals; `scope_only=true` → strict. `POST /entities` accepts `project=null` as equivalent to `""`. `POST /entities/{id}/promote` accepts `{target_project}` to retarget an inbox/global lead into a real project on promote (new "retargeted" audit action). Wiki homepage shows a new "📥 Inbox & Global" section with live counts, linking to scoped `/entities` lists. 15 new tests in `test_inbox_crossproject.py` cover reserved-name enforcement, scoping rules, API shape, and promote retargeting. Tests 463 → 478. Pending: commit, push, deploy. Issue B (wiki redlinks) deferred per AKC thread — P1 cosmetic, not a blocker.
+
+- **2026-04-21 Claude** Issue A (API versioning) landed on `main` working tree (not yet committed/deployed). `src/atocore/main.py` now mounts a second `/v1` router that re-registers an explicit allowlist of public handlers (`_V1_PUBLIC_PATHS`) against the same endpoint functions — entities, relationships, ingest, context/build, query, projects, memory, interactions, project/state, health, sources, stats, and their sub-paths. Unversioned paths are untouched; OpenClaw and hooks keep working. Added `tests/test_v1_aliases.py` (5 tests: health parity, projects parity, entities reachable, v1 paths present in OpenAPI, unversioned paths still present in OpenAPI) and a "API versioning" section in the README documenting the rule (new endpoints at latest prefix, breaking changes bump prefix, unversioned retained for internal callers). Tests 459 → 463. Next: commit + deploy, then relay to the AKC thread so Phase 2 can code against `/v1`. Issues B (wiki redlinks) and C (inbox/cross-project) remain open, unstarted.
+
+- **2026-04-19 Claude** Shipped Phases 7A.1 (tiered auto-merge), 7C (tag canonicalization), 7D (confidence decay), 7I (OpenClaw context injection), UI refresh (memory/domain/activity pages + topnav), and closed the Claude Code retrieval asymmetry. Builds deployed: `028d4c3` → `56d5df0` → `e840ef4` → `877b97e` → `6e43cc7` → `9c91d77`. New capture-surface scope: Claude Code (Stop + UserPromptSubmit hooks, both installed and verified live) + OpenClaw (v0.2.0 plugin with capture + context injection, verified loaded on T420 gateway). `/wiki/capture` paste form removed from topnav; kept as labeled fallback. Anthropic API polling explicitly out of scope per user. Tests 414 → 459. `docs/capture-surfaces.md` documents the sanctioned scope.
+
+- **2026-04-18 Claude** **Phase 7A — Memory Consolidation V1 ("sleep cycle") landed on branch.** New `docs/PHASE-7-MEMORY-CONSOLIDATION.md` covers all 8 subphases (7A dedup, 7B contradictions, 7C tag canon, 7D confidence decay, 7E memory detail, 7F domain view, 7F re-extract, 7H vector hygiene). 7A implementation: schema migration `memory_merge_candidates`, `atocore.memory.similarity` (cosine + transitive cluster), stdlib-only `atocore.memory._dedup_prompt` (llm drafts unified content preserving all specifics), `merge_memories()` + `create_merge_candidate()` + `get_merge_candidates()` + `reject_merge_candidate()` in service.py, host-side `scripts/memory_dedup.py` (HTTP + claude -p, idempotent via sorted-id set), 5 new endpoints under `/admin/memory/merge-candidates*` + `/admin/memory/dedup-scan` + `/admin/memory/dedup-status`, purple-themed "🔗 Merge Candidates" section in /admin/triage with editable draft + approve/reject buttons, "🔗 Scan for duplicates" control bar with threshold slider, nightly Step B3 in batch-extract.sh (0.90 daily, 0.85 Sundays deep), `deploy/dalidou/dedup-watcher.sh` host watcher for UI-triggered scans (mirrors graduation-watcher pattern). 21 new tests (similarity, prompt parse, idempotency, merge happy path, override content/tags, audit rows, abort-if-source-tampered, reject leaves sources alone, schema). Tests 374 → 395. Not yet deployed; harness not re-run. Next: push + deploy, install `dedup-watcher.sh` in host cron, trigger first scan, review proposals in UI.
+
 - **2026-04-16 Claude** `b687e7f..999788b` **"Make It Actually Useful" sprint.** Two-part session: ops fixes then consolidation sprint.

  **Part 1 — Ops fixes:** Deployed `b687e7f` (project inference from cwd). Fixed cron logging (was `/dev/null` — redirected to `~/atocore-logs/`). Fixed OpenClaw gateway crash-loop (`discord.replyToMode: "any"` invalid → `"all"`). Deployed `atocore-capture` plugin on T420 OpenClaw using `before_agent_start` + `llm_output` hooks — verified end-to-end: 38 `client=openclaw` interactions captured. Backfilled project tags on 179/181 unscoped interactions (165 atocore, 8 p06, 6 p04).
--- a/README.md
+++ b/README.md
@@ -40,6 +40,26 @@ python scripts/atocore_client.py audit-query "gigabit" 5
 | GET | /health | Health check |
 | GET | /debug/context | Inspect last context pack |

+## API versioning
+
+The public contract for external clients (AKC, OpenClaw, future tools) is
+served under a `/v1` prefix. Every public endpoint is available at both its
+unversioned path and under `/v1` — e.g. `POST /entities` and `POST /v1/entities`
+route to the same handler.
+
+Rules:
+
+- New public endpoints land at the latest version prefix.
+- Backwards-compatible additions stay on the current version.
+- Breaking schema changes to an existing endpoint bump the prefix (`/v2/...`)
+  and leave the prior version in place until clients migrate.
+- Unversioned paths are retained for internal callers (hooks, scripts, the
+  wiki/admin UI). Do not rely on them from external clients — use `/v1`.
+
+The authoritative list of versioned paths is in `src/atocore/main.py`
+(`_V1_PUBLIC_PATHS`). `GET /openapi.json` reflects both the versioned and
+unversioned forms.
+
 ## Architecture

 ```text
--- a/deploy/dalidou/auto-triage-watcher.sh
+++ b/deploy/dalidou/auto-triage-watcher.sh
@@ -0,0 +1,108 @@
+#!/usr/bin/env bash
+#
+# deploy/dalidou/auto-triage-watcher.sh
+# --------------------------------------
+# Host-side watcher for on-demand auto-triage requests from the web UI.
+#
+# The web UI at /admin/triage has an "Auto-process queue" button that
+# POSTs to /admin/triage/request-drain, which writes a timestamp to
+# AtoCore project state (atocore/config/auto_triage_requested_at).
+#
+# This script runs on the Dalidou HOST (where the claude CLI is
+# available), polls for the flag, and runs auto_triage.py when seen.
+#
+# Installed via cron to run every 2 minutes:
+#   */2 * * * * /srv/storage/atocore/app/deploy/dalidou/auto-triage-watcher.sh
+#
+# Safety:
+#   - Lock file prevents concurrent runs
+#   - Flag is cleared after processing so one request = one run
+#   - If auto_triage hangs, the lock prevents pileup until manual cleanup
+
+set -euo pipefail
+
+ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
+APP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
+LOCK_FILE="/tmp/atocore-auto-triage.lock"
+LOG_DIR="/home/papa/atocore-logs"
+mkdir -p "$LOG_DIR"
+
+TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+log() { printf '[%s] %s\n' "$TS" "$*"; }
+
+# Fetch the request flag via API (read-only, no lock needed)
+STATE_JSON=$(curl -sSf --max-time 5 "$ATOCORE_URL/project/state/atocore" 2>/dev/null || echo "{}")
+REQUESTED=$(echo "$STATE_JSON" | python3 -c "
+import sys, json
+try:
+    d = json.load(sys.stdin)
+    for e in d.get('entries', d.get('state', [])):
+        if e.get('category') == 'config' and e.get('key') == 'auto_triage_requested_at':
+            print(e.get('value', ''))
+            break
+except Exception:
+    pass
+" 2>/dev/null || echo "")
+
+if [[ -z "$REQUESTED" ]]; then
+    # No request — silent exit
+    exit 0
+fi
+
+# Acquire lock (non-blocking)
+exec 9>"$LOCK_FILE" || exit 0
+if ! flock -n 9; then
+    log "auto-triage already running, skipping"
+    exit 0
+fi
+
+# Record we're starting
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_running\",\"value\":\"1\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_last_started_at\",\"value\":\"$TS\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+
+LOG_FILE="$LOG_DIR/auto-triage-ondemand-$(date -u +%Y%m%d-%H%M%S).log"
+log "Starting auto-triage (request: $REQUESTED, log: $LOG_FILE)"
+
+# Clear the request flag FIRST so duplicate clicks queue at most one re-run
+# (the next watcher tick would then see a fresh request, not this one)
+curl -sSf -X DELETE "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"config\",\"key\":\"auto_triage_requested_at\"}" \
+    >/dev/null 2>&1 || true
+
+# Run the drain
+cd "$APP_DIR"
+export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
+if python3 scripts/auto_triage.py --base-url "$ATOCORE_URL" >> "$LOG_FILE" 2>&1; then
+    RESULT_LINE=$(tail -5 "$LOG_FILE" | grep "total:" | tail -1 || tail -1 "$LOG_FILE")
+    RESULT="${RESULT_LINE:-completed}"
+    log "auto-triage finished: $RESULT"
+else
+    RESULT="ERROR — see $LOG_FILE"
+    log "auto-triage FAILED — see $LOG_FILE"
+fi
+
+FINISH_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+
+# Mark done
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_running\",\"value\":\"0\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_last_finished_at\",\"value\":\"$FINISH_TS\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+
+# Escape quotes in result for JSON
+SAFE_RESULT=$(printf '%s' "$RESULT" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read())[1:-1])")
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_last_result\",\"value\":\"$SAFE_RESULT\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
--- a/deploy/dalidou/batch-extract.sh
+++ b/deploy/dalidou/batch-extract.sh
@@ -65,15 +65,16 @@ python3 "$APP_DIR/scripts/auto_promote_reinforced.py" \
    log "WARN: auto-promote/expire failed (non-blocking)"
 }

-# Step C: Weekly synthesis (Sundays only)
-if [[ "$(date -u +%u)" == "7" ]]; then
-    log "Step C: weekly project synthesis"
-    python3 "$APP_DIR/scripts/synthesize_projects.py" \
-        --base-url "$ATOCORE_URL" \
-        2>&1 || {
-        log "WARN: synthesis failed (non-blocking)"
-    }
+# Step C: Daily project synthesis (keeps wiki/mirror pages fresh)
+log "Step C: project synthesis (daily)"
+python3 "$APP_DIR/scripts/synthesize_projects.py" \
+    --base-url "$ATOCORE_URL" \
+    2>&1 || {
+    log "WARN: synthesis failed (non-blocking)"
+}

+# Step D: Weekly lint pass (Sundays only — heavier, not needed daily)
+if [[ "$(date -u +%u)" == "7" ]]; then
    log "Step D: weekly lint pass"
    python3 "$APP_DIR/scripts/lint_knowledge_base.py" \
        --base-url "$ATOCORE_URL" \
@@ -149,4 +150,125 @@ print(f'Pipeline summary persisted: {json.dumps(summary)}')
    log "WARN: pipeline summary persistence failed (non-blocking)"
 }

+# Step F2: Emerging-concepts detector (Phase 6 C.1)
+log "Step F2: emerging-concepts detector"
+python3 "$APP_DIR/scripts/detect_emerging.py" \
+    --base-url "$ATOCORE_URL" \
+    2>&1 || {
+    log "WARN: emerging detector failed (non-blocking)"
+}
+
+# Step F3: Transient-to-durable extension (Phase 6 C.3)
+log "Step F3: transient-to-durable extension"
+curl -sSf -X POST "$ATOCORE_URL/admin/memory/extend-reinforced" \
+    -H 'Content-Type: application/json' \
+    2>&1 | tail -5 || {
+    log "WARN: extend-reinforced failed (non-blocking)"
+}
+
+# Step F4: Confidence decay on unreferenced cold memories (Phase 7D)
+# Daily: memories with reference_count=0 AND idle > 30 days → confidence × 0.97.
+# Below 0.3 → auto-supersede with audit. Reversible via reinforcement.
+log "Step F4: confidence decay"
+curl -sSf -X POST "$ATOCORE_URL/admin/memory/decay-run" \
+    -H 'Content-Type: application/json' \
+    -d '{"idle_days_threshold": 30, "daily_decay_factor": 0.97, "supersede_confidence_floor": 0.30}' \
+    2>&1 | tail -5 || {
+    log "WARN: decay-run failed (non-blocking)"
+}
+
+# Step B3: Memory dedup scan (Phase 7A)
+# Nightly at 0.90 (tight — only near-duplicates). Sundays run a deeper
+# pass at 0.85 to catch semantically-similar-but-differently-worded memories.
+if [[ "$(date -u +%u)" == "7" ]]; then
+    DEDUP_THRESHOLD="0.85"
+    DEDUP_BATCH="80"
+    log "Step B3: memory dedup (Sunday deep pass, threshold $DEDUP_THRESHOLD)"
+else
+    DEDUP_THRESHOLD="0.90"
+    DEDUP_BATCH="50"
+    log "Step B3: memory dedup (daily, threshold $DEDUP_THRESHOLD)"
+fi
+python3 "$APP_DIR/scripts/memory_dedup.py" \
+    --base-url "$ATOCORE_URL" \
+    --similarity-threshold "$DEDUP_THRESHOLD" \
+    --max-batch "$DEDUP_BATCH" \
+    2>&1 || {
+    log "WARN: memory dedup failed (non-blocking)"
+}
+
+# Step B4: Tag canonicalization (Phase 7C, weekly Sundays)
+# Autonomous: LLM proposes alias→canonical maps, auto-applies confidence >= 0.8.
+# Projects tokens are protected (skipped on both sides). Borderline proposals
+# land in /admin/tags/aliases for human review.
+if [[ "$(date -u +%u)" == "7" ]]; then
+    log "Step B4: tag canonicalization (Sunday)"
+    python3 "$APP_DIR/scripts/canonicalize_tags.py" \
+        --base-url "$ATOCORE_URL" \
+        2>&1 || {
+        log "WARN: tag canonicalization failed (non-blocking)"
+    }
+fi
+
+# Step G: Integrity check (Phase 4 V1)
+log "Step G: integrity check"
+python3 "$APP_DIR/scripts/integrity_check.py" \
+    --base-url "$ATOCORE_URL" \
+    2>&1 || {
+    log "WARN: integrity check failed (non-blocking)"
+}
+
+# Step H: Pipeline-level alerts — detect conditions that warrant attention
+log "Step H: pipeline alerts"
+python3 -c "
+import json, os, sys, urllib.request
+sys.path.insert(0, '$APP_DIR/src')
+from atocore.observability.alerts import emit_alert
+
+base = '$ATOCORE_URL'
+
+def get_state(project='atocore'):
+    try:
+        req = urllib.request.Request(f'{base}/project/state/{project}')
+        resp = urllib.request.urlopen(req, timeout=10)
+        return json.loads(resp.read()).get('entries', [])
+    except Exception:
+        return []
+
+def get_dashboard():
+    try:
+        req = urllib.request.Request(f'{base}/admin/dashboard')
+        resp = urllib.request.urlopen(req, timeout=10)
+        return json.loads(resp.read())
+    except Exception:
+        return {}
+
+state = {(e['category'], e['key']): e['value'] for e in get_state()}
+dash = get_dashboard()
+
+# Harness regression check
+harness_raw = state.get(('status', 'retrieval_harness_result'))
+if harness_raw:
+    try:
+        h = json.loads(harness_raw)
+        passed, total = h.get('passed', 0), h.get('total', 0)
+        if total > 0:
+            rate = passed / total
+            if rate < 0.85:
+                emit_alert('warning', 'Retrieval harness below 85%',
+                           f'Only {passed}/{total} fixtures passing ({rate:.0%}). Failures: {h.get(\"failures\", [])[:5]}',
+                           context={'pass_rate': rate})
+    except Exception:
+        pass
+
+# Candidate queue pileup
+candidates = dash.get('memories', {}).get('candidates', 0)
+if candidates > 200:
+    emit_alert('warning', 'Candidate queue not draining',
+               f'{candidates} candidates pending. Auto-triage may be stuck or rate-limited.',
+               context={'candidates': candidates})
+
+print('pipeline alerts check complete')
+" 2>&1 || true
+
 log "=== AtoCore batch extraction + triage complete ==="
--- a/deploy/dalidou/dedup-watcher.sh
+++ b/deploy/dalidou/dedup-watcher.sh
@@ -0,0 +1,110 @@
+#!/usr/bin/env bash
+#
+# deploy/dalidou/dedup-watcher.sh
+# -------------------------------
+# Host-side watcher for on-demand memory dedup scans (Phase 7A).
+#
+# The /admin/triage page has a "🔗 Scan for duplicates" button that POSTs
+# to /admin/memory/dedup-scan with {project, similarity_threshold, max_batch}.
+# The container writes this to project_state (atocore/config/dedup_requested_at).
+#
+# This script runs on the Dalidou HOST (where claude CLI lives), polls
+# for the flag, and runs memory_dedup.py when seen.
+#
+# Installed via cron every 2 minutes:
+#   */2 * * * * /srv/storage/atocore/app/deploy/dalidou/dedup-watcher.sh \
+#     >> /home/papa/atocore-logs/dedup-watcher.log 2>&1
+#
+# Mirrors deploy/dalidou/graduation-watcher.sh exactly.
+
+set -euo pipefail
+
+ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
+APP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
+LOCK_FILE="/tmp/atocore-dedup.lock"
+LOG_DIR="/home/papa/atocore-logs"
+mkdir -p "$LOG_DIR"
+
+TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+log() { printf '[%s] %s\n' "$TS" "$*"; }
+
+# Fetch the flag via API
+STATE_JSON=$(curl -sSf --max-time 5 "$ATOCORE_URL/project/state/atocore" 2>/dev/null || echo "{}")
+REQUESTED=$(echo "$STATE_JSON" | python3 -c "
+import sys, json
+try:
+    d = json.load(sys.stdin)
+    for e in d.get('entries', d.get('state', [])):
+        if e.get('category') == 'config' and e.get('key') == 'dedup_requested_at':
+            print(e.get('value', ''))
+            break
+except Exception:
+    pass
+" 2>/dev/null || echo "")
+
+if [[ -z "$REQUESTED" ]]; then
+    exit 0
+fi
+
+PROJECT=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('project',''))" 2>/dev/null || echo "")
+THRESHOLD=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('similarity_threshold',0.88))" 2>/dev/null || echo "0.88")
+MAX_BATCH=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('max_batch',50))" 2>/dev/null || echo "50")
+
+# Acquire lock
+exec 9>"$LOCK_FILE" || exit 0
+if ! flock -n 9; then
+    log "dedup already running, skipping"
+    exit 0
+fi
+
+# Mark running
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_running\",\"value\":\"1\",\"source\":\"dedup watcher\"}" \
+    >/dev/null 2>&1 || true
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_last_started_at\",\"value\":\"$TS\",\"source\":\"dedup watcher\"}" \
+    >/dev/null 2>&1 || true
+
+LOG_FILE="$LOG_DIR/dedup-ondemand-$(date -u +%Y%m%d-%H%M%S).log"
+log "Starting dedup (project='$PROJECT' threshold=$THRESHOLD max_batch=$MAX_BATCH, log: $LOG_FILE)"
+
+# Clear the flag BEFORE running so duplicate clicks queue at most one
+curl -sSf -X DELETE "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"config\",\"key\":\"dedup_requested_at\"}" \
+    >/dev/null 2>&1 || true
+
+cd "$APP_DIR"
+export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
+ARGS=(--base-url "$ATOCORE_URL" --similarity-threshold "$THRESHOLD" --max-batch "$MAX_BATCH")
+if [[ -n "$PROJECT" ]]; then
+    ARGS+=(--project "$PROJECT")
+fi
+
+if python3 scripts/memory_dedup.py "${ARGS[@]}" >> "$LOG_FILE" 2>&1; then
+    RESULT=$(grep "^summary:" "$LOG_FILE" | tail -1 || tail -1 "$LOG_FILE")
+    RESULT="${RESULT:-completed}"
+    log "dedup finished: $RESULT"
+else
+    RESULT="ERROR — see $LOG_FILE"
+    log "dedup FAILED"
+fi
+
+FINISH_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_running\",\"value\":\"0\",\"source\":\"dedup watcher\"}" \
+    >/dev/null 2>&1 || true
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_last_finished_at\",\"value\":\"$FINISH_TS\",\"source\":\"dedup watcher\"}" \
+    >/dev/null 2>&1 || true
+
+SAFE_RESULT=$(printf '%s' "$RESULT" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read())[1:-1])")
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_last_result\",\"value\":\"$SAFE_RESULT\",\"source\":\"dedup watcher\"}" \
+    >/dev/null 2>&1 || true
--- a/deploy/dalidou/docker-compose.yml
+++ b/deploy/dalidou/docker-compose.yml
@@ -27,6 +27,7 @@ services:
      - ${ATOCORE_BACKUP_DIR}:${ATOCORE_BACKUP_DIR}
      - ${ATOCORE_RUN_DIR}:${ATOCORE_RUN_DIR}
      - ${ATOCORE_PROJECT_REGISTRY_DIR}:${ATOCORE_PROJECT_REGISTRY_DIR}
+      - ${ATOCORE_ASSETS_DIR}:${ATOCORE_ASSETS_DIR}
      - ${ATOCORE_VAULT_SOURCE_DIR}:${ATOCORE_VAULT_SOURCE_DIR}:ro
      - ${ATOCORE_DRIVE_SOURCE_DIR}:${ATOCORE_DRIVE_SOURCE_DIR}:ro
    healthcheck:
--- a/deploy/dalidou/graduation-watcher.sh
+++ b/deploy/dalidou/graduation-watcher.sh
@@ -0,0 +1,117 @@
+#!/usr/bin/env bash
+#
+# deploy/dalidou/graduation-watcher.sh
+# ------------------------------------
+# Host-side watcher for on-demand memory→entity graduation from the web UI.
+#
+# The /admin/triage page has a "🎓 Graduate memories" button that POSTs
+# to /admin/graduation/request with {project, limit}. The container
+# writes this to project_state (atocore/config/graduation_requested_at).
+#
+# This script runs on the Dalidou HOST (where claude CLI lives), polls
+# for the flag, and runs graduate_memories.py when seen.
+#
+# Installed via cron every 2 minutes:
+#   */2 * * * * /srv/storage/atocore/app/deploy/dalidou/graduation-watcher.sh \
+#     >> /home/papa/atocore-logs/graduation-watcher.log 2>&1
+#
+# Safety:
+#   - Lock file prevents concurrent runs
+#   - Flag cleared before processing so duplicate clicks queue at most one re-run
+#   - Fail-open: any error logs but doesn't break the host
+
+set -euo pipefail
+
+ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
+APP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
+LOCK_FILE="/tmp/atocore-graduation.lock"
+LOG_DIR="/home/papa/atocore-logs"
+mkdir -p "$LOG_DIR"
+
+TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+log() { printf '[%s] %s\n' "$TS" "$*"; }
+
+# Fetch the flag via API
+STATE_JSON=$(curl -sSf --max-time 5 "$ATOCORE_URL/project/state/atocore" 2>/dev/null || echo "{}")
+REQUESTED=$(echo "$STATE_JSON" | python3 -c "
+import sys, json
+try:
+    d = json.load(sys.stdin)
+    for e in d.get('entries', d.get('state', [])):
+        if e.get('category') == 'config' and e.get('key') == 'graduation_requested_at':
+            print(e.get('value', ''))
+            break
+except Exception:
+    pass
+" 2>/dev/null || echo "")
+
+if [[ -z "$REQUESTED" ]]; then
+    exit 0
+fi
+
+# Parse JSON: {project, limit, requested_at}
+PROJECT=$(echo "$REQUESTED" | python3 -c "import sys,json; d=json.load(sys.stdin) if '{' in sys.stdin.buffer.peek().decode(errors='ignore') else None; print((d or {}).get('project',''))" 2>/dev/null || echo "")
+# Fallback: python inline above can be flaky; just re-parse
+PROJECT=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('project',''))" 2>/dev/null || echo "")
+LIMIT=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('limit',30))" 2>/dev/null || echo "30")
+
+# Acquire lock
+exec 9>"$LOCK_FILE" || exit 0
+if ! flock -n 9; then
+    log "graduation already running, skipping"
+    exit 0
+fi
+
+# Mark running
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_running\",\"value\":\"1\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_last_started_at\",\"value\":\"$TS\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+
+LOG_FILE="$LOG_DIR/graduation-ondemand-$(date -u +%Y%m%d-%H%M%S).log"
+log "Starting graduation (project='$PROJECT' limit=$LIMIT, log: $LOG_FILE)"
+
+# Clear the flag BEFORE running so duplicate clicks queue at most one
+curl -sSf -X DELETE "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"config\",\"key\":\"graduation_requested_at\"}" \
+    >/dev/null 2>&1 || true
+
+# Build script args
+cd "$APP_DIR"
+export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
+ARGS=(--base-url "$ATOCORE_URL" --limit "$LIMIT")
+if [[ -n "$PROJECT" ]]; then
+    ARGS+=(--project "$PROJECT")
+fi
+
+if python3 scripts/graduate_memories.py "${ARGS[@]}" >> "$LOG_FILE" 2>&1; then
+    RESULT=$(tail -3 "$LOG_FILE" | grep "^total:" | tail -1 || tail -1 "$LOG_FILE")
+    RESULT="${RESULT:-completed}"
+    log "graduation finished: $RESULT"
+else
+    RESULT="ERROR — see $LOG_FILE"
+    log "graduation FAILED"
+fi
+
+FINISH_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+
+# Mark done
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_running\",\"value\":\"0\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_last_finished_at\",\"value\":\"$FINISH_TS\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
+
+SAFE_RESULT=$(printf '%s' "$RESULT" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read())[1:-1])")
+curl -sSf -X POST "$ATOCORE_URL/project/state" \
+    -H 'Content-Type: application/json' \
+    -d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_last_result\",\"value\":\"$SAFE_RESULT\",\"source\":\"host watcher\"}" \
+    >/dev/null 2>&1 || true
--- a/deploy/dalidou/hourly-extract.sh
+++ b/deploy/dalidou/hourly-extract.sh
@@ -0,0 +1,64 @@
+#!/usr/bin/env bash
+#
+# deploy/dalidou/hourly-extract.sh
+# ---------------------------------
+# Lightweight hourly extraction + triage so autonomous capture stays
+# current (not a 24h-latency nightly-only affair).
+#
+# Does ONLY:
+#   Step A: LLM extraction over recent interactions (last 2h window)
+#   Step B: 3-tier auto-triage on the resulting candidates
+#
+# Skips the heavy nightly stuff (backup, rsync, OpenClaw import,
+# synthesis, harness, integrity check, emerging detector). Those stay
+# in cron-backup.sh at 03:00 UTC.
+#
+# Runs every hour via cron:
+#   0 * * * * /srv/storage/atocore/app/deploy/dalidou/hourly-extract.sh \
+#       >> /home/papa/atocore-logs/hourly-extract.log 2>&1
+#
+# Lock file prevents overlap if a previous run is still going (which
+# can happen if claude CLI rate-limits and retries).
+
+set -euo pipefail
+
+ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
+# 50 recent interactions is enough for an hour — typical usage is under 20/h.
+LIMIT="${ATOCORE_HOURLY_EXTRACT_LIMIT:-50}"
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+APP_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
+TIMESTAMP="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+LOCK_FILE="/tmp/atocore-hourly-extract.lock"
+
+log() { printf '[%s] %s\n' "$TIMESTAMP" "$*"; }
+
+# Acquire lock (non-blocking)
+exec 9>"$LOCK_FILE" || exit 0
+if ! flock -n 9; then
+    log "hourly extract already running, skipping"
+    exit 0
+fi
+
+export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
+
+log "=== hourly extract+triage starting ==="
+
+# Step A — Extract candidates from recent interactions
+log "Step A: LLM extraction (since last run)"
+python3 "$APP_DIR/scripts/batch_llm_extract_live.py" \
+    --base-url "$ATOCORE_URL" \
+    --limit "$LIMIT" \
+    2>&1 || {
+    log "WARN: batch extraction failed (non-blocking)"
+}
+
+# Step B — 3-tier auto-triage (sonnet → opus → discard)
+log "Step B: auto-triage (3-tier)"
+python3 "$APP_DIR/scripts/auto_triage.py" \
+    --base-url "$ATOCORE_URL" \
+    --max-batches 3 \
+    2>&1 || {
+    log "WARN: auto-triage failed (non-blocking)"
+}
+
+log "=== hourly extract+triage complete ==="
--- a/deploy/hooks/capture_stop.py
+++ b/deploy/hooks/capture_stop.py
--- a/deploy/hooks/inject_context.py
+++ b/deploy/hooks/inject_context.py
@@ -0,0 +1,174 @@
+#!/usr/bin/env python3
+"""Claude Code UserPromptSubmit hook: inject AtoCore context.
+
+Mirrors the OpenClaw 7I pattern on the Claude Code side. Every user
+prompt submitted to Claude Code is (a) sent to /context/build on the
+AtoCore API, and (b) the returned context pack is prepended to the
+prompt the LLM sees — so Claude Code answers grounded in what AtoCore
+already knows, same as OpenClaw now does.
+
+Contract per Claude Code hooks spec:
+  stdin: JSON with `prompt`, `session_id`, `transcript_path`, `cwd`,
+         `hook_event_name`, etc.
+  stdout on success: JSON
+      {"hookSpecificOutput":
+          {"hookEventName": "UserPromptSubmit",
+           "additionalContext": "<pack>"}}
+  exit 0 always — fail open. An unreachable AtoCore must never block
+  the user's prompt.
+
+Environment variables:
+  ATOCORE_URL                 base URL (default http://dalidou:8100)
+  ATOCORE_CONTEXT_DISABLED    set to "1" to disable injection
+  ATOCORE_CONTEXT_BUDGET      max chars of injected pack (default 4000)
+  ATOCORE_CONTEXT_TIMEOUT     HTTP timeout in seconds (default 5)
+
+Usage in ~/.claude/settings.json:
+    "UserPromptSubmit": [{
+        "matcher": "",
+        "hooks": [{
+            "type": "command",
+            "command": "python /path/to/inject_context.py",
+            "timeout": 10
+        }]
+    }]
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+import urllib.error
+import urllib.request
+
+ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100")
+CONTEXT_TIMEOUT = float(os.environ.get("ATOCORE_CONTEXT_TIMEOUT", "5"))
+CONTEXT_BUDGET = int(os.environ.get("ATOCORE_CONTEXT_BUDGET", "4000"))
+
+# Don't spend an API call on trivial acks or slash commands.
+MIN_PROMPT_LENGTH = 15
+
+
+# Project inference table — kept in sync with capture_stop.py so both
+# hooks agree on what project a Claude Code session belongs to.
+_VAULT = "C:\\Users\\antoi\\antoine\\My Libraries\\Antoine Brain Extension"
+_PROJECT_PATH_MAP: dict[str, str] = {
+    f"{_VAULT}\\2-Projects\\P04-GigaBIT-M1": "p04-gigabit",
+    f"{_VAULT}\\2-Projects\\P10-Interferometer": "p05-interferometer",
+    f"{_VAULT}\\2-Projects\\P11-Polisher-Fullum": "p06-polisher",
+    f"{_VAULT}\\2-Projects\\P08-ABB-Space-Mirror": "abb-space",
+    f"{_VAULT}\\2-Projects\\I01-Atomizer": "atomizer-v2",
+    f"{_VAULT}\\2-Projects\\I02-AtoCore": "atocore",
+    "C:\\Users\\antoi\\ATOCore": "atocore",
+    "C:\\Users\\antoi\\Polisher-Sim": "p06-polisher",
+    "C:\\Users\\antoi\\Fullum-Interferometer": "p05-interferometer",
+    "C:\\Users\\antoi\\Atomizer-V2": "atomizer-v2",
+}
+
+
+def _infer_project(cwd: str) -> str:
+    if not cwd:
+        return ""
+    norm = os.path.normpath(cwd).lower()
+    for path_prefix, project_id in _PROJECT_PATH_MAP.items():
+        if norm.startswith(os.path.normpath(path_prefix).lower()):
+            return project_id
+    return ""
+
+
+def _emit_empty() -> None:
+    """Exit 0 with no additionalContext — equivalent to no-op."""
+    sys.exit(0)
+
+
+def _emit_context(pack: str) -> None:
+    """Write the hook output JSON and exit 0."""
+    out = {
+        "hookSpecificOutput": {
+            "hookEventName": "UserPromptSubmit",
+            "additionalContext": pack,
+        }
+    }
+    sys.stdout.write(json.dumps(out))
+    sys.exit(0)
+
+
+def main() -> None:
+    if os.environ.get("ATOCORE_CONTEXT_DISABLED") == "1":
+        _emit_empty()
+
+    try:
+        raw = sys.stdin.read()
+        if not raw.strip():
+            _emit_empty()
+        hook_data = json.loads(raw)
+    except Exception as exc:
+        # Bad stdin → nothing to do
+        print(f"inject_context: bad stdin: {exc}", file=sys.stderr)
+        _emit_empty()
+
+    prompt = (hook_data.get("prompt") or "").strip()
+    cwd = hook_data.get("cwd", "")
+
+    if len(prompt) < MIN_PROMPT_LENGTH:
+        _emit_empty()
+
+    # Skip meta / system prompts that start with '<' (XML tags etc.)
+    if prompt.startswith("<"):
+        _emit_empty()
+
+    project = _infer_project(cwd)
+
+    body = json.dumps({
+        "prompt": prompt,
+        "project": project,
+        "char_budget": CONTEXT_BUDGET,
+    }).encode("utf-8")
+
+    req = urllib.request.Request(
+        f"{ATOCORE_URL}/context/build",
+        data=body,
+        headers={"Content-Type": "application/json"},
+        method="POST",
+    )
+
+    try:
+        resp = urllib.request.urlopen(req, timeout=CONTEXT_TIMEOUT)
+        data = json.loads(resp.read().decode("utf-8"))
+    except urllib.error.URLError as exc:
+        # AtoCore unreachable — fail open
+        print(f"inject_context: atocore unreachable: {exc}", file=sys.stderr)
+        _emit_empty()
+    except Exception as exc:
+        print(f"inject_context: request failed: {exc}", file=sys.stderr)
+        _emit_empty()
+
+    pack = (data.get("formatted_context") or "").strip()
+    if not pack:
+        _emit_empty()
+
+    # Safety truncate. /context/build respects the budget we sent, but
+    # be defensive in case of a regression.
+    if len(pack) > CONTEXT_BUDGET + 500:
+        pack = pack[:CONTEXT_BUDGET] + "\n\n[context truncated]"
+
+    # Wrap so the LLM knows this is injected grounding, not user text.
+    wrapped = (
+        "---\n"
+        "AtoCore-injected context for this prompt "
+        f"(project={project or '(none)'}):\n\n"
+        f"{pack}\n"
+        "---"
+    )
+
+    print(
+        f"inject_context: injected {len(pack)} chars "
+        f"(project={project or 'none'}, prompt_chars={len(prompt)})",
+        file=sys.stderr,
+    )
+    _emit_context(wrapped)
+
+
+if __name__ == "__main__":
+    main()
--- a/docs/MASTER-BRAIN-PLAN.md
+++ b/docs/MASTER-BRAIN-PLAN.md
@@ -0,0 +1,284 @@
+# AtoCore Master Brain Plan
+
+> Vision: AtoCore becomes the **single source of truth** that grounds every LLM
+> interaction across the entire ecosystem (Claude, OpenClaw, Codex, Ollama, future
+> agents). Every prompt is automatically enriched with full project context. The
+> brain self-grows from daily work, auto-organizes its metadata, and stays
+> flawlessly reliable.
+
+## The Core Insight
+
+AtoCore today is a **well-architected capture + curation system with a critical
+gap on the consumption side**. We pour water into the bucket (capture from
+Claude Code Stop hook + OpenClaw message hooks) but nothing is drinking from it
+at prompt time. Fixing that gap is the single highest-leverage move.
+
+**Once every LLM call is AtoCore-grounded automatically, the feedback loop
+closes**: LLMs use the context → produce better responses → those responses
+reference the injected memories → reinforcement fires → knowledge curates
+itself. The capture side is already working. The pull side is what's missing.
+
+## Universal Consumption Strategy
+
+MCP is great for Claude (Claude Desktop, Claude Code, Cursor, Zed, Windsurf) but
+is **not universal**. OpenClaw has its own plugin SDK. Codex, Ollama, and GPT
+don't natively support MCP. The right strategy:
+
+**HTTP API is the truth; every client gets the thinnest possible adapter.**
+
+```
+                 ┌─────────────────────┐
+                 │  AtoCore HTTP API   │  ← canonical interface
+                 │  /context/build     │
+                 │  /query             │
+                 │  /memory            │
+                 │  /project/state     │
+                 └──────────┬──────────┘
+                            │
+   ┌────────────┬───────────┼──────────┬────────────┐
+   │            │           │          │            │
+┌──┴───┐   ┌────┴────┐  ┌───┴───┐  ┌───┴────┐  ┌───┴────┐
+│ MCP  │   │OpenClaw │  │Claude │  │ Codex  │  │ Ollama │
+│server│   │ plugin  │  │ Code  │  │  skill │  │ proxy  │
+│      │   │ (pull)  │  │ hook  │  │        │  │        │
+└──┬───┘   └────┬────┘  └───┬───┘  └────┬───┘  └────┬───┘
+   │            │           │           │           │
+Claude       OpenClaw   Claude Code  Codex CLI   Ollama
+Desktop,     agent                                local
+Cursor,                                           models
+Zed,
+Windsurf
+```
+
+Each adapter's only job: accept a prompt, call AtoCore HTTP, prepend the
+returned context pack. The adapter itself carries no logic.
+
+## Three Integration Tiers
+
+### Tier 1: MCP-native clients (Claude ecosystem)
+Build **atocore-mcp** — a standalone MCP server that wraps the HTTP API. Exposes:
+- `context(query, project)` → context pack
+- `search(query)` → raw retrieval
+- `remember(type, content, project)` → create candidate memory
+- `recall(project, key)` → project state lookup
+- `list_projects()` → registered projects
+
+Works with Claude Desktop, Claude Code (via `claude mcp add atocore`), Cursor,
+Zed, Windsurf without any per-client work beyond config.
+
+### Tier 2: Custom plugin ecosystems (OpenClaw)
+Extend the existing `atocore-capture` plugin on T420 to also register a
+**`before_prompt_build`** hook that pulls context from AtoCore and injects it
+into the agent's system prompt. The plugin already has the HTTP client, the
+authentication, the fail-open pattern. This is ~30 lines of added code.
+
+### Tier 3: Everything else (Codex, Ollama, custom agents)
+For clients without plugin/hook systems, ship a **thin proxy/middleware** the
+user configures as the LLM endpoint:
+- `atocore-proxy` listens on `localhost:PORT`
+- Intercepts OpenAI-compatible chat/completion calls
+- Pulls context from AtoCore, injects into system prompt
+- Forwards to the real model endpoint (OpenAI, Ollama, Anthropic, etc.)
+- Returns the response, then captures the interaction back to AtoCore
+
+This makes AtoCore a "drop-in" layer for anything that speaks
+OpenAI-compatible HTTP — which is nearly every modern LLM runtime.
+
+## Knowledge Density Plan
+
+The brain is only as smart as what it knows. Current state: 80 active memories
+across 6 projects, 324 candidates in the queue being processed. Target:
+**1,000+ curated memories** to become a real master brain.
+
+Mechanisms:
+1. **Finish the current triage pass** (324 → ~80 more promotions expected).
+2. **Re-extract with stronger prompt on existing 236 interactions** — tune the
+   LLM extractor system prompt to pull more durable facts and fewer ephemeral
+   snapshots.
+3. **Ingest all drive/vault documents as memory candidates** (not just chunks).
+   Every structured markdown section with a decision/fact/requirement header
+   becomes a candidate memory.
+4. **Multi-source triangulation**: same fact in 3+ sources = auto-promote to
+   confidence 0.95.
+5. **Cross-project synthesis**: facts appearing in multiple project contexts
+   get promoted to global domain knowledge.
+
+## Auto-Organization of Metadata
+
+Currently: `type`, `project`, `confidence`, `status`, `reference_count`. For
+master brain we need more structure, inferred automatically:
+
+| Addition | Purpose | Mechanism |
+|---|---|---|
+| **Domain tags** (optics, mechanics, firmware, business…) | Cross-cutting retrieval | LLM inference during triage |
+| **Temporal scope** (permanent, valid_until_X, transient) | Avoid stale truth | LLM classifies during triage |
+| **Source refs** (chunk_id[], interaction_id[]) | Provenance for every fact | Enforced at creation time |
+| **Relationships** (contradicts, updates, depends_on) | Memory graph | Triage infers during review |
+| **Semantic clusters** | Detect duplicates, find gaps | Weekly HDBSCAN pass on embeddings |
+
+Layer these in progressively — none of them require schema rewrites, just
+additional fields and batch jobs.
+
+## Self-Growth Mechanisms
+
+Four loops that make AtoCore grow autonomously:
+
+### 1. Drift detection (nightly)
+Compare new chunk embeddings to existing vector distribution. Centroids >X
+cosine distance from any existing centroid = new knowledge area. Log to
+dashboard; human decides if it's noise or a domain worth curating.
+
+### 2. Gap identification (continuous)
+Every `/context/build` logs `query + chunks_returned + memories_returned`.
+Weekly report: "top 10 queries with weak coverage." Those are targeted
+curation opportunities.
+
+### 3. Multi-source triangulation (weekly)
+Scan memory content similarity across sources. When a fact appears in 3+
+independent sources (vault doc + drive doc + interaction), auto-promote to
+high confidence and mark as "triangulated."
+
+### 4. Active learning prompts (monthly)
+Surface "you have 200 p06 memories but only 15 p04 memories. Spend 30 min
+curating p04?" via dashboard digest.
+
+## Robustness Strategy (Flawless Operation Bar)
+
+Current: nightly backup, off-host rsync, health endpoint, 303 tests, harness,
+enhanced dashboard with pipeline health (this session).
+
+To reach "flawless":
+
+| Gap | Fix | Priority |
+|---|---|---|
+| Silent pipeline failures | Alerting webhook on harness drop / pipeline skip | P1 |
+| Memory mutations untracked | Append-only audit log table | P1 |
+| Integrity drift | Nightly FK + vector-chunk parity checks | P1 |
+| Schema migrations ad-hoc | Formal migration framework with rollback | P2 |
+| Single point of failure | Daily backup to user's main computer (new) | P1 |
+| No hot standby | Second instance following primary via WAL | P3 |
+| No temporal history | Memory audit + valid_until fields | P2 |
+
+### Daily Backup to Main Computer
+
+Currently: Dalidou → T420 (192.168.86.39) via rsync.
+
+Add: Dalidou → main computer via a pull (main computer runs the rsync,
+pulls from Dalidou). Pull-based is simpler than push — no need for SSH
+keys on Dalidou to reach the Windows machine.
+
+```bash
+# On main computer, daily scheduled task:
+rsync -a papa@dalidou:/srv/storage/atocore/backups/snapshots/ \
+         /path/to/local/atocore-backups/
+```
+
+Configure via Windows Task Scheduler or a cron-like runner. Verify weekly
+that the latest snapshot is present.
+
+## Human Interface Auto-Evolution
+
+Current: wiki at `/wiki`, regenerates on every request from DB. Synthesis
+(the "current state" paragraph at top of project pages) runs **weekly on
+Sundays only**. That's why it feels stalled.
+
+Fixes:
+1. **Run synthesis daily, not weekly.** It's cheap (one claude call per
+   project) and keeps the human-readable overview fresh.
+2. **Trigger synthesis on major events** — when 5+ new memories land for a
+   project, regenerate its synthesis.
+3. **Add "What's New" feed** — wiki homepage shows recent additions across all
+   projects (last 7 days of memory promotions, state entries, entities).
+4. **Memory timeline view** — project page gets a chronological list of what
+   we learned when.
+
+## Phased Roadmap (8-10 weeks)
+
+### Phase 1 (week 1-2): Universal Consumption
+**Goal: every LLM call is AtoCore-grounded automatically.**
+
+- [ ] Build `atocore-mcp` server (wraps HTTP API, stdio transport)
+- [ ] Publish to npm / or run via `pipx` / stdlib HTTP
+- [ ] Configure in Claude Desktop (`~/.claude/mcp_servers.json`)
+- [ ] Configure in Claude Code (`claude mcp add atocore …`)
+- [ ] Extend OpenClaw plugin with `before_prompt_build` PULL
+- [ ] Write `atocore-proxy` middleware for Codex/Ollama/generic clients
+- [ ] Document configuration for each client
+
+**Success:** open a fresh Claude Code session, ask a project question, verify
+the response references AtoCore memories without manual context commands.
+
+### Phase 2 (week 2-3): Knowledge Density + Wiki Evolution
+- [ ] Finish current triage pass (324 candidates → active)
+- [ ] Tune extractor prompt for higher promotion rate on durable facts
+- [ ] Daily synthesis in cron (not just Sundays)
+- [ ] Event-triggered synthesis on significant project changes
+- [ ] Wiki "What's New" feed
+- [ ] Memory timeline per project
+
+**Target:** 300+ active memories, wiki feels alive daily.
+
+### Phase 3 (week 3-4): Auto-Organization
+- [ ] Schema: add `domain_tags`, `valid_until`, `source_refs`, `triangulated_count`
+- [ ] Triage prompt upgraded: infer tags + temporal scope + relationships
+- [ ] Weekly HDBSCAN clustering of embeddings → dup detection + gap reports
+- [ ] Relationship edges in a new `memory_relationships` table
+
+### Phase 4 (week 4-5): Robustness Hardening
+- [ ] Append-only `memory_audit` table + retrofit mutations
+- [ ] Nightly integrity checks (FK validation, orphan detection, parity)
+- [ ] Alerting webhook (Discord/email) on pipeline anomalies
+- [ ] Daily backup to user's main computer (pull-based)
+- [ ] Formal migration framework
+
+### Phase 5 (week 6-7): Engineering V1 Implementation
+Execute the 23 acceptance criteria in `docs/architecture/engineering-v1-acceptance.md`
+against p06-polisher as the test bed. The ontology and queries are designed;
+this phase implements them.
+
+### Phase 6 (week 8-9): Self-Growth Loops
+- [ ] Drift detection (nightly)
+- [ ] Gap identification from `/context/build` logs
+- [ ] Multi-source triangulation
+- [ ] Active learning digest (monthly)
+- [ ] Cross-project synthesis
+
+### Phase 7 (ongoing): Scale & Polish
+- [ ] Multi-model validation (sonnet triages, opus cross-checks on disagreements)
+- [ ] AtoDrive integration (Google Drive as trusted source)
+- [ ] Hot standby when real production dependence materializes
+- [ ] More MCP tools (write-back, memory search, entity queries)
+
+## Success Criteria
+
+AtoCore is a master brain when:
+
+1. **Zero manual context commands.** A fresh Claude/OpenClaw session answering
+   a project question without being told "use AtoCore context."
+2. **1,000+ active memories** with >90% provenance coverage (every fact
+   traceable to a source).
+3. **Every project has a current, human-readable overview** updated within 24h
+   of significant changes.
+4. **Harness stays >95%** across 20+ fixtures covering all active projects.
+5. **Zero silent pipeline failures** for 30 consecutive days (all failures
+   surface via alert within the hour).
+6. **Claude on any task knows what we know** — user asks "what did we decide
+   about X?" and the answer is grounded in AtoCore, not reconstructed from
+   scratch.
+
+## Where We Are Now (2026-04-16)
+
+- ✅ Core infrastructure: HTTP API, SQLite, Chroma, deploy pipeline
+- ✅ Capture pipes: Claude Code Stop hook, OpenClaw message hooks
+- ✅ Nightly pipeline: backup, extract, triage, synthesis, lint, harness, summary
+- ✅ Phase 10: auto-promotion from reinforcement + candidate expiry
+- ✅ Dashboard shows pipeline health + interaction totals + all projects
+- ⚡ 324 candidates being triaged (down from 439), ~80 active memories, growing
+- ❌ No consumption at prompt time (capture-only)
+- ❌ Wiki auto-evolves only on Sundays (synthesis cadence)
+- ❌ No MCP adapter
+- ❌ No daily backup to main computer
+- ❌ Engineering V1 not implemented
+- ❌ No alerting on pipeline failures
+
+The path is clear. Phase 1 is the keystone.
--- a/docs/PHASE-7-MEMORY-CONSOLIDATION.md
+++ b/docs/PHASE-7-MEMORY-CONSOLIDATION.md
@@ -0,0 +1,96 @@
+# Phase 7 — Memory Consolidation (the "Sleep Cycle")
+
+**Status**: 7A in progress · 7B-H scoped, deferred
+**Design principle**: *"Like human memory while sleeping, but more robotic — never discard relevant details. Consolidate, update, supersede — don't delete."*
+
+## Why
+
+Phases 1–6 built capture + triage + graduation + emerging-project detection. What they don't solve:
+
+| # | Problem | Fix |
+|---|---|---|
+| 1 | Redundancy — "APM uses NX" said 5 different ways across 5 memories | **7A** Semantic dedup |
+| 2 | Latent contradictions — "chose Zygo" + "switched from Zygo" both active | **7B** Pair contradiction detection |
+| 3 | Tag drift — `firmware`, `fw`, `firmware-control` fragment retrieval | **7C** Tag canonicalization |
+| 4 | Confidence staleness — 6-month unreferenced memory ranks as fresh | **7D** Confidence decay |
+| 5 | No memory drill-down page | **7E** `/wiki/memories/{id}` |
+| 6 | Domain knowledge siloed per project | **7F** `/wiki/domains/{tag}` |
+| 7 | Prompt upgrades (llm-0.5 → 0.6) don't re-process old interactions | **7G** Re-extraction on version bump |
+| 8 | Superseded memory vectors still in Chroma polluting retrieval | **7H** Vector hygiene |
+
+Collectively: the brain needs a nightly pass that looks at what it already knows and tidies up — dedup, resolve contradictions, canonicalize tags, decay stale facts — **without losing information**.
+
+## Subphases
+
+### 7A — Semantic dedup + consolidation *(this sprint)*
+
+Compute embeddings on active memories, find pairs within `(project, memory_type)` bucket above similarity threshold (default 0.88), cluster, draft a unified memory via LLM, human approves in triage UI. On approve: sources become `superseded`, new merged memory created with union of `source_refs`, sum of `reference_count`, max of `confidence`. **Ships first** because redundancy compounds — every new memory potentially duplicates an old one.
+
+Detailed spec lives in the working plan (`dapper-cooking-tower.md`) and across the files listed under "Files touched" below. Key decisions:
+
+- LLM drafts, human approves — no silent auto-merge.
+- Same `(project, memory_type)` bucket only. Cross-project merges are rare + risky → separate flow in 7B.
+- Recompute embeddings each scan (~2s / 335 memories). Persist only if scan time becomes a problem.
+- Cluster-based proposals (A~B~C → one merge), not pair-based.
+- `status=superseded` never deleted — still queryable with filter.
+
+**Schema**: new table `memory_merge_candidates` (pending | approved | rejected).
+**Cron**: nightly at threshold 0.90 (tight); weekly (Sundays) at 0.85 (deeper cleanup).
+**UI**: new "🔗 Merge Candidates" section in `/admin/triage`.
+
+**Files touched in 7A**:
+- `src/atocore/models/database.py` — migration
+- `src/atocore/memory/similarity.py` — new, `compute_memory_similarity()`
+- `src/atocore/memory/_dedup_prompt.py` — new, shared LLM prompt
+- `src/atocore/memory/service.py` — `merge_memories()`
+- `scripts/memory_dedup.py` — new, host-side detector (HTTP-only)
+- `src/atocore/api/routes.py` — 5 new endpoints under `/admin/memory/`
+- `src/atocore/engineering/triage_ui.py` — merge cards section
+- `deploy/dalidou/batch-extract.sh` — Step B3
+- `deploy/dalidou/dedup-watcher.sh` — new, UI-triggered scans
+- `tests/test_memory_dedup.py` — ~10-15 new tests
+
+### 7B — Memory-to-memory contradiction detection
+
+Same embedding-pair machinery as 7A but within a *different* band (similarity 0.70–0.88 — semantically related but different wording). LLM classifies each pair: `duplicate | complementary | contradicts | supersedes-older`. Contradictions write a `memory_conflicts` row + surface a triage badge. Clear supersessions (both tier 1 sonnet and tier 2 opus agree) auto-mark the older as `superseded`.
+
+### 7C — Tag canonicalization
+
+Weekly LLM pass over `domain_tags` distribution, proposes `alias → canonical` map (e.g. `fw → firmware`). Human approves via UI (one-click pattern, same as emerging-project registration). Bulk-rewrites `domain_tags` atomically across all memories.
+
+### 7D — Confidence decay
+
+Daily lightweight job. For memories with `reference_count=0` AND `last_referenced_at` older than 30 days: multiply confidence by 0.97/day (~2-month half-life). Reinforcement already bumps confidence. Below 0.3 → auto-supersede with reason `decayed, no references`. Reversible (tune half-life), non-destructive (still searchable with status filter).
+
+### 7E — Memory detail page `/wiki/memories/{id}`
+
+Provenance chain: source_chunk → interaction → graduated_to_entity. Audit trail (Phase 4 has the data). Related memories (same project + tag + semantic neighbors). Decay trajectory plot (if 7D ships). Link target from every memory surfaced anywhere in the wiki.
+
+### 7F — Cross-project domain view `/wiki/domains/{tag}`
+
+One page per `domain_tag` showing all memories + graduated entities with that tag, grouped by project. "Optics across p04+p05+p06" becomes a real navigable page. Answers the long-standing question the tag system was meant to enable.
+
+### 7G — Re-extraction on prompt upgrade
+
+`batch_llm_extract_live.py --force-reextract --since DATE`. Dedupe key: `(interaction_id, extractor_version)` — same run on same interaction doesn't double-create. Triggered manually when `LLM_EXTRACTOR_VERSION` bumps. Not automatic (destructive).
+
+### 7H — Vector store hygiene
+
+Nightly: scan `source_chunks` and `memory_embeddings` (added in 7A V2) for `status=superseded|invalid`. Delete matching vectors from Chroma. Fail-open — the retrieval harness catches any real regression.
+
+## Verification & ship order
+
+1. **7A** — ship + observe 1 week → validate merge proposals are high-signal, rejection rate acceptable
+2. **7D** — decay is low-risk + high-compounding value; ship second
+3. **7C** — clean up tag fragmentation before 7F depends on canonical tags
+4. **7E** + **7F** — UX surfaces; ship together once data is clean
+5. **7B** — contradictions flow (pairs harder than duplicates to classify; wait for 7A data to tune threshold)
+6. **7G** — on-demand; no ship until we actually bump the extractor prompt
+7. **7H** — housekeeping; after 7A + 7B + 7D have generated enough `superseded` rows to matter
+
+## Scope NOT in Phase 7
+
+- Graduated memories (entity-descended) are **frozen** — exempt from dedup/decay. Entity consolidation is a separate Phase (8+).
+- Auto-merging without human approval (always human-in-the-loop in V1).
+- Summarization / compression — a different problem (reducing the number of chunks per memory, not the number of memories).
+- Forgetting policies — there's no user-facing "delete this" flow in Phase 7. Supersede + filter covers the need.
--- a/docs/capture-surfaces.md
+++ b/docs/capture-surfaces.md
@@ -0,0 +1,45 @@
+# AtoCore — sanctioned capture surfaces
+
+**Scope statement**: AtoCore captures conversations from **two surfaces only**. Everything else is intentionally out of scope.
+
+| Surface | Hooks | Status |
+|---|---|---|
+| **Claude Code** (local CLI) | `Stop` (capture) + `UserPromptSubmit` (context injection) | both installed |
+| **OpenClaw** (agent framework on T420) | `before_agent_start` (context injection) + `llm_output` (capture) | both installed (v0.2.0 plugin, Phase 7I) |
+
+Both surfaces are **symmetric** — push (capture) and pull (context injection on prompt submit) — so AtoCore learns from every turn AND every turn is grounded in what AtoCore already knows.
+
+## Why these two?
+
+- **Stable hook APIs.** Claude Code exposes `Stop` and `UserPromptSubmit` lifecycle hooks with documented JSON contracts. OpenClaw exposes `before_agent_start` and `llm_output`. Both run locally where we control the process.
+- **Passive from the user's perspective.** No paste, no manual capture command, no "remember this" prompt. You just use the tool and AtoCore absorbs everything durable.
+- **Failure is graceful.** If AtoCore is down, hooks exit 0 with no output — the user's turn proceeds uninterrupted.
+
+## Why not Claude Desktop / Claude.ai web / Claude mobile / ChatGPT / …?
+
+- Claude Desktop has MCP but no `Stop`-equivalent hook for auto-capture; auto-capture would require system-prompt coercion ("call atocore_remember every turn"), which is fragile.
+- Claude.ai web has no hook surface — would need a browser extension (real project, not shipped).
+- Claude mobile app has neither hooks nor MCP — nothing to wire into.
+- ChatGPT etc. — same as above.
+
+**Anthropic API log polling is explicitly prohibited.**
+
+If you find yourself wanting to capture from one of these, the real answer is: use Claude Code or OpenClaw for the work that matters. Don't paste chat transcripts into AtoCore — that contradicts the whole design principle of passive capture.
+
+A `/wiki/capture` fallback form still exists (the endpoint `/interactions` is public) but it is **not promoted in the UI** and is documented as a last-resort escape hatch. If you're reaching for it, something is wrong with your workflow, not with AtoCore.
+
+## Hook files
+
+- `deploy/hooks/capture_stop.py` — Claude Code Stop → POSTs `/interactions`
+- `deploy/hooks/inject_context.py` — Claude Code UserPromptSubmit → POSTs `/context/build`, returns pack via `hookSpecificOutput.additionalContext`
+- `openclaw-plugins/atocore-capture/index.js` — OpenClaw plugin v0.2.0: capture + context injection
+
+Both Claude Code hooks share a `_infer_project` table mapping cwd to project slug. Keep them in sync when adding a new project path.
+
+## Kill switches
+
+- `ATOCORE_CAPTURE_DISABLED=1` → skip Stop capture
+- `ATOCORE_CONTEXT_DISABLED=1` → skip UserPromptSubmit injection
+- OpenClaw plugin config `injectContext: false` → skip context injection (capture still fires)
+
+All three are documented in the respective hook/plugin files.
--- a/docs/current-state.md
+++ b/docs/current-state.md
@@ -1,275 +1,49 @@
-# AtoCore Current State
+# AtoCore — Current State (2026-04-19)

-## Status Summary
+Live deploy: `877b97e` · Dalidou health: ok · Harness: 17/18.

-AtoCore is no longer just a proof of concept. The local engine exists, the
-correctness pass is complete, Dalidou now hosts the canonical runtime and
-machine-storage location, and the T420/OpenClaw side now has a safe read-only
-path to consume AtoCore. The live corpus is no longer just self-knowledge: it
-now includes a first curated ingestion batch for the active projects.
+## The numbers

-## Phase Assessment
+| | count |
+|---|---|
+| Active memories | 266 (180 project, 31 preference, 24 knowledge, 17 adaptation, 11 episodic, 3 identity) |
+| Candidates pending | **0** (autonomous triage drained the queue) |
+| Interactions captured | 605 (250 claude-code, 351 openclaw) |
+| Entities (typed graph) | 50 |
+| Vectors in Chroma | 33K+ |
+| Projects | 6 registered (p04, p05, p06, abb-space, atomizer-v2, atocore) + apm emerging (2 memories, below auto-register threshold) |
+| Unique domain tags | 210 |
+| Tests | 440 passing |

- completed
-  - Phase 0
-  - Phase 0.5
-  - Phase 1
- baseline complete
-  - Phase 2
-  - Phase 3
-  - Phase 5
-  - Phase 7
-  - Phase 9 (Commits A/B/C: capture, reinforcement, extractor + review queue)
- partial
-  - Phase 4
-  - Phase 8
- not started
-  - Phase 6
-  - Phase 10
-  - Phase 11
-  - Phase 12
-  - Phase 13
+## Autonomous pipeline — what runs without me

-## What Exists Today
+| When | Job | Does |
+|---|---|---|
+| every hour | `hourly-extract.sh` | Pulls new interactions → LLM extraction → 3-tier auto-triage (sonnet → opus → discard/human). 0 pending candidates right now = autonomy is working. |
+| every 2 min | `dedup-watcher.sh` | Services UI-triggered dedup scans |
+| daily 03:00 UTC | Full nightly (`batch-extract.sh`) | Extract · triage · auto-promote reinforced · synthesis · harness · dedup (0.90) · emerging detector · transient→durable · **confidence decay (7D)** · integrity check · alerts |
+| Sundays | +Weekly deep pass | Knowledge-base lint · dedup @ 0.85 · **tag canonicalization (7C)** |

- ingestion pipeline
- parser and chunker
- SQLite-backed memory and project state
- vector retrieval
- context builder
- API routes for query, context, health, and source status
- project registry and per-project refresh foundation
- project registration lifecycle:
-  - template
-  - proposal preview
-  - approved registration
-  - safe update of existing project registrations
-  - refresh
- implementation-facing architecture notes for:
-  - engineering knowledge hybrid architecture
-  - engineering ontology v1
- env-driven storage and deployment paths
- Dalidou Docker deployment foundation
- initial AtoCore self-knowledge corpus ingested on Dalidou
- T420/OpenClaw read-only AtoCore helper skill
- full active-project markdown/text corpus wave for:
-  - `p04-gigabit`
-  - `p05-interferometer`
-  - `p06-polisher`
+Last nightly run (2026-04-19 03:00 UTC): **31 promoted · 39 rejected · 0 needs human**. That's the brain self-organizing.

-## What Is True On Dalidou
+## Phase 7 — Memory Consolidation status

- deployed repo location:
-  - `/srv/storage/atocore/app`
- canonical machine DB location:
-  - `/srv/storage/atocore/data/db/atocore.db`
- canonical vector store location:
-  - `/srv/storage/atocore/data/chroma`
- source input locations:
-  - `/srv/storage/atocore/sources/vault`
-  - `/srv/storage/atocore/sources/drive`
+| Subphase | What | Status |
+|---|---|---|
+| 7A | Semantic dedup + merge lifecycle | live |
+| 7A.1 | Tiered auto-approve (sonnet ≥0.8 + sim ≥0.92 → merge; opus escalation; human only for ambiguous) | live |
+| 7B | Memory-to-memory contradiction detection (0.70–0.88 band, classify duplicate/contradicts/supersedes) | deferred, needs 7A signal |
+| 7C | Tag canonicalization (weekly; auto-apply ≥0.8 confidence; protects project tokens) | live (first run: 0 proposals — vocabulary is clean) |
+| 7D | Confidence decay (0.97/day on idle unreferenced; auto-supersede below 0.3) | live (first run: 0 decayed — nothing idle+unreferenced yet) |
+| 7E | `/wiki/memories/{id}` detail page | pending |
+| 7F | `/wiki/domains/{tag}` cross-project view | pending (wants 7C + more usage first) |
+| 7G | Re-extraction on prompt version bump | pending |
+| 7H | Chroma vector hygiene (delete vectors for superseded memories) | pending |

-The service and storage foundation are live on Dalidou.
+## Known gaps (honest)

-The machine-data host is real and canonical.
-
-The project registry is now also persisted in a canonical mounted config path on
-Dalidou:
-
- `/srv/storage/atocore/config/project-registry.json`
-
-The content corpus is partially populated now.
-
-The Dalidou instance already contains:
-
- AtoCore ecosystem and hosting docs
- current-state and OpenClaw integration docs
- Master Plan V3
- Build Spec V1
- trusted project-state entries for `atocore`
- full staged project markdown/text corpora for:
-  - `p04-gigabit`
-  - `p05-interferometer`
-  - `p06-polisher`
- curated repo-context docs for:
-  - `p05`: `Fullum-Interferometer`
-  - `p06`: `polisher-sim`
- trusted project-state entries for:
-  - `p04-gigabit`
-  - `p05-interferometer`
-  - `p06-polisher`
-
-Current live stats after the full active-project wave are now far beyond the
-initial seed stage:
-
- more than `1,100` source documents
- more than `20,000` chunks
- matching vector count
-
-The broader long-term corpus is still not fully populated yet. Wider project and
-vault ingestion remains a deliberate next step rather than something already
-completed, but the corpus is now meaningfully seeded beyond AtoCore's own docs.
-
-For human-readable quality review, the current staged project markdown corpus is
-primarily visible under:
-
- `/srv/storage/atocore/sources/vault/incoming/projects`
-
-This staged area is now useful for review because it contains the markdown/text
-project docs that were actually ingested for the full active-project wave.
-
-It is important to read this staged area correctly:
-
- it is a readable ingestion input layer
- it is not the final machine-memory representation itself
- seeing familiar PKM-style notes there is expected
- the machine-processed intelligence lives in the DB, chunks, vectors, memory,
-  trusted project state, and context-builder outputs
-
-## What Is True On The T420
-
- SSH access is working
- OpenClaw workspace inspected at `/home/papa/clawd`
- OpenClaw's own memory system remains unchanged
- a read-only AtoCore integration skill exists in the workspace:
-  - `/home/papa/clawd/skills/atocore-context/`
- the T420 can successfully reach Dalidou AtoCore over network/Tailscale
- fail-open behavior has been verified for the helper path
- OpenClaw can now seed AtoCore in two distinct ways:
-  - project-scoped memory entries
-  - staged document ingestion into the retrieval corpus
- the helper now supports the practical registered-project lifecycle:
-  - projects
-  - project-template
-  - propose-project
-  - register-project
-  - update-project
-  - refresh-project
- the helper now also supports the first organic routing layer:
-  - `detect-project "<prompt>"`
-  - `auto-context "<prompt>" [budget] [project]`
- OpenClaw can now default to AtoCore for project-knowledge questions without
-  requiring explicit helper commands from the human every time
-
-## What Exists In Memory vs Corpus
-
-These remain separate and that is intentional.
-
-In `/memory`:
-
- project-scoped curated memories now exist for:
-  - `p04-gigabit`: 5 memories
-  - `p05-interferometer`: 6 memories
-  - `p06-polisher`: 8 memories
-
-These are curated summaries and extracted stable project signals.
-
-In `source_documents` / retrieval corpus:
-
- full project markdown/text corpora are now present for the active project set
- retrieval is no longer limited to AtoCore self-knowledge only
- the current corpus is broad enough that ranking quality matters more than
-  corpus presence alone
- underspecified prompts can still pull in historical or archive material, so
-  project-aware routing and better ranking remain important
-
-The source refresh model now has a concrete foundation in code:
-
- a project registry file defines known project ids, aliases, and ingest roots
- the API can list registered projects
- the API can return a registration template
- the API can preview a registration without mutating state
- the API can persist an approved registration
- the API can update an existing registered project without changing its canonical id
- the API can refresh one registered project at a time
-
-This lifecycle is now coherent end to end for normal use.
-
-The first live update passes on existing registered projects have now been
-verified against `p04-gigabit` and `p05-interferometer`:
-
- the registration description can be updated safely
- the canonical project id remains unchanged
- refresh still behaves cleanly after the update
- `context/build` still returns useful project-specific context afterward
-
-## Reliability Baseline
-
-The runtime has now been hardened in a few practical ways:
-
- SQLite connections use a configurable busy timeout
- SQLite uses WAL mode to reduce transient lock pain under normal concurrent use
- project registry writes are atomic file replacements rather than in-place rewrites
- a full runtime backup and restore path now exists and has been exercised on
-  live Dalidou:
-  - SQLite (hot online backup via `conn.backup()`)
-  - project registry (file copy)
-  - Chroma vector store (cold directory copy under `exclusive_ingestion()`)
-  - backup metadata
-  - `restore_runtime_backup()` with CLI entry point
-    (`python -m atocore.ops.backup restore <STAMP>
-    --confirm-service-stopped`), pre-restore safety snapshot for
-    rollback, WAL/SHM sidecar cleanup, `PRAGMA integrity_check`
-    on the restored file
-  - the first live drill on 2026-04-09 surfaced and fixed a Chroma
-    restore bug on Docker bind-mounted volumes (`shutil.rmtree`
-    on a mount point); a regression test now asserts the
-    destination inode is stable across restore
- deploy provenance is visible end-to-end:
-  - `/health` reports `build_sha`, `build_time`, `build_branch`
-    from env vars wired by `deploy.sh`
-  - `deploy.sh` Step 6 verifies the live `build_sha` matches the
-    just-built commit (exit code 6 on drift) so "live is current?"
-    can be answered precisely, not just by `__version__`
-  - `deploy.sh` Step 1.5 detects that the script itself changed
-    in the pulled commit and re-execs into the fresh copy, so
-    the deploy never silently runs the old script against new source
-
-This does not eliminate every concurrency edge, but it materially improves the
-current operational baseline.
-
-In `Trusted Project State`:
-
- each active seeded project now has a conservative trusted-state set
- promoted facts cover:
-  - summary
-  - core architecture or boundary decision
-  - key constraints
-  - next focus
-
-This separation is healthy:
-
- memory stores distilled project facts
- corpus stores the underlying retrievable documents
-
-## Immediate Next Focus
-
-1. ~~Re-run the full backup/restore drill~~ — DONE 2026-04-11,
-   full pass (db, registry, chroma, integrity all true)
-2. ~~Turn on auto-capture of Claude Code sessions in conservative
-   mode~~ — DONE 2026-04-11, Stop hook wired via
-   `deploy/hooks/capture_stop.py` → `POST /interactions`
-   with `reinforce=false`; kill switch via
-   `ATOCORE_CAPTURE_DISABLED=1`
-3. Run a short real-use pilot with auto-capture on, verify
-   interactions are landing in Dalidou, review quality
-4. Use the new T420-side organic routing layer in real OpenClaw workflows
-4. Tighten retrieval quality for the now fully ingested active project corpora
-5. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
-6. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
-7. Expand the remaining boring operations baseline:
-   - retention policy cleanup script
-   - off-Dalidou backup target (rsync or similar)
-8. Only later consider write-back, reflection, or deeper autonomous behaviors
-
-See also:
-
- [ingestion-waves.md](C:/Users/antoi/ATOCore/docs/ingestion-waves.md)
- [master-plan-status.md](C:/Users/antoi/ATOCore/docs/master-plan-status.md)
-
-## Guiding Constraints
-
- bad memory is worse than no memory
- trusted project state must remain highest priority
- human-readable sources and machine storage stay separate
- OpenClaw integration must not degrade OpenClaw baseline behavior
+1. **Capture surface is Claude-Code-and-OpenClaw only.** Conversations in Claude Desktop, Claude.ai web, phone, or any other LLM UI are NOT captured. Example: the rotovap/mushroom chat yesterday never reached AtoCore because no hook fired. See Q4 below.
+2. **OpenClaw is capture-only, not context-grounded.** The plugin POSTs `/interactions` on `llm_output` but does NOT call `/context/build` on `before_agent_start`. OpenClaw's underlying agent runs blind. See Q2 below.
+3. **Human interface (wiki) is thin and static.** 5 project cards + a "System" line. No dashboard for the autonomous activity. No per-memory detail page. See Q3/Q5.
+4. **Harness 17/18** — the `p04-constraints` fixture wants "Zerodur" but retrieval surfaces related-not-exact terms. Content gap, not a retrieval regression.
+5. **Two projects under-populated**: p05-interferometer (4 memories, 18 state) and atomizer-v2 (1 memory, 6 state). Batch re-extract with the new llm-0.6.0 prompt would help.
--- a/docs/universal-consumption.md
+++ b/docs/universal-consumption.md
@@ -0,0 +1,274 @@
+# Universal Consumption — Connecting LLM Clients to AtoCore
+
+Phase 1 of the Master Brain plan. Every LLM interaction across the ecosystem
+pulls context from AtoCore automatically, without the user or agent having
+to remember to ask for it.
+
+## Architecture
+
+```
+                 ┌─────────────────────┐
+                 │  AtoCore HTTP API   │  ← single source of truth
+                 │  http://dalidou:8100│
+                 └──────────┬──────────┘
+                            │
+       ┌────────────────────┼────────────────────┐
+       │                    │                    │
+   ┌───┴────┐         ┌─────┴────┐         ┌────┴────┐
+   │  MCP   │         │ OpenClaw │         │  HTTP   │
+   │ server │         │  plugin  │         │  proxy  │
+   └───┬────┘         └──────┬───┘         └────┬────┘
+       │                     │                   │
+   Claude/Cursor/         OpenClaw            Codex/Ollama/
+   Zed/Windsurf                                any OpenAI-compat client
+```
+
+Three adapters, one HTTP backend. Each adapter is a thin passthrough — no
+business logic duplicated.
+
+---
+
+## Adapter 1: MCP Server (Claude Desktop, Claude Code, Cursor, Zed, Windsurf)
+
+The MCP server is `scripts/atocore_mcp.py` — stdlib-only Python, stdio
+transport, wraps the HTTP API. Claude-family clients see AtoCore as built-in
+tools just like `Read` or `Bash`.
+
+### Tools exposed
+
+- **`atocore_context`** (most important): Full context pack for a query —
+  Trusted Project State + memories + retrieved chunks. Use at the start of
+  any project-related conversation to ground it.
+- **`atocore_search`**: Semantic search over ingested documents (top-K chunks).
+- **`atocore_memory_list`**: List active memories, filterable by project + type.
+- **`atocore_memory_create`**: Propose a candidate memory (enters triage queue).
+- **`atocore_project_state`**: Get Trusted Project State entries by category.
+- **`atocore_projects`**: List registered projects + aliases.
+- **`atocore_health`**: Service status check.
+
+### Registration
+
+#### Claude Code (CLI)
+```bash
+claude mcp add atocore -- python C:/Users/antoi/ATOCore/scripts/atocore_mcp.py
+claude mcp list    # verify: "atocore ... ✓ Connected"
+```
+
+#### Claude Desktop (GUI)
+Edit `~/Library/Application Support/Claude/claude_desktop_config.json`
+(macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
+
+```json
+{
+  "mcpServers": {
+    "atocore": {
+      "command": "python",
+      "args": ["C:/Users/antoi/ATOCore/scripts/atocore_mcp.py"],
+      "env": {
+        "ATOCORE_URL": "http://dalidou:8100"
+      }
+    }
+  }
+}
+```
+Restart Claude Desktop.
+
+#### Cursor / Zed / Windsurf
+Similar JSON config in each tool's MCP settings. Consult their docs —
+the config schema is standard MCP.
+
+### Configuration
+
+Environment variables the MCP server honors:
+
+| Var | Default | Purpose |
+|---|---|---|
+| `ATOCORE_URL` | `http://dalidou:8100` | Where to reach AtoCore |
+| `ATOCORE_TIMEOUT` | `10` | Per-request HTTP timeout (seconds) |
+
+### Behavior
+
+- Fail-open: if Dalidou is unreachable, tools return "AtoCore unavailable"
+  error messages but don't crash the client.
+- Zero business logic: every tool is a direct HTTP passthrough.
+- stdlib only: no MCP SDK dependency.
+
+---
+
+## Adapter 2: OpenClaw Plugin (`openclaw-plugins/atocore-capture/handler.js`)
+
+The plugin on T420 OpenClaw has two responsibilities:
+
+1. **CAPTURE**: On `before_agent_start` + `llm_output`, POST completed turns
+   to AtoCore `/interactions` (existing).
+2. **PULL**: On `before_prompt_build`, call `/context/build` and inject the
+   context pack via `prependContext` so the agent's system prompt includes
+   AtoCore knowledge.
+
+### Deployment
+
+The plugin is loaded from
+`/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/`
+on the T420 (per OpenClaw's plugin config at `~/.openclaw/openclaw.json`).
+
+To update:
+```bash
+scp openclaw-plugins/atocore-capture/handler.js \
+    papa@192.168.86.39:/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/index.js
+ssh papa@192.168.86.39 'systemctl --user restart openclaw-gateway'
+```
+
+Verify in gateway logs: look for "ready (7 plugins: acpx, atocore-capture, ...)"
+
+### Configuration (env vars set on T420)
+
+| Var | Default | Purpose |
+|---|---|---|
+| `ATOCORE_BASE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
+| `ATOCORE_PULL_DISABLED` | (unset) | Set to `1` to disable context pull |
+
+### Behavior
+
+- Fail-open: AtoCore unreachable = no injection, no capture, agent runs
+  normally.
+- 6s timeout on context pull, 10s on capture — won't stall the agent.
+- Context pack prepended as a clearly-bracketed block so the agent can see
+  it's auto-injected grounding info.
+
+---
+
+## Adapter 3: HTTP Proxy (`scripts/atocore_proxy.py`)
+
+A stdlib-only OpenAI-compatible HTTP proxy. Sits between any
+OpenAI-API-speaking client and the real provider, enriches every
+`/chat/completions` request with AtoCore context.
+
+Works with:
+- **Codex CLI** (OpenAI-compatible endpoint)
+- **Ollama** (has OpenAI-compatible `/v1` endpoint since 0.1.24)
+- **LiteLLM**, **llama.cpp server**, custom agents
+- Anything that can be pointed at a custom base URL
+
+### Start it
+
+```bash
+# For Ollama (local models):
+ATOCORE_UPSTREAM=http://localhost:11434/v1 \
+  python scripts/atocore_proxy.py
+
+# For OpenAI cloud:
+ATOCORE_UPSTREAM=https://api.openai.com/v1 \
+  ATOCORE_CLIENT_LABEL=codex \
+  python scripts/atocore_proxy.py
+
+# Test:
+curl http://127.0.0.1:11435/healthz
+```
+
+### Point a client at it
+
+Set the client's OpenAI base URL to `http://127.0.0.1:11435/v1`.
+
+#### Ollama example:
+```bash
+OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
+  some-openai-client --model llama3:8b
+```
+
+#### Codex CLI:
+Set `OPENAI_BASE_URL=http://127.0.0.1:11435/v1` in your codex config.
+
+### Configuration
+
+| Var | Default | Purpose |
+|---|---|---|
+| `ATOCORE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
+| `ATOCORE_UPSTREAM` | (required) | Real provider base URL |
+| `ATOCORE_PROXY_PORT` | `11435` | Proxy listen port |
+| `ATOCORE_PROXY_HOST` | `127.0.0.1` | Proxy bind address |
+| `ATOCORE_CLIENT_LABEL` | `proxy` | Client id in captures |
+| `ATOCORE_INJECT` | `1` | Inject context (set `0` to disable) |
+| `ATOCORE_CAPTURE` | `1` | Capture interactions (set `0` to disable) |
+
+### Behavior
+
+- GET requests (model listing etc) pass through unchanged
+- POST to `/chat/completions` (or `/v1/chat/completions`) gets enriched:
+  1. Last user message extracted as query
+  2. AtoCore `/context/build` called with 6s timeout
+  3. Pack injected as system message (or prepended to existing system)
+  4. Enriched body forwarded to upstream
+  5. After success, interaction POSTed to `/interactions` in background
+- Fail-open: AtoCore unreachable = pass through without injection
+- Streaming responses: currently buffered (not true stream). Good enough for
+  most cases; can be upgraded later if needed.
+
+### Running as a service
+
+On Linux, create `~/.config/systemd/user/atocore-proxy.service`:
+```ini
+[Unit]
+Description=AtoCore HTTP proxy
+
+[Service]
+Environment=ATOCORE_UPSTREAM=http://localhost:11434/v1
+Environment=ATOCORE_CLIENT_LABEL=ollama
+ExecStart=/usr/bin/python3 /path/to/scripts/atocore_proxy.py
+Restart=on-failure
+
+[Install]
+WantedBy=default.target
+```
+Then: `systemctl --user enable --now atocore-proxy`
+
+On Windows, register via Task Scheduler (similar pattern to backup task)
+or use NSSM to install as a service.
+
+---
+
+## Verification Checklist
+
+Fresh end-to-end test to confirm Phase 1 is working:
+
+### For Claude Code (MCP)
+1. Open a new Claude Code session (not this one).
+2. Ask: "what do we know about p06 polisher's control architecture?"
+3. Claude should invoke `atocore_context` or `atocore_project_state`
+   on its own and answer grounded in AtoCore data.
+
+### For OpenClaw (plugin pull)
+1. Send a Discord message to OpenClaw: "what's the status on p04?"
+2. Check T420 logs: `journalctl --user -u openclaw-gateway --since "1 min ago" | grep atocore-pull`
+3. Expect: `atocore-pull:injected project=p04-gigabit chars=NNN`
+
+### For proxy (any OpenAI-compat client)
+1. Start proxy with appropriate upstream
+2. Run a client query through it
+3. Check stderr: `[atocore-proxy] inject: project=... chars=...`
+4. Check `curl http://127.0.0.1:8100/interactions?client=proxy` — should
+   show the captured turn
+
+---
+
+## Why not just MCP everywhere?
+
+MCP is great for Claude-family clients but:
+- Not supported natively by Codex CLI, Ollama, or OpenAI's own API
+- No universal "attach MCP" mechanism in all LLM runtimes
+- HTTP APIs are truly universal
+
+HTTP API is the truth, each adapter is the thinnest possible shim for its
+ecosystem. When new adapters are needed (Gemini CLI, Claude Code plugin
+system, etc.), they follow the same pattern.
+
+---
+
+## Future enhancements
+
+- **Streaming passthrough** in the proxy (currently buffered for simplicity)
+- **Response grounding check**: parse assistant output for references to
+  injected context, count reinforcement events
+- **Per-client metrics** in the dashboard: how often each client pulls,
+  context pack size, injection rate
+- **Smart project detection**: today we use keyword matching; could use
+  AtoCore's own project resolver endpoint
--- a/docs/windows-backup-setup.md
+++ b/docs/windows-backup-setup.md
@@ -0,0 +1,140 @@
+# Windows Main-Computer Backup Setup
+
+The AtoCore backup pipeline runs nightly on Dalidou and already pushes snapshots
+off-host to the T420 (`papa@192.168.86.39`). This doc sets up a **second**,
+pull-based daily backup to your Windows main computer at
+`C:\Users\antoi\Documents\ATOCore_Backups\`.
+
+Pull-based means the Windows machine pulls from Dalidou. This is simpler than
+push because Dalidou doesn't need SSH keys to reach Windows, and the backup
+only runs when the Windows machine is powered on and can reach Dalidou.
+
+## Prerequisites
+
+- Windows 10/11 with OpenSSH client (built-in since Win10 1809)
+- SSH key-based auth to `papa@dalidou` already working (you're using it today)
+- `C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1` present
+
+## Test the script manually
+
+```powershell
+powershell.exe -ExecutionPolicy Bypass -File `
+    C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1
+```
+
+Expected output:
+```
+[timestamp] === AtoCore backup pull starting ===
+[timestamp] Dalidou reachable.
+[timestamp] Pulling snapshots via scp...
+[timestamp] Pulled N snapshots successfully (total X MB, latest: ...)
+[timestamp] === backup complete ===
+```
+
+Target directory: `C:\Users\antoi\Documents\ATOCore_Backups\snapshots\`
+Logs: `C:\Users\antoi\Documents\ATOCore_Backups\_logs\backup-*.log`
+
+## Register the Task Scheduler task
+
+### Option A — automatic registration (recommended)
+
+Run this PowerShell command **as your user** (no admin needed — uses HKCU task):
+
+```powershell
+$action = New-ScheduledTaskAction -Execute 'powershell.exe' `
+    -Argument '-ExecutionPolicy Bypass -NonInteractive -WindowStyle Hidden -File C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1'
+
+# Run daily at 10:00 local time; if missed (computer off), run at next logon
+$trigger = New-ScheduledTaskTrigger -Daily -At 10:00AM
+$trigger.StartBoundary = (Get-Date -Format 'yyyy-MM-ddTHH:mm:ss')
+
+$settings = New-ScheduledTaskSettingsSet `
+    -AllowStartIfOnBatteries `
+    -DontStopIfGoingOnBatteries `
+    -StartWhenAvailable `
+    -ExecutionTimeLimit (New-TimeSpan -Minutes 10) `
+    -RestartCount 2 `
+    -RestartInterval (New-TimeSpan -Minutes 30)
+
+Register-ScheduledTask -TaskName 'AtoCore Backup Pull' `
+    -Description 'Daily pull of AtoCore backup snapshots from Dalidou' `
+    -Action $action -Trigger $trigger -Settings $settings `
+    -User $env:USERNAME
+```
+
+Key settings:
+- `-StartWhenAvailable`: if the computer was off at 10:00, run as soon as it
+  comes online
+- `-AllowStartIfOnBatteries`: works on laptop battery too
+- `-ExecutionTimeLimit 10min`: kill hung tasks
+- `-RestartCount 2`: retry twice if it fails (Dalidou temporarily unreachable)
+
+### Option B -- Task Scheduler GUI
+
+1. Open Task Scheduler (`taskschd.msc`)
+2. Create Basic Task -> name: `AtoCore Backup Pull`
+3. Trigger: Daily, 10:00 AM, recur every 1 day
+4. Action: Start a program
+   - Program: `powershell.exe`
+   - Arguments: `-ExecutionPolicy Bypass -NonInteractive -WindowStyle Hidden -File "C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1"`
+5. Finish, then edit the task:
+   - Settings tab: check "Run task as soon as possible after a scheduled start is missed"
+   - Settings tab: "If the task fails, restart every 30 minutes, up to 2 times"
+   - Conditions tab: uncheck "Start only if computer is on AC power" (if you want it on battery)
+
+## Verify
+
+After the first scheduled run:
+
+```powershell
+# Most recent log
+Get-ChildItem C:\Users\antoi\Documents\ATOCore_Backups\_logs\ |
+    Sort-Object Name -Descending |
+    Select-Object -First 1 |
+    Get-Content
+
+# Latest snapshot present?
+Get-ChildItem C:\Users\antoi\Documents\ATOCore_Backups\snapshots\ |
+    Sort-Object Name -Descending |
+    Select-Object -First 3
+```
+
+## Unregister (if needed)
+
+```powershell
+Unregister-ScheduledTask -TaskName 'AtoCore Backup Pull' -Confirm:$false
+```
+
+## How it behaves
+
+- **Computer on, Dalidou reachable**: pulls latest snapshots silently in ~15s
+- **Computer on, Dalidou unreachable** (remote work, network down): fail-open,
+  exits without error, logs "Dalidou unreachable"
+- **Computer off at scheduled time**: Task Scheduler runs it as soon as the
+  computer wakes up
+- **Many days off**: one run catches up; scp only transfers files not already
+  present (snapshots are date-stamped directories, idempotent overwrites)
+
+## What gets backed up
+
+The snapshots tree contains:
+- `YYYYMMDDTHHMMSSZ/config/` — project registry, AtoCore config
+- `YYYYMMDDTHHMMSSZ/db/` — SQLite snapshot of all memory, state, interactions
+- `YYYYMMDDTHHMMSSZ/backup-metadata.json` — SHA, timestamp, source info
+
+Chroma vectors are **not** in the snapshot by default
+(`ATOCORE_BACKUP_CHROMA=false` on Dalidou). They can be rebuilt from the
+source documents if lost. To include them, set `ATOCORE_BACKUP_CHROMA=true`
+in the Dalidou cron environment.
+
+## Three-tier backup summary
+
+After this setup:
+
+| Tier | Location | Cadence | Purpose |
+|---|---|---|---|
+| Live | Dalidou `/srv/storage/atocore/backups/snapshots/` | Nightly 03:00 UTC | Fast restore |
+| Off-host | T420 `papa@192.168.86.39:/home/papa/atocore-backups/` | Nightly after Dalidou | Dalidou dies |
+| User machine | `C:\Users\antoi\Documents\ATOCore_Backups\` | Daily 10:00 local | Full home-network failure |
+
+Three independent copies. Any two can be lost simultaneously without data loss.
--- a/openclaw-plugins/atocore-capture/README.md
+++ b/openclaw-plugins/atocore-capture/README.md
@@ -1,29 +1,40 @@
-# AtoCore Capture Plugin for OpenClaw
+# AtoCore Capture + Context Plugin for OpenClaw

-Minimal OpenClaw plugin that mirrors Claude Code's `capture_stop.py` behavior:
+Two-way bridge between OpenClaw agents and AtoCore:

+**Capture (since v1)**
 - watches user-triggered assistant turns
 - POSTs `prompt` + `response` to `POST /interactions`
- sets `client="openclaw"`
- sets `reinforce=true`
+- sets `client="openclaw"`, `reinforce=true`
 - fails open on network or API errors

-## Config
+**Context injection (Phase 7I, v2+)**
+- on `before_agent_start`, fetches a context pack from `POST /context/build`
+- prepends the pack to the agent's prompt so whatever LLM runs underneath
+  (sonnet, opus, codex, local model — whichever OpenClaw delegates to)
+  answers grounded in what AtoCore already knows
+- original user prompt is still what gets captured later (no recursion)
+- fails open: context unreachable → agent runs as before

-Optional plugin config:
+## Config

 ```json
 {
  "baseUrl": "http://dalidou:8100",
  "minPromptLength": 15,
-  "maxResponseLength": 50000
+  "maxResponseLength": 50000,
+  "injectContext": true,
+  "contextCharBudget": 4000
 }
 ```

-If `baseUrl` is omitted, the plugin uses `ATOCORE_BASE_URL` or defaults to `http://dalidou:8100`.
+- `baseUrl` — defaults to `ATOCORE_BASE_URL` env or `http://dalidou:8100`
+- `injectContext` — set to `false` to disable the Phase 7I context injection and make this a pure one-way capture plugin again
+- `contextCharBudget` — cap on injected context size. `/context/build` respects it too; this is a client-side safety net. Default 4000 chars (~1000 tokens).

 ## Notes

- Project detection is intentionally left empty for now. Unscoped capture is acceptable because AtoCore's extraction pipeline handles unscoped interactions.
- Extraction is **not** part of the capture path. This plugin only records interactions and lets AtoCore reinforcement run automatically.
- The plugin captures only user-triggered turns, not heartbeats or system-only runs.
+- Project detection is intentionally left empty — AtoCore's extraction pipeline handles unscoped interactions and infers the project from content.
+- Extraction is **not** part of this plugin. Interactions are captured; batch extraction runs via cron on the AtoCore host.
+- Context injection only fires for user-triggered turns (not heartbeats or system-only runs).
+- Timeouts: context fetch is 5s (short so a slow AtoCore never blocks a user turn); capture post is 10s.
--- a/openclaw-plugins/atocore-capture/handler.js
+++ b/openclaw-plugins/atocore-capture/handler.js
@@ -1,63 +1,154 @@
 /**
- * AtoCore capture hook for OpenClaw.
+ * AtoCore OpenClaw plugin — capture + pull.
 *
- * Listens on message:received (buffer prompt) and message:sent (POST pair).
- * Fail-open: errors are caught silently.
+ * Two responsibilities:
+ *
+ * 1. CAPTURE (existing): On before_agent_start, buffer the user prompt.
+ *    On llm_output, POST prompt+response to AtoCore /interactions.
+ *    This is the "write" side — OpenClaw turns feed AtoCore's memory.
+ *
+ * 2. PULL (Phase 1 master brain): On before_prompt_build, call AtoCore
+ *    /context/build and inject the returned context via prependContext.
+ *    Every OpenClaw response is automatically grounded in what AtoCore
+ *    knows (project state, memories, relevant chunks).
+ *
+ * Fail-open throughout: AtoCore unreachable = no injection, no capture,
+ * never blocks the agent.
 */

+import { definePluginEntry } from "openclaw/plugin-sdk/core";
+
 const BASE_URL = process.env.ATOCORE_BASE_URL || "http://dalidou:8100";
 const MIN_LEN = 15;
 const MAX_RESP = 50000;
+const CONTEXT_TIMEOUT_MS = 6000;
+const CAPTURE_TIMEOUT_MS = 10000;

-let lastPrompt = null; // simple single-slot buffer
+function trim(v) { return typeof v === "string" ? v.trim() : ""; }
+function trunc(t, m) { return !t || t.length <= m ? t : t.slice(0, m) + "\n\n[truncated]"; }

-const atocoreCaptureHook = async (event) => {
-  try {
-    if (process.env.ATOCORE_CAPTURE_DISABLED === "1") return;
+function detectProject(prompt) {
+  const lower = (prompt || "").toLowerCase();
+  const hints = [
+    ["p04", "p04-gigabit"],
+    ["gigabit", "p04-gigabit"],
+    ["p05", "p05-interferometer"],
+    ["interferometer", "p05-interferometer"],
+    ["p06", "p06-polisher"],
+    ["polisher", "p06-polisher"],
+    ["fullum", "p06-polisher"],
+    ["abb", "abb-space"],
+    ["atomizer", "atomizer-v2"],
+    ["atocore", "atocore"],
+  ];
+  for (const [token, proj] of hints) {
+    if (lower.includes(token)) return proj;
+  }
+  return "";
+}

-    if (event.type === "message" && event.action === "received") {
-      const content = (event.context?.content || "").trim();
-      if (content.length >= MIN_LEN && !content.startsWith("<")) {
-        lastPrompt = { text: content, ts: Date.now() };
+export default definePluginEntry({
+  register(api) {
+    const log = api.logger;
+    let lastPrompt = null;
+
+    // --- PULL: inject AtoCore context into every prompt ---
+    api.on("before_prompt_build", async (event, ctx) => {
+      if (process.env.ATOCORE_PULL_DISABLED === "1") return;
+      const prompt = trim(event?.prompt || "");
+      if (prompt.length < MIN_LEN) return;
+
+      const project = detectProject(prompt);
+
+      try {
+        const res = await fetch(BASE_URL.replace(/\/$/, "") + "/context/build", {
+          method: "POST",
+          headers: { "Content-Type": "application/json" },
+          body: JSON.stringify({ prompt, project }),
+          signal: AbortSignal.timeout(CONTEXT_TIMEOUT_MS),
+        });
+        if (!res.ok) {
+          log.info("atocore-pull:http_error", { status: res.status });
+          return;
+        }
+        const data = await res.json();
+        const contextPack = data.formatted_context || "";
+        if (!contextPack.trim()) return;
+
+        log.info("atocore-pull:injected", {
+          project: project || "(none)",
+          chars: contextPack.length,
+        });
+
+        return {
+          prependContext:
+            "--- AtoCore Context (auto-injected) ---\n" +
+            contextPack +
+            "\n--- End AtoCore Context ---\n",
+        };
+      } catch (err) {
+        log.info("atocore-pull:error", { error: String(err).slice(0, 200) });
      }
-      return;
-    }
+    });

-    if (event.type === "message" && event.action === "sent") {
-      if (!event.context?.success) return;
-      const response = (event.context?.content || "").trim();
-      if (!response || !lastPrompt) return;
-
-      // Discard stale prompts (>5 min old)
-      if (Date.now() - lastPrompt.ts > 300000) {
+    // --- CAPTURE: buffer user prompts on agent start ---
+    api.on("before_agent_start", async (event, ctx) => {
+      const prompt = trim(event?.prompt || event?.cleanedBody || "");
+      if (prompt.length < MIN_LEN || prompt.startsWith("<")) {
        lastPrompt = null;
        return;
      }
+      // Filter cron-initiated agent runs. OpenClaw's scheduled tasks fire
+      // agent sessions with prompts that begin "[cron:<id> ...]". These are
+      // automated polls (DXF email watcher, calendar reminders, etc.), not
+      // real user turns — they're pure noise in the AtoCore capture stream.
+      if (prompt.startsWith("[cron:")) {
+        lastPrompt = null;
+        return;
+      }
+      lastPrompt = { text: prompt, sessionKey: ctx?.sessionKey || "", ts: Date.now() };
+      log.info("atocore-capture:prompt_buffered", { len: prompt.length });
+    });
+
+    // --- CAPTURE: send completed turns to AtoCore ---
+    api.on("llm_output", async (event, ctx) => {
+      if (!lastPrompt) return;
+      const texts = Array.isArray(event?.assistantTexts) ? event.assistantTexts : [];
+      const response = trunc(trim(texts.join("\n\n")), MAX_RESP);
+      if (!response) return;

      const prompt = lastPrompt.text;
+      const sessionKey = lastPrompt.sessionKey || ctx?.sessionKey || "";
+      const project = detectProject(prompt);
      lastPrompt = null;

-      const body = JSON.stringify({
-        prompt,
-        response: response.length > MAX_RESP
-          ? response.slice(0, MAX_RESP) + "\n\n[truncated]"
-          : response,
-        client: "openclaw",
-        session_id: event.sessionKey || "",
-        project: "",
-        reinforce: true,
+      log.info("atocore-capture:posting", {
+        promptLen: prompt.length,
+        responseLen: response.length,
+        project: project || "(none)",
      });

      fetch(BASE_URL.replace(/\/$/, "") + "/interactions", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
-        body,
-        signal: AbortSignal.timeout(10000),
-      }).catch(() => {});
-    }
-  } catch {
-    // fail-open: never crash the gateway
-  }
-};
+        body: JSON.stringify({
+          prompt,
+          response,
+          client: "openclaw",
+          session_id: sessionKey,
+          project,
+          reinforce: true,
+        }),
+        signal: AbortSignal.timeout(CAPTURE_TIMEOUT_MS),
+      }).then(res => {
+        log.info("atocore-capture:posted", { status: res.status });
+      }).catch(err => {
+        log.warn("atocore-capture:post_error", { error: String(err).slice(0, 200) });
+      });
+    });

-export default atocoreCaptureHook;
+    api.on("session_end", async () => {
+      lastPrompt = null;
+    });
+  }
+});
--- a/openclaw-plugins/atocore-capture/index.js
+++ b/openclaw-plugins/atocore-capture/index.js
@@ -3,6 +3,11 @@ import { definePluginEntry } from "openclaw/plugin-sdk/core";
 const DEFAULT_BASE_URL = process.env.ATOCORE_BASE_URL || "http://dalidou:8100";
 const DEFAULT_MIN_PROMPT_LENGTH = 15;
 const DEFAULT_MAX_RESPONSE_LENGTH = 50_000;
+// Phase 7I — context injection: cap how much AtoCore context we stuff
+// back into the prompt. The /context/build endpoint respects a budget
+// parameter too, but we keep a client-side safety net.
+const DEFAULT_CONTEXT_CHAR_BUDGET = 4_000;
+const DEFAULT_INJECT_CONTEXT = true;

 function trimText(value) {
  return typeof value === "string" ? value.trim() : "";
@@ -41,6 +46,37 @@ async function postInteraction(baseUrl, payload, logger) {
  }
 }

+// Phase 7I — fetch a context pack for the incoming prompt so the agent
+// answers grounded in what AtoCore already knows. Fail-open: if the
+// request times out or errors, we just don't inject; the agent runs as
+// before. Never block the user's turn on AtoCore availability.
+async function fetchContextPack(baseUrl, prompt, project, charBudget, logger) {
+  try {
+    const res = await fetch(`${baseUrl.replace(/\/$/, "")}/context/build`, {
+      method: "POST",
+      headers: { "Content-Type": "application/json" },
+      body: JSON.stringify({
+        prompt,
+        project: project || "",
+        char_budget: charBudget
+      }),
+      signal: AbortSignal.timeout(5_000)
+    });
+    if (!res.ok) {
+      logger?.debug?.("atocore_context_fetch_failed", { status: res.status });
+      return null;
+    }
+    const data = await res.json();
+    const pack = trimText(data?.formatted_context || "");
+    return pack || null;
+  } catch (error) {
+    logger?.debug?.("atocore_context_fetch_error", {
+      error: error instanceof Error ? error.message : String(error)
+    });
+    return null;
+  }
+}
+
 export default definePluginEntry({
  register(api) {
    const logger = api.logger;
@@ -55,6 +91,28 @@ export default definePluginEntry({
        pendingBySession.delete(ctx.sessionId);
        return;
      }
+
+      // Phase 7I — inject AtoCore context into the agent's prompt so it
+      // answers grounded in what the brain already knows. Config-gated
+      // (injectContext: false disables). Fail-open.
+      const baseUrl = trimText(config.baseUrl) || DEFAULT_BASE_URL;
+      const injectContext = config.injectContext !== false && DEFAULT_INJECT_CONTEXT;
+      const charBudget = Number(config.contextCharBudget || DEFAULT_CONTEXT_CHAR_BUDGET);
+      if (injectContext && event && typeof event === "object") {
+        const pack = await fetchContextPack(baseUrl, prompt, "", charBudget, logger);
+        if (pack) {
+          // Prepend to the event's prompt so the agent sees grounded info
+          // before the user's question. OpenClaw's agent receives
+          // event.prompt as its primary input; modifying it here grounds
+          // whatever LLM the agent delegates to (sonnet, opus, codex,
+          // local model — doesn't matter).
+          event.prompt = `${pack}\n\n---\n\n${prompt}`;
+          logger?.debug?.("atocore_context_injected", { chars: pack.length });
+        }
+      }
+
+      // Record the ORIGINAL user prompt (not the injected version) so
+      // captured interactions stay clean for later extraction.
      pendingBySession.set(ctx.sessionId, {
        prompt,
        sessionId: ctx.sessionId,
--- a/openclaw-plugins/atocore-capture/package.json
+++ b/openclaw-plugins/atocore-capture/package.json
@@ -1,7 +1,7 @@
 {
  "name": "@atomaste/atocore-openclaw-capture",
  "private": true,
-  "version": "0.0.0",
+  "version": "0.2.0",
  "type": "module",
-  "description": "OpenClaw plugin that captures assistant turns to AtoCore interactions"
+  "description": "OpenClaw plugin: captures assistant turns to AtoCore interactions AND injects AtoCore context into agent prompts before they run (Phase 7I two-way bridge)"
 }
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -17,6 +17,8 @@ dependencies = [
    "pydantic-settings>=2.1.0",
    "structlog>=24.1.0",
    "markdown>=3.5.0",
+    "python-multipart>=0.0.9",
+    "Pillow>=10.0.0",
 ]

 [project.optional-dependencies]
--- a/requirements.txt
+++ b/requirements.txt
@@ -7,3 +7,5 @@ pydantic>=2.6.0
 pydantic-settings>=2.1.0
 structlog>=24.1.0
 markdown>=3.5.0
+python-multipart>=0.0.9
+Pillow>=10.0.0
--- a/scripts/atocore_mcp.py
+++ b/scripts/atocore_mcp.py
@@ -0,0 +1,914 @@
+#!/usr/bin/env python3
+"""AtoCore MCP server — stdio transport, stdlib-only.
+
+Exposes the AtoCore HTTP API as MCP tools so any MCP-aware client
+(Claude Desktop, Claude Code, Cursor, Zed, Windsurf) can pull
+context + memories automatically at prompt time.
+
+Design:
+  - stdlib only (no mcp SDK dep) — MCP protocol is simple JSON-RPC
+    over stdio, and AtoCore's philosophy prefers stdlib.
+  - Thin wrapper: every tool is a direct pass-through to an HTTP
+    endpoint. Zero business logic here — the AtoCore server is
+    the single source of truth.
+  - Fail-open: if AtoCore is unreachable, tools return a graceful
+    "unavailable" message rather than crashing the client.
+
+Protocol: MCP 2024-11-05 / 2025-03-26 compatible
+  https://spec.modelcontextprotocol.io/specification/
+
+Usage (standalone test):
+  echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"0"}}}' | python atocore_mcp.py
+
+Register with Claude Code:
+  claude mcp add atocore -- python /path/to/atocore_mcp.py
+
+Environment:
+  ATOCORE_URL      base URL of the AtoCore HTTP API (default http://dalidou:8100)
+  ATOCORE_TIMEOUT  per-request HTTP timeout seconds (default 10)
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+import urllib.error
+import urllib.parse
+import urllib.request
+
+# Force UTF-8 on stdio — MCP protocol expects UTF-8 but Windows Python
+# defaults stdout to cp1252, which crashes on any non-ASCII char (emojis,
+# ≥, →, etc.) in tool responses. This call is a no-op on Linux/macOS
+# where UTF-8 is already the default.
+try:
+    sys.stdin.reconfigure(encoding="utf-8")
+    sys.stdout.reconfigure(encoding="utf-8")
+    sys.stderr.reconfigure(encoding="utf-8")
+except Exception:
+    pass
+
+# --- Configuration ---
+
+ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100").rstrip("/")
+HTTP_TIMEOUT = float(os.environ.get("ATOCORE_TIMEOUT", "10"))
+SERVER_NAME = "atocore"
+SERVER_VERSION = "0.1.0"
+PROTOCOL_VERSION = "2024-11-05"
+
+
+# --- stderr logging (stdout is reserved for JSON-RPC) ---
+
+def log(msg: str) -> None:
+    print(f"[atocore-mcp] {msg}", file=sys.stderr, flush=True)
+
+
+# --- HTTP helpers ---
+
+def http_get(path: str, params: dict | None = None) -> dict:
+    """GET a JSON response from AtoCore. Raises on HTTP error."""
+    url = ATOCORE_URL + path
+    if params:
+        # Drop empty params so the URL stays clean
+        clean = {k: v for k, v in params.items() if v not in (None, "", [], {})}
+        if clean:
+            url += "?" + urllib.parse.urlencode(clean)
+    req = urllib.request.Request(url, headers={"Accept": "application/json"})
+    with urllib.request.urlopen(req, timeout=HTTP_TIMEOUT) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def http_post(path: str, body: dict) -> dict:
+    url = ATOCORE_URL + path
+    data = json.dumps(body).encode("utf-8")
+    req = urllib.request.Request(
+        url, data=data, method="POST",
+        headers={"Content-Type": "application/json", "Accept": "application/json"},
+    )
+    with urllib.request.urlopen(req, timeout=HTTP_TIMEOUT) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def safe_call(fn, *args, **kwargs) -> tuple[dict | None, str | None]:
+    """Run an HTTP call, return (result, error_message_or_None)."""
+    try:
+        return fn(*args, **kwargs), None
+    except urllib.error.HTTPError as e:
+        try:
+            body = e.read().decode("utf-8", errors="replace")
+        except Exception:
+            body = ""
+        return None, f"AtoCore HTTP {e.code}: {body[:200]}"
+    except urllib.error.URLError as e:
+        return None, f"AtoCore unreachable at {ATOCORE_URL}: {e.reason}"
+    except Exception as e:
+        return None, f"AtoCore error: {type(e).__name__}: {str(e)[:200]}"
+
+
+# --- Tool definitions ---
+# Each tool: name, description, inputSchema (JSON Schema), handler
+
+def _tool_context(args: dict) -> str:
+    """Build a full context pack for a query — state + memories + retrieved chunks."""
+    query = (args.get("query") or "").strip()
+    project = args.get("project") or ""
+    if not query:
+        return "Error: 'query' is required."
+    result, err = safe_call(http_post, "/context/build", {
+        "prompt": query, "project": project,
+    })
+    if err:
+        return f"AtoCore context unavailable: {err}"
+    pack = result.get("formatted_context", "") or ""
+    if not pack.strip():
+        return "(AtoCore returned an empty context pack — no matching state, memories, or chunks.)"
+    return pack
+
+
+def _tool_search(args: dict) -> str:
+    """Retrieval only — raw chunks ranked by semantic similarity."""
+    query = (args.get("query") or "").strip()
+    project = args.get("project") or ""
+    top_k = int(args.get("top_k") or 5)
+    if not query:
+        return "Error: 'query' is required."
+    result, err = safe_call(http_post, "/query", {
+        "prompt": query, "project": project, "top_k": top_k,
+    })
+    if err:
+        return f"AtoCore search unavailable: {err}"
+    chunks = result.get("results", []) or []
+    if not chunks:
+        return "No results."
+    lines = []
+    for i, c in enumerate(chunks, 1):
+        src = c.get("source_file") or c.get("title") or "unknown"
+        heading = c.get("heading_path") or ""
+        snippet = (c.get("content") or "")[:300]
+        score = c.get("score", 0.0)
+        head_str = f" ({heading})" if heading else ""
+        lines.append(f"[{i}] score={score:.3f} source={src}{head_str}\n{snippet}")
+    return "\n\n".join(lines)
+
+
+def _tool_memory_list(args: dict) -> str:
+    """List active memories, optionally filtered by project and type."""
+    params = {
+        "status": "active",
+        "limit": int(args.get("limit") or 20),
+    }
+    if args.get("project"):
+        params["project"] = args["project"]
+    if args.get("memory_type"):
+        params["memory_type"] = args["memory_type"]
+    result, err = safe_call(http_get, "/memory", params=params)
+    if err:
+        return f"AtoCore memory list unavailable: {err}"
+    memories = result.get("memories", []) or []
+    if not memories:
+        return "No memories match."
+    lines = []
+    for m in memories:
+        mt = m.get("memory_type", "?")
+        proj = m.get("project") or "(global)"
+        conf = m.get("confidence", 0.0)
+        refs = m.get("reference_count", 0)
+        content = (m.get("content") or "")[:250]
+        lines.append(f"[{mt}/{proj}] conf={conf:.2f} refs={refs}\n  {content}")
+    return "\n\n".join(lines)
+
+
+def _tool_memory_create(args: dict) -> str:
+    """Create a candidate memory (enters the triage queue)."""
+    memory_type = (args.get("memory_type") or "").strip()
+    content = (args.get("content") or "").strip()
+    project = args.get("project") or ""
+    confidence = float(args.get("confidence") or 0.5)
+    if not memory_type or not content:
+        return "Error: 'memory_type' and 'content' are required."
+    valid_types = ["identity", "preference", "project", "episodic", "knowledge", "adaptation"]
+    if memory_type not in valid_types:
+        return f"Error: memory_type must be one of {valid_types}."
+    result, err = safe_call(http_post, "/memory", {
+        "memory_type": memory_type,
+        "content": content,
+        "project": project,
+        "confidence": confidence,
+        "status": "candidate",
+    })
+    if err:
+        return f"AtoCore memory create failed: {err}"
+    mid = result.get("id", "?")
+    return f"Candidate memory created: id={mid} type={memory_type} project={project or '(global)'}"
+
+
+def _tool_project_state(args: dict) -> str:
+    """Get Trusted Project State entries for a project."""
+    project = (args.get("project") or "").strip()
+    category = args.get("category") or ""
+    if not project:
+        return "Error: 'project' is required."
+    path = f"/project/state/{urllib.parse.quote(project)}"
+    params = {"category": category} if category else None
+    result, err = safe_call(http_get, path, params=params)
+    if err:
+        return f"AtoCore project state unavailable: {err}"
+    entries = result.get("entries", []) or result.get("state", []) or []
+    if not entries:
+        return f"No state entries for project '{project}'."
+    lines = []
+    for e in entries:
+        cat = e.get("category", "?")
+        key = e.get("key", "?")
+        value = (e.get("value") or "")[:300]
+        src = e.get("source") or ""
+        lines.append(f"[{cat}/{key}] (source: {src})\n  {value}")
+    return "\n\n".join(lines)
+
+
+def _tool_projects(args: dict) -> str:
+    """List registered AtoCore projects."""
+    result, err = safe_call(http_get, "/projects")
+    if err:
+        return f"AtoCore projects unavailable: {err}"
+    projects = result.get("projects", []) or []
+    if not projects:
+        return "No projects registered."
+    lines = []
+    for p in projects:
+        pid = p.get("project_id") or p.get("id") or p.get("name") or "?"
+        aliases = p.get("aliases", []) or []
+        alias_str = f" (aliases: {', '.join(aliases)})" if aliases else ""
+        lines.append(f"- {pid}{alias_str}")
+    return "\n".join(lines)
+
+
+def _tool_remember(args: dict) -> str:
+    """Phase 6 Part B — universal capture from any Claude session.
+
+    Wraps POST /memory to create a candidate memory tagged with
+    source='mcp-remember'. The existing 3-tier triage is the quality
+    gate: nothing becomes active until sonnet (+ opus if borderline)
+    approves it. Returns the memory id so the caller can reference it
+    in the same session.
+    """
+    content = (args.get("content") or "").strip()
+    if not content:
+        return "Error: 'content' is required."
+
+    memory_type = (args.get("memory_type") or "knowledge").strip()
+    valid_types = ["identity", "preference", "project", "episodic", "knowledge", "adaptation"]
+    if memory_type not in valid_types:
+        return f"Error: memory_type must be one of {valid_types}."
+
+    project = (args.get("project") or "").strip()
+    try:
+        confidence = float(args.get("confidence") or 0.6)
+    except (TypeError, ValueError):
+        confidence = 0.6
+    confidence = max(0.0, min(1.0, confidence))
+
+    valid_until = (args.get("valid_until") or "").strip()
+    tags = args.get("domain_tags") or []
+    if not isinstance(tags, list):
+        tags = []
+    # Normalize tags: lowercase, dedupe, cap at 10
+    clean_tags: list[str] = []
+    for t in tags[:10]:
+        if not isinstance(t, str):
+            continue
+        t = t.strip().lower()
+        if t and t not in clean_tags:
+            clean_tags.append(t)
+
+    payload = {
+        "memory_type": memory_type,
+        "content": content,
+        "project": project,
+        "confidence": confidence,
+        "status": "candidate",
+    }
+    if valid_until:
+        payload["valid_until"] = valid_until
+    if clean_tags:
+        payload["domain_tags"] = clean_tags
+
+    result, err = safe_call(http_post, "/memory", payload)
+    if err:
+        return f"AtoCore remember failed: {err}"
+
+    mid = result.get("id", "?")
+    scope = project if project else "(global)"
+    tag_str = f" tags=[{', '.join(clean_tags)}]" if clean_tags else ""
+    expires = f" valid_until={valid_until}" if valid_until else ""
+    return (
+        f"Remembered as candidate: id={mid}\n"
+        f"  type={memory_type} project={scope} confidence={confidence:.2f}{tag_str}{expires}\n"
+        f"Will flow through the standard triage pipeline within 24h "
+        f"(or on next auto-process button click at /admin/triage)."
+    )
+
+
+def _tool_health(args: dict) -> str:
+    """Check AtoCore service health."""
+    result, err = safe_call(http_get, "/health")
+    if err:
+        return f"AtoCore unreachable: {err}"
+    sha = result.get("build_sha", "?")[:8]
+    vectors = result.get("vectors_count", "?")
+    env = result.get("env", "?")
+    return f"AtoCore healthy: sha={sha} vectors={vectors} env={env}"
+
+
+# --- Phase 5H: Engineering query tools ---
+
+
+def _tool_system_map(args: dict) -> str:
+    """Q-001 + Q-004: subsystem/component tree for a project."""
+    project = (args.get("project") or "").strip()
+    if not project:
+        return "Error: 'project' is required."
+    result, err = safe_call(
+        http_get, f"/engineering/projects/{urllib.parse.quote(project)}/systems"
+    )
+    if err:
+        return f"Engineering query failed: {err}"
+    subs = result.get("subsystems", []) or []
+    orphans = result.get("orphan_components", []) or []
+    if not subs and not orphans:
+        return f"No subsystems or components registered for {project}."
+    lines = [f"System map for {project}:"]
+    for s in subs:
+        lines.append(f"\n[{s['name']}] — {s.get('description') or '(no description)'}")
+        for c in s.get("components", []):
+            mats = ", ".join(c.get("materials", [])) or "-"
+            lines.append(f"  • {c['name']} (materials: {mats})")
+    if orphans:
+        lines.append(f"\nOrphan components (not attached to any subsystem):")
+        for c in orphans:
+            lines.append(f"  • {c['name']}")
+    return "\n".join(lines)
+
+
+def _tool_gaps(args: dict) -> str:
+    """Q-006 + Q-009 + Q-011: find coverage gaps. Director's most-used query."""
+    project = (args.get("project") or "").strip()
+    if not project:
+        return "Error: 'project' is required."
+    result, err = safe_call(
+        http_get, f"/engineering/gaps",
+        params={"project": project},
+    )
+    if err:
+        return f"Gap query failed: {err}"
+
+    orphan = result.get("orphan_requirements", {})
+    risky = result.get("risky_decisions", {})
+    unsup = result.get("unsupported_claims", {})
+
+    counts = f"{orphan.get('count',0)}/{risky.get('count',0)}/{unsup.get('count',0)}"
+    lines = [f"Coverage gaps for {project} (orphan reqs / risky decisions / unsupported claims: {counts}):\n"]
+
+    if orphan.get("count", 0):
+        lines.append(f"ORPHAN REQUIREMENTS ({orphan['count']}) — no component claims to satisfy:")
+        for g in orphan.get("gaps", [])[:10]:
+            lines.append(f"  • {g['name']}: {(g.get('description') or '')[:120]}")
+        lines.append("")
+    if risky.get("count", 0):
+        lines.append(f"RISKY DECISIONS ({risky['count']}) — based on flagged assumptions:")
+        for g in risky.get("gaps", [])[:10]:
+            lines.append(f"  • {g['decision_name']} (assumption: {g['assumption_name']} — {g['assumption_status']})")
+        lines.append("")
+    if unsup.get("count", 0):
+        lines.append(f"UNSUPPORTED CLAIMS ({unsup['count']}) — no Result entity backs them:")
+        for g in unsup.get("gaps", [])[:10]:
+            lines.append(f"  • {g['name']}: {(g.get('description') or '')[:120]}")
+
+    if orphan.get("count", 0) == 0 and risky.get("count", 0) == 0 and unsup.get("count", 0) == 0:
+        lines.append("✓ No gaps detected — every requirement satisfied, no flagged assumptions, all claims have evidence.")
+
+    return "\n".join(lines)
+
+
+def _tool_requirements_for(args: dict) -> str:
+    """Q-005: requirements that a component satisfies."""
+    component_id = (args.get("component_id") or "").strip()
+    if not component_id:
+        return "Error: 'component_id' is required."
+    result, err = safe_call(
+        http_get, f"/engineering/components/{urllib.parse.quote(component_id)}/requirements"
+    )
+    if err:
+        return f"Query failed: {err}"
+    reqs = result.get("requirements", []) or []
+    if not reqs:
+        return "No requirements associated with this component."
+    lines = [f"Component satisfies {len(reqs)} requirement(s):"]
+    for r in reqs:
+        lines.append(f"  • {r['name']}: {(r.get('description') or '')[:150]}")
+    return "\n".join(lines)
+
+
+def _tool_decisions_affecting(args: dict) -> str:
+    """Q-008: decisions affecting a project or subsystem."""
+    project = (args.get("project") or "").strip()
+    subsystem = args.get("subsystem_id") or args.get("subsystem") or ""
+    if not project:
+        return "Error: 'project' is required."
+    params = {"project": project}
+    if subsystem:
+        params["subsystem"] = subsystem
+    result, err = safe_call(http_get, "/engineering/decisions", params=params)
+    if err:
+        return f"Query failed: {err}"
+    decisions = result.get("decisions", []) or []
+    if not decisions:
+        scope = f"subsystem {subsystem}" if subsystem else f"project {project}"
+        return f"No decisions recorded for {scope}."
+    scope = f"subsystem {subsystem}" if subsystem else project
+    lines = [f"{len(decisions)} decision(s) affecting {scope}:"]
+    for d in decisions:
+        lines.append(f"  • {d['name']}: {(d.get('description') or '')[:150]}")
+    return "\n".join(lines)
+
+
+def _tool_recent_changes(args: dict) -> str:
+    """Q-013: what changed recently in the engineering graph."""
+    project = (args.get("project") or "").strip()
+    since = args.get("since") or ""
+    limit = int(args.get("limit") or 20)
+    if not project:
+        return "Error: 'project' is required."
+    params = {"project": project, "limit": limit}
+    if since:
+        params["since"] = since
+    result, err = safe_call(http_get, "/engineering/changes", params=params)
+    if err:
+        return f"Query failed: {err}"
+    changes = result.get("changes", []) or []
+    if not changes:
+        return f"No entity changes in {project} since {since or '(all time)'}."
+    lines = [f"Recent changes in {project} ({len(changes)}):"]
+    for c in changes:
+        lines.append(
+            f"  [{c['timestamp'][:16]}] {c['action']:10s} "
+            f"[{c.get('entity_type','?')}] {c.get('entity_name','?')} "
+            f"by {c.get('actor','?')}"
+        )
+    return "\n".join(lines)
+
+
+def _tool_impact(args: dict) -> str:
+    """Q-016: impact of changing an entity (downstream BFS)."""
+    entity = (args.get("entity_id") or args.get("entity") or "").strip()
+    if not entity:
+        return "Error: 'entity_id' is required."
+    max_depth = int(args.get("max_depth") or 3)
+    result, err = safe_call(
+        http_get, "/engineering/impact",
+        params={"entity": entity, "max_depth": max_depth},
+    )
+    if err:
+        return f"Query failed: {err}"
+    root = result.get("root") or {}
+    impacted = result.get("impacted", []) or []
+    if not impacted:
+        return f"Nothing downstream of [{root.get('entity_type','?')}] {root.get('name','?')}."
+    lines = [
+        f"Changing [{root.get('entity_type')}] {root.get('name')} "
+        f"would affect {len(impacted)} entity(ies) (max depth {max_depth}):"
+    ]
+    for i in impacted[:25]:
+        indent = "  " * i.get("depth", 1)
+        lines.append(f"{indent}→ [{i['entity_type']}] {i['name']} (via {i['relationship']})")
+    if len(impacted) > 25:
+        lines.append(f"  ... and {len(impacted)-25} more")
+    return "\n".join(lines)
+
+
+def _tool_evidence(args: dict) -> str:
+    """Q-017: evidence chain for an entity."""
+    entity = (args.get("entity_id") or args.get("entity") or "").strip()
+    if not entity:
+        return "Error: 'entity_id' is required."
+    result, err = safe_call(http_get, "/engineering/evidence", params={"entity": entity})
+    if err:
+        return f"Query failed: {err}"
+    root = result.get("root") or {}
+    chain = result.get("evidence_chain", []) or []
+    lines = [f"Evidence for [{root.get('entity_type','?')}] {root.get('name','?')}:"]
+    if not chain:
+        lines.append("  (no inbound provenance edges)")
+    else:
+        for e in chain:
+            lines.append(
+                f"  {e['via']} ← [{e['source_type']}] {e['source_name']}: "
+                f"{(e.get('source_description') or '')[:100]}"
+            )
+    refs = result.get("direct_source_refs") or []
+    if refs:
+        lines.append(f"\nDirect source_refs: {refs[:5]}")
+    return "\n".join(lines)
+
+
+TOOLS = [
+    {
+        "name": "atocore_context",
+        "description": (
+            "Get the full AtoCore context pack for a user query. Returns "
+            "Trusted Project State (high trust), relevant memories, and "
+            "retrieved source chunks formatted for prompt injection. "
+            "Use this FIRST on any project-related query to ground the "
+            "conversation in what AtoCore already knows."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "query": {"type": "string", "description": "The user's question or task"},
+                "project": {"type": "string", "description": "Project hint (e.g. 'p04-gigabit'); optional"},
+            },
+            "required": ["query"],
+        },
+        "handler": _tool_context,
+    },
+    {
+        "name": "atocore_search",
+        "description": (
+            "Semantic search over AtoCore's ingested source documents. "
+            "Returns top-K ranked chunks. Use this when you need raw "
+            "references rather than a full context pack."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "query": {"type": "string"},
+                "project": {"type": "string", "description": "optional project filter"},
+                "top_k": {"type": "integer", "minimum": 1, "maximum": 20, "default": 5},
+            },
+            "required": ["query"],
+        },
+        "handler": _tool_search,
+    },
+    {
+        "name": "atocore_memory_list",
+        "description": (
+            "List active memories (curated facts, decisions, preferences). "
+            "Filter by project and/or memory_type. Use this to inspect what "
+            "AtoCore currently remembers about a topic."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+                "memory_type": {
+                    "type": "string",
+                    "enum": ["identity", "preference", "project", "episodic", "knowledge", "adaptation"],
+                },
+                "limit": {"type": "integer", "minimum": 1, "maximum": 100, "default": 20},
+            },
+        },
+        "handler": _tool_memory_list,
+    },
+    {
+        "name": "atocore_memory_create",
+        "description": (
+            "Propose a new memory for AtoCore. Creates a CANDIDATE that "
+            "enters the triage queue for human/auto review — not immediately "
+            "active. Use this to capture durable facts/decisions that "
+            "should persist across sessions. Do NOT use for transient state "
+            "or session-specific notes."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "memory_type": {
+                    "type": "string",
+                    "enum": ["identity", "preference", "project", "episodic", "knowledge", "adaptation"],
+                },
+                "content": {"type": "string", "description": "The fact/decision/preference to remember"},
+                "project": {"type": "string", "description": "project id if project-scoped; empty for global"},
+                "confidence": {"type": "number", "minimum": 0, "maximum": 1, "default": 0.5},
+            },
+            "required": ["memory_type", "content"],
+        },
+        "handler": _tool_memory_create,
+    },
+    {
+        "name": "atocore_remember",
+        "description": (
+            "Save a durable fact to AtoCore's memory layer from any conversation. "
+            "Use when the user says 'remember this', 'save that for later', "
+            "'don't lose this fact', or when you identify a decision/insight/"
+            "preference worth persisting across future sessions. The fact "
+            "goes through quality review before being consulted in future "
+            "context packs (so durable facts get kept, noise gets rejected). "
+            "Call multiple times if one conversation has multiple distinct "
+            "facts worth remembering — one tool call per atomic fact. "
+            "Prefer 'knowledge' type for cross-project engineering insights, "
+            "'project' for facts specific to one project, 'preference' for "
+            "user work-style notes, 'adaptation' for standing behavioral rules."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "content": {
+                    "type": "string",
+                    "description": "The atomic fact to remember. Under 250 chars. Should stand alone without session context.",
+                },
+                "memory_type": {
+                    "type": "string",
+                    "enum": ["identity", "preference", "project", "episodic", "knowledge", "adaptation"],
+                    "default": "knowledge",
+                },
+                "project": {
+                    "type": "string",
+                    "description": "Project id if scoped. Empty for cross-project. Unregistered names flagged by triage as 'emerging project' proposals.",
+                },
+                "confidence": {
+                    "type": "number",
+                    "minimum": 0,
+                    "maximum": 1,
+                    "default": 0.6,
+                    "description": "0.5-0.7 typical. 0.8+ only for ratified/committed claims.",
+                },
+                "valid_until": {
+                    "type": "string",
+                    "description": "ISO date YYYY-MM-DD if time-bounded (e.g. current state, scheduled event, quote expiry). Empty for permanent facts.",
+                },
+                "domain_tags": {
+                    "type": "array",
+                    "items": {"type": "string"},
+                    "description": "Lowercase topical tags (optics, thermal, firmware, procurement, etc.) for cross-project retrieval. 2-5 tags typical.",
+                },
+            },
+            "required": ["content"],
+        },
+        "handler": _tool_remember,
+    },
+    {
+        "name": "atocore_project_state",
+        "description": (
+            "Get Trusted Project State entries for a given project — the "
+            "highest-trust tier with curated decisions, requirements, "
+            "facts, contacts, milestones. Use this to look up authoritative "
+            "project info."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+                "category": {
+                    "type": "string",
+                    "enum": ["status", "decision", "requirement", "contact", "milestone", "fact", "config"],
+                },
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_project_state,
+    },
+    {
+        "name": "atocore_projects",
+        "description": "List all registered AtoCore projects (id + aliases).",
+        "inputSchema": {"type": "object", "properties": {}},
+        "handler": _tool_projects,
+    },
+    {
+        "name": "atocore_health",
+        "description": "Check AtoCore service health (build SHA, vector count, env).",
+        "inputSchema": {"type": "object", "properties": {}},
+        "handler": _tool_health,
+    },
+    # --- Phase 5H: Engineering knowledge graph tools ---
+    {
+        "name": "atocore_engineering_map",
+        "description": (
+            "Get the subsystem/component tree for an engineering project. "
+            "Returns the full system architecture: subsystems, their components, "
+            "materials, and any orphan components not attached to a subsystem. "
+            "Use when the user asks about project structure or system design."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string", "description": "Project id (e.g. p04-gigabit)"},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_system_map,
+    },
+    {
+        "name": "atocore_engineering_gaps",
+        "description": (
+            "Find coverage gaps in a project's engineering graph: orphan "
+            "requirements (no component satisfies them), risky decisions "
+            "(based on flagged assumptions), and unsupported claims (no "
+            "Result evidence). This is the director's most useful query — "
+            "answers 'what am I forgetting?' in seconds."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_gaps,
+    },
+    {
+        "name": "atocore_engineering_requirements_for_component",
+        "description": "List the requirements a specific component claims to satisfy (Q-005).",
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "component_id": {"type": "string"},
+            },
+            "required": ["component_id"],
+        },
+        "handler": _tool_requirements_for,
+    },
+    {
+        "name": "atocore_engineering_decisions",
+        "description": (
+            "Decisions that affect a project, optionally scoped to a specific "
+            "subsystem. Use when the user asks 'what did we decide about X?'"
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+                "subsystem_id": {"type": "string", "description": "optional subsystem entity id"},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_decisions_affecting,
+    },
+    {
+        "name": "atocore_engineering_changes",
+        "description": (
+            "Recent changes to the engineering graph for a project: which "
+            "entities were created/promoted/rejected/updated, by whom, when. "
+            "Use for 'what changed recently?' type questions."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "project": {"type": "string"},
+                "since": {"type": "string", "description": "ISO timestamp; optional"},
+                "limit": {"type": "integer", "minimum": 1, "maximum": 200, "default": 20},
+            },
+            "required": ["project"],
+        },
+        "handler": _tool_recent_changes,
+    },
+    {
+        "name": "atocore_engineering_impact",
+        "description": (
+            "Impact analysis: what's downstream of a given entity. BFS over "
+            "outbound relationships up to max_depth. Use to answer 'what would "
+            "break if I change X?'"
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "entity_id": {"type": "string"},
+                "max_depth": {"type": "integer", "minimum": 1, "maximum": 5, "default": 3},
+            },
+            "required": ["entity_id"],
+        },
+        "handler": _tool_impact,
+    },
+    {
+        "name": "atocore_engineering_evidence",
+        "description": (
+            "Evidence chain for an entity: what supports it? Walks inbound "
+            "SUPPORTS / EVIDENCED_BY / DESCRIBED_BY / VALIDATED_BY / ANALYZED_BY "
+            "edges. Use for 'how do we know X is true?' type questions."
+        ),
+        "inputSchema": {
+            "type": "object",
+            "properties": {
+                "entity_id": {"type": "string"},
+            },
+            "required": ["entity_id"],
+        },
+        "handler": _tool_evidence,
+    },
+]
+
+
+# --- JSON-RPC handlers ---
+
+def handle_initialize(params: dict) -> dict:
+    return {
+        "protocolVersion": PROTOCOL_VERSION,
+        "capabilities": {
+            "tools": {"listChanged": False},
+        },
+        "serverInfo": {"name": SERVER_NAME, "version": SERVER_VERSION},
+    }
+
+
+def handle_tools_list(params: dict) -> dict:
+    return {
+        "tools": [
+            {"name": t["name"], "description": t["description"], "inputSchema": t["inputSchema"]}
+            for t in TOOLS
+        ]
+    }
+
+
+def handle_tools_call(params: dict) -> dict:
+    tool_name = params.get("name", "")
+    args = params.get("arguments", {}) or {}
+    tool = next((t for t in TOOLS if t["name"] == tool_name), None)
+    if tool is None:
+        return {
+            "content": [{"type": "text", "text": f"Unknown tool: {tool_name}"}],
+            "isError": True,
+        }
+    try:
+        text = tool["handler"](args)
+    except Exception as e:
+        log(f"tool {tool_name} raised: {e}")
+        return {
+            "content": [{"type": "text", "text": f"Tool error: {type(e).__name__}: {e}"}],
+            "isError": True,
+        }
+    return {"content": [{"type": "text", "text": text}]}
+
+
+def handle_ping(params: dict) -> dict:
+    return {}
+
+
+METHODS = {
+    "initialize": handle_initialize,
+    "tools/list": handle_tools_list,
+    "tools/call": handle_tools_call,
+    "ping": handle_ping,
+}
+
+
+# --- stdio main loop ---
+
+def send(obj: dict) -> None:
+    """Write a single-line JSON message to stdout and flush."""
+    sys.stdout.write(json.dumps(obj, ensure_ascii=False) + "\n")
+    sys.stdout.flush()
+
+
+def make_response(req_id, result=None, error=None) -> dict:
+    resp = {"jsonrpc": "2.0", "id": req_id}
+    if error is not None:
+        resp["error"] = error
+    else:
+        resp["result"] = result if result is not None else {}
+    return resp
+
+
+def main() -> int:
+    log(f"starting (AtoCore at {ATOCORE_URL})")
+    for line in sys.stdin:
+        line = line.strip()
+        if not line:
+            continue
+        try:
+            msg = json.loads(line)
+        except json.JSONDecodeError as e:
+            log(f"parse error: {e}")
+            continue
+
+        method = msg.get("method", "")
+        req_id = msg.get("id")
+        params = msg.get("params", {}) or {}
+
+        # Notifications (no id) don't need a response
+        if req_id is None:
+            if method == "notifications/initialized":
+                log("client initialized")
+            continue
+
+        handler = METHODS.get(method)
+        if handler is None:
+            send(make_response(req_id, error={
+                "code": -32601,
+                "message": f"Method not found: {method}",
+            }))
+            continue
+
+        try:
+            result = handler(params)
+            send(make_response(req_id, result=result))
+        except Exception as e:
+            log(f"handler {method} raised: {e}")
+            send(make_response(req_id, error={
+                "code": -32603,
+                "message": f"Internal error: {type(e).__name__}: {e}",
+            }))
+
+    log("stdin closed, exiting")
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/scripts/atocore_proxy.py
+++ b/scripts/atocore_proxy.py
@@ -0,0 +1,321 @@
+#!/usr/bin/env python3
+"""AtoCore Proxy — OpenAI-compatible HTTP middleware.
+
+Acts as a drop-in layer for any client that speaks the OpenAI Chat
+Completions API (Codex, Ollama, LiteLLM, custom agents). Sits between
+the client and the real model provider:
+
+  client -> atocore_proxy -> real_provider (OpenAI, Ollama, Anthropic, ...)
+
+For each chat completion request:
+  1. Extract the user's last message as the "query"
+  2. Call AtoCore /context/build to get a context pack
+  3. Inject the pack as a system message (or prepend to existing system)
+  4. Forward the enriched request to the real provider
+  5. Capture the full interaction back to AtoCore /interactions
+
+Fail-open: if AtoCore is unreachable, the request passes through
+unchanged. If the real provider fails, the error is propagated to the
+client as-is.
+
+Configuration (env vars):
+  ATOCORE_URL            AtoCore base URL (default http://dalidou:8100)
+  ATOCORE_UPSTREAM       real provider base URL (e.g. http://localhost:11434/v1 for Ollama)
+  ATOCORE_PROXY_PORT     port to listen on (default 11435)
+  ATOCORE_PROXY_HOST     bind address (default 127.0.0.1)
+  ATOCORE_CLIENT_LABEL   client id recorded in captures (default "proxy")
+  ATOCORE_CAPTURE        "1" to capture interactions back (default "1")
+  ATOCORE_INJECT         "1" to inject context (default "1")
+
+Usage:
+  # Proxy for Ollama:
+  ATOCORE_UPSTREAM=http://localhost:11434/v1 python atocore_proxy.py
+
+  # Then point your client at http://localhost:11435/v1 instead of the
+  # real provider.
+
+Stdlib only — deliberate to keep the dependency footprint at zero.
+"""
+
+from __future__ import annotations
+
+import http.server
+import json
+import os
+import socketserver
+import sys
+import threading
+import urllib.error
+import urllib.parse
+import urllib.request
+from typing import Any
+
+ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100").rstrip("/")
+UPSTREAM_URL = os.environ.get("ATOCORE_UPSTREAM", "").rstrip("/")
+PROXY_PORT = int(os.environ.get("ATOCORE_PROXY_PORT", "11435"))
+PROXY_HOST = os.environ.get("ATOCORE_PROXY_HOST", "127.0.0.1")
+CLIENT_LABEL = os.environ.get("ATOCORE_CLIENT_LABEL", "proxy")
+CAPTURE_ENABLED = os.environ.get("ATOCORE_CAPTURE", "1") == "1"
+INJECT_ENABLED = os.environ.get("ATOCORE_INJECT", "1") == "1"
+ATOCORE_TIMEOUT = float(os.environ.get("ATOCORE_TIMEOUT", "6"))
+UPSTREAM_TIMEOUT = float(os.environ.get("ATOCORE_UPSTREAM_TIMEOUT", "300"))
+
+PROJECT_HINTS = [
+    ("p04-gigabit", ["p04", "gigabit"]),
+    ("p05-interferometer", ["p05", "interferometer"]),
+    ("p06-polisher", ["p06", "polisher", "fullum"]),
+    ("abb-space", ["abb"]),
+    ("atomizer-v2", ["atomizer"]),
+    ("atocore", ["atocore", "dalidou"]),
+]
+
+
+def log(msg: str) -> None:
+    print(f"[atocore-proxy] {msg}", file=sys.stderr, flush=True)
+
+
+def detect_project(text: str) -> str:
+    lower = (text or "").lower()
+    for proj, tokens in PROJECT_HINTS:
+        if any(t in lower for t in tokens):
+            return proj
+    return ""
+
+
+def get_last_user_message(body: dict) -> str:
+    messages = body.get("messages", []) or []
+    for m in reversed(messages):
+        if m.get("role") == "user":
+            content = m.get("content", "")
+            if isinstance(content, list):
+                # OpenAI multi-part content: extract text parts
+                parts = [p.get("text", "") for p in content if p.get("type") == "text"]
+                return "\n".join(parts)
+            return str(content)
+    return ""
+
+
+def get_assistant_text(response: dict) -> str:
+    """Extract assistant text from an OpenAI-style completion response."""
+    choices = response.get("choices", []) or []
+    if not choices:
+        return ""
+    msg = choices[0].get("message", {}) or {}
+    content = msg.get("content", "")
+    if isinstance(content, list):
+        parts = [p.get("text", "") for p in content if p.get("type") == "text"]
+        return "\n".join(parts)
+    return str(content)
+
+
+def fetch_context(query: str, project: str) -> str:
+    """Pull a context pack from AtoCore. Returns '' on any failure."""
+    if not INJECT_ENABLED or not query:
+        return ""
+    try:
+        data = json.dumps({"prompt": query, "project": project}).encode("utf-8")
+        req = urllib.request.Request(
+            ATOCORE_URL + "/context/build",
+            data=data,
+            method="POST",
+            headers={"Content-Type": "application/json"},
+        )
+        with urllib.request.urlopen(req, timeout=ATOCORE_TIMEOUT) as resp:
+            result = json.loads(resp.read().decode("utf-8"))
+        return result.get("formatted_context", "") or ""
+    except Exception as e:
+        log(f"context fetch failed: {type(e).__name__}: {e}")
+        return ""
+
+
+def capture_interaction(prompt: str, response: str, project: str) -> None:
+    """POST the completed turn back to AtoCore. Fire-and-forget."""
+    if not CAPTURE_ENABLED or not prompt or not response:
+        return
+
+    def _post():
+        try:
+            data = json.dumps({
+                "prompt": prompt,
+                "response": response,
+                "client": CLIENT_LABEL,
+                "project": project,
+                "reinforce": True,
+            }).encode("utf-8")
+            req = urllib.request.Request(
+                ATOCORE_URL + "/interactions",
+                data=data,
+                method="POST",
+                headers={"Content-Type": "application/json"},
+            )
+            urllib.request.urlopen(req, timeout=ATOCORE_TIMEOUT)
+        except Exception as e:
+            log(f"capture failed: {type(e).__name__}: {e}")
+
+    threading.Thread(target=_post, daemon=True).start()
+
+
+def inject_context(body: dict, context_pack: str) -> dict:
+    """Prepend the AtoCore context as a system message, or augment existing."""
+    if not context_pack.strip():
+        return body
+    header = "--- AtoCore Context (auto-injected) ---\n"
+    footer = "\n--- End AtoCore Context ---\n"
+    injection = header + context_pack + footer
+
+    messages = list(body.get("messages", []) or [])
+    if messages and messages[0].get("role") == "system":
+        # Augment existing system message
+        existing = messages[0].get("content", "") or ""
+        if isinstance(existing, list):
+            # multi-part: prepend a text part
+            messages[0]["content"] = [{"type": "text", "text": injection}] + existing
+        else:
+            messages[0]["content"] = injection + "\n" + str(existing)
+    else:
+        messages.insert(0, {"role": "system", "content": injection})
+
+    body["messages"] = messages
+    return body
+
+
+def forward_to_upstream(body: dict, headers: dict[str, str], path: str) -> tuple[int, dict]:
+    """Forward the enriched body to the upstream provider. Returns (status, response_dict)."""
+    if not UPSTREAM_URL:
+        return 503, {"error": {"message": "ATOCORE_UPSTREAM not configured"}}
+    url = UPSTREAM_URL + path
+    data = json.dumps(body).encode("utf-8")
+    # Strip hop-by-hop / host-specific headers
+    fwd_headers = {"Content-Type": "application/json"}
+    for k, v in headers.items():
+        lk = k.lower()
+        if lk in ("authorization", "x-api-key", "anthropic-version"):
+            fwd_headers[k] = v
+    req = urllib.request.Request(url, data=data, method="POST", headers=fwd_headers)
+    try:
+        with urllib.request.urlopen(req, timeout=UPSTREAM_TIMEOUT) as resp:
+            return resp.status, json.loads(resp.read().decode("utf-8"))
+    except urllib.error.HTTPError as e:
+        try:
+            body_bytes = e.read()
+            payload = json.loads(body_bytes.decode("utf-8"))
+        except Exception:
+            payload = {"error": {"message": f"upstream HTTP {e.code}"}}
+        return e.code, payload
+    except Exception as e:
+        log(f"upstream error: {e}")
+        return 502, {"error": {"message": f"upstream unreachable: {e}"}}
+
+
+class ProxyHandler(http.server.BaseHTTPRequestHandler):
+    # Silence default request logging (we log what matters ourselves)
+    def log_message(self, format: str, *args: Any) -> None:
+        pass
+
+    def _read_body(self) -> dict:
+        length = int(self.headers.get("Content-Length", "0") or "0")
+        if length <= 0:
+            return {}
+        raw = self.rfile.read(length)
+        try:
+            return json.loads(raw.decode("utf-8"))
+        except Exception:
+            return {}
+
+    def _send_json(self, status: int, payload: dict) -> None:
+        body = json.dumps(payload).encode("utf-8")
+        self.send_response(status)
+        self.send_header("Content-Type", "application/json")
+        self.send_header("Content-Length", str(len(body)))
+        self.send_header("Access-Control-Allow-Origin", "*")
+        self.end_headers()
+        self.wfile.write(body)
+
+    def do_OPTIONS(self) -> None:  # CORS preflight
+        self.send_response(204)
+        self.send_header("Access-Control-Allow-Origin", "*")
+        self.send_header("Access-Control-Allow-Methods", "POST, GET, OPTIONS")
+        self.send_header("Access-Control-Allow-Headers", "Content-Type, Authorization, X-API-Key")
+        self.end_headers()
+
+    def do_GET(self) -> None:
+        parsed = urllib.parse.urlparse(self.path)
+        if parsed.path == "/healthz":
+            self._send_json(200, {
+                "status": "ok",
+                "atocore": ATOCORE_URL,
+                "upstream": UPSTREAM_URL or "(not configured)",
+                "inject": INJECT_ENABLED,
+                "capture": CAPTURE_ENABLED,
+            })
+            return
+        # Pass through GET to upstream (model listing etc)
+        if not UPSTREAM_URL:
+            self._send_json(503, {"error": {"message": "ATOCORE_UPSTREAM not configured"}})
+            return
+        try:
+            req = urllib.request.Request(UPSTREAM_URL + parsed.path + (f"?{parsed.query}" if parsed.query else ""))
+            for k in ("Authorization", "X-API-Key"):
+                v = self.headers.get(k)
+                if v:
+                    req.add_header(k, v)
+            with urllib.request.urlopen(req, timeout=UPSTREAM_TIMEOUT) as resp:
+                data = resp.read()
+                self.send_response(resp.status)
+                self.send_header("Content-Type", resp.headers.get("Content-Type", "application/json"))
+                self.send_header("Content-Length", str(len(data)))
+                self.end_headers()
+                self.wfile.write(data)
+        except Exception as e:
+            self._send_json(502, {"error": {"message": f"upstream error: {e}"}})
+
+    def do_POST(self) -> None:
+        parsed = urllib.parse.urlparse(self.path)
+        body = self._read_body()
+
+        # Only enrich chat completions; other endpoints pass through
+        if parsed.path.endswith("/chat/completions") or parsed.path == "/v1/chat/completions":
+            prompt = get_last_user_message(body)
+            project = detect_project(prompt)
+            context = fetch_context(prompt, project) if prompt else ""
+            if context:
+                log(f"inject: project={project or '(none)'} chars={len(context)}")
+                body = inject_context(body, context)
+
+            status, response = forward_to_upstream(body, dict(self.headers), parsed.path)
+            self._send_json(status, response)
+
+            if status == 200:
+                assistant_text = get_assistant_text(response)
+                capture_interaction(prompt, assistant_text, project)
+        else:
+            # Non-chat endpoints (embeddings, completions, etc.) — pure passthrough
+            status, response = forward_to_upstream(body, dict(self.headers), parsed.path)
+            self._send_json(status, response)
+
+
+class ThreadedServer(socketserver.ThreadingMixIn, http.server.HTTPServer):
+    daemon_threads = True
+    allow_reuse_address = True
+
+
+def main() -> int:
+    if not UPSTREAM_URL:
+        log("WARNING: ATOCORE_UPSTREAM not set. Chat completions will fail.")
+        log("Example: ATOCORE_UPSTREAM=http://localhost:11434/v1 for Ollama")
+    server = ThreadedServer((PROXY_HOST, PROXY_PORT), ProxyHandler)
+    log(f"listening on {PROXY_HOST}:{PROXY_PORT}")
+    log(f"AtoCore: {ATOCORE_URL}  inject={INJECT_ENABLED}  capture={CAPTURE_ENABLED}")
+    log(f"Upstream: {UPSTREAM_URL or '(not configured)'}")
+    log(f"Client label: {CLIENT_LABEL}")
+    log("Ready. Point your OpenAI-compatible client at /v1/chat/completions")
+    try:
+        server.serve_forever()
+    except KeyboardInterrupt:
+        log("stopping")
+    server.server_close()
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/scripts/auto_triage.py
+++ b/scripts/auto_triage.py
@@ -29,25 +29,61 @@ import os
 import shutil
 import subprocess
 import sys
+import time
 import tempfile
 import urllib.error
 import urllib.parse
 import urllib.request

 DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://localhost:8100")
-DEFAULT_MODEL = os.environ.get("ATOCORE_TRIAGE_MODEL", "sonnet")
+
+# 3-tier escalation config (Phase "Triage Quality")
+TIER1_MODEL = os.environ.get("ATOCORE_TRIAGE_MODEL_TIER1",
+                             os.environ.get("ATOCORE_TRIAGE_MODEL", "sonnet"))
+TIER2_MODEL = os.environ.get("ATOCORE_TRIAGE_MODEL_TIER2", "opus")
+# Tier 3: default "discard" (auto-reject uncertain after opus disagrees/wavers),
+# alternative "human" routes them to /admin/triage.
+TIER3_ACTION = os.environ.get("ATOCORE_TRIAGE_TIER3", "discard").lower()
 DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_TRIAGE_TIMEOUT_S", "60"))
+TIER2_TIMEOUT_S = float(os.environ.get("ATOCORE_TRIAGE_TIER2_TIMEOUT_S", "120"))
 AUTO_PROMOTE_MIN_CONFIDENCE = 0.8
+# Below this, tier 1 decision is "not confident enough" and we escalate
+ESCALATION_CONFIDENCE_THRESHOLD = float(
+    os.environ.get("ATOCORE_TRIAGE_ESCALATION_THRESHOLD", "0.75")
+)
+
+# Kept for legacy callers that reference DEFAULT_MODEL
+DEFAULT_MODEL = TIER1_MODEL

 TRIAGE_SYSTEM_PROMPT = """You are a memory triage reviewer for a personal context engine called AtoCore. You review candidate memories extracted from LLM conversations and decide whether each should be promoted to active status, rejected, or flagged for human review.

 You will receive:
- The candidate memory content and type
- A list of existing active memories for the same project (to check for duplicates)
+- The candidate memory content, type, and claimed project
+- A list of existing active memories for the same project (to check for duplicates + contradictions)
+- Trusted project state entries (curated ground truth — higher trust than memories)
+- Known project ids so you can flag misattribution

 For each candidate, output exactly one JSON object:

-{"verdict": "promote|reject|needs_human|contradicts", "confidence": 0.0-1.0, "reason": "one sentence", "conflicts_with": "id of existing memory if contradicts"}
+{"verdict": "promote|reject|needs_human|contradicts", "confidence": 0.0-1.0, "reason": "one sentence", "conflicts_with": "id of existing memory if contradicts", "domain_tags": ["tag1","tag2"], "valid_until": null}
+
+DOMAIN TAGS (Phase 3): A lowercase list of 2-5 topical keywords describing
+the SUBJECT matter (not the project). This enables cross-project retrieval:
+a query about "optics" can pull matches from p04 + p05 + p06.
+
+Good tags are single lowercase words or hyphenated terms. Mix:
+- domain keywords (optics, thermal, firmware, materials, controls)
+- project tokens when clearly scoped (p04, p05, p06, abb)
+- lifecycle/activity words (procurement, design, validation, vendor)
+
+Always emit domain_tags on a promote. For reject, empty list is fine.
+
+VALID_UNTIL (Phase 3): ISO date "YYYY-MM-DD" OR null (permanent).
+Set to a near-future date when the candidate is time-bounded:
+- Status snapshots ("current blocker is X") → ~2 weeks out
+- Scheduled events ("meeting Friday") → event date
+- Quotes with expiry → quote expiry date
+Leave null for durable decisions, engineering insights, ratified requirements.

 Rules:

@@ -65,9 +101,24 @@ Rules:

 4. OPENCLAW-CURATED content (candidate content starts with "From OpenClaw/"): apply a MUCH LOWER bar. OpenClaw's SOUL.md, USER.md, MEMORY.md, MODEL-ROUTING.md, and dated memory/*.md files are ALREADY curated by OpenClaw as canonical continuity. Promote unless clearly wrong or a genuine duplicate. Do NOT reject OpenClaw content as "process rule belongs elsewhere" or "session log" — that's exactly what AtoCore wants to absorb. Session events, project updates, stakeholder notes, and decisions from OpenClaw daily memory files ARE valuable context and should promote.

-5. NEEDS_HUMAN when you're genuinely unsure — the candidate might be valuable but you can't tell without domain knowledge. This should be rare (< 20% of candidates).
+5. NEEDS_HUMAN when you're genuinely unsure — the candidate might be valuable but you can't tell without domain knowledge. This should be rare (< 20% of candidates). If this is just noise/filler, prefer REJECT with low confidence.

-6. Output ONLY the JSON object. No prose, no markdown, no explanation outside the reason field."""
+6. PROJECT VALIDATION: The candidate has a "claimed project". You'll see the list of registered project ids. If the claimed project doesn't match any registered id AND the content clearly belongs to a registered project, include "suggested_project": "<correct_id>" in your output so the caller can auto-fix the attribution. If the content is genuinely cross-project or global, leave project empty (suggested_project=""). Misattribution is the #1 pollution source — flag it.
+
+7. TEMPORAL SENSITIVITY: Be aggressive with valid_until for anything that reads like "current state", "right now", "this week", "as of". Stale facts pollute context. When in doubt, set a 2-4 week expiry rather than null.
+
+8. CONFIDENCE GRADING:
+   - 0.9+: crystal clear durable fact or clear noise
+   - 0.75-0.9: confident but not cryptographic-certain
+   - 0.6-0.75: borderline — will escalate to opus for second opinion
+   - <0.6: genuinely ambiguous — needs human or will be discarded
+
+9. Output ONLY the JSON object. No prose, no markdown, no explanation outside the reason field. Include optional "suggested_project" field when misattribution detected."""
+
+
+TIER2_SECOND_OPINION_PROMPT = TRIAGE_SYSTEM_PROMPT + """
+
+ESCALATED REVIEW: You are seeing this candidate because the tier-1 (sonnet) reviewer could not decide confidently. You will be shown tier-1's verdict + reason as additional context. Your job is to resolve the uncertainty with more careful thinking. Use your full context window to cross-reference the existing memories. If you ALSO cannot decide with confidence >= 0.8, output verdict="needs_human" with a clear explanation of what information would break the tie. That signal will route to a human (or auto-discard, depending on config)."""

 _sandbox_cwd = None

@@ -104,48 +155,129 @@ def fetch_active_memories_for_project(base_url, project):
    return result.get("memories", [])


-def triage_one(candidate, active_memories, model, timeout_s):
-    """Ask the triage model to classify one candidate."""
-    if not shutil.which("claude"):
-        return {"verdict": "needs_human", "confidence": 0.0, "reason": "claude CLI not available"}
+def fetch_project_state(base_url, project):
+    """Fetch trusted project state for ground-truth context."""
+    if not project:
+        return []
+    try:
+        result = api_get(base_url, f"/project/state/{urllib.parse.quote(project)}")
+        return result.get("entries", result.get("state", []))
+    except Exception:
+        return []

+
+def fetch_registered_projects(base_url):
+    """Return list of registered project ids + aliases for misattribution check."""
+    try:
+        result = api_get(base_url, "/projects")
+        projects = result.get("projects", [])
+        out = {}
+        for p in projects:
+            pid = p.get("project_id") or p.get("id") or p.get("name")
+            if pid:
+                out[pid] = p.get("aliases", []) or []
+        return out
+    except Exception:
+        return {}
+
+
+def build_triage_user_message(candidate, active_memories, project_state, known_projects):
+    """Richer context for the triage model: memories + state + project registry."""
    active_summary = "\n".join(
-        f"- [{m['memory_type']}] {m['content'][:150]}"
-        for m in active_memories[:20]
+        f"- [{m['memory_type']}] {m['content'][:200]}"
+        for m in active_memories[:30]
    ) or "(no active memories for this project)"

-    user_message = (
+    state_summary = ""
+    if project_state:
+        lines = []
+        for e in project_state[:20]:
+            cat = e.get("category", "?")
+            key = e.get("key", "?")
+            val = (e.get("value") or "")[:200]
+            lines.append(f"- [{cat}/{key}] {val}")
+        state_summary = "\n".join(lines)
+    else:
+        state_summary = "(no trusted state entries for this project)"
+
+    projects_line = ", ".join(sorted(known_projects.keys())) if known_projects else "(none)"
+
+    return (
        f"CANDIDATE TO TRIAGE:\n"
        f"  type: {candidate['memory_type']}\n"
-        f"  project: {candidate.get('project') or '(none)'}\n"
+        f"  claimed project: {candidate.get('project') or '(none)'}\n"
        f"  content: {candidate['content']}\n\n"
+        f"REGISTERED PROJECT IDS: {projects_line}\n\n"
+        f"TRUSTED PROJECT STATE (ground truth, higher trust than memories):\n{state_summary}\n\n"
        f"EXISTING ACTIVE MEMORIES FOR THIS PROJECT:\n{active_summary}\n\n"
        f"Return the JSON verdict now."
    )

+
+def _call_claude(system_prompt, user_message, model, timeout_s):
+    """Shared CLI caller with retry + stderr capture."""
    args = [
        "claude", "-p",
        "--model", model,
-        "--append-system-prompt", TRIAGE_SYSTEM_PROMPT,
+        "--append-system-prompt", system_prompt,
        "--disable-slash-commands",
        user_message,
    ]
+    last_error = ""
+    for attempt in range(3):
+        if attempt > 0:
+            time.sleep(2 ** attempt)
+        try:
+            completed = subprocess.run(
+                args, capture_output=True, text=True,
+                timeout=timeout_s, cwd=get_sandbox_cwd(),
+                encoding="utf-8", errors="replace",
+            )
+        except subprocess.TimeoutExpired:
+            last_error = f"{model} timed out"
+            continue
+        except Exception as exc:
+            last_error = f"subprocess error: {exc}"
+            continue

-    try:
-        completed = subprocess.run(
-            args, capture_output=True, text=True,
-            timeout=timeout_s, cwd=get_sandbox_cwd(),
-            encoding="utf-8", errors="replace",
-        )
-    except subprocess.TimeoutExpired:
-        return {"verdict": "needs_human", "confidence": 0.0, "reason": "triage model timed out"}
-    except Exception as exc:
-        return {"verdict": "needs_human", "confidence": 0.0, "reason": f"subprocess error: {exc}"}
+        if completed.returncode == 0:
+            return (completed.stdout or "").strip(), None

-    if completed.returncode != 0:
-        return {"verdict": "needs_human", "confidence": 0.0, "reason": f"claude exit {completed.returncode}"}
+        stderr = (completed.stderr or "").strip()[:200]
+        last_error = f"{model} exit {completed.returncode}: {stderr}" if stderr else f"{model} exit {completed.returncode}"
+    return None, last_error

-    raw = (completed.stdout or "").strip()
+
+def triage_one(candidate, active_memories, project_state, known_projects, model, timeout_s):
+    """Tier-1 triage: ask the cheap model for a verdict."""
+    if not shutil.which("claude"):
+        return {"verdict": "needs_human", "confidence": 0.0, "reason": "claude CLI not available"}
+
+    user_message = build_triage_user_message(candidate, active_memories, project_state, known_projects)
+    raw, err = _call_claude(TRIAGE_SYSTEM_PROMPT, user_message, model, timeout_s)
+    if err:
+        return {"verdict": "needs_human", "confidence": 0.0, "reason": err}
+    return parse_verdict(raw)
+
+
+def triage_escalation(candidate, tier1_verdict, active_memories, project_state, known_projects, model, timeout_s):
+    """Tier-2 escalation: opus sees tier-1's verdict + reasoning, tries again."""
+    if not shutil.which("claude"):
+        return {"verdict": "needs_human", "confidence": 0.0, "reason": "claude CLI not available"}
+
+    base_msg = build_triage_user_message(candidate, active_memories, project_state, known_projects)
+    tier1_context = (
+        f"\nTIER-1 REVIEW (sonnet, for your reference):\n"
+        f"  verdict: {tier1_verdict.get('verdict')}\n"
+        f"  confidence: {tier1_verdict.get('confidence', 0.0):.2f}\n"
+        f"  reason: {tier1_verdict.get('reason', '')[:300]}\n\n"
+        f"Resolve the uncertainty. If you also can't decide with confidence ≥ 0.8, "
+        f"return verdict='needs_human' with a specific explanation of what information "
+        f"would break the tie.\n\nReturn the JSON verdict now."
+    )
+    raw, err = _call_claude(TIER2_SECOND_OPINION_PROMPT, base_msg + tier1_context, model, timeout_s)
+    if err:
+        return {"verdict": "needs_human", "confidence": 0.0, "reason": f"tier2: {err}"}
    return parse_verdict(raw)


@@ -184,81 +316,235 @@ def parse_verdict(raw):

    reason = str(parsed.get("reason", "")).strip()[:200]
    conflicts_with = str(parsed.get("conflicts_with", "")).strip()
+
+    # Phase 3: domain tags + expiry
+    raw_tags = parsed.get("domain_tags") or []
+    if isinstance(raw_tags, str):
+        raw_tags = [t.strip() for t in raw_tags.split(",") if t.strip()]
+    if not isinstance(raw_tags, list):
+        raw_tags = []
+    domain_tags = []
+    for t in raw_tags[:10]:
+        if not isinstance(t, str):
+            continue
+        tag = t.strip().lower()
+        if tag and tag not in domain_tags:
+            domain_tags.append(tag)
+
+    valid_until = parsed.get("valid_until")
+    if valid_until is None:
+        valid_until = ""
+    else:
+        valid_until = str(valid_until).strip()
+        if valid_until.lower() in ("", "null", "none", "permanent"):
+            valid_until = ""
+
+    # Triage Quality: project misattribution flag
+    suggested_project = str(parsed.get("suggested_project", "")).strip()
+
    return {
        "verdict": verdict,
        "confidence": confidence,
        "reason": reason,
        "conflicts_with": conflicts_with,
+        "domain_tags": domain_tags,
+        "valid_until": valid_until,
+        "suggested_project": suggested_project,
    }


-def main():
-    parser = argparse.ArgumentParser(description="Auto-triage candidate memories")
-    parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
-    parser.add_argument("--model", default=DEFAULT_MODEL)
-    parser.add_argument("--dry-run", action="store_true", help="preview without executing")
-    args = parser.parse_args()
+def _apply_metadata_update(base_url, mid, verdict_obj):
+    """Persist tags + valid_until + suggested_project before the promote call."""
+    tags = verdict_obj.get("domain_tags") or []
+    valid_until = verdict_obj.get("valid_until") or ""
+    suggested = verdict_obj.get("suggested_project") or ""

-    # Fetch candidates
-    result = api_get(args.base_url, "/memory?status=candidate&limit=100")
-    candidates = result.get("memories", [])
-    print(f"candidates: {len(candidates)}  model: {args.model}  dry_run: {args.dry_run}")
-
-    if not candidates:
-        print("queue empty, nothing to triage")
+    body = {}
+    if tags:
+        body["domain_tags"] = tags
+    if valid_until:
+        body["valid_until"] = valid_until
+    if not body and not suggested:
        return

-    # Cache active memories per project for dedup
-    active_cache = {}
-    promoted = rejected = needs_human = errors = 0
+    if body:
+        try:
+            import urllib.request as _ur
+            req = _ur.Request(
+                f"{base_url}/memory/{mid}", method="PUT",
+                headers={"Content-Type": "application/json"},
+                data=json.dumps(body).encode("utf-8"),
+            )
+            _ur.urlopen(req, timeout=10).read()
+        except Exception:
+            pass

-    for i, cand in enumerate(candidates, 1):
-        project = cand.get("project") or ""
-        if project not in active_cache:
-            active_cache[project] = fetch_active_memories_for_project(args.base_url, project)
+    # Project auto-fix via direct SQLite update would bypass audit; use PUT if supported.
+    # For now we log the suggestion — operator script can apply it in batch.
+    if suggested:
+        # noop here — handled by caller which tracks suggested_project_fixes
+        pass

-        verdict_obj = triage_one(cand, active_cache[project], args.model, DEFAULT_TIMEOUT_S)
-        verdict = verdict_obj["verdict"]
-        conf = verdict_obj["confidence"]
-        reason = verdict_obj["reason"]
-        conflicts_with = verdict_obj.get("conflicts_with", "")

-        mid = cand["id"]
-        label = f"[{i:2d}/{len(candidates)}] {mid[:8]} [{cand['memory_type']}]"
+def process_candidate(cand, base_url, active_cache, state_cache, known_projects, dry_run):
+    """Run the 3-tier triage and apply the resulting action.

-        if verdict == "promote" and conf >= AUTO_PROMOTE_MIN_CONFIDENCE:
-            if args.dry_run:
-                print(f"  WOULD PROMOTE  {label}  conf={conf:.2f}  {reason}")
+    Returns (action, note) where action in {promote, reject, discard, human, error}.
+    """
+    mid = cand["id"]
+    project = cand.get("project") or ""
+    if project not in active_cache:
+        active_cache[project] = fetch_active_memories_for_project(base_url, project)
+    if project not in state_cache:
+        state_cache[project] = fetch_project_state(base_url, project)
+
+    # === Tier 1 ===
+    v1 = triage_one(
+        cand, active_cache[project], state_cache[project],
+        known_projects, TIER1_MODEL, DEFAULT_TIMEOUT_S,
+    )
+
+    # Project misattribution fix: suggested_project surfaces from tier 1
+    suggested = (v1.get("suggested_project") or "").strip()
+    if suggested and suggested != project and suggested in known_projects:
+        # Try to re-canonicalize the memory's project
+        if not dry_run:
+            try:
+                import urllib.request as _ur
+                req = _ur.Request(
+                    f"{base_url}/memory/{mid}", method="PUT",
+                    headers={"Content-Type": "application/json"},
+                    data=json.dumps({"content": cand["content"]}).encode("utf-8"),
+                )
+                _ur.urlopen(req, timeout=10).read()  # triggers canonicalization via update
+            except Exception:
+                pass
+        print(f"    ↺ misattribution flagged: {project!r} → {suggested!r}")
+
+    # High-confidence tier 1 decision → act
+    if v1["verdict"] in ("promote", "reject") and v1["confidence"] >= AUTO_PROMOTE_MIN_CONFIDENCE:
+        return _apply_verdict(v1, cand, base_url, active_cache, dry_run, tier="sonnet")
+
+    # Borderline or uncertain → escalate to tier 2 (opus)
+    print(f"    ↑ escalating (tier1 verdict={v1['verdict']} conf={v1['confidence']:.2f})")
+    v2 = triage_escalation(
+        cand, v1, active_cache[project], state_cache[project],
+        known_projects, TIER2_MODEL, TIER2_TIMEOUT_S,
+    )
+
+    # Tier 2 is confident → act
+    if v2["verdict"] in ("promote", "reject") and v2["confidence"] >= AUTO_PROMOTE_MIN_CONFIDENCE:
+        return _apply_verdict(v2, cand, base_url, active_cache, dry_run, tier="opus")
+
+    # Tier 3: still uncertain — route per config
+    if TIER3_ACTION == "discard":
+        reason = f"tier1+tier2 uncertain: {v2.get('reason', '')[:150]}"
+        if dry_run:
+            return ("discard", reason)
+        try:
+            api_post(base_url, f"/memory/{mid}/reject")
+        except Exception:
+            return ("error", reason)
+        return ("discard", reason)
+    else:
+        # "human" — leave in queue for /admin/triage review
+        return ("human", v2.get("reason", "no reason")[:200])
+
+
+def _apply_verdict(verdict_obj, cand, base_url, active_cache, dry_run, tier):
+    """Execute the promote/reject action and update metadata."""
+    mid = cand["id"]
+    verdict = verdict_obj["verdict"]
+    conf = verdict_obj["confidence"]
+    reason = f"[{tier}] {verdict_obj['reason']}"
+
+    if verdict == "promote":
+        if dry_run:
+            return ("promote", reason)
+        _apply_metadata_update(base_url, mid, verdict_obj)
+        try:
+            api_post(base_url, f"/memory/{mid}/promote")
+            project = cand.get("project") or ""
+            if project in active_cache:
+                active_cache[project].append(cand)
+            return ("promote", reason)
+        except Exception as e:
+            return ("error", f"promote failed: {e}")
+    else:
+        if dry_run:
+            return ("reject", reason)
+        try:
+            api_post(base_url, f"/memory/{mid}/reject")
+            return ("reject", reason)
+        except Exception as e:
+            return ("error", f"reject failed: {e}")
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Auto-triage candidate memories (3-tier escalation)")
+    parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
+    parser.add_argument("--dry-run", action="store_true", help="preview without executing")
+    parser.add_argument("--max-batches", type=int, default=20,
+                        help="Max batches of 100 to process per run")
+    parser.add_argument("--no-escalation", action="store_true",
+                        help="Disable tier-2 escalation (legacy single-model behavior)")
+    args = parser.parse_args()
+
+    seen_ids: set[str] = set()
+    active_cache: dict[str, list] = {}
+    state_cache: dict[str, list] = {}
+
+    known_projects = fetch_registered_projects(args.base_url)
+    print(f"Registered projects: {sorted(known_projects.keys())}")
+    print(f"Tier1: {TIER1_MODEL}  Tier2: {TIER2_MODEL}  Tier3: {TIER3_ACTION}  "
+          f"escalation_threshold: {ESCALATION_CONFIDENCE_THRESHOLD}")
+
+    counts = {"promote": 0, "reject": 0, "discard": 0, "human": 0, "error": 0}
+    batch_num = 0
+
+    while batch_num < args.max_batches:
+        batch_num += 1
+        result = api_get(args.base_url, "/memory?status=candidate&limit=100")
+        all_candidates = result.get("memories", [])
+        candidates = [c for c in all_candidates if c["id"] not in seen_ids]
+
+        if not candidates:
+            if batch_num == 1:
+                print("queue empty, nothing to triage")
            else:
-                try:
-                    api_post(args.base_url, f"/memory/{mid}/promote")
-                    print(f"  PROMOTED       {label}  conf={conf:.2f}  {reason}")
-                    active_cache[project].append(cand)
-                except Exception:
-                    errors += 1
-            promoted += 1
-        elif verdict == "reject":
-            if args.dry_run:
-                print(f"  WOULD REJECT   {label}  conf={conf:.2f}  {reason}")
-            else:
-                try:
-                    api_post(args.base_url, f"/memory/{mid}/reject")
-                    print(f"  REJECTED       {label}  conf={conf:.2f}  {reason}")
-                except Exception:
-                    errors += 1
-            rejected += 1
-        elif verdict == "contradicts":
-            # Leave candidate in queue but flag the conflict in content
-            # so the wiki/triage shows it. This is conservative: we
-            # don't silently merge or reject when sources disagree.
-            print(f"  CONTRADICTS    {label}  vs {conflicts_with[:8] if conflicts_with else '?'}  {reason}")
-            contradicts_count = locals().get('contradicts_count', 0) + 1
-            needs_human += 1
-        else:
-            print(f"  NEEDS_HUMAN    {label}  conf={conf:.2f}  {reason}")
-            needs_human += 1
+                print(f"\nQueue drained after batch {batch_num-1}.")
+            break

-    print(f"\npromoted={promoted} rejected={rejected} needs_human={needs_human} errors={errors}")
+        print(f"\n=== batch {batch_num}: {len(candidates)} candidates  dry_run: {args.dry_run} ===")
+
+        for i, cand in enumerate(candidates, 1):
+            if i > 1:
+                time.sleep(0.5)
+            seen_ids.add(cand["id"])
+            mid = cand["id"]
+            label = f"[{i:2d}/{len(candidates)}] {mid[:8]} [{cand['memory_type']}]"
+
+            try:
+                action, note = process_candidate(
+                    cand, args.base_url, active_cache, state_cache,
+                    known_projects, args.dry_run,
+                )
+            except Exception as e:
+                action, note = ("error", f"exception: {e}")
+
+            counts[action] = counts.get(action, 0) + 1
+            verb = {"promote": "PROMOTED  ", "reject": "REJECTED  ",
+                    "discard": "DISCARDED ", "human": "NEEDS_HUM ",
+                    "error": "ERROR     "}.get(action, action.upper())
+            if args.dry_run and action in ("promote", "reject", "discard"):
+                verb = "WOULD " + verb.strip()
+            print(f"  {verb} {label}  {note[:120]}")
+
+    print(
+        f"\ntotal: promoted={counts['promote']} rejected={counts['reject']} "
+        f"discarded={counts['discard']} human={counts['human']} errors={counts['error']} "
+        f"batches={batch_num}"
+    )


 if __name__ == "__main__":
--- a/scripts/batch_llm_extract_live.py
+++ b/scripts/batch_llm_extract_live.py
@@ -126,22 +126,34 @@ def extract_one(prompt, response, project, model, timeout_s):
        user_message,
    ]

-    try:
-        completed = subprocess.run(
-            args, capture_output=True, text=True,
-            timeout=timeout_s, cwd=get_sandbox_cwd(),
-            encoding="utf-8", errors="replace",
-        )
-    except subprocess.TimeoutExpired:
-        return [], "timeout"
-    except Exception as exc:
-        return [], f"subprocess_error: {exc}"
+    # Retry with exponential backoff on transient failures (rate limits etc)
+    import time as _time
+    last_error = ""
+    for attempt in range(3):
+        if attempt > 0:
+            _time.sleep(2 ** attempt)  # 2s, 4s
+        try:
+            completed = subprocess.run(
+                args, capture_output=True, text=True,
+                timeout=timeout_s, cwd=get_sandbox_cwd(),
+                encoding="utf-8", errors="replace",
+            )
+        except subprocess.TimeoutExpired:
+            last_error = "timeout"
+            continue
+        except Exception as exc:
+            last_error = f"subprocess_error: {exc}"
+            continue

-    if completed.returncode != 0:
-        return [], f"exit_{completed.returncode}"
+        if completed.returncode == 0:
+            raw = (completed.stdout or "").strip()
+            return parse_candidates(raw, project), ""

-    raw = (completed.stdout or "").strip()
-    return parse_candidates(raw, project), ""
+        # Capture stderr for diagnostics (truncate to 200 chars)
+        stderr = (completed.stderr or "").strip()[:200]
+        last_error = f"exit_{completed.returncode}: {stderr}" if stderr else f"exit_{completed.returncode}"
+
+    return [], last_error


 def parse_candidates(raw, interaction_project):
@@ -164,6 +176,8 @@ def parse_candidates(raw, interaction_project):
            "content": normalized["content"],
            "project": project,
            "confidence": normalized["confidence"],
+            "domain_tags": normalized.get("domain_tags") or [],
+            "valid_until": normalized.get("valid_until") or "",
        })
    return results

@@ -192,10 +206,14 @@ def main():
    total_persisted = 0
    errors = 0

-    for summary in interaction_summaries:
+    import time as _time
+    for ix, summary in enumerate(interaction_summaries):
        resp_chars = summary.get("response_chars", 0) or 0
        if resp_chars < 50:
            continue
+        # Light pacing between calls to avoid bursting the claude CLI
+        if ix > 0:
+            _time.sleep(0.5)
        iid = summary["id"]
        try:
            raw = api_get(
@@ -234,6 +252,8 @@ def main():
                    "project": c["project"],
                    "confidence": c["confidence"],
                    "status": "candidate",
+                    "domain_tags": c.get("domain_tags") or [],
+                    "valid_until": c.get("valid_until") or "",
                })
                total_persisted += 1
            except urllib.error.HTTPError as exc:
--- a/scripts/canonicalize_tags.py
+++ b/scripts/canonicalize_tags.py
@@ -0,0 +1,254 @@
+#!/usr/bin/env python3
+"""Phase 7C — tag canonicalization detector.
+
+Weekly (or on-demand) LLM pass that:
+  1. Fetches the tag distribution across all active memories via HTTP
+  2. Asks claude-p to propose alias→canonical mappings
+  3. AUTO-APPLIES aliases with confidence >= AUTO_APPROVE_CONF (0.8)
+  4. Submits lower-confidence proposals as pending for human review
+
+Autonomous by default — matches the Phase 7A.1 pattern. Set
+--no-auto-approve to force every proposal into human review.
+
+Host-side because claude CLI lives on Dalidou, not the container.
+Reuses the PYTHONPATH=src pattern from scripts/memory_dedup.py.
+
+Usage:
+  python3 scripts/canonicalize_tags.py [--base-url URL] [--dry-run] [--no-auto-approve]
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import shutil
+import subprocess
+import sys
+import tempfile
+import time
+import urllib.error
+import urllib.request
+
+_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
+if _SRC_DIR not in sys.path:
+    sys.path.insert(0, _SRC_DIR)
+
+from atocore.memory._tag_canon_prompt import (  # noqa: E402
+    PROTECTED_PROJECT_TOKENS,
+    SYSTEM_PROMPT,
+    TAG_CANON_PROMPT_VERSION,
+    build_user_message,
+    normalize_alias_item,
+    parse_canon_output,
+)
+
+DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
+DEFAULT_MODEL = os.environ.get("ATOCORE_TAG_CANON_MODEL", "sonnet")
+DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_TAG_CANON_TIMEOUT_S", "90"))
+
+AUTO_APPROVE_CONF = float(os.environ.get("ATOCORE_TAG_CANON_AUTO_APPROVE_CONF", "0.8"))
+MIN_ALIAS_COUNT = int(os.environ.get("ATOCORE_TAG_CANON_MIN_ALIAS_COUNT", "1"))
+
+_sandbox_cwd = None
+
+
+def get_sandbox_cwd() -> str:
+    global _sandbox_cwd
+    if _sandbox_cwd is None:
+        _sandbox_cwd = tempfile.mkdtemp(prefix="ato-tagcanon-")
+    return _sandbox_cwd
+
+
+def api_get(base_url: str, path: str) -> dict:
+    req = urllib.request.Request(f"{base_url}{path}")
+    with urllib.request.urlopen(req, timeout=30) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def api_post(base_url: str, path: str, body: dict | None = None) -> dict:
+    data = json.dumps(body or {}).encode("utf-8")
+    req = urllib.request.Request(
+        f"{base_url}{path}", method="POST",
+        headers={"Content-Type": "application/json"}, data=data,
+    )
+    with urllib.request.urlopen(req, timeout=30) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def call_claude(user_message: str, model: str, timeout_s: float) -> tuple[str | None, str | None]:
+    if not shutil.which("claude"):
+        return None, "claude CLI not available"
+    args = [
+        "claude", "-p",
+        "--model", model,
+        "--append-system-prompt", SYSTEM_PROMPT,
+        "--disable-slash-commands",
+        user_message,
+    ]
+    last_error = ""
+    for attempt in range(3):
+        if attempt > 0:
+            time.sleep(2 ** attempt)
+        try:
+            completed = subprocess.run(
+                args, capture_output=True, text=True,
+                timeout=timeout_s, cwd=get_sandbox_cwd(),
+                encoding="utf-8", errors="replace",
+            )
+        except subprocess.TimeoutExpired:
+            last_error = f"{model} timed out"
+            continue
+        except Exception as exc:
+            last_error = f"subprocess error: {exc}"
+            continue
+        if completed.returncode == 0:
+            return (completed.stdout or "").strip(), None
+        stderr = (completed.stderr or "").strip()[:200]
+        last_error = f"{model} exit {completed.returncode}: {stderr}"
+    return None, last_error
+
+
+def fetch_tag_distribution(base_url: str) -> dict[str, int]:
+    """Count tag occurrences across active memories (client-side)."""
+    try:
+        result = api_get(base_url, "/memory?active_only=true&limit=2000")
+    except Exception as e:
+        print(f"ERROR: could not fetch memories: {e}", file=sys.stderr)
+        return {}
+    mems = result.get("memories", [])
+    counts: dict[str, int] = {}
+    for m in mems:
+        tags = m.get("domain_tags") or []
+        if isinstance(tags, str):
+            try:
+                tags = json.loads(tags)
+            except Exception:
+                tags = []
+        if not isinstance(tags, list):
+            continue
+        for t in tags:
+            if not isinstance(t, str):
+                continue
+            key = t.strip().lower()
+            if key:
+                counts[key] = counts.get(key, 0) + 1
+    return counts
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Phase 7C tag canonicalization detector")
+    parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
+    parser.add_argument("--model", default=DEFAULT_MODEL)
+    parser.add_argument("--timeout-s", type=float, default=DEFAULT_TIMEOUT_S)
+    parser.add_argument("--no-auto-approve", action="store_true",
+                        help="Disable autonomous apply; all proposals → human queue")
+    parser.add_argument("--dry-run", action="store_true",
+                        help="Print decisions without touching state")
+    args = parser.parse_args()
+
+    base = args.base_url.rstrip("/")
+    autonomous = not args.no_auto_approve
+
+    print(
+        f"canonicalize_tags {TAG_CANON_PROMPT_VERSION} | model={args.model} | "
+        f"autonomous={autonomous} | auto-approve conf>={AUTO_APPROVE_CONF}"
+    )
+
+    dist = fetch_tag_distribution(base)
+    print(f"tag distribution: {len(dist)} unique tags, "
+          f"{sum(dist.values())} total references")
+    if not dist:
+        print("no tags found — nothing to canonicalize")
+        return
+
+    user_msg = build_user_message(dist)
+    raw, err = call_claude(user_msg, args.model, args.timeout_s)
+    if err or raw is None:
+        print(f"ERROR: LLM call failed: {err}", file=sys.stderr)
+        return
+
+    aliases_raw = parse_canon_output(raw)
+    print(f"LLM returned {len(aliases_raw)} raw alias proposals")
+
+    auto_applied = 0
+    auto_skipped_missing_canonical = 0
+    proposals_created = 0
+    duplicates_skipped = 0
+
+    for item in aliases_raw:
+        norm = normalize_alias_item(item)
+        if norm is None:
+            continue
+        alias = norm["alias"]
+        canonical = norm["canonical"]
+        confidence = norm["confidence"]
+
+        alias_count = dist.get(alias, 0)
+        canonical_count = dist.get(canonical, 0)
+
+        # Sanity: alias must actually exist in the current distribution
+        if alias_count < MIN_ALIAS_COUNT:
+            print(f"  SKIP {alias!r} → {canonical!r}: alias not in distribution")
+            continue
+        if canonical_count == 0:
+            auto_skipped_missing_canonical += 1
+            print(f"  SKIP {alias!r} → {canonical!r}: canonical missing from distribution")
+            continue
+
+        label = f"{alias!r} ({alias_count}) → {canonical!r} ({canonical_count}) conf={confidence:.2f}"
+
+        auto_apply = autonomous and confidence >= AUTO_APPROVE_CONF
+        if auto_apply:
+            if args.dry_run:
+                auto_applied += 1
+                print(f"  [dry-run] would auto-apply: {label}")
+                continue
+            try:
+                result = api_post(base, "/admin/tags/aliases/apply", {
+                    "alias": alias, "canonical": canonical,
+                    "confidence": confidence, "reason": norm["reason"],
+                    "alias_count": alias_count, "canonical_count": canonical_count,
+                    "actor": "auto-tag-canon",
+                })
+                touched = result.get("memories_touched", 0)
+                auto_applied += 1
+                print(f"  ✅ auto-applied: {label} ({touched} memories)")
+            except Exception as e:
+                print(f"  ⚠️ auto-apply failed: {label} — {e}", file=sys.stderr)
+            time.sleep(0.2)
+            continue
+
+        # Lower confidence → human review
+        if args.dry_run:
+            proposals_created += 1
+            print(f"  [dry-run] would propose for review: {label}")
+            continue
+        try:
+            result = api_post(base, "/admin/tags/aliases/propose", {
+                "alias": alias, "canonical": canonical,
+                "confidence": confidence, "reason": norm["reason"],
+                "alias_count": alias_count, "canonical_count": canonical_count,
+            })
+            if result.get("proposal_id"):
+                proposals_created += 1
+                print(f"  → pending proposal: {label}")
+            else:
+                duplicates_skipped += 1
+                print(f"  (duplicate pending proposal): {label}")
+        except Exception as e:
+            print(f"  ⚠️ propose failed: {label} — {e}", file=sys.stderr)
+        time.sleep(0.2)
+
+    print(
+        f"\nsummary: proposals_seen={len(aliases_raw)} "
+        f"auto_applied={auto_applied} "
+        f"proposals_created={proposals_created} "
+        f"duplicates_skipped={duplicates_skipped} "
+        f"skipped_missing_canonical={auto_skipped_missing_canonical}"
+    )
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/detect_emerging.py
+++ b/scripts/detect_emerging.py
@@ -0,0 +1,223 @@
+#!/usr/bin/env python3
+"""Phase 6 C.1 — Emerging-concepts detector (HTTP-only).
+
+Scans active + candidate memories via the HTTP API to surface:
+  1. Unregistered projects — project strings appearing on 3+ memories
+     that aren't in the project registry. Surface for one-click
+     registration.
+  2. Emerging categories — top 20 domain_tags by frequency, for
+     "what themes are emerging in my work?" intelligence.
+  3. Reinforced transients — active memories with reference_count >= 5
+     AND valid_until set. These "were temporary but now durable"; a
+     sibling endpoint (/admin/memory/extend-reinforced) actually
+     performs the extension.
+
+Writes results to project_state under atocore/proposals/* via the API.
+Runs host-side (cron calls it) so uses stdlib only — no atocore deps.
+
+Usage:
+  python3 scripts/detect_emerging.py [--base-url URL] [--dry-run]
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import sys
+import urllib.error
+import urllib.request
+from collections import Counter, defaultdict
+
+PROJECT_MIN_MEMORIES = int(os.environ.get("ATOCORE_EMERGING_PROJECT_MIN", "3"))
+PROJECT_ALERT_THRESHOLD = int(os.environ.get("ATOCORE_EMERGING_ALERT_THRESHOLD", "5"))
+TOP_TAGS_LIMIT = int(os.environ.get("ATOCORE_EMERGING_TOP_TAGS", "20"))
+
+
+def api_get(base_url: str, path: str, timeout: int = 30) -> dict:
+    req = urllib.request.Request(f"{base_url}{path}")
+    with urllib.request.urlopen(req, timeout=timeout) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def api_post(base_url: str, path: str, body: dict, timeout: int = 10) -> dict:
+    data = json.dumps(body).encode("utf-8")
+    req = urllib.request.Request(
+        f"{base_url}{path}", method="POST",
+        headers={"Content-Type": "application/json"}, data=data,
+    )
+    with urllib.request.urlopen(req, timeout=timeout) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def fetch_registered_project_names(base_url: str) -> set[str]:
+    """Set of all registered project ids + aliases, lowercased."""
+    try:
+        result = api_get(base_url, "/projects")
+    except Exception as e:
+        print(f"WARN: could not load project registry: {e}", file=sys.stderr)
+        return set()
+    registered = set()
+    for p in result.get("projects", []):
+        pid = (p.get("project_id") or p.get("id") or p.get("name") or "").strip()
+        if pid:
+            registered.add(pid.lower())
+        for alias in p.get("aliases", []) or []:
+            if isinstance(alias, str) and alias.strip():
+                registered.add(alias.strip().lower())
+    return registered
+
+
+def fetch_memories(base_url: str, status: str, limit: int = 500) -> list[dict]:
+    try:
+        params = f"limit={limit}"
+        if status == "active":
+            params += "&active_only=true"
+        else:
+            params += f"&status={status}"
+        result = api_get(base_url, f"/memory?{params}")
+        return result.get("memories", [])
+    except Exception as e:
+        print(f"WARN: could not fetch {status} memories: {e}", file=sys.stderr)
+        return []
+
+
+def fetch_previous_proposals(base_url: str) -> list[dict]:
+    """Read last run's unregistered_projects to diff against this run."""
+    try:
+        result = api_get(base_url, "/project/state/atocore")
+        entries = result.get("entries", result.get("state", []))
+        for e in entries:
+            if e.get("category") == "proposals" and e.get("key") == "unregistered_projects_prev":
+                try:
+                    return json.loads(e.get("value") or "[]")
+                except Exception:
+                    return []
+    except Exception:
+        pass
+    return []
+
+
+def set_state(base_url: str, category: str, key: str, value: str, source: str = "emerging detector") -> None:
+    api_post(base_url, "/project/state", {
+        "project": "atocore",
+        "category": category,
+        "key": key,
+        "value": value,
+        "source": source,
+    })
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Detect emerging projects + categories")
+    parser.add_argument("--base-url", default=os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100"))
+    parser.add_argument("--dry-run", action="store_true", help="Report without writing to project state")
+    args = parser.parse_args()
+
+    base = args.base_url.rstrip("/")
+
+    registered = fetch_registered_project_names(base)
+    active = fetch_memories(base, "active")
+    candidates = fetch_memories(base, "candidate")
+    all_mems = active + candidates
+
+    # --- Unregistered projects ---
+    project_mems: dict[str, list] = defaultdict(list)
+    for m in all_mems:
+        proj = (m.get("project") or "").strip().lower()
+        if not proj or proj in registered:
+            continue
+        project_mems[proj].append(m)
+
+    unregistered = []
+    for proj, mems in sorted(project_mems.items()):
+        if len(mems) < PROJECT_MIN_MEMORIES:
+            continue
+        unregistered.append({
+            "project": proj,
+            "count": len(mems),
+            "sample_memory_ids": [m.get("id") for m in mems[:3]],
+            "sample_contents": [(m.get("content") or "")[:150] for m in mems[:3]],
+        })
+
+    # --- Emerging domain_tags (active only) ---
+    tag_counter: Counter = Counter()
+    for m in active:
+        for t in (m.get("domain_tags") or []):
+            if isinstance(t, str) and t.strip():
+                tag_counter[t.strip().lower()] += 1
+    emerging_tags = [{"tag": tag, "count": cnt} for tag, cnt in tag_counter.most_common(TOP_TAGS_LIMIT)]
+
+    # --- Reinforced transients (active, high refs, has expiry) ---
+    reinforced = []
+    for m in active:
+        ref_count = int(m.get("reference_count") or 0)
+        vu = (m.get("valid_until") or "").strip()
+        if ref_count >= 5 and vu:
+            reinforced.append({
+                "memory_id": m.get("id"),
+                "reference_count": ref_count,
+                "valid_until": vu,
+                "content_preview": (m.get("content") or "")[:150],
+                "project": m.get("project") or "",
+            })
+
+    result = {
+        "unregistered_projects": unregistered,
+        "emerging_categories": emerging_tags,
+        "reinforced_transients": reinforced,
+        "counts": {
+            "active_memories": len(active),
+            "candidate_memories": len(candidates),
+            "unregistered_project_count": len(unregistered),
+            "emerging_tag_count": len(emerging_tags),
+            "reinforced_transient_count": len(reinforced),
+        },
+    }
+
+    print(json.dumps(result, indent=2))
+
+    if args.dry_run:
+        return
+
+    # --- Persist to project state via HTTP ---
+    try:
+        set_state(base, "proposals", "unregistered_projects", json.dumps(unregistered))
+        set_state(base, "proposals", "emerging_categories", json.dumps(emerging_tags))
+        set_state(base, "proposals", "reinforced_transients", json.dumps(reinforced))
+    except Exception as e:
+        print(f"WARN: failed to persist proposals: {e}", file=sys.stderr)
+
+    # --- Alert on NEW projects crossing the threshold ---
+    try:
+        prev = fetch_previous_proposals(base)
+        prev_names = {p.get("project") for p in prev if isinstance(p, dict)}
+        newly_crossed = [
+            p for p in unregistered
+            if p["count"] >= PROJECT_ALERT_THRESHOLD
+            and p["project"] not in prev_names
+        ]
+        if newly_crossed:
+            names = ", ".join(p["project"] for p in newly_crossed)
+            # Use existing alert mechanism via state (Phase 4 infra)
+            try:
+                set_state(base, "alert", "last_warning", json.dumps({
+                    "title": f"Emerging project(s) detected: {names}",
+                    "message": (
+                        f"{len(newly_crossed)} unregistered project(s) crossed "
+                        f"the {PROJECT_ALERT_THRESHOLD}-memory threshold. "
+                        f"Review at /wiki or /admin/dashboard."
+                    ),
+                    "timestamp": "",
+                }))
+            except Exception:
+                pass
+
+        # Snapshot for next run's diff
+        set_state(base, "proposals", "unregistered_projects_prev", json.dumps(unregistered))
+    except Exception as e:
+        print(f"WARN: alert/state write failed: {e}", file=sys.stderr)
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/graduate_memories.py
+++ b/scripts/graduate_memories.py
@@ -0,0 +1,237 @@
+#!/usr/bin/env python3
+"""Phase 5F — Memory → Entity graduation batch pass.
+
+Takes active memories, asks claude-p whether each describes a typed
+engineering entity, and creates entity candidates for the ones that do.
+Each candidate carries source_refs back to its source memory so human
+review can trace provenance.
+
+Human reviews the entity candidates via /admin/triage (same UI as memory
+triage). When a candidate is promoted, a post-promote hook marks the source
+memory as `graduated` and sets `graduated_to_entity_id` for traceability.
+
+This is THE population move: without it, the engineering graph stays sparse
+and the killer queries (Q-006/009/011) have nothing to find gaps in.
+
+Usage:
+  python3 scripts/graduate_memories.py --base-url http://127.0.0.1:8100 \\
+      --project p05-interferometer --limit 20
+
+  # Dry run (don't create entities, just show decisions):
+  python3 scripts/graduate_memories.py --project p05-interferometer --dry-run
+
+  # Process all active memories across all projects (big run):
+  python3 scripts/graduate_memories.py --limit 200
+
+Host-side because claude CLI lives on Dalidou, not in the container.
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import shutil
+import subprocess
+import sys
+import tempfile
+import time
+import urllib.error
+import urllib.request
+from typing import Any
+
+# Make src/ importable so we can reuse the stdlib-only prompt module
+_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
+if _SRC_DIR not in sys.path:
+    sys.path.insert(0, _SRC_DIR)
+
+from atocore.engineering._graduation_prompt import (  # noqa: E402
+    GRADUATION_PROMPT_VERSION,
+    SYSTEM_PROMPT,
+    build_user_message,
+    parse_graduation_output,
+)
+
+
+DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
+DEFAULT_MODEL = os.environ.get("ATOCORE_LLM_EXTRACTOR_MODEL", "sonnet")
+DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_GRADUATION_TIMEOUT_S", "90"))
+
+_sandbox_cwd = None
+
+
+def get_sandbox_cwd() -> str:
+    """Temp cwd so claude CLI doesn't auto-discover project CLAUDE.md files."""
+    global _sandbox_cwd
+    if _sandbox_cwd is None:
+        _sandbox_cwd = tempfile.mkdtemp(prefix="ato-graduate-")
+    return _sandbox_cwd
+
+
+def api_get(base_url: str, path: str) -> dict:
+    req = urllib.request.Request(f"{base_url}{path}")
+    with urllib.request.urlopen(req, timeout=15) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def api_post(base_url: str, path: str, body: dict | None = None) -> dict:
+    data = json.dumps(body or {}).encode("utf-8")
+    req = urllib.request.Request(
+        f"{base_url}{path}", method="POST",
+        headers={"Content-Type": "application/json"}, data=data,
+    )
+    with urllib.request.urlopen(req, timeout=15) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def graduate_one(memory: dict, model: str, timeout_s: float) -> dict[str, Any] | None:
+    """Ask claude whether this memory describes a typed entity.
+
+    Returns None on any failure (parse error, timeout, exit!=0).
+    Applies retry+pacing to match the pattern in auto_triage/batch_extract.
+    """
+    if not shutil.which("claude"):
+        return None
+
+    user_msg = build_user_message(
+        memory_content=memory.get("content", "") or "",
+        memory_project=memory.get("project", "") or "",
+        memory_type=memory.get("memory_type", "") or "",
+    )
+
+    args = [
+        "claude", "-p",
+        "--model", model,
+        "--append-system-prompt", SYSTEM_PROMPT,
+        "--disable-slash-commands",
+        user_msg,
+    ]
+
+    last_error = ""
+    for attempt in range(3):
+        if attempt > 0:
+            time.sleep(2 ** attempt)
+        try:
+            completed = subprocess.run(
+                args, capture_output=True, text=True,
+                timeout=timeout_s, cwd=get_sandbox_cwd(),
+                encoding="utf-8", errors="replace",
+            )
+        except subprocess.TimeoutExpired:
+            last_error = "timeout"
+            continue
+        except Exception as exc:
+            last_error = f"subprocess error: {exc}"
+            continue
+
+        if completed.returncode == 0:
+            return parse_graduation_output(completed.stdout or "")
+
+        stderr = (completed.stderr or "").strip()[:200]
+        last_error = f"exit_{completed.returncode}: {stderr}" if stderr else f"exit_{completed.returncode}"
+
+    print(f"  ! claude failed after 3 tries: {last_error}", file=sys.stderr)
+    return None
+
+
+def create_entity_candidate(
+    base_url: str,
+    decision: dict,
+    memory: dict,
+) -> str | None:
+    """Create an entity candidate with source_refs pointing at the memory."""
+    try:
+        result = api_post(base_url, "/entities", {
+            "entity_type": decision["entity_type"],
+            "name": decision["name"],
+            "project": memory.get("project", "") or "",
+            "description": decision["description"],
+            "properties": {
+                "graduated_from_memory": memory["id"],
+                "proposed_relationships": decision["relationships"],
+                "prompt_version": GRADUATION_PROMPT_VERSION,
+            },
+            "status": "candidate",
+            "confidence": decision["confidence"],
+            "source_refs": [f"memory:{memory['id']}"],
+        })
+        return result.get("id")
+    except Exception as e:
+        print(f"  ! entity create failed: {e}", file=sys.stderr)
+        return None
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Graduate active memories into entity candidates")
+    parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
+    parser.add_argument("--model", default=DEFAULT_MODEL)
+    parser.add_argument("--project", default=None, help="Only graduate memories in this project")
+    parser.add_argument("--limit", type=int, default=50, help="Max memories to process")
+    parser.add_argument("--min-confidence", type=float, default=0.3,
+                        help="Skip memories with confidence below this (they're probably noise)")
+    parser.add_argument("--dry-run", action="store_true", help="Show decisions without creating entities")
+    args = parser.parse_args()
+
+    # Fetch active memories
+    query = "status=active"
+    query += f"&limit={args.limit}"
+    if args.project:
+        query += f"&project={args.project}"
+    result = api_get(args.base_url, f"/memory?{query}")
+    memories = result.get("memories", [])
+
+    # Filter by min_confidence + skip already-graduated
+    memories = [m for m in memories
+                if m.get("confidence", 0) >= args.min_confidence
+                and m.get("status") != "graduated"]
+
+    print(f"graduating: {len(memories)} memories  project={args.project or '(all)'}  "
+          f"model={args.model}  dry_run={args.dry_run}")
+
+    graduated = 0
+    skipped = 0
+    errors = 0
+    entities_created: list[str] = []
+
+    for i, mem in enumerate(memories, 1):
+        if i > 1:
+            time.sleep(0.5)  # light pacing, matches auto_triage
+        mid = mem["id"]
+        label = f"[{i:3d}/{len(memories)}] {mid[:8]} [{mem.get('memory_type','?')}]"
+
+        decision = graduate_one(mem, args.model, DEFAULT_TIMEOUT_S)
+        if decision is None:
+            print(f"  ERROR  {label}  (graduate_one returned None)")
+            errors += 1
+            continue
+
+        if not decision.get("graduate"):
+            reason = decision.get("reason", "(no reason)")
+            print(f"  skip   {label}  {reason}")
+            skipped += 1
+            continue
+
+        etype = decision["entity_type"]
+        ename = decision["name"]
+        nrel = len(decision.get("relationships", []))
+
+        if args.dry_run:
+            print(f"  WOULD  {label}  → [{etype}] {ename!r}  ({nrel} rels)")
+            graduated += 1
+        else:
+            entity_id = create_entity_candidate(args.base_url, decision, mem)
+            if entity_id:
+                print(f"  CREATE {label}  → [{etype}] {ename!r}  ({nrel} rels)  entity={entity_id[:8]}")
+                graduated += 1
+                entities_created.append(entity_id)
+            else:
+                errors += 1
+
+    print(f"\ntotal: graduated={graduated} skipped={skipped} errors={errors}")
+    if entities_created:
+        print(f"Review at /admin/triage ({len(entities_created)} entity candidates created)")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/integrity_check.py
+++ b/scripts/integrity_check.py
@@ -0,0 +1,49 @@
+#!/usr/bin/env python3
+"""Trigger the integrity check inside the AtoCore container.
+
+The scan itself lives in the container (needs direct DB access via the
+already-loaded sqlite connection). This host-side wrapper just POSTs to
+/admin/integrity-check so the nightly cron can kick it off from bash
+without needing the container's Python deps on the host.
+
+Usage:
+  python3 scripts/integrity_check.py [--base-url URL] [--dry-run]
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import sys
+import urllib.parse
+import urllib.request
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--base-url", default=os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100"))
+    parser.add_argument("--dry-run", action="store_true",
+                        help="Report without persisting findings to state")
+    args = parser.parse_args()
+
+    url = args.base_url.rstrip("/") + "/admin/integrity-check"
+    if args.dry_run:
+        url += "?persist=false"
+
+    req = urllib.request.Request(url, method="POST")
+    try:
+        with urllib.request.urlopen(req, timeout=30) as resp:
+            result = json.loads(resp.read().decode("utf-8"))
+    except Exception as e:
+        print(f"ERROR: could not reach {url}: {e}", file=sys.stderr)
+        sys.exit(1)
+
+    print(json.dumps(result, indent=2))
+    if not result.get("ok", True):
+        # Non-zero exit so cron logs flag it
+        sys.exit(2)
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/memory_dedup.py
+++ b/scripts/memory_dedup.py
@@ -0,0 +1,374 @@
+#!/usr/bin/env python3
+"""Phase 7A — semantic memory dedup detector (stdlib-only host script).
+
+Finds clusters of near-duplicate active memories and writes merge-
+candidate proposals for human (or autonomous) approval.
+
+Algorithm:
+  1. POST /admin/memory/dedup-cluster on the AtoCore server — it
+     computes embeddings + transitive clusters under the (project,
+     memory_type) bucket rule (sentence-transformers lives in the
+     container, not on the host)
+  2. For each returned cluster of size >= 2, ask claude-p (host-side
+     CLI) to draft unified content preserving all specifics
+  3. Server-side tiering:
+       - TIER-1 auto-approve: sonnet confidence >= 0.8 AND min_sim >= 0.92
+         AND all sources share project+type → immediately submit and
+         approve (actor="auto-dedup-tier1")
+       - TIER-2 escalation: opus confirms with conf >= 0.8 → auto-approve
+         (actor="auto-dedup-tier2"); opus rejects → skip silently
+       - HUMAN: pending proposal lands in /admin/triage
+
+Host-only dep: the `claude` CLI. No python packages beyond stdlib.
+Reuses atocore.memory._dedup_prompt (stdlib-only shared prompt).
+
+Usage:
+  python3 scripts/memory_dedup.py --base-url http://127.0.0.1:8100 \\
+      --similarity-threshold 0.88 --max-batch 50
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import shutil
+import subprocess
+import sys
+import tempfile
+import time
+import urllib.error
+import urllib.request
+from typing import Any
+
+# Make src/ importable for the stdlib-only prompt module.
+# We DO NOT import anything that pulls in pydantic_settings or
+# sentence-transformers; those live on the server side.
+_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
+if _SRC_DIR not in sys.path:
+    sys.path.insert(0, _SRC_DIR)
+
+from atocore.memory._dedup_prompt import (  # noqa: E402
+    DEDUP_PROMPT_VERSION,
+    SYSTEM_PROMPT,
+    TIER2_SYSTEM_PROMPT,
+    build_tier2_user_message,
+    build_user_message,
+    normalize_merge_verdict,
+    parse_merge_verdict,
+)
+
+DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
+DEFAULT_MODEL = os.environ.get("ATOCORE_DEDUP_MODEL", "sonnet")
+DEFAULT_TIER2_MODEL = os.environ.get("ATOCORE_DEDUP_TIER2_MODEL", "opus")
+DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_DEDUP_TIMEOUT_S", "60"))
+
+AUTO_APPROVE_CONF = float(os.environ.get("ATOCORE_DEDUP_AUTO_APPROVE_CONF", "0.8"))
+AUTO_APPROVE_SIM = float(os.environ.get("ATOCORE_DEDUP_AUTO_APPROVE_SIM", "0.92"))
+TIER2_MIN_CONF = float(os.environ.get("ATOCORE_DEDUP_TIER2_MIN_CONF", "0.5"))
+TIER2_MIN_SIM = float(os.environ.get("ATOCORE_DEDUP_TIER2_MIN_SIM", "0.85"))
+
+_sandbox_cwd = None
+
+
+def get_sandbox_cwd() -> str:
+    global _sandbox_cwd
+    if _sandbox_cwd is None:
+        _sandbox_cwd = tempfile.mkdtemp(prefix="ato-dedup-")
+    return _sandbox_cwd
+
+
+def api_get(base_url: str, path: str) -> dict:
+    req = urllib.request.Request(f"{base_url}{path}")
+    with urllib.request.urlopen(req, timeout=30) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def api_post(base_url: str, path: str, body: dict | None = None, timeout: int = 60) -> dict:
+    data = json.dumps(body or {}).encode("utf-8")
+    req = urllib.request.Request(
+        f"{base_url}{path}", method="POST",
+        headers={"Content-Type": "application/json"}, data=data,
+    )
+    with urllib.request.urlopen(req, timeout=timeout) as resp:
+        return json.loads(resp.read().decode("utf-8"))
+
+
+def call_claude(system_prompt: str, user_message: str, model: str, timeout_s: float) -> tuple[str | None, str | None]:
+    if not shutil.which("claude"):
+        return None, "claude CLI not available"
+    args = [
+        "claude", "-p",
+        "--model", model,
+        "--append-system-prompt", system_prompt,
+        "--disable-slash-commands",
+        user_message,
+    ]
+    last_error = ""
+    for attempt in range(3):
+        if attempt > 0:
+            time.sleep(2 ** attempt)
+        try:
+            completed = subprocess.run(
+                args, capture_output=True, text=True,
+                timeout=timeout_s, cwd=get_sandbox_cwd(),
+                encoding="utf-8", errors="replace",
+            )
+        except subprocess.TimeoutExpired:
+            last_error = f"{model} timed out"
+            continue
+        except Exception as exc:
+            last_error = f"subprocess error: {exc}"
+            continue
+        if completed.returncode == 0:
+            return (completed.stdout or "").strip(), None
+        stderr = (completed.stderr or "").strip()[:200]
+        last_error = f"{model} exit {completed.returncode}: {stderr}"
+    return None, last_error
+
+
+def fetch_clusters(base_url: str, project: str, threshold: float, max_clusters: int) -> list[dict]:
+    """Ask the server to compute near-duplicate clusters. The server
+    owns sentence-transformers; host stays lean."""
+    try:
+        result = api_post(base_url, "/admin/memory/dedup-cluster", {
+            "project": project,
+            "similarity_threshold": threshold,
+            "max_clusters": max_clusters,
+        }, timeout=120)
+    except Exception as e:
+        print(f"ERROR: dedup-cluster fetch failed: {e}", file=sys.stderr)
+        return []
+    clusters = result.get("clusters", [])
+    print(
+        f"server returned {len(clusters)} clusters "
+        f"(total_active={result.get('total_active_scanned')}, "
+        f"buckets={result.get('bucket_count')})"
+    )
+    return clusters
+
+
+def draft_merge(sources: list[dict], model: str, timeout_s: float) -> dict[str, Any] | None:
+    user_msg = build_user_message(sources)
+    raw, err = call_claude(SYSTEM_PROMPT, user_msg, model, timeout_s)
+    if err:
+        print(f"  WARN: claude tier-1 failed: {err}", file=sys.stderr)
+        return None
+    parsed = parse_merge_verdict(raw or "")
+    if parsed is None:
+        print(f"  WARN: could not parse tier-1 verdict: {(raw or '')[:200]}", file=sys.stderr)
+        return None
+    return normalize_merge_verdict(parsed)
+
+
+def tier2_review(sources: list[dict], tier1_verdict: dict, model: str, timeout_s: float) -> dict | None:
+    user_msg = build_tier2_user_message(sources, tier1_verdict)
+    raw, err = call_claude(TIER2_SYSTEM_PROMPT, user_msg, model, timeout_s)
+    if err:
+        print(f"  WARN: claude tier-2 failed: {err}", file=sys.stderr)
+        return None
+    parsed = parse_merge_verdict(raw or "")
+    if parsed is None:
+        print(f"  WARN: could not parse tier-2 verdict: {(raw or '')[:200]}", file=sys.stderr)
+        return None
+    return normalize_merge_verdict(parsed)
+
+
+def submit_candidate(base_url: str, memory_ids: list[str], similarity: float, verdict: dict, dry_run: bool) -> str | None:
+    body = {
+        "memory_ids": memory_ids,
+        "similarity": similarity,
+        "proposed_content": verdict["content"],
+        "proposed_memory_type": verdict["memory_type"],
+        "proposed_project": verdict["project"],
+        "proposed_tags": verdict["domain_tags"],
+        "proposed_confidence": verdict["confidence"],
+        "reason": verdict["reason"],
+    }
+    if dry_run:
+        print(f"  [dry-run] would POST: {json.dumps(body)[:200]}...")
+        return "dry-run"
+    try:
+        result = api_post(base_url, "/admin/memory/merge-candidates/create", body)
+        return result.get("candidate_id")
+    except urllib.error.HTTPError as e:
+        print(f"  ERROR: submit failed: {e.code} {e.read().decode()[:200]}", file=sys.stderr)
+        return None
+    except Exception as e:
+        print(f"  ERROR: submit failed: {e}", file=sys.stderr)
+        return None
+
+
+def auto_approve(base_url: str, candidate_id: str, actor: str, dry_run: bool) -> str | None:
+    if dry_run:
+        return "dry-run"
+    try:
+        result = api_post(
+            base_url,
+            f"/admin/memory/merge-candidates/{candidate_id}/approve",
+            {"actor": actor},
+        )
+        return result.get("result_memory_id")
+    except Exception as e:
+        print(f"  ERROR: auto-approve failed: {e}", file=sys.stderr)
+        return None
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Phase 7A semantic dedup detector (tiered, stdlib-only host)")
+    parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
+    parser.add_argument("--project", default="", help="Only scan this project (empty = all)")
+    parser.add_argument("--similarity-threshold", type=float, default=0.88)
+    parser.add_argument("--max-batch", type=int, default=50,
+                        help="Max clusters to process per run")
+    parser.add_argument("--model", default=DEFAULT_MODEL)
+    parser.add_argument("--tier2-model", default=DEFAULT_TIER2_MODEL)
+    parser.add_argument("--timeout-s", type=float, default=DEFAULT_TIMEOUT_S)
+    parser.add_argument("--no-auto-approve", action="store_true",
+                        help="Disable autonomous merging; all merges land in human triage queue")
+    parser.add_argument("--dry-run", action="store_true")
+    args = parser.parse_args()
+
+    base = args.base_url.rstrip("/")
+    autonomous = not args.no_auto_approve
+
+    print(
+        f"memory_dedup {DEDUP_PROMPT_VERSION} | threshold={args.similarity_threshold} | "
+        f"tier1={args.model} tier2={args.tier2_model} | "
+        f"autonomous={autonomous} | "
+        f"auto-approve: conf>={AUTO_APPROVE_CONF} sim>={AUTO_APPROVE_SIM}"
+    )
+
+    clusters = fetch_clusters(
+        base, args.project, args.similarity_threshold, args.max_batch,
+    )
+    if not clusters:
+        print("no clusters — nothing to dedup")
+        return
+
+    auto_merged_tier1 = 0
+    auto_merged_tier2 = 0
+    human_candidates = 0
+    tier1_rejections = 0
+    tier2_overrides = 0
+    skipped_existing = 0
+    processed = 0
+
+    for cluster in clusters:
+        if processed >= args.max_batch:
+            break
+        processed += 1
+        sources = cluster["sources"]
+        ids = cluster["memory_ids"]
+        min_sim = float(cluster["min_similarity"])
+        proj = cluster.get("project") or "(global)"
+        mtype = cluster.get("memory_type") or "?"
+
+        print(f"\n[{proj}/{mtype}] cluster size={cluster['size']} min_sim={min_sim:.3f} "
+              f"{[s['id'][:8] for s in sources]}")
+
+        tier1 = draft_merge(sources, args.model, args.timeout_s)
+        if tier1 is None:
+            continue
+        if tier1["action"] == "reject":
+            tier1_rejections += 1
+            print(f"  TIER-1 rejected: {tier1['reason'][:100]}")
+            continue
+
+        # All sources share the bucket by construction from the server
+        bucket_ok = True
+        tier1_ok = (
+            tier1["confidence"] >= AUTO_APPROVE_CONF
+            and min_sim >= AUTO_APPROVE_SIM
+            and bucket_ok
+        )
+
+        if autonomous and tier1_ok:
+            cid = submit_candidate(base, ids, min_sim, tier1, args.dry_run)
+            if cid == "dry-run":
+                auto_merged_tier1 += 1
+                print("  [dry-run] would auto-merge (tier-1)")
+            elif cid:
+                new_id = auto_approve(base, cid, actor="auto-dedup-tier1", dry_run=args.dry_run)
+                if new_id:
+                    auto_merged_tier1 += 1
+                    print(f"  ✅ auto-merged (tier-1) → {str(new_id)[:8]}")
+                else:
+                    human_candidates += 1
+                    print(f"  ⚠️ tier-1 approve failed; candidate {cid[:8]} pending")
+            else:
+                skipped_existing += 1
+            time.sleep(0.3)
+            continue
+
+        tier2_eligible = (
+            autonomous
+            and min_sim >= TIER2_MIN_SIM
+            and tier1["confidence"] >= TIER2_MIN_CONF
+        )
+
+        if tier2_eligible:
+            print("  → escalating to tier-2 (opus)…")
+            tier2 = tier2_review(sources, tier1, args.tier2_model, args.timeout_s)
+            if tier2 is None:
+                cid = submit_candidate(base, ids, min_sim, tier1, args.dry_run)
+                if cid and cid != "dry-run":
+                    human_candidates += 1
+                    print(f"  → candidate {cid[:8]} (tier-2 errored, human review)")
+                time.sleep(0.5)
+                continue
+
+            if tier2["action"] == "reject":
+                tier2_overrides += 1
+                print(f"  ❌ TIER-2 override (reject): {tier2['reason'][:100]}")
+                time.sleep(0.5)
+                continue
+
+            if tier2["confidence"] >= AUTO_APPROVE_CONF:
+                cid = submit_candidate(base, ids, min_sim, tier2, args.dry_run)
+                if cid == "dry-run":
+                    auto_merged_tier2 += 1
+                elif cid:
+                    new_id = auto_approve(base, cid, actor="auto-dedup-tier2", dry_run=args.dry_run)
+                    if new_id:
+                        auto_merged_tier2 += 1
+                        print(f"  ✅ auto-merged (tier-2) → {str(new_id)[:8]}")
+                    else:
+                        human_candidates += 1
+                else:
+                    skipped_existing += 1
+                time.sleep(0.5)
+                continue
+
+            cid = submit_candidate(base, ids, min_sim, tier2, args.dry_run)
+            if cid and cid != "dry-run":
+                human_candidates += 1
+                print(f"  → candidate {cid[:8]} (tier-2 low-conf, human review)")
+            time.sleep(0.5)
+            continue
+
+        # Below tier-2 thresholds — human review with tier-1 draft
+        cid = submit_candidate(base, ids, min_sim, tier1, args.dry_run)
+        if cid == "dry-run":
+            human_candidates += 1
+        elif cid:
+            human_candidates += 1
+            print(f"  → candidate {cid[:8]} (human review)")
+        else:
+            skipped_existing += 1
+        time.sleep(0.3)
+
+    print(
+        f"\nsummary: clusters_processed={processed} "
+        f"auto_merged_tier1={auto_merged_tier1} "
+        f"auto_merged_tier2={auto_merged_tier2} "
+        f"human_candidates={human_candidates} "
+        f"tier1_rejections={tier1_rejections} "
+        f"tier2_overrides={tier2_overrides} "
+        f"skipped_existing={skipped_existing}"
+    )
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/windows/atocore-backup-pull.ps1
+++ b/scripts/windows/atocore-backup-pull.ps1
@@ -0,0 +1,87 @@
+# atocore-backup-pull.ps1
+#
+# Pull the latest AtoCore backup snapshot from Dalidou to this Windows machine.
+# Designed to be run by Windows Task Scheduler. Fail-open by design -- if
+# Dalidou is unreachable (laptop on the road, etc.), exit cleanly without error.
+#
+# Usage (manual test):
+#   powershell.exe -ExecutionPolicy Bypass -File atocore-backup-pull.ps1
+#
+# Scheduled task: see docs/windows-backup-setup.md for Task Scheduler config.
+
+$ErrorActionPreference = "Continue"
+
+# --- Configuration ---
+$Remote           = "papa@dalidou"
+$RemoteSnapshots  = "/srv/storage/atocore/backups/snapshots"
+$LocalBackupDir   = "$env:USERPROFILE\Documents\ATOCore_Backups"
+$LogDir           = "$LocalBackupDir\_logs"
+$ReachabilityTest = 5   # seconds timeout for SSH probe
+
+# --- Setup ---
+if (-not (Test-Path $LocalBackupDir)) {
+    New-Item -ItemType Directory -Path $LocalBackupDir -Force | Out-Null
+}
+if (-not (Test-Path $LogDir)) {
+    New-Item -ItemType Directory -Path $LogDir -Force | Out-Null
+}
+
+$Timestamp = Get-Date -Format "yyyy-MM-dd_HHmmss"
+$LogFile = "$LogDir\backup-$Timestamp.log"
+
+function Log($msg) {
+    $line = "[{0}] {1}" -f (Get-Date -Format "yyyy-MM-dd HH:mm:ss"), $msg
+    Write-Host $line
+    Add-Content -Path $LogFile -Value $line
+}
+
+Log "=== AtoCore backup pull starting ==="
+Log "Remote: $Remote"
+Log "Local target: $LocalBackupDir"
+
+# --- Reachability check: fail open if Dalidou is offline ---
+Log "Checking Dalidou reachability..."
+$probe = & ssh -o ConnectTimeout=$ReachabilityTest -o BatchMode=yes `
+              -o StrictHostKeyChecking=accept-new `
+              $Remote "echo ok" 2>&1
+if ($LASTEXITCODE -ne 0 -or $probe -ne "ok") {
+    Log "Dalidou unreachable ($probe) -- fail-open exit"
+    exit 0
+}
+Log "Dalidou reachable."
+
+# --- Pull the entire snapshots directory ---
+# Dalidou's retention policy (7 daily + 4 weekly + 6 monthly) already caps
+# the snapshot count, so pulling the whole dir is bounded and simple. scp
+# will overwrite local files -- we rely on this to pick up new snapshots.
+Log "Pulling snapshots via scp..."
+$LocalSnapshotsDir = Join-Path $LocalBackupDir "snapshots"
+if (-not (Test-Path $LocalSnapshotsDir)) {
+    New-Item -ItemType Directory -Path $LocalSnapshotsDir -Force | Out-Null
+}
+
+& scp -o BatchMode=yes -r "${Remote}:${RemoteSnapshots}/*" "$LocalSnapshotsDir\" 2>&1 |
+    ForEach-Object { Add-Content -Path $LogFile -Value $_ }
+
+if ($LASTEXITCODE -ne 0) {
+    Log "scp failed with exit $LASTEXITCODE"
+    exit 0  # fail-open
+}
+
+# --- Stats ---
+$snapshots = Get-ChildItem -Path $LocalSnapshotsDir -Directory |
+    Where-Object { $_.Name -match "^\d{8}T\d{6}Z$" } |
+    Sort-Object Name -Descending
+
+$totalSize = (Get-ChildItem $LocalSnapshotsDir -Recurse -File | Measure-Object -Property Length -Sum).Sum
+$SizeMB = [math]::Round($totalSize / 1MB, 2)
+$latest = if ($snapshots.Count -gt 0) { $snapshots[0].Name } else { "(none)" }
+
+Log ("Pulled {0} snapshots successfully (total {1} MB, latest: {2})" -f $snapshots.Count, $SizeMB, $latest)
+Log "=== backup complete ==="
+
+# --- Log retention: keep last 30 log files ---
+Get-ChildItem -Path $LogDir -Filter "backup-*.log" |
+    Sort-Object Name -Descending |
+    Select-Object -Skip 30 |
+    ForEach-Object { Remove-Item $_.FullName -Force -ErrorAction SilentlyContinue }
--- a/src/atocore/api/routes.py
+++ b/src/atocore/api/routes.py
--- a/src/atocore/assets/init.py
+++ b/src/atocore/assets/init.py
@@ -0,0 +1,31 @@
+"""Binary asset store (Issue F — visual evidence)."""
+
+from atocore.assets.service import (
+    ALLOWED_MIME_TYPES,
+    Asset,
+    AssetError,
+    AssetNotFound,
+    AssetTooLarge,
+    AssetTypeNotAllowed,
+    get_asset,
+    get_asset_binary,
+    get_thumbnail,
+    invalidate_asset,
+    list_orphan_assets,
+    store_asset,
+)
+
+__all__ = [
+    "ALLOWED_MIME_TYPES",
+    "Asset",
+    "AssetError",
+    "AssetNotFound",
+    "AssetTooLarge",
+    "AssetTypeNotAllowed",
+    "get_asset",
+    "get_asset_binary",
+    "get_thumbnail",
+    "invalidate_asset",
+    "list_orphan_assets",
+    "store_asset",
+]
--- a/src/atocore/assets/service.py
+++ b/src/atocore/assets/service.py
@@ -0,0 +1,367 @@
+"""Binary asset storage with hash-dedup and on-demand thumbnails.
+
+Issue F — visual evidence. Stores uploaded images / PDFs / CAD exports
+under ``<assets_dir>/<hash[:2]>/<hash>.<ext>``. Re-uploads are idempotent
+on SHA-256. Thumbnails are generated on first request and cached under
+``<assets_dir>/.thumbnails/<size>/<hash>.jpg``.
+
+Kept deliberately small: no authentication, no background jobs, no
+image transformations beyond thumbnailing. Callers (API layer) own
+MIME validation and size caps.
+"""
+
+from __future__ import annotations
+
+import hashlib
+import json
+import uuid
+from dataclasses import dataclass, field
+from datetime import datetime, timezone
+from io import BytesIO
+from pathlib import Path
+
+import atocore.config as _config
+from atocore.models.database import get_connection
+from atocore.observability.logger import get_logger
+
+log = get_logger("assets")
+
+
+# Whitelisted mime types. Start conservative; extend when a real use
+# case lands rather than speculatively.
+ALLOWED_MIME_TYPES: dict[str, str] = {
+    "image/png": "png",
+    "image/jpeg": "jpg",
+    "image/webp": "webp",
+    "image/gif": "gif",
+    "application/pdf": "pdf",
+    "model/step": "step",
+    "model/iges": "iges",
+}
+
+
+class AssetError(Exception):
+    """Base class for asset errors."""
+
+
+class AssetTooLarge(AssetError):
+    pass
+
+
+class AssetTypeNotAllowed(AssetError):
+    pass
+
+
+class AssetNotFound(AssetError):
+    pass
+
+
+@dataclass
+class Asset:
+    id: str
+    hash_sha256: str
+    mime_type: str
+    size_bytes: int
+    stored_path: str
+    width: int | None = None
+    height: int | None = None
+    original_filename: str = ""
+    project: str = ""
+    caption: str = ""
+    source_refs: list[str] = field(default_factory=list)
+    status: str = "active"
+    created_at: str = ""
+    updated_at: str = ""
+
+    def to_dict(self) -> dict:
+        return {
+            "id": self.id,
+            "hash_sha256": self.hash_sha256,
+            "mime_type": self.mime_type,
+            "size_bytes": self.size_bytes,
+            "width": self.width,
+            "height": self.height,
+            "stored_path": self.stored_path,
+            "original_filename": self.original_filename,
+            "project": self.project,
+            "caption": self.caption,
+            "source_refs": self.source_refs,
+            "status": self.status,
+            "created_at": self.created_at,
+            "updated_at": self.updated_at,
+        }
+
+
+def _assets_root() -> Path:
+    root = _config.settings.resolved_assets_dir
+    root.mkdir(parents=True, exist_ok=True)
+    return root
+
+
+def _blob_path(hash_sha256: str, ext: str) -> Path:
+    root = _assets_root()
+    return root / hash_sha256[:2] / f"{hash_sha256}.{ext}"
+
+
+def _thumbnails_root() -> Path:
+    return _assets_root() / ".thumbnails"
+
+
+def _thumbnail_path(hash_sha256: str, size: int) -> Path:
+    return _thumbnails_root() / str(size) / f"{hash_sha256}.jpg"
+
+
+def _image_dimensions(data: bytes, mime_type: str) -> tuple[int | None, int | None]:
+    if not mime_type.startswith("image/"):
+        return None, None
+    try:
+        from PIL import Image
+    except Exception:
+        return None, None
+    try:
+        with Image.open(BytesIO(data)) as img:
+            return img.width, img.height
+    except Exception as e:
+        log.warning("asset_dimension_probe_failed", error=str(e))
+        return None, None
+
+
+def store_asset(
+    data: bytes,
+    mime_type: str,
+    original_filename: str = "",
+    project: str = "",
+    caption: str = "",
+    source_refs: list[str] | None = None,
+) -> Asset:
+    """Persist a binary blob and return the catalog row.
+
+    Idempotent on SHA-256 — a re-upload returns the existing asset row
+    without rewriting the blob or creating a duplicate catalog entry.
+    Caption / project / source_refs on re-upload are ignored; update
+    those via the owning entity's properties instead.
+    """
+    max_bytes = _config.settings.assets_max_upload_bytes
+    if len(data) > max_bytes:
+        raise AssetTooLarge(
+            f"Upload is {len(data)} bytes; limit is {max_bytes} bytes"
+        )
+    if mime_type not in ALLOWED_MIME_TYPES:
+        raise AssetTypeNotAllowed(
+            f"mime_type {mime_type!r} not in allowlist. "
+            f"Allowed: {sorted(ALLOWED_MIME_TYPES)}"
+        )
+
+    hash_sha256 = hashlib.sha256(data).hexdigest()
+    ext = ALLOWED_MIME_TYPES[mime_type]
+
+    # Idempotency — if we already have this hash, return the existing row.
+    existing = _fetch_by_hash(hash_sha256)
+    if existing is not None:
+        log.info("asset_dedup_hit", asset_id=existing.id, hash=hash_sha256[:12])
+        return existing
+
+    width, height = _image_dimensions(data, mime_type)
+
+    blob_path = _blob_path(hash_sha256, ext)
+    blob_path.parent.mkdir(parents=True, exist_ok=True)
+    blob_path.write_bytes(data)
+
+    asset_id = str(uuid.uuid4())
+    now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
+    refs = source_refs or []
+
+    with get_connection() as conn:
+        conn.execute(
+            """INSERT INTO assets
+               (id, hash_sha256, mime_type, size_bytes, width, height,
+                stored_path, original_filename, project, caption,
+                source_refs, status, created_at, updated_at)
+               VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 'active', ?, ?)""",
+            (
+                asset_id, hash_sha256, mime_type, len(data), width, height,
+                str(blob_path), original_filename, project, caption,
+                json.dumps(refs), now, now,
+            ),
+        )
+
+    log.info(
+        "asset_stored", asset_id=asset_id, hash=hash_sha256[:12],
+        mime_type=mime_type, size_bytes=len(data),
+    )
+    return Asset(
+        id=asset_id, hash_sha256=hash_sha256, mime_type=mime_type,
+        size_bytes=len(data), width=width, height=height,
+        stored_path=str(blob_path), original_filename=original_filename,
+        project=project, caption=caption, source_refs=refs,
+        status="active", created_at=now, updated_at=now,
+    )
+
+
+def _fetch_by_hash(hash_sha256: str) -> Asset | None:
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT * FROM assets WHERE hash_sha256 = ? AND status != 'invalid'",
+            (hash_sha256,),
+        ).fetchone()
+    return _row_to_asset(row) if row else None
+
+
+def get_asset(asset_id: str) -> Asset | None:
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT * FROM assets WHERE id = ?", (asset_id,)
+        ).fetchone()
+    return _row_to_asset(row) if row else None
+
+
+def get_asset_binary(asset_id: str) -> tuple[Asset, bytes]:
+    """Return (metadata, raw bytes). Raises AssetNotFound."""
+    asset = get_asset(asset_id)
+    if asset is None or asset.status == "invalid":
+        raise AssetNotFound(f"Asset not found: {asset_id}")
+    path = Path(asset.stored_path)
+    if not path.exists():
+        raise AssetNotFound(
+            f"Asset {asset_id} row exists but blob is missing at {path}"
+        )
+    return asset, path.read_bytes()
+
+
+def get_thumbnail(asset_id: str, size: int = 240) -> tuple[Asset, bytes]:
+    """Return (metadata, thumbnail JPEG bytes).
+
+    Thumbnails are only generated for image mime types. For non-images
+    the caller should render a placeholder instead. Generated thumbs
+    are cached on disk at ``<assets_dir>/.thumbnails/<size>/<hash>.jpg``.
+    """
+    asset = get_asset(asset_id)
+    if asset is None or asset.status == "invalid":
+        raise AssetNotFound(f"Asset not found: {asset_id}")
+    if not asset.mime_type.startswith("image/"):
+        raise AssetError(
+            f"Thumbnails are only supported for images; "
+            f"{asset.mime_type!r} is not an image"
+        )
+
+    size = max(16, min(int(size), 2048))
+    thumb_path = _thumbnail_path(asset.hash_sha256, size)
+    if thumb_path.exists():
+        return asset, thumb_path.read_bytes()
+
+    try:
+        from PIL import Image
+    except Exception as e:
+        raise AssetError(f"Pillow not available for thumbnailing: {e}")
+
+    src_path = Path(asset.stored_path)
+    if not src_path.exists():
+        raise AssetNotFound(
+            f"Asset {asset_id} row exists but blob is missing at {src_path}"
+        )
+
+    thumb_path.parent.mkdir(parents=True, exist_ok=True)
+    with Image.open(src_path) as img:
+        img = img.convert("RGB") if img.mode not in ("RGB", "L") else img
+        img.thumbnail((size, size))
+        buf = BytesIO()
+        img.save(buf, format="JPEG", quality=85, optimize=True)
+        jpeg_bytes = buf.getvalue()
+    thumb_path.write_bytes(jpeg_bytes)
+    return asset, jpeg_bytes
+
+
+def list_orphan_assets(limit: int = 200) -> list[Asset]:
+    """Assets not referenced by any active entity or memory.
+
+    "Referenced" means: an active entity has ``properties.asset_id``
+    pointing at this asset, OR any active entity / memory's
+    source_refs contains ``asset:<id>``.
+    """
+    with get_connection() as conn:
+        asset_rows = conn.execute(
+            "SELECT * FROM assets WHERE status = 'active' "
+            "ORDER BY created_at DESC LIMIT ?",
+            (min(limit, 1000),),
+        ).fetchall()
+
+        entities_with_asset = set()
+        rows = conn.execute(
+            "SELECT properties, source_refs FROM entities "
+            "WHERE status = 'active'"
+        ).fetchall()
+        for r in rows:
+            try:
+                props = json.loads(r["properties"] or "{}")
+                aid = props.get("asset_id")
+                if aid:
+                    entities_with_asset.add(aid)
+            except Exception:
+                pass
+            try:
+                refs = json.loads(r["source_refs"] or "[]")
+                for ref in refs:
+                    if isinstance(ref, str) and ref.startswith("asset:"):
+                        entities_with_asset.add(ref.split(":", 1)[1])
+            except Exception:
+                pass
+
+        # Memories don't have a properties dict, but source_refs may carry
+        # asset:<id> after Issue F lands for memory-level evidence.
+        # The memories table has no source_refs column today — skip here
+        # and extend once that lands.
+
+    return [
+        _row_to_asset(r)
+        for r in asset_rows
+        if r["id"] not in entities_with_asset
+    ]
+
+
+def invalidate_asset(asset_id: str, actor: str = "api", note: str = "") -> bool:
+    """Tombstone an asset. No-op if still referenced.
+
+    Returns True on success, False if the asset is missing or still
+    referenced by an active entity (caller should get a 409 in that
+    case). The blob file stays on disk until a future gc pass sweeps
+    orphaned blobs — this function only flips the catalog status.
+    """
+    asset = get_asset(asset_id)
+    if asset is None:
+        return False
+    orphans = list_orphan_assets(limit=1000)
+    if asset.id not in {o.id for o in orphans} and asset.status == "active":
+        log.info("asset_invalidate_blocked_referenced", asset_id=asset_id)
+        return False
+
+    now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
+    with get_connection() as conn:
+        conn.execute(
+            "UPDATE assets SET status = 'invalid', updated_at = ? WHERE id = ?",
+            (now, asset_id),
+        )
+    log.info("asset_invalidated", asset_id=asset_id, actor=actor, note=note[:80])
+    return True
+
+
+def _row_to_asset(row) -> Asset:
+    try:
+        refs = json.loads(row["source_refs"] or "[]")
+    except Exception:
+        refs = []
+    return Asset(
+        id=row["id"],
+        hash_sha256=row["hash_sha256"],
+        mime_type=row["mime_type"],
+        size_bytes=row["size_bytes"],
+        width=row["width"],
+        height=row["height"],
+        stored_path=row["stored_path"],
+        original_filename=row["original_filename"] or "",
+        project=row["project"] or "",
+        caption=row["caption"] or "",
+        source_refs=refs,
+        status=row["status"],
+        created_at=row["created_at"] or "",
+        updated_at=row["updated_at"] or "",
+    )
--- a/src/atocore/config.py
+++ b/src/atocore/config.py
@@ -22,6 +22,8 @@ class Settings(BaseSettings):
    backup_dir: Path = Path("./backups")
    run_dir: Path = Path("./run")
    project_registry_path: Path = Path("./config/project-registry.json")
+    assets_dir: Path | None = None
+    assets_max_upload_bytes: int = 20 * 1024 * 1024  # 20 MB per upload
    host: str = "127.0.0.1"
    port: int = 8100
    db_busy_timeout_ms: int = 5000
@@ -76,6 +78,10 @@ class Settings(BaseSettings):
    def resolved_data_dir(self) -> Path:
        return self._resolve_path(self.data_dir)

+    @property
+    def resolved_assets_dir(self) -> Path:
+        return self._resolve_path(self.assets_dir or (self.resolved_data_dir / "assets"))
+
    @property
    def resolved_db_dir(self) -> Path:
        return self._resolve_path(self.db_dir or (self.resolved_data_dir / "db"))
@@ -132,6 +138,7 @@ class Settings(BaseSettings):
            self.resolved_backup_dir,
            self.resolved_run_dir,
            self.resolved_project_registry_path.parent,
+            self.resolved_assets_dir,
        ]

    @property
--- a/src/atocore/context/builder.py
+++ b/src/atocore/context/builder.py
@@ -508,6 +508,23 @@ def _build_engineering_context(
                f"  {direction} {rel.relationship_type} [{other.entity_type}] {other.name}"
            )

+    # Phase 5H: append a compact gaps summary so the LLM always sees
+    # "what we're currently missing" alongside the entity neighborhood.
+    # This is the director's most-used insight — orphan requirements,
+    # risky decisions, unsupported claims — surfaced in every context pack
+    # for project-scoped queries.
+    try:
+        from atocore.engineering.queries import all_gaps as _all_gaps
+        gaps = _all_gaps(project)
+        orphan_n = gaps["orphan_requirements"]["count"]
+        risky_n = gaps["risky_decisions"]["count"]
+        unsup_n = gaps["unsupported_claims"]["count"]
+        if orphan_n or risky_n or unsup_n:
+            lines.append("")
+            lines.append(f"Gaps: {orphan_n} orphan reqs, {risky_n} risky decisions, {unsup_n} unsupported claims")
+    except Exception:
+        pass
+
    lines.append("--- End Engineering Context ---")
    text = "\n".join(lines)

--- a/src/atocore/engineering/_graduation_prompt.py
+++ b/src/atocore/engineering/_graduation_prompt.py
@@ -0,0 +1,194 @@
+"""Shared LLM prompt for memory → entity graduation (Phase 5F).
+
+Mirrors the pattern of ``atocore.memory._llm_prompt``: stdlib-only so both
+the container extractor path and the host-side graduate_memories.py script
+use the same system prompt and parser, eliminating drift.
+
+Graduation asks: "does this active memory describe a TYPED engineering entity
+that belongs in the knowledge graph?" If yes, produce an entity candidate
+with type + name + description + zero-or-more relationship hints. If no,
+return null so the memory stays as-is.
+
+Design note: we DON'T ask the LLM to resolve targets of relationships (e.g.,
+"connect to Subsystem 'Optics'"). That's done in a second pass after human
+review — partly to keep this prompt cheap, partly because name-matching
+targets across projects is a hard problem worth its own pass.
+"""
+
+from __future__ import annotations
+
+import json
+from typing import Any
+
+GRADUATION_PROMPT_VERSION = "graduate-0.1.0"
+MAX_CONTENT_CHARS = 1500
+
+ENTITY_TYPES = {
+    "project",
+    "system",
+    "subsystem",
+    "component",
+    "interface",
+    "requirement",
+    "constraint",
+    "decision",
+    "material",
+    "parameter",
+    "analysis_model",
+    "result",
+    "validation_claim",
+    "vendor",
+    "process",
+}
+
+SYSTEM_PROMPT = """You are a knowledge-graph curator for an engineering firm's context system (AtoCore).
+
+Your job: given one active MEMORY (a curated fact about an engineering project), decide whether it describes a TYPED engineering entity that belongs in the structured graph. If yes, emit the entity candidate. If no, return null.
+
+A memory gets graduated when its content names a specific thing that has lifecycle, relationships, or cross-references in engineering work. A memory stays as-is when it's a general observation, preference, or loose context.
+
+ENTITY TYPES (choose the best fit):
+
+- project — a named project (usually already registered; rare to emit)
+- subsystem — a named chunk of a system with defined boundaries (e.g., "Primary Optics", "Cable Tensioning", "Motion Control")
+- component — a discrete physical or logical part (e.g., "Primary Mirror", "Pivot Pin", "Z-axis Servo Drive")
+- interface — a named boundary between two subsystems/components (e.g., "Mirror-to-Cell mounting interface")
+- requirement — a "must" or "shall" statement (e.g., "Surface figure < 25nm RMS")
+- constraint — a non-negotiable limit (e.g., "Thermal operating range 0-40°C")
+- decision — a committed design direction (e.g., "Selected Zerodur over ULE for primary blank")
+- material — a named material used in a component (e.g., "Zerodur", "Invar 36")
+- parameter — a specific named value or assumption (e.g., "Ambient temperature 22°C", "Lead time 6 weeks")
+- analysis_model — a named FEA / optical / thermal model (e.g., "Preston wear model v2")
+- result — a named measurement or simulation output (e.g., "FEA thermal sweep 2026-03")
+- validation_claim — an asserted claim to be backed by evidence (e.g., "Margin is adequate for full envelope")
+- vendor — a supplier / partner entity (e.g., "Schott AG", "ABB Space", "Nabeel")
+- process — a named workflow step (e.g., "Ion beam figuring pass", "Incoming inspection")
+- system — whole project's system envelope (rare; usually project handles this)
+
+WHEN TO GRADUATE:
+
+GRADUATE if the memory clearly names one of these entities with enough detail to be useful. Examples:
+- "Selected Zerodur for the p04 primary mirror blank" → 2 entities: decision(name="Select Zerodur for primary blank") + material(name="Zerodur")
+- "ABB Space (INO) is the polishing vendor for p04" → vendor(name="ABB Space")
+- "Surface figure target is < 25nm RMS after IBF" → requirement(name="Surface figure < 25nm RMS after IBF")
+- "The Preston model assumes 5N min contact pressure" → parameter(name="Preston min contact pressure = 5N")
+
+DON'T GRADUATE if the memory is:
+- A preference or work-style note (those stay as memories)
+- A session observation ("we tested X today") — no durable typed thing
+- A general insight / rule of thumb ("Always calibrate before measuring")
+- An OpenClaw MEMORY.md import of conversational history
+- Something where you can't pick a clear entity type with confidence
+
+OUTPUT FORMAT — exactly one JSON object:
+
+If graduating, emit:
+{
+  "graduate": true,
+  "entity_type": "component|requirement|decision|...",
+  "name": "short noun phrase, <60 chars",
+  "description": "one-sentence description that adds context beyond the name",
+  "confidence": 0.0-1.0,
+  "relationships": [
+    {"rel_type": "part_of|satisfies|uses_material|based_on_assumption|constrained_by|affected_by_decision|supports|evidenced_by|described_by", "target_hint": "name of the target entity (human will resolve)"}
+  ]
+}
+
+If not graduating, emit:
+{"graduate": false, "reason": "one-sentence reason"}
+
+Rules:
+- Output ONLY the JSON object, no markdown, no prose
+- name MUST be <60 chars and specific; reject vague names like "the system"
+- confidence: 0.6-0.7 is typical. Raise to 0.8+ only if the memory is very specific and unambiguous.
+- relationships array can be empty
+- target_hint is a free-text name; the human-review stage will resolve it to an actual entity id (or reject if the target doesn't exist yet)
+- If the memory describes MULTIPLE entities, pick the single most important one; a second pass can catch the others
+"""
+
+
+def build_user_message(memory_content: str, memory_project: str, memory_type: str) -> str:
+    return (
+        f"MEMORY PROJECT: {memory_project or '(unscoped)'}\n"
+        f"MEMORY TYPE: {memory_type}\n\n"
+        f"MEMORY CONTENT:\n{memory_content[:MAX_CONTENT_CHARS]}\n\n"
+        "Return the JSON decision now."
+    )
+
+
+def parse_graduation_output(raw: str) -> dict[str, Any] | None:
+    """Parse the LLM's graduation decision. Return None on any parse error.
+
+    On success returns the normalized decision dict with keys:
+      graduate (bool), entity_type (str), name (str), description (str),
+      confidence (float), relationships (list of {rel_type, target_hint})
+    OR {"graduate": false, "reason": "..."}
+    """
+    text = (raw or "").strip()
+    if not text:
+        return None
+    if text.startswith("```"):
+        text = text.strip("`")
+        nl = text.find("\n")
+        if nl >= 0:
+            text = text[nl + 1:]
+        if text.endswith("```"):
+            text = text[:-3]
+        text = text.strip()
+
+    # Tolerate leading prose
+    if not text.lstrip().startswith("{"):
+        start = text.find("{")
+        end = text.rfind("}")
+        if start >= 0 and end > start:
+            text = text[start:end + 1]
+
+    try:
+        parsed = json.loads(text)
+    except json.JSONDecodeError:
+        return None
+
+    if not isinstance(parsed, dict):
+        return None
+
+    graduate = bool(parsed.get("graduate", False))
+    if not graduate:
+        return {"graduate": False, "reason": str(parsed.get("reason", ""))[:200]}
+
+    entity_type = str(parsed.get("entity_type") or "").strip().lower()
+    if entity_type not in ENTITY_TYPES:
+        return None
+
+    name = str(parsed.get("name") or "").strip()
+    if not name or len(name) > 120:
+        return None
+
+    description = str(parsed.get("description") or "").strip()[:500]
+
+    try:
+        confidence = float(parsed.get("confidence", 0.6))
+    except (TypeError, ValueError):
+        confidence = 0.6
+    confidence = max(0.0, min(1.0, confidence))
+
+    raw_rels = parsed.get("relationships") or []
+    if not isinstance(raw_rels, list):
+        raw_rels = []
+    relationships: list[dict] = []
+    for r in raw_rels[:10]:
+        if not isinstance(r, dict):
+            continue
+        rtype = str(r.get("rel_type") or "").strip().lower()
+        target = str(r.get("target_hint") or "").strip()
+        if not rtype or not target:
+            continue
+        relationships.append({"rel_type": rtype, "target_hint": target[:120]})
+
+    return {
+        "graduate": True,
+        "entity_type": entity_type,
+        "name": name,
+        "description": description,
+        "confidence": confidence,
+        "relationships": relationships,
+    }
--- a/src/atocore/engineering/conflicts.py
+++ b/src/atocore/engineering/conflicts.py
@@ -0,0 +1,291 @@
+"""Phase 5G — Conflict detection on entity promote.
+
+When a candidate entity is promoted to active, we check whether another
+active entity is already claiming the "same slot" with an incompatible
+value. If so, we emit a conflicts row + conflict_members rows so the
+human can resolve.
+
+Slot keys are per-entity-type (from ``conflict-model.md``). V1 starts
+narrow with 3 slot kinds to avoid false positives:
+
+1. **component.material** — a component should normally have ONE
+   dominant material (via USES_MATERIAL edge). Two active USES_MATERIAL
+   edges from the same component pointing at different materials =
+   conflict.
+2. **component.part_of** — a component should belong to AT MOST one
+   subsystem (via PART_OF). Two active PART_OF edges = conflict.
+3. **requirement.value** — two active Requirements with the same name in
+   the same project but different descriptions = conflict.
+
+Rule: **flag, never block**. The promote succeeds; the conflict row is
+just a flag for the human. Users see conflicts in the dashboard and on
+wiki entity pages with a "⚠️ Disputed" badge.
+"""
+
+from __future__ import annotations
+
+import uuid
+from datetime import datetime, timezone
+
+from atocore.models.database import get_connection
+from atocore.observability.logger import get_logger
+
+log = get_logger("conflicts")
+
+
+def detect_conflicts_for_entity(entity_id: str) -> list[str]:
+    """Run conflict detection for a newly-promoted active entity.
+
+    Returns a list of conflict_ids created. Fail-open: any detection error
+    is logged and returns an empty list; the promote itself is not affected.
+    """
+    try:
+        with get_connection() as conn:
+            row = conn.execute(
+                "SELECT * FROM entities WHERE id = ? AND status = 'active'",
+                (entity_id,),
+            ).fetchone()
+        if row is None:
+            return []
+
+        created: list[str] = []
+        etype = row["entity_type"]
+        project = row["project"] or ""
+
+        if etype == "component":
+            created.extend(_check_component_conflicts(entity_id, project))
+        elif etype == "requirement":
+            created.extend(_check_requirement_conflicts(entity_id, row["name"], project))
+
+        return created
+    except Exception as e:
+        log.warning("conflict_detection_failed", entity_id=entity_id, error=str(e))
+        return []
+
+
+def _check_component_conflicts(component_id: str, project: str) -> list[str]:
+    """Check material + part_of slot uniqueness for a component."""
+    created: list[str] = []
+    with get_connection() as conn:
+        # component.material conflicts
+        mat_edges = conn.execute(
+            "SELECT r.id AS rel_id, r.target_entity_id, e.name "
+            "FROM relationships r "
+            "JOIN entities e ON e.id = r.target_entity_id "
+            "WHERE r.source_entity_id = ? AND r.relationship_type = 'uses_material' "
+            "AND e.status = 'active'",
+            (component_id,),
+        ).fetchall()
+        if len(mat_edges) > 1:
+            cid = _record_conflict(
+                slot_kind="component.material",
+                slot_key=component_id,
+                project=project,
+                note=f"component has {len(mat_edges)} active material edges",
+                members=[
+                    {
+                        "kind": "entity",
+                        "id": m["target_entity_id"],
+                        "snapshot": m["name"],
+                    }
+                    for m in mat_edges
+                ],
+            )
+            if cid:
+                created.append(cid)
+
+        # component.part_of conflicts
+        pof_edges = conn.execute(
+            "SELECT r.id AS rel_id, r.target_entity_id, e.name "
+            "FROM relationships r "
+            "JOIN entities e ON e.id = r.target_entity_id "
+            "WHERE r.source_entity_id = ? AND r.relationship_type = 'part_of' "
+            "AND e.status = 'active'",
+            (component_id,),
+        ).fetchall()
+        if len(pof_edges) > 1:
+            cid = _record_conflict(
+                slot_kind="component.part_of",
+                slot_key=component_id,
+                project=project,
+                note=f"component is part_of {len(pof_edges)} subsystems",
+                members=[
+                    {
+                        "kind": "entity",
+                        "id": p["target_entity_id"],
+                        "snapshot": p["name"],
+                    }
+                    for p in pof_edges
+                ],
+            )
+            if cid:
+                created.append(cid)
+
+    return created
+
+
+def _check_requirement_conflicts(requirement_id: str, name: str, project: str) -> list[str]:
+    """Two active Requirements with the same name in the same project."""
+    with get_connection() as conn:
+        peers = conn.execute(
+            "SELECT id, description FROM entities "
+            "WHERE entity_type = 'requirement' AND status = 'active' "
+            "AND project = ? AND LOWER(name) = LOWER(?) AND id != ?",
+            (project, name, requirement_id),
+        ).fetchall()
+    if not peers:
+        return []
+
+    members = [{"kind": "entity", "id": requirement_id, "snapshot": name}]
+    for p in peers:
+        members.append({"kind": "entity", "id": p["id"],
+                        "snapshot": (p["description"] or "")[:200]})
+
+    cid = _record_conflict(
+        slot_kind="requirement.name",
+        slot_key=f"{project}|{name.lower()}",
+        project=project,
+        note=f"{len(peers)+1} active requirements share the name '{name}'",
+        members=members,
+    )
+    return [cid] if cid else []
+
+
+def _record_conflict(
+    slot_kind: str,
+    slot_key: str,
+    project: str,
+    note: str,
+    members: list[dict],
+) -> str | None:
+    """Persist a conflict + its members; skip if an open conflict already
+    exists for the same (slot_kind, slot_key)."""
+    try:
+        with get_connection() as conn:
+            existing = conn.execute(
+                "SELECT id FROM conflicts WHERE slot_kind = ? AND slot_key = ? "
+                "AND status = 'open'",
+                (slot_kind, slot_key),
+            ).fetchone()
+            if existing:
+                return None  # don't dup
+
+            conflict_id = str(uuid.uuid4())
+            conn.execute(
+                "INSERT INTO conflicts (id, slot_kind, slot_key, project, "
+                "status, note) VALUES (?, ?, ?, ?, 'open', ?)",
+                (conflict_id, slot_kind, slot_key, project, note[:500]),
+            )
+            for m in members:
+                conn.execute(
+                    "INSERT INTO conflict_members (id, conflict_id, member_kind, "
+                    "member_id, value_snapshot) VALUES (?, ?, ?, ?, ?)",
+                    (str(uuid.uuid4()), conflict_id,
+                     m.get("kind", "entity"), m.get("id", ""),
+                     (m.get("snapshot") or "")[:500]),
+                )
+
+        log.info("conflict_detected", conflict_id=conflict_id,
+                 slot_kind=slot_kind, project=project)
+
+        # Emit a warning alert so the operator sees it
+        try:
+            from atocore.observability.alerts import emit_alert
+            emit_alert(
+                severity="warning",
+                title=f"Entity conflict: {slot_kind}",
+                message=note,
+                context={"project": project, "slot_key": slot_key,
+                         "member_count": len(members)},
+            )
+        except Exception:
+            pass
+
+        return conflict_id
+    except Exception as e:
+        log.warning("conflict_record_failed", error=str(e))
+        return None
+
+
+def list_open_conflicts(project: str | None = None) -> list[dict]:
+    """Return open conflicts with their members."""
+    with get_connection() as conn:
+        query = "SELECT * FROM conflicts WHERE status = 'open'"
+        params: list = []
+        if project:
+            query += " AND project = ?"
+            params.append(project)
+        query += " ORDER BY detected_at DESC"
+        rows = conn.execute(query, params).fetchall()
+
+        conflicts = []
+        for r in rows:
+            member_rows = conn.execute(
+                "SELECT * FROM conflict_members WHERE conflict_id = ?",
+                (r["id"],),
+            ).fetchall()
+            conflicts.append({
+                "id": r["id"],
+                "slot_kind": r["slot_kind"],
+                "slot_key": r["slot_key"],
+                "project": r["project"] or "",
+                "status": r["status"],
+                "note": r["note"] or "",
+                "detected_at": r["detected_at"],
+                "members": [
+                    {
+                        "id": m["id"],
+                        "member_kind": m["member_kind"],
+                        "member_id": m["member_id"],
+                        "snapshot": m["value_snapshot"] or "",
+                    }
+                    for m in member_rows
+                ],
+            })
+    return conflicts
+
+
+def resolve_conflict(
+    conflict_id: str,
+    action: str,  # "dismiss", "supersede_others", "no_action"
+    winner_id: str | None = None,
+    actor: str = "api",
+) -> bool:
+    """Resolve a conflict. Optionally marks non-winner members as superseded."""
+    if action not in ("dismiss", "supersede_others", "no_action"):
+        raise ValueError(f"Invalid action: {action}")
+
+    now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
+
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT * FROM conflicts WHERE id = ?", (conflict_id,)
+        ).fetchone()
+        if row is None or row["status"] != "open":
+            return False
+
+        if action == "supersede_others":
+            if not winner_id:
+                raise ValueError("winner_id required for supersede_others")
+            # Mark non-winner member entities as superseded
+            member_rows = conn.execute(
+                "SELECT member_id FROM conflict_members WHERE conflict_id = ?",
+                (conflict_id,),
+            ).fetchall()
+            for m in member_rows:
+                if m["member_id"] != winner_id:
+                    conn.execute(
+                        "UPDATE entities SET status = 'superseded', updated_at = ? "
+                        "WHERE id = ? AND status = 'active'",
+                        (now, m["member_id"]),
+                    )
+
+        conn.execute(
+            "UPDATE conflicts SET status = 'resolved', resolution = ?, "
+            "resolved_at = ? WHERE id = ?",
+            (action, now, conflict_id),
+        )
+
+    log.info("conflict_resolved", conflict_id=conflict_id,
+             action=action, actor=actor)
+    return True
--- a/src/atocore/engineering/mirror.py
+++ b/src/atocore/engineering/mirror.py
@@ -29,6 +29,7 @@ def generate_project_overview(project: str) -> str:
    sections = [
        _header(project),
        _synthesis_section(project),
+        _gaps_section(project),  # Phase 5: killer queries surface here
        _state_section(project),
        _system_architecture(project),
        _decisions_section(project),
@@ -41,6 +42,66 @@ def generate_project_overview(project: str) -> str:
    return "\n\n".join(s for s in sections if s)


+def _gaps_section(project: str) -> str:
+    """Phase 5: surface the 3 killer-query gaps on every project page.
+
+    If any gap is non-empty, it appears near the top so the director
+    sees "what am I forgetting?" before the rest of the report.
+    """
+    try:
+        from atocore.engineering.queries import all_gaps
+        result = all_gaps(project)
+    except Exception:
+        return ""
+
+    orphan = result["orphan_requirements"]["count"]
+    risky = result["risky_decisions"]["count"]
+    unsup = result["unsupported_claims"]["count"]
+
+    if orphan == 0 and risky == 0 and unsup == 0:
+        return (
+            "## Coverage Gaps\n\n"
+            "> ✅ No gaps detected: every requirement is satisfied, "
+            "no decisions rest on flagged assumptions, every claim has evidence.\n"
+        )
+
+    lines = ["## Coverage Gaps", ""]
+    lines.append(
+        "> ⚠️ Items below need attention — gaps in the engineering graph.\n"
+    )
+
+    if orphan:
+        lines.append(f"### {orphan} Orphan Requirement(s)")
+        lines.append("*Requirements with no component claiming to satisfy them:*")
+        lines.append("")
+        for r in result["orphan_requirements"]["gaps"][:10]:
+            lines.append(f"- **{r['name']}** — {(r['description'] or '')[:120]}")
+        if orphan > 10:
+            lines.append(f"- _...and {orphan - 10} more_")
+        lines.append("")
+
+    if risky:
+        lines.append(f"### {risky} Risky Decision(s)")
+        lines.append("*Decisions based on assumptions that are flagged, superseded, or invalid:*")
+        lines.append("")
+        for d in result["risky_decisions"]["gaps"][:10]:
+            lines.append(
+                f"- **{d['decision_name']}** — based on flagged assumption "
+                f"_{d['assumption_name']}_ ({d['assumption_status']})"
+            )
+        lines.append("")
+
+    if unsup:
+        lines.append(f"### {unsup} Unsupported Claim(s)")
+        lines.append("*Validation claims with no supporting Result entity:*")
+        lines.append("")
+        for c in result["unsupported_claims"]["gaps"][:10]:
+            lines.append(f"- **{c['name']}** — {(c['description'] or '')[:120]}")
+        lines.append("")
+
+    return "\n".join(lines)
+
+
 def _synthesis_section(project: str) -> str:
    """Generate a short LLM synthesis of the current project state.

--- a/src/atocore/engineering/queries.py
+++ b/src/atocore/engineering/queries.py
@@ -0,0 +1,467 @@
+"""Phase 5 Engineering V1 — The 10 canonical queries.
+
+Each function maps to one or more catalog IDs in
+``docs/architecture/engineering-query-catalog.md``. Return values are plain
+dicts so API and wiki renderers can consume them without importing dataclasses.
+
+Design principles:
+  - All queries filter to status='active' unless the caller asks otherwise
+  - All project filters go through ``resolve_project_name`` (canonicalization)
+  - Graph traversals are bounded (depth <= 3 for impact, limit 200 for lists)
+  - The 3 "killer" queries (gaps) accept project as required — gaps are always
+    scoped to one project in V1
+
+These queries are the *useful surface* of the entity graph. Before this module,
+the graph was data with no narrative; after this module, the director can ask
+real questions about coverage, risk, and evidence.
+"""
+
+from __future__ import annotations
+
+from datetime import datetime, timezone
+
+from atocore.engineering.service import (
+    Entity,
+    _row_to_entity,
+    get_entity,
+    get_relationships,
+)
+from atocore.models.database import get_connection
+from atocore.projects.registry import resolve_project_name
+
+
+# ============================================================
+# Structure queries (Q-001, Q-004, Q-005, Q-008)
+# ============================================================
+
+
+def system_map(project: str) -> dict:
+    """Q-001 + Q-004: return the full subsystem/component tree for a project.
+
+    Shape:
+      {
+        "project": "p05-interferometer",
+        "subsystems": [
+          {
+            "id": ..., "name": ..., "description": ...,
+            "components": [{id, name, description, materials: [...]}],
+          },
+          ...
+        ],
+        "orphan_components": [...],   # components with no PART_OF edge
+      }
+    """
+    project = resolve_project_name(project) if project else ""
+    out: dict = {"project": project, "subsystems": [], "orphan_components": []}
+
+    with get_connection() as conn:
+        # All subsystems in project
+        subsys_rows = conn.execute(
+            "SELECT * FROM entities WHERE status = 'active' "
+            "AND project = ? AND entity_type = 'subsystem' "
+            "ORDER BY name",
+            (project,),
+        ).fetchall()
+
+        # All components in project
+        comp_rows = conn.execute(
+            "SELECT * FROM entities WHERE status = 'active' "
+            "AND project = ? AND entity_type = 'component'",
+            (project,),
+        ).fetchall()
+
+        # PART_OF edges: component → subsystem
+        part_of_rows = conn.execute(
+            "SELECT source_entity_id, target_entity_id FROM relationships "
+            "WHERE relationship_type = 'part_of'"
+        ).fetchall()
+        part_of_map: dict[str, str] = {
+            r["source_entity_id"]: r["target_entity_id"] for r in part_of_rows
+        }
+
+        # uses_material edges for components
+        mat_rows = conn.execute(
+            "SELECT r.source_entity_id, e.name FROM relationships r "
+            "JOIN entities e ON e.id = r.target_entity_id "
+            "WHERE r.relationship_type = 'uses_material' AND e.status = 'active'"
+        ).fetchall()
+        materials_by_comp: dict[str, list[str]] = {}
+        for r in mat_rows:
+            materials_by_comp.setdefault(r["source_entity_id"], []).append(r["name"])
+
+    # Build: subsystems → their components
+    subsys_comps: dict[str, list[dict]] = {s["id"]: [] for s in subsys_rows}
+    orphans: list[dict] = []
+    for c in comp_rows:
+        parent = part_of_map.get(c["id"])
+        comp_dict = {
+            "id": c["id"],
+            "name": c["name"],
+            "description": c["description"] or "",
+            "materials": materials_by_comp.get(c["id"], []),
+        }
+        if parent and parent in subsys_comps:
+            subsys_comps[parent].append(comp_dict)
+        else:
+            orphans.append(comp_dict)
+
+    out["subsystems"] = [
+        {
+            "id": s["id"],
+            "name": s["name"],
+            "description": s["description"] or "",
+            "components": subsys_comps.get(s["id"], []),
+        }
+        for s in subsys_rows
+    ]
+    out["orphan_components"] = orphans
+    return out
+
+
+def decisions_affecting(project: str, subsystem_id: str | None = None) -> dict:
+    """Q-008: decisions that affect a subsystem (or whole project).
+
+    Walks AFFECTED_BY_DECISION edges. If subsystem_id is given, returns
+    decisions linked to that subsystem or any of its components. Otherwise,
+    all decisions in the project.
+    """
+    project = resolve_project_name(project) if project else ""
+
+    target_ids: set[str] = set()
+    if subsystem_id:
+        target_ids.add(subsystem_id)
+        # Include components PART_OF the subsystem
+        with get_connection() as conn:
+            rows = conn.execute(
+                "SELECT source_entity_id FROM relationships "
+                "WHERE relationship_type = 'part_of' AND target_entity_id = ?",
+                (subsystem_id,),
+            ).fetchall()
+            for r in rows:
+                target_ids.add(r["source_entity_id"])
+
+    with get_connection() as conn:
+        if target_ids:
+            placeholders = ",".join("?" * len(target_ids))
+            rows = conn.execute(
+                f"SELECT DISTINCT e.* FROM entities e "
+                f"JOIN relationships r ON r.source_entity_id = e.id "
+                f"WHERE e.status = 'active' AND e.entity_type = 'decision' "
+                f"AND e.project = ? AND r.relationship_type = 'affected_by_decision' "
+                f"AND r.target_entity_id IN ({placeholders}) "
+                f"ORDER BY e.updated_at DESC",
+                (project, *target_ids),
+            ).fetchall()
+        else:
+            rows = conn.execute(
+                "SELECT * FROM entities WHERE status = 'active' "
+                "AND entity_type = 'decision' AND project = ? "
+                "ORDER BY updated_at DESC LIMIT 200",
+                (project,),
+            ).fetchall()
+
+    decisions = [_entity_dict(_row_to_entity(r)) for r in rows]
+    return {
+        "project": project,
+        "subsystem_id": subsystem_id or "",
+        "decisions": decisions,
+        "count": len(decisions),
+    }
+
+
+def requirements_for(component_id: str) -> dict:
+    """Q-005: requirements that a component satisfies."""
+    with get_connection() as conn:
+        # Component → SATISFIES → Requirement
+        rows = conn.execute(
+            "SELECT e.* FROM entities e "
+            "JOIN relationships r ON r.target_entity_id = e.id "
+            "WHERE r.source_entity_id = ? AND r.relationship_type = 'satisfies' "
+            "AND e.entity_type = 'requirement' AND e.status = 'active' "
+            "ORDER BY e.name",
+            (component_id,),
+        ).fetchall()
+    requirements = [_entity_dict(_row_to_entity(r)) for r in rows]
+    return {
+        "component_id": component_id,
+        "requirements": requirements,
+        "count": len(requirements),
+    }
+
+
+def recent_changes(project: str, since: str | None = None, limit: int = 50) -> dict:
+    """Q-013: what changed recently in the project (entity audit log).
+
+    Uses the shared memory_audit table filtered by entity_kind='entity' and
+    joins back to entities for the project scope.
+    """
+    project = resolve_project_name(project) if project else ""
+    since = since or "2020-01-01"
+
+    with get_connection() as conn:
+        rows = conn.execute(
+            "SELECT a.id, a.memory_id AS entity_id, a.action, a.actor, "
+            "a.timestamp, a.note, e.entity_type, e.name, e.project "
+            "FROM memory_audit a "
+            "LEFT JOIN entities e ON e.id = a.memory_id "
+            "WHERE a.entity_kind = 'entity' AND a.timestamp >= ? "
+            "AND (e.project = ? OR e.project IS NULL) "
+            "ORDER BY a.timestamp DESC LIMIT ?",
+            (since, project, limit),
+        ).fetchall()
+
+    changes = []
+    for r in rows:
+        changes.append({
+            "audit_id": r["id"],
+            "entity_id": r["entity_id"],
+            "entity_type": r["entity_type"] or "?",
+            "entity_name": r["name"] or "(deleted)",
+            "action": r["action"],
+            "actor": r["actor"] or "api",
+            "note": r["note"] or "",
+            "timestamp": r["timestamp"],
+        })
+    return {"project": project, "since": since, "changes": changes, "count": len(changes)}
+
+
+# ============================================================
+# Killer queries (Q-006, Q-009, Q-011) — the "what am I forgetting?" queries
+# ============================================================
+
+
+def orphan_requirements(project: str) -> dict:
+    """Q-006: requirements in project with NO inbound SATISFIES edge.
+
+    These are "something we said must be true" with nothing actually
+    satisfying them. The single highest-value query for an engineering
+    director: shows what's unclaimed by design.
+    """
+    project = resolve_project_name(project) if project else ""
+
+    with get_connection() as conn:
+        rows = conn.execute(
+            "SELECT * FROM entities WHERE status = 'active' "
+            "AND project = ? AND entity_type = 'requirement' "
+            "AND NOT EXISTS ("
+            "  SELECT 1 FROM relationships r "
+            "  WHERE r.relationship_type = 'satisfies' "
+            "  AND r.target_entity_id = entities.id"
+            ") "
+            "ORDER BY updated_at DESC",
+            (project,),
+        ).fetchall()
+
+    orphans = [_entity_dict(_row_to_entity(r)) for r in rows]
+    return {
+        "project": project,
+        "query": "Q-006 orphan requirements",
+        "description": "Requirements with no SATISFIES relationship — nothing claims to meet them.",
+        "gaps": orphans,
+        "count": len(orphans),
+    }
+
+
+def risky_decisions(project: str) -> dict:
+    """Q-009: decisions linked to assumptions flagged as unresolved.
+
+    Walks BASED_ON_ASSUMPTION edges. An assumption is "flagged" if its
+    properties.flagged=True OR status='superseded' OR status='invalid'.
+    """
+    project = resolve_project_name(project) if project else ""
+
+    with get_connection() as conn:
+        rows = conn.execute(
+            "SELECT DISTINCT d.*, a.name AS assumption_name, a.id AS assumption_id, "
+            "a.status AS assumption_status, a.properties AS assumption_props "
+            "FROM entities d "
+            "JOIN relationships r ON r.source_entity_id = d.id "
+            "JOIN entities a ON a.id = r.target_entity_id "
+            "WHERE d.status = 'active' AND d.entity_type = 'decision' "
+            "AND d.project = ? "
+            "AND r.relationship_type = 'based_on_assumption' "
+            "AND ("
+            "  a.status IN ('superseded', 'invalid') OR "
+            "  a.properties LIKE '%\"flagged\": true%' OR "
+            "  a.properties LIKE '%\"flagged\":true%'"
+            ") "
+            "ORDER BY d.updated_at DESC",
+            (project,),
+        ).fetchall()
+
+    risky = []
+    for r in rows:
+        risky.append({
+            "decision_id": r["id"],
+            "decision_name": r["name"],
+            "decision_description": r["description"] or "",
+            "assumption_id": r["assumption_id"],
+            "assumption_name": r["assumption_name"],
+            "assumption_status": r["assumption_status"],
+        })
+    return {
+        "project": project,
+        "query": "Q-009 risky decisions",
+        "description": "Decisions based on assumptions that are flagged, superseded, or invalid.",
+        "gaps": risky,
+        "count": len(risky),
+    }
+
+
+def unsupported_claims(project: str) -> dict:
+    """Q-011: validation claims with NO inbound SUPPORTS edge.
+
+    These are asserted claims (e.g., "margin is adequate") with no
+    Result entity actually supporting them. High-risk: the engineer
+    believes it, but there's no evidence on file.
+    """
+    project = resolve_project_name(project) if project else ""
+
+    with get_connection() as conn:
+        rows = conn.execute(
+            "SELECT * FROM entities WHERE status = 'active' "
+            "AND project = ? AND entity_type = 'validation_claim' "
+            "AND NOT EXISTS ("
+            "  SELECT 1 FROM relationships r "
+            "  WHERE r.relationship_type = 'supports' "
+            "  AND r.target_entity_id = entities.id"
+            ") "
+            "ORDER BY updated_at DESC",
+            (project,),
+        ).fetchall()
+
+    claims = [_entity_dict(_row_to_entity(r)) for r in rows]
+    return {
+        "project": project,
+        "query": "Q-011 unsupported claims",
+        "description": "Validation claims with no supporting Result — asserted but not evidenced.",
+        "gaps": claims,
+        "count": len(claims),
+    }
+
+
+def all_gaps(project: str) -> dict:
+    """Combined: run Q-006, Q-009, Q-011 for a project in one go."""
+    return {
+        "project": resolve_project_name(project) if project else "",
+        "generated_at": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
+        "orphan_requirements": orphan_requirements(project),
+        "risky_decisions": risky_decisions(project),
+        "unsupported_claims": unsupported_claims(project),
+    }
+
+
+# ============================================================
+# History + impact (Q-016, Q-017)
+# ============================================================
+
+
+def impact_analysis(entity_id: str, max_depth: int = 3) -> dict:
+    """Q-016: transitive outbound reach of an entity.
+
+    Walks outbound edges breadth-first to max_depth. Answers "what would
+    be affected if I changed component X?" by finding everything downstream.
+    """
+    visited: set[str] = {entity_id}
+    impacted: list[dict] = []
+    frontier = [(entity_id, 0)]
+
+    while frontier:
+        current_id, depth = frontier.pop(0)
+        if depth >= max_depth:
+            continue
+        with get_connection() as conn:
+            rows = conn.execute(
+                "SELECT r.relationship_type, r.target_entity_id, "
+                "e.entity_type, e.name, e.status "
+                "FROM relationships r "
+                "JOIN entities e ON e.id = r.target_entity_id "
+                "WHERE r.source_entity_id = ? AND e.status = 'active'",
+                (current_id,),
+            ).fetchall()
+        for r in rows:
+            tid = r["target_entity_id"]
+            if tid in visited:
+                continue
+            visited.add(tid)
+            impacted.append({
+                "entity_id": tid,
+                "entity_type": r["entity_type"],
+                "name": r["name"],
+                "relationship": r["relationship_type"],
+                "depth": depth + 1,
+            })
+            frontier.append((tid, depth + 1))
+
+    root = get_entity(entity_id)
+    return {
+        "root": _entity_dict(root) if root else None,
+        "impacted_count": len(impacted),
+        "impacted": impacted,
+        "max_depth": max_depth,
+    }
+
+
+def evidence_chain(entity_id: str) -> dict:
+    """Q-017: what evidence supports this entity?
+
+    Walks inbound SUPPORTS / EVIDENCED_BY / DESCRIBED_BY edges to surface
+    the provenance chain: "this claim is supported by that result, which
+    was produced by that analysis model, which was described by that doc."
+    """
+    provenance_edges = ("supports", "evidenced_by", "described_by",
+                        "validated_by", "analyzed_by")
+    placeholders = ",".join("?" * len(provenance_edges))
+
+    with get_connection() as conn:
+        # Inbound edges of the provenance family
+        inbound_rows = conn.execute(
+            f"SELECT r.relationship_type, r.source_entity_id, "
+            f"e.entity_type, e.name, e.description, e.status "
+            f"FROM relationships r "
+            f"JOIN entities e ON e.id = r.source_entity_id "
+            f"WHERE r.target_entity_id = ? AND e.status = 'active' "
+            f"AND r.relationship_type IN ({placeholders})",
+            (entity_id, *provenance_edges),
+        ).fetchall()
+
+    # Also look at source_refs on the entity itself
+    root = get_entity(entity_id)
+
+    chain = []
+    for r in inbound_rows:
+        chain.append({
+            "via": r["relationship_type"],
+            "source_id": r["source_entity_id"],
+            "source_type": r["entity_type"],
+            "source_name": r["name"],
+            "source_description": (r["description"] or "")[:200],
+        })
+
+    return {
+        "root": _entity_dict(root) if root else None,
+        "direct_source_refs": root.source_refs if root else [],
+        "evidence_chain": chain,
+        "count": len(chain),
+    }
+
+
+# ============================================================
+# Helpers
+# ============================================================
+
+
+def _entity_dict(e: Entity) -> dict:
+    """Flatten an Entity to a public-API dict."""
+    return {
+        "id": e.id,
+        "entity_type": e.entity_type,
+        "name": e.name,
+        "project": e.project,
+        "description": e.description,
+        "properties": e.properties,
+        "status": e.status,
+        "confidence": e.confidence,
+        "source_refs": e.source_refs,
+        "updated_at": e.updated_at,
+    }
--- a/src/atocore/engineering/service.py
+++ b/src/atocore/engineering/service.py
@@ -9,6 +9,7 @@ from datetime import datetime, timezone

 from atocore.models.database import get_connection
 from atocore.observability.logger import get_logger
+from atocore.projects.registry import resolve_project_name

 log = get_logger("engineering")

@@ -28,21 +29,36 @@ ENTITY_TYPES = [
    "validation_claim",
    "vendor",
    "process",
+    # Issue F (visual evidence): images, PDFs, CAD exports attached to
+    # other entities via EVIDENCED_BY. properties carries kind +
+    # asset_id + caption + capture_context.
+    "artifact",
 ]

 RELATIONSHIP_TYPES = [
+    # Structural family
    "contains",
    "part_of",
    "interfaces_with",
+    # Intent family
    "satisfies",
    "constrained_by",
    "affected_by_decision",
+    "based_on_assumption",  # Phase 5 — Q-009 killer query
+    "supersedes",
+    # Validation family
    "analyzed_by",
    "validated_by",
+    "supports",  # Phase 5 — Q-011 killer query
+    "conflicts_with",  # Phase 5 — Q-012 future
    "depends_on",
-    "uses_material",
+    # Provenance family
    "described_by",
-    "supersedes",
+    "updated_by_session",  # Phase 5 — session→entity provenance
+    "evidenced_by",  # Phase 5 — Q-017 evidence trace
+    "summarized_in",  # Phase 5 — mirror caches
+    # Domain-specific (pre-existing, retained)
+    "uses_material",
 ]

 ENTITY_STATUSES = ["candidate", "active", "superseded", "invalid"]
@@ -132,6 +148,7 @@ def create_entity(
    status: str = "active",
    confidence: float = 1.0,
    source_refs: list[str] | None = None,
+    actor: str = "api",
 ) -> Entity:
    if entity_type not in ENTITY_TYPES:
        raise ValueError(f"Invalid entity type: {entity_type}. Must be one of {ENTITY_TYPES}")
@@ -140,6 +157,11 @@ def create_entity(
    if not name or not name.strip():
        raise ValueError("Entity name must be non-empty")

+    # Phase 5: enforce project canonicalization contract at the write seam.
+    # Aliases like "p04" become "p04-gigabit" so downstream reads stay
+    # consistent with the registry.
+    project = resolve_project_name(project) if project else ""
+
    entity_id = str(uuid.uuid4())
    now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
    props = properties or {}
@@ -159,6 +181,22 @@ def create_entity(
        )

    log.info("entity_created", entity_id=entity_id, entity_type=entity_type, name=name)
+
+    # Phase 5: entity audit rows share the memory_audit table via
+    # entity_kind="entity" discriminator. Same infrastructure, unified history.
+    _audit_entity(
+        entity_id=entity_id,
+        action="created",
+        actor=actor,
+        after={
+            "entity_type": entity_type,
+            "name": name.strip(),
+            "project": project,
+            "status": status,
+            "confidence": confidence,
+        },
+    )
+
    return Entity(
        id=entity_id, entity_type=entity_type, name=name.strip(),
        project=project, description=description, properties=props,
@@ -167,6 +205,35 @@ def create_entity(
    )


+def _audit_entity(
+    entity_id: str,
+    action: str,
+    actor: str = "api",
+    before: dict | None = None,
+    after: dict | None = None,
+    note: str = "",
+) -> None:
+    """Append an entity mutation row to the shared memory_audit table."""
+    try:
+        with get_connection() as conn:
+            conn.execute(
+                "INSERT INTO memory_audit (id, memory_id, action, actor, "
+                "before_json, after_json, note, entity_kind) "
+                "VALUES (?, ?, ?, ?, ?, ?, ?, 'entity')",
+                (
+                    str(uuid.uuid4()),
+                    entity_id,
+                    action,
+                    actor or "api",
+                    json.dumps(before or {}),
+                    json.dumps(after or {}),
+                    (note or "")[:500],
+                ),
+            )
+    except Exception as e:
+        log.warning("entity_audit_failed", entity_id=entity_id, action=action, error=str(e))
+
+
 def create_relationship(
    source_entity_id: str,
    target_entity_id: str,
@@ -198,6 +265,17 @@ def create_relationship(
        target=target_entity_id,
        rel_type=relationship_type,
    )
+    # Phase 5: relationship audit as an entity action on the source
+    _audit_entity(
+        entity_id=source_entity_id,
+        action="relationship_added",
+        actor="api",
+        after={
+            "rel_id": rel_id,
+            "rel_type": relationship_type,
+            "target": target_entity_id,
+        },
+    )
    return Relationship(
        id=rel_id, source_entity_id=source_entity_id,
        target_entity_id=target_entity_id,
@@ -206,13 +284,413 @@ def create_relationship(
    )


+# --- Phase 5: Entity promote/reject lifecycle ---
+
+
+def _set_entity_status(
+    entity_id: str,
+    new_status: str,
+    actor: str = "api",
+    note: str = "",
+) -> bool:
+    """Transition an entity's status with audit."""
+    if new_status not in ENTITY_STATUSES:
+        raise ValueError(f"Invalid status: {new_status}")
+
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT status FROM entities WHERE id = ?", (entity_id,)
+        ).fetchone()
+        if row is None:
+            return False
+        old_status = row["status"]
+        if old_status == new_status:
+            return False
+        now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
+        conn.execute(
+            "UPDATE entities SET status = ?, updated_at = ? WHERE id = ?",
+            (new_status, now, entity_id),
+        )
+
+    # Action verb mirrors memory pattern
+    if new_status == "active" and old_status == "candidate":
+        action = "promoted"
+    elif new_status == "invalid" and old_status == "candidate":
+        action = "rejected"
+    elif new_status == "invalid":
+        action = "invalidated"
+    elif new_status == "superseded":
+        action = "superseded"
+    else:
+        action = "status_changed"
+
+    _audit_entity(
+        entity_id=entity_id,
+        action=action,
+        actor=actor,
+        before={"status": old_status},
+        after={"status": new_status},
+        note=note,
+    )
+    log.info("entity_status_changed", entity_id=entity_id,
+             old=old_status, new=new_status, action=action)
+    return True
+
+
+def promote_entity(
+    entity_id: str,
+    actor: str = "api",
+    note: str = "",
+    target_project: str | None = None,
+) -> bool:
+    """Promote a candidate entity to active.
+
+    When ``target_project`` is provided (Issue C), also retarget the
+    entity's project before flipping the status. Use this to graduate an
+    inbox/global lead into a real project (e.g. when a vendor quote
+    becomes a contract). ``target_project`` is canonicalized through the
+    registry; reserved ids (``inbox``) and ``""`` are accepted verbatim.
+
+    Phase 5F graduation hook: if this entity has source_refs pointing at
+    memories (format "memory:<uuid>"), mark those source memories as
+    ``status=graduated`` and set their ``graduated_to_entity_id`` forward
+    pointer. This preserves the memory as an immutable historical record
+    while signalling that it's been absorbed into the typed graph.
+    """
+    entity = get_entity(entity_id)
+    if entity is None or entity.status != "candidate":
+        return False
+
+    if target_project is not None:
+        new_project = (
+            resolve_project_name(target_project) if target_project else ""
+        )
+        if new_project != entity.project:
+            now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
+            with get_connection() as conn:
+                conn.execute(
+                    "UPDATE entities SET project = ?, updated_at = ? "
+                    "WHERE id = ?",
+                    (new_project, now, entity_id),
+                )
+            _audit_entity(
+                entity_id=entity_id,
+                action="retargeted",
+                actor=actor,
+                before={"project": entity.project},
+                after={"project": new_project},
+                note=note,
+            )
+
+    ok = _set_entity_status(entity_id, "active", actor=actor, note=note)
+    if not ok:
+        return False
+
+    # Phase 5F: mark source memories as graduated
+    memory_ids = [
+        ref.split(":", 1)[1]
+        for ref in (entity.source_refs or [])
+        if isinstance(ref, str) and ref.startswith("memory:")
+    ]
+    if memory_ids:
+        _graduate_source_memories(memory_ids, entity_id, actor=actor)
+
+    # Phase 5G: sync conflict detection on promote. Fail-open — detection
+    # errors log but never undo the successful promote.
+    try:
+        from atocore.engineering.conflicts import detect_conflicts_for_entity
+        detect_conflicts_for_entity(entity_id)
+    except Exception as e:
+        log.warning("conflict_detection_failed", entity_id=entity_id, error=str(e))
+
+    return True
+
+
+def _graduate_source_memories(memory_ids: list[str], entity_id: str, actor: str) -> None:
+    """Mark source memories as graduated and set forward pointer."""
+    if not memory_ids:
+        return
+    now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
+    with get_connection() as conn:
+        for mid in memory_ids:
+            try:
+                row = conn.execute(
+                    "SELECT status FROM memories WHERE id = ?", (mid,)
+                ).fetchone()
+                if row is None:
+                    continue
+                old_status = row["status"]
+                if old_status == "graduated":
+                    continue  # already graduated — maybe by a different entity
+                conn.execute(
+                    "UPDATE memories SET status = 'graduated', "
+                    "graduated_to_entity_id = ?, updated_at = ? WHERE id = ?",
+                    (entity_id, now, mid),
+                )
+                # Write a memory_audit row for the graduation
+                conn.execute(
+                    "INSERT INTO memory_audit (id, memory_id, action, actor, "
+                    "before_json, after_json, note, entity_kind) "
+                    "VALUES (?, ?, 'graduated', ?, ?, ?, ?, 'memory')",
+                    (
+                        str(uuid.uuid4()),
+                        mid,
+                        actor or "api",
+                        json.dumps({"status": old_status}),
+                        json.dumps({
+                            "status": "graduated",
+                            "graduated_to_entity_id": entity_id,
+                        }),
+                        f"graduated to entity {entity_id[:8]}",
+                    ),
+                )
+                log.info("memory_graduated", memory_id=mid,
+                         entity_id=entity_id, old_status=old_status)
+            except Exception as e:
+                log.warning("memory_graduation_failed",
+                            memory_id=mid, entity_id=entity_id, error=str(e))
+
+
+def reject_entity_candidate(entity_id: str, actor: str = "api", note: str = "") -> bool:
+    """Reject a candidate entity (status → invalid)."""
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT status FROM entities WHERE id = ?", (entity_id,)
+        ).fetchone()
+    if row is None or row["status"] != "candidate":
+        return False
+    return _set_entity_status(entity_id, "invalid", actor=actor, note=note)
+
+
+def supersede_entity(
+    entity_id: str,
+    actor: str = "api",
+    note: str = "",
+    superseded_by: str | None = None,
+) -> bool:
+    """Mark an active entity as superseded.
+
+    When ``superseded_by`` names a real entity, also create a
+    ``supersedes`` relationship from the new entity to the old one
+    (semantics: ``new SUPERSEDES old``). This keeps the graph
+    navigable without the caller remembering to make that edge.
+    """
+    if superseded_by:
+        new_entity = get_entity(superseded_by)
+        if new_entity is None:
+            raise ValueError(
+                f"superseded_by entity not found: {superseded_by}"
+            )
+        if new_entity.id == entity_id:
+            raise ValueError("entity cannot supersede itself")
+
+    ok = _set_entity_status(entity_id, "superseded", actor=actor, note=note)
+    if not ok:
+        return False
+
+    if superseded_by:
+        try:
+            create_relationship(
+                source_entity_id=superseded_by,
+                target_entity_id=entity_id,
+                relationship_type="supersedes",
+                source_refs=[f"supersede-api:{actor}"],
+            )
+        except Exception as e:
+            log.warning(
+                "supersede_relationship_create_failed",
+                entity_id=entity_id,
+                superseded_by=superseded_by,
+                error=str(e),
+            )
+    return True
+
+
+def invalidate_active_entity(
+    entity_id: str,
+    actor: str = "api",
+    reason: str = "",
+) -> tuple[bool, str]:
+    """Mark an active entity as invalid (Issue E — retraction path).
+
+    Returns (success, status_code) where status_code is one of:
+    - "invalidated"   — happy path
+    - "not_found"     — no such entity
+    - "already_invalid" — already invalid (idempotent)
+    - "not_active"    — entity is candidate/superseded; use the
+      appropriate other endpoint
+
+    This is the public retraction API distinct from
+    ``reject_entity_candidate`` (which only handles candidate→invalid).
+    """
+    entity = get_entity(entity_id)
+    if entity is None:
+        return False, "not_found"
+    if entity.status == "invalid":
+        return True, "already_invalid"
+    if entity.status != "active":
+        return False, "not_active"
+    ok = _set_entity_status(entity_id, "invalid", actor=actor, note=reason)
+    return ok, "invalidated" if ok else "not_active"
+
+
+def update_entity(
+    entity_id: str,
+    *,
+    description: str | None = None,
+    properties_patch: dict | None = None,
+    confidence: float | None = None,
+    append_source_refs: list[str] | None = None,
+    actor: str = "api",
+    note: str = "",
+) -> Entity | None:
+    """Update mutable fields on an existing entity (Issue E follow-up).
+
+    Field rules (kept narrow on purpose):
+
+    - ``description``: replaces the current value when provided.
+    - ``properties_patch``: merged into the existing ``properties`` dict,
+      shallow. Pass ``None`` as a value to delete a key; pass a new
+      value to overwrite it.
+    - ``confidence``: replaces when provided. Must be in [0, 1].
+    - ``append_source_refs``: appended verbatim to the existing list
+      (duplicates are filtered out, order preserved).
+
+    What you cannot change via this path:
+
+    - ``entity_type`` — requires supersede+create (a new type is a new
+      thing).
+    - ``project`` — use ``promote_entity`` with ``target_project`` for
+      inbox→project graduation, or supersede+create for anything else.
+    - ``name`` — renames are destructive to cross-references;
+      supersede+create.
+    - ``status`` — use the dedicated promote/reject/invalidate/supersede
+      endpoints.
+
+    Returns the updated entity, or None if no such entity exists.
+    """
+    entity = get_entity(entity_id)
+    if entity is None:
+        return None
+    if confidence is not None and not (0.0 <= confidence <= 1.0):
+        raise ValueError("confidence must be in [0, 1]")
+
+    before = {
+        "description": entity.description,
+        "properties": dict(entity.properties or {}),
+        "confidence": entity.confidence,
+        "source_refs": list(entity.source_refs or []),
+    }
+
+    new_description = entity.description if description is None else description
+    new_confidence = entity.confidence if confidence is None else confidence
+    new_properties = dict(entity.properties or {})
+    if properties_patch:
+        for key, value in properties_patch.items():
+            if value is None:
+                new_properties.pop(key, None)
+            else:
+                new_properties[key] = value
+    new_refs = list(entity.source_refs or [])
+    if append_source_refs:
+        existing = set(new_refs)
+        for ref in append_source_refs:
+            if ref and ref not in existing:
+                new_refs.append(ref)
+                existing.add(ref)
+
+    now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
+    with get_connection() as conn:
+        conn.execute(
+            """UPDATE entities
+               SET description = ?, properties = ?, confidence = ?,
+                   source_refs = ?, updated_at = ?
+               WHERE id = ?""",
+            (
+                new_description,
+                json.dumps(new_properties),
+                new_confidence,
+                json.dumps(new_refs),
+                now,
+                entity_id,
+            ),
+        )
+
+    after = {
+        "description": new_description,
+        "properties": new_properties,
+        "confidence": new_confidence,
+        "source_refs": new_refs,
+    }
+    _audit_entity(
+        entity_id=entity_id,
+        action="updated",
+        actor=actor,
+        before=before,
+        after=after,
+        note=note,
+    )
+    log.info("entity_updated", entity_id=entity_id, actor=actor)
+    return get_entity(entity_id)
+
+
+def get_entity_audit(entity_id: str, limit: int = 100) -> list[dict]:
+    """Fetch audit entries for an entity from the shared audit table."""
+    with get_connection() as conn:
+        rows = conn.execute(
+            "SELECT id, memory_id AS entity_id, action, actor, before_json, "
+            "after_json, note, timestamp FROM memory_audit "
+            "WHERE entity_kind = 'entity' AND memory_id = ? "
+            "ORDER BY timestamp DESC LIMIT ?",
+            (entity_id, limit),
+        ).fetchall()
+    out = []
+    for r in rows:
+        try:
+            before = json.loads(r["before_json"] or "{}")
+        except Exception:
+            before = {}
+        try:
+            after = json.loads(r["after_json"] or "{}")
+        except Exception:
+            after = {}
+        out.append({
+            "id": r["id"],
+            "entity_id": r["entity_id"],
+            "action": r["action"],
+            "actor": r["actor"] or "api",
+            "before": before,
+            "after": after,
+            "note": r["note"] or "",
+            "timestamp": r["timestamp"],
+        })
+    return out
+
+
 def get_entities(
    entity_type: str | None = None,
    project: str | None = None,
    status: str = "active",
    name_contains: str | None = None,
    limit: int = 100,
+    scope_only: bool = False,
 ) -> list[Entity]:
+    """List entities with optional filters.
+
+    Project scoping rules (Issue C — inbox + cross-project):
+
+    - ``project=None``: no project filter, return everything matching status.
+    - ``project=""``: return only cross-project (global) entities.
+    - ``project="inbox"``: return only inbox entities.
+    - ``project="<real>"`` and ``scope_only=False`` (default): return entities
+      scoped to that project PLUS cross-project (``project=""``) entities.
+    - ``project="<real>"`` and ``scope_only=True``: return only that project,
+      without the cross-project bleed.
+    """
+    from atocore.projects.registry import (
+        INBOX_PROJECT, GLOBAL_PROJECT, is_reserved_project,
+    )
+
    query = "SELECT * FROM entities WHERE status = ?"
    params: list = [status]

@@ -220,8 +698,14 @@ def get_entities(
        query += " AND entity_type = ?"
        params.append(entity_type)
    if project is not None:
-        query += " AND project = ?"
-        params.append(project)
+        p = (project or "").strip()
+        if p == GLOBAL_PROJECT or is_reserved_project(p) or scope_only:
+            query += " AND project = ?"
+            params.append(p)
+        else:
+            # Real project — include cross-project entities by default.
+            query += " AND (project = ? OR project = ?)"
+            params.extend([p, GLOBAL_PROJECT])
    if name_contains:
        query += " AND name LIKE ?"
        params.append(f"%{name_contains}%")
--- a/src/atocore/engineering/triage_ui.py
+++ b/src/atocore/engineering/triage_ui.py
@@ -0,0 +1,747 @@
+"""Human triage UI for AtoCore candidate memories.
+
+Renders a lightweight HTML page at /admin/triage with all pending
+candidate memories, each with inline Promote / Reject / Edit buttons.
+No framework, no JS build, no database — reads candidates from the
+AtoCore DB and posts back to the existing REST endpoints.
+
+Design principle: the user should be able to triage 20 candidates in
+60 seconds from any browser. Keyboard shortcuts (y/n/e/s) make it
+feel like email triage (archive/delete).
+"""
+
+from __future__ import annotations
+
+import html as _html
+
+from atocore.engineering.wiki import render_html
+from atocore.memory.service import get_memories
+
+
+VALID_TYPES = ["identity", "preference", "project", "episodic", "knowledge", "adaptation"]
+
+
+def _escape(s: str | None) -> str:
+    return _html.escape(s or "", quote=True)
+
+
+def _render_candidate_card(cand) -> str:
+    """One candidate row with inline forms for promote/reject/edit."""
+    mid = _escape(cand.id)
+    content = _escape(cand.content)
+    memory_type = _escape(cand.memory_type)
+    project = _escape(cand.project or "")
+    project_display = project or "(global)"
+    confidence = f"{cand.confidence:.2f}"
+    refs = cand.reference_count or 0
+    created = _escape(str(cand.created_at or ""))
+    tags = cand.domain_tags or []
+    tags_str = _escape(", ".join(tags))
+    valid_until = _escape(cand.valid_until or "")
+    # Strip time portion for HTML date input
+    valid_until_date = valid_until[:10] if valid_until else ""
+
+    type_options = "".join(
+        f'<option value="{t}"{" selected" if t == cand.memory_type else ""}>{t}</option>'
+        for t in VALID_TYPES
+    )
+
+    # Tag badges rendered from current tags
+    badges_html = ""
+    if tags:
+        badges_html = '<div class="cand-tags-display">' + "".join(
+            f'<span class="tag-badge">{_escape(t)}</span>' for t in tags
+        ) + '</div>'
+
+    return f"""
+<div class="cand" id="cand-{mid}" data-id="{mid}">
+  <div class="cand-head">
+    <span class="cand-type">[{memory_type}]</span>
+    <span class="cand-project">{project_display}</span>
+    <span class="cand-meta">conf {confidence} · refs {refs} · {created[:16]}</span>
+  </div>
+  <div class="cand-body">
+    <textarea class="cand-content" id="content-{mid}">{content}</textarea>
+  </div>
+  {badges_html}
+  <div class="cand-meta-row">
+    <label class="cand-field-label">Tags:
+      <input type="text" class="cand-tags-input" id="tags-{mid}"
+             value="{tags_str}" placeholder="optics, thermal, p04" />
+    </label>
+    <label class="cand-field-label">Valid until:
+      <input type="date" class="cand-valid-until" id="valid-until-{mid}"
+             value="{valid_until_date}" />
+    </label>
+  </div>
+  <div class="cand-actions">
+    <button class="btn-promote" data-id="{mid}" title="Promote (Y)">✅ Promote</button>
+    <button class="btn-reject" data-id="{mid}" title="Reject (N)">❌ Reject</button>
+    <button class="btn-save-promote" data-id="{mid}" title="Save edits + promote (E)">✏️ Save&Promote</button>
+    <label class="cand-type-label">Type:
+      <select class="cand-type-select" id="type-{mid}">{type_options}</select>
+    </label>
+  </div>
+  <div class="cand-status" id="status-{mid}"></div>
+</div>
+"""
+
+
+_TRIAGE_SCRIPT = """
+<script>
+async function apiCall(url, method, body) {
+  try {
+    const opts = { method };
+    if (body) {
+      opts.headers = { 'Content-Type': 'application/json' };
+      opts.body = JSON.stringify(body);
+    }
+    const res = await fetch(url, opts);
+    return { ok: res.ok, status: res.status, json: res.ok ? await res.json().catch(()=>null) : null };
+  } catch (e) { return { ok: false, status: 0, error: String(e) }; }
+}
+
+async function requestAutoTriage() {
+  const btn = document.getElementById('auto-triage-btn');
+  const status = document.getElementById('auto-triage-status');
+  if (!btn) return;
+  btn.disabled = true;
+  btn.textContent = '⏳ Requesting...';
+  const r = await apiCall('/admin/triage/request-drain', 'POST');
+  if (r.ok) {
+    status.textContent = '✓ Requested. Host watcher runs every 2 min. Refresh this page in a minute to check progress.';
+    status.className = 'auto-triage-msg ok';
+    btn.textContent = '✓ Requested';
+    pollDrainStatus();
+  } else {
+    status.textContent = '❌ Request failed: ' + r.status;
+    status.className = 'auto-triage-msg err';
+    btn.disabled = false;
+    btn.textContent = '🤖 Auto-process queue';
+  }
+}
+
+async function pollDrainStatus() {
+  const status = document.getElementById('auto-triage-status');
+  const btn = document.getElementById('auto-triage-btn');
+  let polls = 0;
+  const timer = setInterval(async () => {
+    polls++;
+    const r = await apiCall('/admin/triage/drain-status', 'GET');
+    if (!r.ok || !r.json) return;
+    const s = r.json;
+    if (s.is_running) {
+      status.textContent = '⚙️ Auto-triage running on host... (started ' + (s.last_started_at || '?') + ')';
+      status.className = 'auto-triage-msg ok';
+    } else if (s.last_finished_at && !s.requested_at) {
+      status.textContent = '✅ Last run finished: ' + s.last_finished_at + '  →  ' + (s.last_result || 'complete');
+      status.className = 'auto-triage-msg ok';
+      if (btn) { btn.disabled = false; btn.textContent = '🤖 Auto-process queue'; }
+      clearInterval(timer);
+      // Reload page to pick up new queue state
+      setTimeout(() => window.location.reload(), 3000);
+    }
+    if (polls > 60) { clearInterval(timer); }  // stop after ~10 min of polling
+  }, 10000); // poll every 10s
+}
+
+function setStatus(id, msg, ok) {
+  const el = document.getElementById('status-' + id);
+  if (!el) return;
+  el.textContent = msg;
+  el.className = 'cand-status ' + (ok ? 'ok' : 'err');
+}
+
+function removeCard(id) {
+  setTimeout(() => {
+    const card = document.getElementById('cand-' + id);
+    if (card) {
+      card.style.opacity = '0';
+      setTimeout(() => card.remove(), 300);
+    }
+    updateCount();
+  }, 400);
+}
+
+function updateCount() {
+  const n = document.querySelectorAll('.cand').length;
+  const el = document.getElementById('cand-count');
+  if (el) el.textContent = n;
+  const next = document.querySelector('.cand');
+  if (next) next.scrollIntoView({ behavior: 'smooth', block: 'start' });
+}
+
+async function promote(id) {
+  setStatus(id, 'Promoting…', true);
+  const r = await apiCall('/memory/' + encodeURIComponent(id) + '/promote', 'POST');
+  if (r.ok) { setStatus(id, '✅ Promoted', true); removeCard(id); }
+  else setStatus(id, '❌ Failed: ' + r.status, false);
+}
+
+async function reject(id) {
+  setStatus(id, 'Rejecting…', true);
+  const r = await apiCall('/memory/' + encodeURIComponent(id) + '/reject', 'POST');
+  if (r.ok) { setStatus(id, '❌ Rejected', true); removeCard(id); }
+  else setStatus(id, '❌ Failed: ' + r.status, false);
+}
+
+function parseTags(str) {
+  return (str || '').split(/[,;]/).map(s => s.trim().toLowerCase()).filter(Boolean);
+}
+
+async function savePromote(id) {
+  const content = document.getElementById('content-' + id).value.trim();
+  const mtype = document.getElementById('type-' + id).value;
+  const tagsStr = document.getElementById('tags-' + id)?.value || '';
+  const validUntil = document.getElementById('valid-until-' + id)?.value || '';
+  if (!content) { setStatus(id, 'Content is empty', false); return; }
+  setStatus(id, 'Saving…', true);
+  const body = {
+    content: content,
+    memory_type: mtype,
+    domain_tags: parseTags(tagsStr),
+    valid_until: validUntil,
+  };
+  const r1 = await apiCall('/memory/' + encodeURIComponent(id), 'PUT', body);
+  if (!r1.ok) { setStatus(id, '❌ Save failed: ' + r1.status, false); return; }
+  const r2 = await apiCall('/memory/' + encodeURIComponent(id) + '/promote', 'POST');
+  if (r2.ok) { setStatus(id, '✅ Saved & Promoted', true); removeCard(id); }
+  else setStatus(id, '❌ Promote failed: ' + r2.status, false);
+}
+
+// Also save tag/expiry edits when plain "Promote" is clicked if fields changed
+async function promoteWithMeta(id) {
+  const tagsStr = document.getElementById('tags-' + id)?.value || '';
+  const validUntil = document.getElementById('valid-until-' + id)?.value || '';
+  if (tagsStr.trim() || validUntil) {
+    await apiCall('/memory/' + encodeURIComponent(id), 'PUT', {
+      domain_tags: parseTags(tagsStr),
+      valid_until: validUntil,
+    });
+  }
+  return promote(id);
+}
+
+document.addEventListener('click', (e) => {
+  const id = e.target.dataset?.id;
+  if (!id) return;
+  if (e.target.classList.contains('btn-promote')) promoteWithMeta(id);
+  else if (e.target.classList.contains('btn-reject')) reject(id);
+  else if (e.target.classList.contains('btn-save-promote')) savePromote(id);
+});
+
+// Keyboard shortcuts on the currently-focused card
+document.addEventListener('keydown', (e) => {
+  // Don't intercept if user is typing in textarea/select/input
+  const t = e.target.tagName;
+  if (t === 'TEXTAREA' || t === 'INPUT' || t === 'SELECT') return;
+  const first = document.querySelector('.cand');
+  if (!first) return;
+  const id = first.dataset.id;
+  if (e.key === 'y' || e.key === 'Y') { e.preventDefault(); promoteWithMeta(id); }
+  else if (e.key === 'n' || e.key === 'N') { e.preventDefault(); reject(id); }
+  else if (e.key === 'e' || e.key === 'E') {
+    e.preventDefault();
+    document.getElementById('content-' + id)?.focus();
+  }
+  else if (e.key === 's' || e.key === 'S') { e.preventDefault(); first.scrollIntoView({behavior:'smooth'}); }
+});
+</script>
+"""
+
+
+_TRIAGE_CSS = """
+<style>
+  .triage-header { display:flex; justify-content:space-between; align-items:baseline; margin-bottom:1rem; }
+  .triage-header .count { font-size:1.4rem; font-weight:600; color:var(--accent); }
+  .triage-help { background:var(--card); border-left:4px solid var(--accent); padding:0.8rem 1rem; margin-bottom:1.5rem; border-radius:4px; font-size:0.9rem; }
+  .triage-help kbd { background:var(--hover); padding:2px 6px; border-radius:3px; font-family:monospace; font-size:0.85em; border:1px solid var(--border); }
+  .cand { background:var(--card); border:1px solid var(--border); border-radius:6px; padding:1rem; margin-bottom:1rem; transition:opacity 0.3s; }
+  .cand-head { display:flex; gap:0.8rem; align-items:center; margin-bottom:0.6rem; font-size:0.9rem; }
+  .cand-type { font-weight:600; color:var(--accent); font-family:monospace; }
+  .cand-project { color:var(--text); opacity:0.8; font-family:monospace; }
+  .cand-meta { color:var(--text); opacity:0.55; font-size:0.8rem; margin-left:auto; }
+  .cand-content { width:100%; min-height:80px; font-family:inherit; font-size:0.95rem; padding:0.5rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:4px; resize:vertical; box-sizing:border-box; }
+  .cand-content:focus { outline:none; border-color:var(--accent); }
+  .cand-actions { display:flex; gap:0.5rem; margin-top:0.8rem; align-items:center; flex-wrap:wrap; }
+  .cand-actions button { padding:0.4rem 0.9rem; border:1px solid var(--border); background:var(--card); color:var(--text); border-radius:4px; cursor:pointer; font-size:0.88rem; }
+  .cand-actions button:hover { background:var(--hover); }
+  .btn-promote:hover { background:#059669; color:white; border-color:#059669; }
+  .btn-reject:hover { background:#dc2626; color:white; border-color:#dc2626; }
+  .btn-save-promote:hover { background:var(--accent); color:white; border-color:var(--accent); }
+  .cand-type-label { font-size:0.85rem; margin-left:auto; opacity:0.7; }
+  .cand-type-select { padding:0.25rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px; font-family:monospace; }
+  .cand-status { margin-top:0.5rem; font-size:0.85rem; min-height:1.2em; }
+  .cand-status.ok { color:#059669; }
+  .cand-status.err { color:#dc2626; }
+  .empty { text-align:center; padding:3rem; opacity:0.6; }
+  .auto-triage-bar { display:flex; gap:0.8rem; align-items:center; background:var(--card); border:1px solid var(--border); border-radius:6px; padding:0.7rem 1rem; margin-bottom:1.2rem; flex-wrap:wrap; }
+  .auto-triage-bar button { padding:0.55rem 1.1rem; border:1px solid var(--accent); background:var(--accent); color:white; border-radius:4px; cursor:pointer; font-weight:600; font-size:0.95rem; }
+  .auto-triage-bar button:hover:not(:disabled) { opacity:0.9; }
+  .auto-triage-bar button:disabled { opacity:0.5; cursor:not-allowed; }
+  .auto-triage-msg { flex:1; min-width:200px; font-size:0.85rem; opacity:0.75; }
+  .auto-triage-msg.ok { color:var(--accent); opacity:1; font-weight:500; }
+  .auto-triage-msg.err { color:#dc2626; opacity:1; font-weight:500; }
+  .cand-tags-display { margin-top:0.5rem; display:flex; gap:0.35rem; flex-wrap:wrap; }
+  .tag-badge { background:var(--accent); color:white; padding:0.15rem 0.55rem; border-radius:10px; font-size:0.72rem; font-family:monospace; font-weight:500; }
+  .cand-meta-row { display:flex; gap:0.8rem; margin-top:0.6rem; align-items:center; flex-wrap:wrap; }
+  .cand-field-label { display:flex; gap:0.3rem; align-items:center; font-size:0.85rem; opacity:0.75; }
+  .cand-tags-input { flex:1; min-width:200px; padding:0.3rem 0.5rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px; font-family:monospace; font-size:0.85rem; }
+  .cand-tags-input:focus { outline:none; border-color:var(--accent); }
+  .cand-valid-until { padding:0.3rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px; font-family:inherit; font-size:0.85rem; }
+</style>
+"""
+
+
+def _render_entity_card(entity) -> str:
+    """Phase 5: entity candidate card with promote/reject."""
+    eid = _escape(entity.id)
+    name = _escape(entity.name)
+    etype = _escape(entity.entity_type)
+    project = _escape(entity.project or "(global)")
+    desc = _escape(entity.description or "")
+    conf = f"{entity.confidence:.2f}"
+    src_refs = entity.source_refs or []
+    source_display = _escape(", ".join(src_refs[:3])) if src_refs else "(no provenance)"
+
+    return f"""
+<div class="cand cand-entity" id="ecand-{eid}" data-entity-id="{eid}">
+  <div class="cand-head">
+    <span class="cand-type entity-type">[entity · {etype}]</span>
+    <span class="cand-project">{project}</span>
+    <span class="cand-meta">conf {conf} · src: {source_display}</span>
+  </div>
+  <div class="cand-body">
+    <div class="entity-name">{name}</div>
+    <div class="entity-desc">{desc}</div>
+  </div>
+  <div class="cand-actions">
+    <button class="btn-entity-promote" data-entity-id="{eid}" title="Promote entity (Y)">✅ Promote Entity</button>
+    <button class="btn-entity-reject" data-entity-id="{eid}" title="Reject entity (N)">❌ Reject</button>
+    <a class="btn-link" href="/wiki/entities/{eid}">View in wiki →</a>
+  </div>
+  <div class="cand-status" id="estatus-{eid}"></div>
+</div>
+"""
+
+
+_ENTITY_TRIAGE_SCRIPT = """
+<script>
+async function entityPromote(id) {
+  const st = document.getElementById('estatus-' + id);
+  st.textContent = 'Promoting…';
+  st.className = 'cand-status ok';
+  const r = await fetch('/entities/' + encodeURIComponent(id) + '/promote', {method:'POST'});
+  if (r.ok) {
+    st.textContent = '✅ Entity promoted';
+    setTimeout(() => {
+      const card = document.getElementById('ecand-' + id);
+      if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
+    }, 400);
+  } else st.textContent = '❌ ' + r.status;
+}
+async function entityReject(id) {
+  const st = document.getElementById('estatus-' + id);
+  st.textContent = 'Rejecting…';
+  st.className = 'cand-status ok';
+  const r = await fetch('/entities/' + encodeURIComponent(id) + '/reject', {method:'POST'});
+  if (r.ok) {
+    st.textContent = '❌ Entity rejected';
+    setTimeout(() => {
+      const card = document.getElementById('ecand-' + id);
+      if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
+    }, 400);
+  } else st.textContent = '❌ ' + r.status;
+}
+document.addEventListener('click', (e) => {
+  const eid = e.target.dataset?.entityId;
+  if (!eid) return;
+  if (e.target.classList.contains('btn-entity-promote')) entityPromote(eid);
+  else if (e.target.classList.contains('btn-entity-reject')) entityReject(eid);
+});
+</script>
+"""
+
+_ENTITY_TRIAGE_CSS = """
+<style>
+  .cand-entity { border-left: 3px solid #059669; }
+  .entity-type { background: #059669; color: white; padding: 0.1rem 0.5rem; border-radius: 3px; font-size: 0.75rem; }
+  .entity-name { font-size: 1.15rem; font-weight: 600; margin-bottom: 0.3rem; }
+  .entity-desc { opacity: 0.85; font-size: 0.95rem; }
+  .btn-entity-promote { background: #059669; color: white; border-color: #059669; }
+  .btn-entity-reject:hover { background: #dc2626; color: white; border-color: #dc2626; }
+  .btn-link { padding: 0.4rem 0.9rem; text-decoration: none; color: var(--accent); border: 1px solid var(--border); border-radius: 4px; font-size: 0.88rem; }
+  .btn-link:hover { background: var(--hover); }
+  .section-break { border-top: 2px solid var(--border); margin: 2rem 0 1rem 0; padding-top: 1rem; }
+</style>
+"""
+
+
+# ---------------------------------------------------------------------
+# Phase 7A — Merge candidates (semantic dedup)
+# ---------------------------------------------------------------------
+
+_MERGE_TRIAGE_CSS = """
+<style>
+  .cand-merge { border-left: 3px solid #8b5cf6; }
+  .merge-type { background: #8b5cf6; color: white; padding: 0.1rem 0.5rem; border-radius: 3px; font-size: 0.75rem; }
+  .merge-sources { margin: 0.5rem 0 0.8rem 0; display: flex; flex-direction: column; gap: 0.35rem; }
+  .merge-source { background: var(--bg); border: 1px dashed var(--border); border-radius: 4px; padding: 0.4rem 0.6rem; font-size: 0.85rem; }
+  .merge-source-meta { font-family: monospace; font-size: 0.72rem; opacity: 0.7; margin-bottom: 0.2rem; }
+  .merge-arrow { text-align: center; font-size: 1.1rem; opacity: 0.5; margin: 0.3rem 0; }
+  .merge-proposed { background: var(--card); border: 1px solid #8b5cf6; border-radius: 4px; padding: 0.5rem; }
+  .btn-merge-approve { background: #8b5cf6; color: white; border-color: #8b5cf6; }
+  .btn-merge-approve:hover { background: #7c3aed; }
+</style>
+"""
+
+
+def _render_merge_card(cand: dict) -> str:
+    import json as _json
+    cid = _escape(cand.get("id", ""))
+    sim = cand.get("similarity") or 0.0
+    sources = cand.get("sources") or []
+    proposed_content = cand.get("proposed_content") or ""
+    proposed_tags = cand.get("proposed_tags") or []
+    proposed_project = cand.get("proposed_project") or ""
+    reason = cand.get("reason") or ""
+
+    src_html = "".join(
+        f"""
+        <div class="merge-source">
+          <div class="merge-source-meta">
+            {_escape(s.get('id','')[:8])} · [{_escape(s.get('memory_type',''))}]
+            · {_escape(s.get('project','') or '(global)')}
+            · conf {float(s.get('confidence',0)):.2f}
+            · refs {int(s.get('reference_count',0))}
+          </div>
+          <div>{_escape((s.get('content') or '')[:300])}</div>
+        </div>
+        """
+        for s in sources
+    )
+    tags_str = ", ".join(proposed_tags)
+    return f"""
+<div class="cand cand-merge" id="mcand-{cid}" data-merge-id="{cid}">
+  <div class="cand-head">
+    <span class="cand-type merge-type">[merge · {len(sources)} sources]</span>
+    <span class="cand-project">{_escape(proposed_project or '(global)')}</span>
+    <span class="cand-meta">sim ≥ {sim:.2f}</span>
+  </div>
+  <div class="merge-sources">{src_html}</div>
+  <div class="merge-arrow">↓ merged into ↓</div>
+  <div class="merge-proposed">
+    <textarea class="cand-content" id="mcontent-{cid}">{_escape(proposed_content)}</textarea>
+    <div class="cand-meta-row">
+      <label class="cand-field-label">Tags:
+        <input type="text" class="cand-tags-input" id="mtags-{cid}" value="{_escape(tags_str)}" placeholder="tag1, tag2">
+      </label>
+    </div>
+    {f'<div class="auto-triage-msg" style="margin-top:0.4rem;">💡 {_escape(reason)}</div>' if reason else ''}
+  </div>
+  <div class="cand-actions">
+    <button class="btn-merge-approve" data-merge-id="{cid}" title="Approve merge">✅ Approve Merge</button>
+    <button class="btn-reject" data-merge-id="{cid}" data-merge-reject="1" title="Keep separate">❌ Keep Separate</button>
+  </div>
+  <div class="cand-status" id="mstatus-{cid}"></div>
+</div>
+"""
+
+
+_MERGE_TRIAGE_SCRIPT = """
+<script>
+async function mergeApprove(id) {
+  const st = document.getElementById('mstatus-' + id);
+  st.textContent = 'Merging…';
+  st.className = 'cand-status ok';
+  const content = document.getElementById('mcontent-' + id).value;
+  const tagsRaw = document.getElementById('mtags-' + id).value;
+  const tags = tagsRaw.split(',').map(t => t.trim()).filter(Boolean);
+  const r = await fetch('/admin/memory/merge-candidates/' + encodeURIComponent(id) + '/approve', {
+    method: 'POST',
+    headers: {'Content-Type': 'application/json'},
+    body: JSON.stringify({actor: 'human-triage', content: content, domain_tags: tags}),
+  });
+  if (r.ok) {
+    const data = await r.json();
+    st.textContent = '✅ Merged → ' + (data.result_memory_id || '').slice(0, 8);
+    setTimeout(() => {
+      const card = document.getElementById('mcand-' + id);
+      if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
+    }, 600);
+  } else {
+    const err = await r.text();
+    st.textContent = '❌ ' + r.status + ': ' + err.slice(0, 120);
+    st.className = 'cand-status err';
+  }
+}
+
+async function mergeReject(id) {
+  const st = document.getElementById('mstatus-' + id);
+  st.textContent = 'Rejecting…';
+  st.className = 'cand-status ok';
+  const r = await fetch('/admin/memory/merge-candidates/' + encodeURIComponent(id) + '/reject', {
+    method: 'POST',
+    headers: {'Content-Type': 'application/json'},
+    body: JSON.stringify({actor: 'human-triage'}),
+  });
+  if (r.ok) {
+    st.textContent = '❌ Kept separate';
+    setTimeout(() => {
+      const card = document.getElementById('mcand-' + id);
+      if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
+    }, 400);
+  } else st.textContent = '❌ ' + r.status;
+}
+
+document.addEventListener('click', (e) => {
+  const mid = e.target.dataset?.mergeId;
+  if (!mid) return;
+  if (e.target.classList.contains('btn-merge-approve')) mergeApprove(mid);
+  else if (e.target.dataset?.mergeReject) mergeReject(mid);
+});
+
+async function requestDedupScan() {
+  const btn = document.getElementById('dedup-btn');
+  const status = document.getElementById('dedup-status');
+  btn.disabled = true;
+  btn.textContent = 'Queuing…';
+  status.textContent = '';
+  status.className = 'auto-triage-msg';
+  const threshold = parseFloat(document.getElementById('dedup-threshold').value || '0.88');
+  const r = await fetch('/admin/memory/dedup-scan', {
+    method: 'POST',
+    headers: {'Content-Type': 'application/json'},
+    body: JSON.stringify({project: '', similarity_threshold: threshold, max_batch: 50}),
+  });
+  if (r.ok) {
+    status.textContent = `✓ Queued dedup scan at threshold ${threshold}. Host watcher runs every 2 min; refresh in ~3 min to see merge candidates.`;
+    status.className = 'auto-triage-msg ok';
+  } else {
+    status.textContent = '✗ ' + r.status;
+    status.className = 'auto-triage-msg err';
+  }
+  setTimeout(() => {
+    btn.disabled = false;
+    btn.textContent = '🔗 Scan for duplicates';
+  }, 2000);
+}
+</script>
+"""
+
+
+def _render_dedup_bar() -> str:
+    return """
+<div class="auto-triage-bar">
+  <button id="dedup-btn" onclick="requestDedupScan()" title="Run semantic dedup scan on Dalidou host">
+    🔗 Scan for duplicates
+  </button>
+  <label class="cand-field-label" style="margin:0 0.5rem;">
+    Threshold:
+    <input id="dedup-threshold" type="number" min="0.70" max="0.99" step="0.01" value="0.88"
+           style="width:70px; padding:0.25rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px;">
+  </label>
+  <span id="dedup-status" class="auto-triage-msg">
+    Finds semantically near-duplicate active memories and proposes LLM-drafted merges for review. Source memories become <code>superseded</code> on approve; nothing is deleted.
+  </span>
+</div>
+"""
+
+
+def _render_graduation_bar() -> str:
+    """The 'Graduate memories → entity candidates' control bar."""
+    from atocore.projects.registry import load_project_registry
+    try:
+        projects = load_project_registry()
+        options = '<option value="">(all projects)</option>' + "".join(
+            f'<option value="{_escape(p.project_id)}">{_escape(p.project_id)}</option>'
+            for p in projects
+        )
+    except Exception:
+        options = '<option value="">(all projects)</option>'
+
+    return f"""
+<div class="auto-triage-bar graduation-bar">
+  <button id="grad-btn" onclick="requestGraduation()" title="Run memory→entity graduation on Dalidou host">
+    🎓 Graduate memories
+  </button>
+  <label class="cand-field-label">Project:
+    <select id="grad-project" class="cand-type-select">{options}</select>
+  </label>
+  <label class="cand-field-label">Limit:
+    <input id="grad-limit" type="number" class="cand-tags-input" style="max-width:80px"
+           value="30" min="1" max="200" />
+  </label>
+  <span id="grad-status" class="auto-triage-msg">
+    Scans active memories, asks the LLM "does this describe a typed entity?",
+    and creates entity candidates. Review them in the Entity section below.
+  </span>
+</div>
+"""
+
+
+_GRADUATION_SCRIPT = """
+<script>
+async function requestGraduation() {
+  const btn = document.getElementById('grad-btn');
+  const status = document.getElementById('grad-status');
+  const project = document.getElementById('grad-project').value;
+  const limit = parseInt(document.getElementById('grad-limit').value || '30', 10);
+  btn.disabled = true;
+  btn.textContent = '⏳ Requesting...';
+  const r = await fetch('/admin/graduation/request', {
+    method: 'POST',
+    headers: {'Content-Type': 'application/json'},
+    body: JSON.stringify({project, limit}),
+  });
+  if (r.ok) {
+    const scope = project || 'all projects';
+    status.textContent = `✓ Queued graduation for ${scope} (limit ${limit}). Host watcher runs every 2 min; refresh this page in ~3 min to see candidates.`;
+    status.className = 'auto-triage-msg ok';
+    btn.textContent = '✓ Requested';
+    pollGraduationStatus();
+  } else {
+    status.textContent = '❌ Request failed: ' + r.status;
+    status.className = 'auto-triage-msg err';
+    btn.disabled = false;
+    btn.textContent = '🎓 Graduate memories';
+  }
+}
+
+async function pollGraduationStatus() {
+  const status = document.getElementById('grad-status');
+  const btn = document.getElementById('grad-btn');
+  let polls = 0;
+  const timer = setInterval(async () => {
+    polls++;
+    const r = await fetch('/admin/graduation/status');
+    if (!r.ok) return;
+    const s = await r.json();
+    if (s.is_running) {
+      status.textContent = '⚙️ Graduation running... (started ' + (s.last_started_at || '?') + ')';
+      status.className = 'auto-triage-msg ok';
+    } else if (s.last_finished_at && !s.requested) {
+      status.textContent = '✅ Finished: ' + s.last_finished_at + '  →  ' + (s.last_result || 'complete');
+      status.className = 'auto-triage-msg ok';
+      if (btn) { btn.disabled = false; btn.textContent = '🎓 Graduate memories'; }
+      clearInterval(timer);
+      setTimeout(() => window.location.reload(), 3000);
+    }
+    if (polls > 120) { clearInterval(timer); }  // ~20 min cap
+  }, 10000);
+}
+</script>
+"""
+
+
+def render_triage_page(limit: int = 100) -> str:
+    """Render the full triage page with pending memory + entity candidates."""
+    from atocore.engineering.service import get_entities
+
+    try:
+        mem_candidates = get_memories(status="candidate", limit=limit)
+    except Exception as e:
+        body = f"<p>Error loading memory candidates: {_escape(str(e))}</p>"
+        return render_html("Triage — AtoCore", body, breadcrumbs=[("Wiki", "/wiki"), ("Triage", "")])
+
+    try:
+        entity_candidates = get_entities(status="candidate", limit=limit)
+    except Exception as e:
+        entity_candidates = []
+
+    try:
+        from atocore.memory.service import get_merge_candidates
+        merge_candidates = get_merge_candidates(status="pending", limit=limit)
+    except Exception:
+        merge_candidates = []
+
+    total = len(mem_candidates) + len(entity_candidates) + len(merge_candidates)
+    graduation_bar = _render_graduation_bar()
+    dedup_bar = _render_dedup_bar()
+
+    if total == 0:
+        body = _TRIAGE_CSS + _ENTITY_TRIAGE_CSS + _MERGE_TRIAGE_CSS + f"""
+          <div class="triage-header">
+            <h1>Triage Queue</h1>
+          </div>
+          {graduation_bar}
+          {dedup_bar}
+          <div class="empty">
+            <p>🎉 No candidates to review.</p>
+            <p>The auto-triage pipeline keeps this queue empty unless something needs your judgment.</p>
+            <p>Use 🎓 Graduate memories to propose entity candidates, or 🔗 Scan for duplicates to find near-duplicate memories to merge.</p>
+          </div>
+        """ + _GRADUATION_SCRIPT + _MERGE_TRIAGE_SCRIPT
+        return render_html("Triage — AtoCore", body, breadcrumbs=[("Wiki", "/wiki"), ("Triage", "")])
+
+    # Memory cards
+    mem_cards = "".join(_render_candidate_card(c) for c in mem_candidates)
+
+    # Merge cards (Phase 7A)
+    merge_cards_html = ""
+    if merge_candidates:
+        merge_cards = "".join(_render_merge_card(c) for c in merge_candidates)
+        merge_cards_html = f"""
+        <div class="section-break">
+          <h2>🔗 Merge Candidates ({len(merge_candidates)})</h2>
+          <p class="auto-triage-msg">
+            Semantically near-duplicate active memories. Approving merges the sources
+            into the proposed unified memory; sources become <code>superseded</code>
+            (not deleted — still queryable). You can edit the draft content and tags
+            before approving.
+          </p>
+        </div>
+        {merge_cards}
+        """
+
+    # Entity cards
+    ent_cards_html = ""
+    if entity_candidates:
+        ent_cards = "".join(_render_entity_card(e) for e in entity_candidates)
+        ent_cards_html = f"""
+        <div class="section-break">
+          <h2>🔧 Entity Candidates ({len(entity_candidates)})</h2>
+          <p class="auto-triage-msg">
+            Typed graph entries awaiting review. Promoting an entity connects it to
+            the engineering knowledge graph (subsystems, requirements, decisions, etc.).
+          </p>
+        </div>
+        {ent_cards}
+        """
+
+    body = _TRIAGE_CSS + _ENTITY_TRIAGE_CSS + _MERGE_TRIAGE_CSS + f"""
+      <div class="triage-header">
+        <h1>Triage Queue</h1>
+        <span class="count">
+          <span id="cand-count">{len(mem_candidates)}</span> memory ·
+          {len(merge_candidates)} merge ·
+          {len(entity_candidates)} entity
+        </span>
+      </div>
+      <div class="triage-help">
+        Review candidates the auto-triage wasn't sure about. Edit the content
+        if needed, then promote or reject. Shortcuts: <kbd>Y</kbd> promote · <kbd>N</kbd>
+        reject · <kbd>E</kbd> edit · <kbd>S</kbd> scroll to next.
+      </div>
+      <div class="auto-triage-bar">
+        <button id="auto-triage-btn" onclick="requestAutoTriage()" title="Run auto_triage on Dalidou host">
+          🤖 Auto-process queue
+        </button>
+        <span id="auto-triage-status" class="auto-triage-msg">
+          Sends the full memory queue through 3-tier LLM triage on the host.
+          Sonnet → Opus → auto-discard. Only genuinely ambiguous items land here.
+        </span>
+      </div>
+      {graduation_bar}
+      {dedup_bar}
+      <h2>📝 Memory Candidates ({len(mem_candidates)})</h2>
+      {mem_cards}
+      {merge_cards_html}
+      {ent_cards_html}
+    """ + _TRIAGE_SCRIPT + _ENTITY_TRIAGE_SCRIPT + _GRADUATION_SCRIPT + _MERGE_TRIAGE_SCRIPT
+
+    return render_html(
+        "Triage — AtoCore",
+        body,
+        breadcrumbs=[("Wiki", "/wiki"), ("Triage", "")],
+    )
--- a/src/atocore/engineering/wiki.py
+++ b/src/atocore/engineering/wiki.py
--- a/src/atocore/main.py
+++ b/src/atocore/main.py
@@ -2,7 +2,8 @@

 from contextlib import asynccontextmanager

-from fastapi import FastAPI
+from fastapi import APIRouter, FastAPI
+from fastapi.routing import APIRoute

 from atocore import __version__
 from atocore.api.routes import router
@@ -53,6 +54,79 @@ app = FastAPI(
 app.include_router(router)


+# Public API v1 — stable contract for external clients (AKC, OpenClaw, etc.).
+# Paths listed here are re-mounted under /v1 as aliases of the existing
+# unversioned handlers. Unversioned paths continue to work; new endpoints
+# land at the latest version; breaking schema changes bump the prefix.
+_V1_PUBLIC_PATHS = {
+    "/entities",
+    "/entities/{entity_id}",
+    "/entities/{entity_id}/promote",
+    "/entities/{entity_id}/reject",
+    "/entities/{entity_id}/invalidate",
+    "/entities/{entity_id}/supersede",
+    "/entities/{entity_id}/audit",
+    "/relationships",
+    "/ingest",
+    "/ingest/sources",
+    "/context/build",
+    "/query",
+    "/projects",
+    "/projects/{project_name}",
+    "/projects/{project_name}/refresh",
+    "/projects/{project_name}/mirror",
+    "/projects/{project_name}/mirror.html",
+    "/memory",
+    "/memory/{memory_id}",
+    "/memory/{memory_id}/audit",
+    "/memory/{memory_id}/promote",
+    "/memory/{memory_id}/reject",
+    "/memory/{memory_id}/invalidate",
+    "/memory/{memory_id}/supersede",
+    "/project/state",
+    "/project/state/{project_name}",
+    "/interactions",
+    "/interactions/{interaction_id}",
+    "/interactions/{interaction_id}/reinforce",
+    "/interactions/{interaction_id}/extract",
+    "/health",
+    "/sources",
+    "/stats",
+    # Issue F: asset store + evidence query
+    "/assets",
+    "/assets/{asset_id}",
+    "/assets/{asset_id}/thumbnail",
+    "/assets/{asset_id}/meta",
+    "/entities/{entity_id}/evidence",
+    # Issue D: engineering query surface (decisions, systems, components,
+    # gaps, evidence, impact, changes)
+    "/engineering/projects/{project_name}/systems",
+    "/engineering/decisions",
+    "/engineering/components/{component_id}/requirements",
+    "/engineering/changes",
+    "/engineering/gaps",
+    "/engineering/gaps/orphan-requirements",
+    "/engineering/gaps/risky-decisions",
+    "/engineering/gaps/unsupported-claims",
+    "/engineering/impact",
+    "/engineering/evidence",
+}
+
+_v1_router = APIRouter(prefix="/v1", tags=["v1"])
+for _route in list(router.routes):
+    if isinstance(_route, APIRoute) and _route.path in _V1_PUBLIC_PATHS:
+        _v1_router.add_api_route(
+            _route.path,
+            _route.endpoint,
+            methods=list(_route.methods),
+            response_model=_route.response_model,
+            response_class=_route.response_class,
+            name=f"v1_{_route.name}",
+            include_in_schema=True,
+        )
+app.include_router(_v1_router)
+
+
 if __name__ == "__main__":
    import uvicorn

--- a/src/atocore/memory/_dedup_prompt.py
+++ b/src/atocore/memory/_dedup_prompt.py
@@ -0,0 +1,200 @@
+"""Shared LLM prompt + parser for memory dedup (Phase 7A).
+
+Stdlib-only — must be importable from both the in-container service
+layer (when a user clicks "scan for duplicates" in the UI) and the
+host-side batch script (``scripts/memory_dedup.py``), which runs on
+Dalidou where the container's Python deps are not available.
+
+The prompt instructs the model to draft a UNIFIED memory that
+preserves every specific detail from the sources. We never want a
+merge to lose information — if two memories disagree on a number, the
+merged content should surface both with context.
+"""
+
+from __future__ import annotations
+
+import json
+from typing import Any
+
+DEDUP_PROMPT_VERSION = "dedup-0.1.0"
+MAX_CONTENT_CHARS = 1000
+MAX_SOURCES = 8  # cluster size cap — bigger clusters are suspicious
+
+SYSTEM_PROMPT = """You consolidate near-duplicate memories for AtoCore, a personal context engine.
+
+Given 2-8 memories that a semantic-similarity scan flagged as likely duplicates, draft a UNIFIED replacement that preserves every specific detail from every source.
+
+CORE PRINCIPLE: information never gets lost. If the sources disagree on a number, date, vendor, or spec, surface BOTH with attribution (e.g., "quoted at $3.2k on 2026-03-01, revised to $3.8k on 2026-04-10"). If one source is more specific than another, keep the specificity. If they say the same thing differently, pick the clearer wording.
+
+YOU MUST:
+- Produce content under 500 characters that reads as a single coherent statement
+- Keep all project/vendor/person/part names that appear in any source
+- Keep all numbers, dates, and identifiers
+- Keep the strongest claim wording ("ratified", "decided", "committed") if any source has it
+- Propose domain_tags as a UNION of the sources' tags (lowercase, deduped, cap 6)
+- Return valid_until = latest non-null valid_until across sources, or null if any source has null (permanent beats transient)
+
+REFUSE TO MERGE (return action="reject") if:
+- The memories are actually about DIFFERENT subjects that just share vocabulary (e.g., "p04 mirror" and "p05 mirror" — same project bucket means same project, but different components)
+- One memory CONTRADICTS another and you cannot reconcile them — flag for contradiction review instead
+- The sources span different time snapshots of a changing state that should stay as a timeline, not be collapsed
+
+OUTPUT — raw JSON, no prose, no markdown fences:
+{
+  "action": "merge" | "reject",
+  "content": "the unified memory content",
+  "memory_type": "knowledge|project|preference|adaptation|episodic|identity",
+  "project": "project-slug or empty",
+  "domain_tags": ["tag1", "tag2"],
+  "confidence": 0.5,
+  "reason": "one sentence explaining the merge (or the rejection)"
+}
+
+On action=reject, still fill content with a short explanation and set confidence=0."""
+
+
+TIER2_SYSTEM_PROMPT = """You are the second-opinion reviewer for AtoCore's memory-consolidation pipeline.
+
+A tier-1 model (cheaper, faster) already drafted a unified memory from N near-duplicate source memories. Your job is to either CONFIRM the merge (refining the content if you see a clearer phrasing) or OVERRIDE with action="reject" if the tier-1 missed something important.
+
+You must be STRICTER than tier-1. Specifically, REJECT if:
+- The sources are about different subjects that share vocabulary (e.g., different components within the same project)
+- The tier-1 draft dropped specifics that existed in the sources (numbers, dates, vendors, people, part IDs)
+- One source contradicts another and the draft glossed over it
+- The sources span a timeline of a changing state (should be preserved as a sequence, not collapsed)
+
+If you CONFIRM, you may polish the content — but preserve every specific from every source.
+
+Same output schema as tier-1:
+{
+  "action": "merge" | "reject",
+  "content": "the unified memory content",
+  "memory_type": "knowledge|project|preference|adaptation|episodic|identity",
+  "project": "project-slug or empty",
+  "domain_tags": ["tag1", "tag2"],
+  "confidence": 0.5,
+  "reason": "one sentence — what you confirmed or why you overrode"
+}
+
+Raw JSON only, no prose, no markdown fences."""
+
+
+def build_tier2_user_message(sources: list[dict[str, Any]], tier1_verdict: dict[str, Any]) -> str:
+    """Format tier-2 review payload: same sources + tier-1's draft."""
+    base = build_user_message(sources)
+    draft_summary = (
+        f"\n\n--- TIER-1 DRAFT (for your review) ---\n"
+        f"action: {tier1_verdict.get('action')}\n"
+        f"confidence: {tier1_verdict.get('confidence', 0):.2f}\n"
+        f"proposed content: {(tier1_verdict.get('content') or '')[:600]}\n"
+        f"proposed memory_type: {tier1_verdict.get('memory_type', '')}\n"
+        f"proposed project: {tier1_verdict.get('project', '')}\n"
+        f"proposed tags: {tier1_verdict.get('domain_tags', [])}\n"
+        f"tier-1 reason: {tier1_verdict.get('reason', '')[:300]}\n"
+        f"---\n\n"
+        f"Return your JSON verdict now. Confirm or override."
+    )
+    return base.replace("Return the JSON object now.", "").rstrip() + draft_summary
+
+
+def build_user_message(sources: list[dict[str, Any]]) -> str:
+    """Format N source memories for the model to consolidate.
+
+    Each source dict should carry id, content, project, memory_type,
+    domain_tags, confidence, valid_until, reference_count.
+    """
+    lines = [f"You have {len(sources)} source memories in the same (project, memory_type) bucket:\n"]
+    for i, src in enumerate(sources[:MAX_SOURCES], start=1):
+        tags = src.get("domain_tags") or []
+        if isinstance(tags, str):
+            try:
+                tags = json.loads(tags)
+            except Exception:
+                tags = []
+        lines.append(
+            f"--- Source {i} (id={src.get('id','?')[:8]}, "
+            f"refs={src.get('reference_count',0)}, "
+            f"conf={src.get('confidence',0):.2f}, "
+            f"valid_until={src.get('valid_until') or 'permanent'}) ---"
+        )
+        lines.append(f"project: {src.get('project','')}")
+        lines.append(f"type: {src.get('memory_type','')}")
+        lines.append(f"tags: {tags}")
+        lines.append(f"content: {(src.get('content') or '')[:MAX_CONTENT_CHARS]}")
+        lines.append("")
+    lines.append("Return the JSON object now.")
+    return "\n".join(lines)
+
+
+def parse_merge_verdict(raw_output: str) -> dict[str, Any] | None:
+    """Strip markdown fences / leading prose and return the parsed JSON
+    object. Returns None on parse failure."""
+    text = (raw_output or "").strip()
+    if text.startswith("```"):
+        text = text.strip("`")
+        nl = text.find("\n")
+        if nl >= 0:
+            text = text[nl + 1:]
+        if text.endswith("```"):
+            text = text[:-3]
+        text = text.strip()
+
+    if not text.lstrip().startswith("{"):
+        start = text.find("{")
+        end = text.rfind("}")
+        if start >= 0 and end > start:
+            text = text[start:end + 1]
+
+    try:
+        parsed = json.loads(text)
+    except json.JSONDecodeError:
+        return None
+    if not isinstance(parsed, dict):
+        return None
+    return parsed
+
+
+def normalize_merge_verdict(verdict: dict[str, Any]) -> dict[str, Any] | None:
+    """Validate + normalize a raw merge verdict. Returns None if the
+    verdict is unusable (no content, unknown action)."""
+    action = str(verdict.get("action") or "").strip().lower()
+    if action not in ("merge", "reject"):
+        return None
+
+    content = str(verdict.get("content") or "").strip()
+    if not content:
+        return None
+
+    memory_type = str(verdict.get("memory_type") or "knowledge").strip().lower()
+    project = str(verdict.get("project") or "").strip()
+
+    raw_tags = verdict.get("domain_tags") or []
+    if isinstance(raw_tags, str):
+        raw_tags = [t.strip() for t in raw_tags.split(",") if t.strip()]
+    if not isinstance(raw_tags, list):
+        raw_tags = []
+    tags: list[str] = []
+    for t in raw_tags[:6]:
+        if not isinstance(t, str):
+            continue
+        tt = t.strip().lower()
+        if tt and tt not in tags:
+            tags.append(tt)
+
+    try:
+        confidence = float(verdict.get("confidence", 0.5))
+    except (TypeError, ValueError):
+        confidence = 0.5
+    confidence = max(0.0, min(1.0, confidence))
+
+    reason = str(verdict.get("reason") or "").strip()[:500]
+
+    return {
+        "action": action,
+        "content": content[:1000],
+        "memory_type": memory_type,
+        "project": project,
+        "domain_tags": tags,
+        "confidence": confidence,
+        "reason": reason,
+    }
--- a/src/atocore/memory/_llm_prompt.py
+++ b/src/atocore/memory/_llm_prompt.py
@@ -21,7 +21,7 @@ from __future__ import annotations
 import json
 from typing import Any

-LLM_EXTRACTOR_VERSION = "llm-0.4.0"
+LLM_EXTRACTOR_VERSION = "llm-0.6.0"  # bolder unknown-project tagging
 MAX_RESPONSE_CHARS = 8000
 MAX_PROMPT_CHARS = 2000
 MEMORY_TYPES = {"identity", "preference", "project", "episodic", "knowledge", "adaptation"}
@@ -30,7 +30,24 @@ SYSTEM_PROMPT = """You extract memory candidates from LLM conversation turns for

 AtoCore is the brain for Atomaste's engineering work. Known projects:
 p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, atocore,
-abb-space. Unknown project names — still tag them, the system auto-detects.
+abb-space.
+
+UNKNOWN PROJECT/TOOL DETECTION (important): when a memory is clearly
+about a named tool, product, project, or system that is NOT in the
+known list above, use a slugified version of that name as the project
+tag (e.g., "apm" for "Atomaste Part Manager", "foo-bar" for "Foo Bar
+System"). DO NOT default to a nearest registered match just because
+APM isn't listed — that's misattribution. The system's Living
+Taxonomy detector scans for these unregistered tags and surfaces them
+for one-click registration once they appear in ≥3 memories. Your job
+is to be honest about scope, not to squeeze everything into existing
+buckets.
+
+Exception: if the memory is about a registered project that merely
+uses or integrates with an unknown tool (e.g., "p04 parts are missing
+materials in APM"), tag with the registered project (p04-gigabit) and
+mention the tool in content. Only use an unknown tool as the project
+tag when the tool itself is the primary subject.

 Your job is to emit SIGNALS that matter for future context. Be aggressive:
 err on the side of capturing useful signal. Triage filters noise downstream.
@@ -84,6 +101,36 @@ DOMAINS for knowledge candidates (required when type=knowledge and project is em
 physics, materials, optics, mechanics, manufacturing, metrology,
 controls, software, math, finance, business

+DOMAIN TAGS (Phase 3):
+Every candidate gets domain_tags — a lowercase list of topical keywords
+that describe the SUBJECT matter regardless of project. This is how
+cross-project retrieval works: a query about "optics" surfaces matches
+from p04 + p05 + p06 without naming each project.
+
+Good tags: single lowercase words or hyphenated terms.
+Examples:
+  - "ABB quote received for P04" → ["abb", "p04", "procurement", "optics"]
+  - "USB SSD mandatory on polisher" → ["p06", "firmware", "storage"]
+  - "CTE dominates WFE at F/1.2" → ["optics", "materials", "thermal"]
+  - "Antoine prefers OAuth over API keys" → ["security", "auth", "preference"]
+
+Tag 2-5 items. Use domain keywords (optics, thermal, firmware), project
+tokens when relevant (p04, abb), and lifecycle words (procurement, design,
+validation) as appropriate.
+
+VALID_UNTIL (Phase 3):
+A memory can have an expiry date if it describes time-bounded truth.
+Use valid_until for:
+  - Status snapshots: "current blocker is X" → valid_until = ~2 weeks out
+  - Scheduled events: "meeting with vendor Friday" → valid_until = meeting date
+  - Quotes with expiry: "quote valid until May 31"
+  - Interim decisions pending ratification
+Leave empty (null) for:
+  - Durable design decisions ("Option B selected")
+  - Engineering insights ("CTE dominates at F/1.2")
+  - Ratified requirements, architectural commitments
+Default = null (permanent). Format: ISO date "YYYY-MM-DD" or empty.
+
 TRUST HIERARCHY:

 - project-specific: set project to the project id, leave domain empty
@@ -99,7 +146,7 @@ OUTPUT RULES:
 - Empty array [] is fine when the conversation has no durable signal

 Each element:
-{"type": "project|knowledge|preference|adaptation|episodic", "content": "...", "project": "...", "domain": "", "confidence": 0.5}"""
+{"type": "project|knowledge|preference|adaptation|episodic", "content": "...", "project": "...", "domain": "", "confidence": 0.5, "domain_tags": ["tag1","tag2"], "valid_until": null}"""


 def build_user_message(prompt: str, response: str, project_hint: str) -> str:
@@ -174,10 +221,36 @@ def normalize_candidate_item(item: dict[str, Any]) -> dict[str, Any] | None:
    if domain and not model_project:
        content = f"[{domain}] {content}"

+    # Phase 3: domain_tags + valid_until
+    raw_tags = item.get("domain_tags") or []
+    if isinstance(raw_tags, str):
+        # Tolerate comma-separated string fallback
+        raw_tags = [t.strip() for t in raw_tags.split(",") if t.strip()]
+    if not isinstance(raw_tags, list):
+        raw_tags = []
+    domain_tags = []
+    for t in raw_tags[:10]:  # cap at 10
+        if not isinstance(t, str):
+            continue
+        tag = t.strip().lower()
+        if tag and tag not in domain_tags:
+            domain_tags.append(tag)
+
+    valid_until = item.get("valid_until")
+    if valid_until is not None:
+        valid_until = str(valid_until).strip()
+        # Accept ISO date "YYYY-MM-DD" or full timestamp; empty/"null" → none
+        if valid_until.lower() in ("", "null", "none", "permanent"):
+            valid_until = ""
+    else:
+        valid_until = ""
+
    return {
        "type": mem_type,
        "content": content[:1000],
        "project": model_project,
        "domain": domain,
        "confidence": confidence,
+        "domain_tags": domain_tags,
+        "valid_until": valid_until,
    }
--- a/src/atocore/memory/_tag_canon_prompt.py
+++ b/src/atocore/memory/_tag_canon_prompt.py
@@ -0,0 +1,158 @@
+"""Shared LLM prompt + parser for tag canonicalization (Phase 7C).
+
+Stdlib-only, importable from both the in-container service layer and the
+host-side batch script that shells out to ``claude -p``.
+
+The prompt instructs the model to propose a map of domain_tag aliases
+to their canonical form. Confidence is key here — we AUTO-APPLY high-
+confidence aliases; low-confidence go to human review. Over-merging
+distinct concepts ("optics" vs "optical" — sometimes equivalent,
+sometimes not) destroys cross-cutting retrieval, so the model is
+instructed to err conservative.
+"""
+
+from __future__ import annotations
+
+import json
+from typing import Any
+
+TAG_CANON_PROMPT_VERSION = "tagcanon-0.1.0"
+MAX_TAGS_IN_PROMPT = 100
+
+SYSTEM_PROMPT = """You canonicalize domain tags for AtoCore's memory layer.
+
+Input: a distribution of lowercase domain tags (keyword → usage count across active memories). Examples: "firmware: 23", "fw: 5", "firmware-control: 3", "optics: 18", "optical: 2".
+
+Your job: identify aliases — distinct strings that refer to the SAME concept — and map them to a single canonical form. The canonical should be the clearest / most-used / most-descriptive variant.
+
+STRICT RULES:
+
+1. ONLY propose aliases that are UNAMBIGUOUSLY equivalent. Examples:
+     - "fw" → "firmware" (abbreviation)
+     - "firmware-control" → "firmware" (compound narrowing — only if usage context makes it clear the narrower one is never used to DISTINGUISH from firmware-in-general)
+     - "py" → "python"
+     - "ml" → "machine-learning"
+   Do NOT merge:
+     - "optics" vs "optical" — these CAN diverge ("optics" = subsystem/product domain; "optical" = adjective used in non-optics contexts)
+     - "p04" vs "p04-gigabit" — project ids are their own namespace, never canonicalize
+     - "thermal" vs "temperature" — related but distinct
+     - Anything where you're not sure — skip it, human review will catch real aliases next week
+
+2. Confidence scale:
+     0.9+  obvious abbreviation, very high usage disparity, no plausible alternative meaning
+     0.7-0.9  likely alias, one-word-diff or standard contraction
+     0.5-0.7  plausible but requires context — low count on alias side
+     <0.5    DO NOT PROPOSE — if you're under 0.5, skip the pair entirely
+   AtoCore auto-applies aliases at confidence >= 0.8; anything below goes to human review.
+
+3. The CANONICAL must actually appear in the input list (don't invent a new term).
+
+4. Never propose `alias == canonical`. Never propose circular mappings.
+
+5. Project tags (p04, p05, p06, abb-space, atomizer-v2, atocore, apm) are OFF LIMITS — they are project identifiers, not concepts. Leave them alone entirely.
+
+OUTPUT — raw JSON, no prose, no markdown fences:
+{
+  "aliases": [
+    {"alias": "fw", "canonical": "firmware", "confidence": 0.95, "reason": "fw is a standard abbreviation of firmware; 5 uses vs 23"},
+    {"alias": "ml", "canonical": "machine-learning", "confidence": 0.90, "reason": "ml is the universal abbreviation"}
+  ]
+}
+
+Empty aliases list is fine if nothing in the distribution is a clear alias. Err conservative — one false merge can pollute retrieval for hundreds of memories."""
+
+
+def build_user_message(tag_distribution: dict[str, int]) -> str:
+    """Format the tag distribution for the model.
+
+    Limited to MAX_TAGS_IN_PROMPT entries, sorted by count descending
+    so high-usage tags appear first (the LLM uses them as anchor points
+    for canonical selection).
+    """
+    if not tag_distribution:
+        return "Empty tag distribution — return {\"aliases\": []}."
+
+    sorted_tags = sorted(tag_distribution.items(), key=lambda x: x[1], reverse=True)
+    top = sorted_tags[:MAX_TAGS_IN_PROMPT]
+    lines = [f"{tag}: {count}" for tag, count in top]
+    return (
+        f"Tag distribution across {sum(tag_distribution.values())} total tag references "
+        f"(showing top {len(top)} of {len(tag_distribution)} unique tags):\n\n"
+        + "\n".join(lines)
+        + "\n\nReturn the JSON aliases map now. Only propose UNAMBIGUOUS equivalents."
+    )
+
+
+def parse_canon_output(raw_output: str) -> list[dict[str, Any]]:
+    """Strip markdown fences / prose and return the parsed aliases list."""
+    text = (raw_output or "").strip()
+    if text.startswith("```"):
+        text = text.strip("`")
+        nl = text.find("\n")
+        if nl >= 0:
+            text = text[nl + 1:]
+        if text.endswith("```"):
+            text = text[:-3]
+        text = text.strip()
+
+    if not text.lstrip().startswith("{"):
+        start = text.find("{")
+        end = text.rfind("}")
+        if start >= 0 and end > start:
+            text = text[start:end + 1]
+
+    try:
+        parsed = json.loads(text)
+    except json.JSONDecodeError:
+        return []
+
+    if not isinstance(parsed, dict):
+        return []
+    aliases = parsed.get("aliases") or []
+    if not isinstance(aliases, list):
+        return []
+    return [a for a in aliases if isinstance(a, dict)]
+
+
+# Project tokens that must never be canonicalized — they're project ids,
+# not concepts. Keep this list in sync with the registered projects.
+# Safe to be over-inclusive; extra entries just skip canonicalization.
+PROTECTED_PROJECT_TOKENS = frozenset({
+    "p04", "p04-gigabit",
+    "p05", "p05-interferometer",
+    "p06", "p06-polisher",
+    "p08", "abb-space",
+    "atomizer", "atomizer-v2",
+    "atocore", "apm",
+})
+
+
+def normalize_alias_item(item: dict[str, Any]) -> dict[str, Any] | None:
+    """Validate one raw alias proposal. Returns None if unusable.
+
+    Filters: non-strings, empty strings, identity mappings, protected
+    project tokens on either side.
+    """
+    alias = str(item.get("alias") or "").strip().lower()
+    canonical = str(item.get("canonical") or "").strip().lower()
+    if not alias or not canonical:
+        return None
+    if alias == canonical:
+        return None
+    if alias in PROTECTED_PROJECT_TOKENS or canonical in PROTECTED_PROJECT_TOKENS:
+        return None
+
+    try:
+        confidence = float(item.get("confidence", 0.0))
+    except (TypeError, ValueError):
+        confidence = 0.0
+    confidence = max(0.0, min(1.0, confidence))
+
+    reason = str(item.get("reason") or "").strip()[:300]
+
+    return {
+        "alias": alias,
+        "canonical": canonical,
+        "confidence": confidence,
+        "reason": reason,
+    }
--- a/src/atocore/memory/service.py
+++ b/src/atocore/memory/service.py
--- a/src/atocore/memory/similarity.py
+++ b/src/atocore/memory/similarity.py
@@ -0,0 +1,88 @@
+"""Phase 7A (Memory Consolidation): semantic similarity helpers.
+
+Thin wrapper over ``atocore.retrieval.embeddings`` that exposes
+pairwise + batch cosine similarity on normalized embeddings. Used by
+the dedup detector to cluster near-duplicate active memories.
+
+Embeddings from ``embed_texts()`` are already L2-normalized, so cosine
+similarity reduces to a dot product — no extra normalization needed.
+"""
+
+from __future__ import annotations
+
+from atocore.retrieval.embeddings import embed_texts
+
+
+def _dot(a: list[float], b: list[float]) -> float:
+    return sum(x * y for x, y in zip(a, b))
+
+
+def cosine(a: list[float], b: list[float]) -> float:
+    """Cosine similarity on already-normalized vectors. Clamped to [0,1]
+    (embeddings use paraphrase-multilingual-MiniLM which is unit-norm,
+    and we never want negative values leaking into thresholds)."""
+    return max(0.0, min(1.0, _dot(a, b)))
+
+
+def compute_memory_similarity(text_a: str, text_b: str) -> float:
+    """Return cosine similarity of two memory contents in [0,1].
+
+    Convenience helper for one-off checks + tests. For batch work (the
+    dedup detector), use ``embed_texts()`` directly and compute the
+    similarity matrix yourself to avoid re-embedding shared texts.
+    """
+    if not text_a or not text_b:
+        return 0.0
+    vecs = embed_texts([text_a, text_b])
+    return cosine(vecs[0], vecs[1])
+
+
+def similarity_matrix(texts: list[str]) -> list[list[float]]:
+    """N×N cosine similarity matrix. Diagonal is 1.0, symmetric."""
+    if not texts:
+        return []
+    vecs = embed_texts(texts)
+    n = len(vecs)
+    matrix = [[0.0] * n for _ in range(n)]
+    for i in range(n):
+        matrix[i][i] = 1.0
+        for j in range(i + 1, n):
+            s = cosine(vecs[i], vecs[j])
+            matrix[i][j] = s
+            matrix[j][i] = s
+    return matrix
+
+
+def cluster_by_threshold(texts: list[str], threshold: float) -> list[list[int]]:
+    """Greedy transitive clustering: if sim(i,j) >= threshold, merge.
+
+    Returns a list of clusters, each a list of indices into ``texts``.
+    Singletons are included. Used by the dedup detector to collapse
+    A~B~C into one merge proposal rather than three pair proposals.
+    """
+    if not texts:
+        return []
+    matrix = similarity_matrix(texts)
+    n = len(texts)
+    parent = list(range(n))
+
+    def find(x: int) -> int:
+        while parent[x] != x:
+            parent[x] = parent[parent[x]]
+            x = parent[x]
+        return x
+
+    def union(x: int, y: int) -> None:
+        rx, ry = find(x), find(y)
+        if rx != ry:
+            parent[rx] = ry
+
+    for i in range(n):
+        for j in range(i + 1, n):
+            if matrix[i][j] >= threshold:
+                union(i, j)
+
+    groups: dict[int, list[int]] = {}
+    for i in range(n):
+        groups.setdefault(find(i), []).append(i)
+    return list(groups.values())
--- a/src/atocore/models/database.py
+++ b/src/atocore/models/database.py
@@ -119,6 +119,111 @@ def _apply_migrations(conn: sqlite3.Connection) -> None:
        "CREATE INDEX IF NOT EXISTS idx_memories_last_referenced ON memories(last_referenced_at)"
    )

+    # Phase 3 (Auto-Organization V1): domain tags + expiry.
+    # domain_tags is a JSON array of lowercase strings (optics, mechanics,
+    # firmware, business, etc.) inferred by the LLM during triage. Used for
+    # cross-project retrieval: a query about "optics" can surface matches from
+    # p04 + p05 + p06 without knowing all the project names.
+    # valid_until is an ISO UTC timestamp beyond which the memory is
+    # considered stale. get_memories_for_context filters these out of context
+    # packs automatically so ephemeral facts (status snapshots, weekly counts)
+    # don't pollute grounding once they've aged out.
+    if not _column_exists(conn, "memories", "domain_tags"):
+        conn.execute("ALTER TABLE memories ADD COLUMN domain_tags TEXT DEFAULT '[]'")
+    if not _column_exists(conn, "memories", "valid_until"):
+        conn.execute("ALTER TABLE memories ADD COLUMN valid_until DATETIME")
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_memories_valid_until ON memories(valid_until)"
+    )
+
+    # Phase 5 (Engineering V1): when a memory graduates to an entity, we
+    # keep the memory row as an immutable historical pointer. The forward
+    # pointer lets downstream code follow "what did this memory become?"
+    # without having to join through source_refs.
+    if not _column_exists(conn, "memories", "graduated_to_entity_id"):
+        conn.execute("ALTER TABLE memories ADD COLUMN graduated_to_entity_id TEXT")
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_memories_graduated ON memories(graduated_to_entity_id)"
+    )
+
+    # Phase 4 (Robustness V1): append-only audit log for memory mutations.
+    # Every create/update/promote/reject/supersede/invalidate/reinforce/expire/
+    # auto_promote writes one row here. before/after are JSON snapshots of the
+    # relevant fields. actor lets us distinguish auto-triage vs human-triage vs
+    # api vs cron. This is the "how did this memory get to its current state"
+    # trail — essential once the brain starts auto-organizing itself.
+    conn.execute(
+        """
+        CREATE TABLE IF NOT EXISTS memory_audit (
+            id TEXT PRIMARY KEY,
+            memory_id TEXT NOT NULL,
+            action TEXT NOT NULL,
+            actor TEXT DEFAULT 'api',
+            before_json TEXT DEFAULT '{}',
+            after_json TEXT DEFAULT '{}',
+            note TEXT DEFAULT '',
+            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
+        )
+        """
+    )
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_memory ON memory_audit(memory_id)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_timestamp ON memory_audit(timestamp)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_action ON memory_audit(action)")
+
+    # Phase 5 (Engineering V1): entity_kind discriminator lets one audit
+    # table serve both memories AND entities. Default "memory" keeps existing
+    # rows correct; entity mutations write entity_kind="entity".
+    if not _column_exists(conn, "memory_audit", "entity_kind"):
+        conn.execute("ALTER TABLE memory_audit ADD COLUMN entity_kind TEXT DEFAULT 'memory'")
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_memory_audit_entity_kind ON memory_audit(entity_kind)"
+    )
+
+    # Phase 5: conflicts + conflict_members tables per conflict-model.md.
+    # A conflict is "two or more active rows claiming the same slot with
+    # incompatible values". slot_kind + slot_key identify the logical slot
+    # (e.g., "component.material" for some component id). Members point
+    # back to the conflicting rows (memory or entity) with layer trust so
+    # resolution can pick the highest-trust winner.
+    conn.execute(
+        """
+        CREATE TABLE IF NOT EXISTS conflicts (
+            id TEXT PRIMARY KEY,
+            slot_kind TEXT NOT NULL,
+            slot_key TEXT NOT NULL,
+            project TEXT DEFAULT '',
+            status TEXT DEFAULT 'open',
+            resolution TEXT DEFAULT '',
+            resolved_at DATETIME,
+            detected_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+            note TEXT DEFAULT ''
+        )
+        """
+    )
+    conn.execute(
+        """
+        CREATE TABLE IF NOT EXISTS conflict_members (
+            id TEXT PRIMARY KEY,
+            conflict_id TEXT NOT NULL REFERENCES conflicts(id) ON DELETE CASCADE,
+            member_kind TEXT NOT NULL,
+            member_id TEXT NOT NULL,
+            member_layer_trust INTEGER DEFAULT 0,
+            value_snapshot TEXT DEFAULT ''
+        )
+        """
+    )
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_conflicts_status ON conflicts(status)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_conflicts_project ON conflicts(project)")
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_conflicts_slot ON conflicts(slot_kind, slot_key)"
+    )
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_conflict_members_conflict ON conflict_members(conflict_id)"
+    )
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_conflict_members_member ON conflict_members(member_kind, member_id)"
+    )
+
    # Phase 9 Commit A: capture loop columns on the interactions table.
    # The original schema only carried prompt + project_id + a context_pack
    # JSON blob. To make interactions a real audit trail of what AtoCore fed
@@ -146,6 +251,101 @@ def _apply_migrations(conn: sqlite3.Connection) -> None:
        "CREATE INDEX IF NOT EXISTS idx_interactions_created_at ON interactions(created_at)"
    )

+    # Phase 7A (Memory Consolidation — "sleep cycle"): merge candidates.
+    # When the dedup detector finds a cluster of semantically similar active
+    # memories within the same (project, memory_type) bucket, it drafts a
+    # unified content via LLM and writes a proposal here. The triage UI
+    # surfaces these for human approval. On approve, source memories become
+    # status=superseded and a new merged memory is created.
+    # memory_ids is a JSON array (length >= 2) of the source memory ids.
+    # proposed_* hold the LLM's draft; a human can edit before approve.
+    # result_memory_id is filled on approve with the new merged memory's id.
+    conn.execute(
+        """
+        CREATE TABLE IF NOT EXISTS memory_merge_candidates (
+            id TEXT PRIMARY KEY,
+            status TEXT DEFAULT 'pending',
+            memory_ids TEXT NOT NULL,
+            similarity REAL,
+            proposed_content TEXT,
+            proposed_memory_type TEXT,
+            proposed_project TEXT,
+            proposed_tags TEXT DEFAULT '[]',
+            proposed_confidence REAL,
+            reason TEXT DEFAULT '',
+            created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+            resolved_at DATETIME,
+            resolved_by TEXT,
+            result_memory_id TEXT
+        )
+        """
+    )
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_mmc_status ON memory_merge_candidates(status)"
+    )
+    conn.execute(
+        "CREATE INDEX IF NOT EXISTS idx_mmc_created_at ON memory_merge_candidates(created_at)"
+    )
+
+    # Phase 7C (Memory Consolidation — tag canonicalization): alias → canonical
+    # map for domain_tags. A weekly LLM pass proposes rows here; high-confidence
+    # ones auto-apply (rewrite domain_tags across all memories), low-confidence
+    # ones stay pending for human approval. Immutable history: resolved rows
+    # keep status=approved/rejected; the same alias can re-appear with a new
+    # id if the tag reaches a different canonical later.
+    conn.execute(
+        """
+        CREATE TABLE IF NOT EXISTS tag_aliases (
+            id TEXT PRIMARY KEY,
+            alias TEXT NOT NULL,
+            canonical TEXT NOT NULL,
+            status TEXT DEFAULT 'pending',
+            confidence REAL DEFAULT 0.0,
+            alias_count INTEGER DEFAULT 0,
+            canonical_count INTEGER DEFAULT 0,
+            reason TEXT DEFAULT '',
+            applied_to_memories INTEGER DEFAULT 0,
+            created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+            resolved_at DATETIME,
+            resolved_by TEXT
+        )
+        """
+    )
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_tag_aliases_status ON tag_aliases(status)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_tag_aliases_alias ON tag_aliases(alias)")
+
+    # Issue F (visual evidence): binary asset store. One row per unique
+    # content hash — re-uploading the same file is idempotent. The blob
+    # itself lives on disk under stored_path; this table is the catalog.
+    # width/height are populated for image mime types (NULL otherwise).
+    # source_refs is a JSON array of free-form provenance pointers
+    # (e.g. "session:<id>", "interaction:<id>") that survive independent
+    # of the EVIDENCED_BY graph. status=invalid tombstones an asset
+    # without dropping the row so audit trails stay intact.
+    conn.execute(
+        """
+        CREATE TABLE IF NOT EXISTS assets (
+            id TEXT PRIMARY KEY,
+            hash_sha256 TEXT UNIQUE NOT NULL,
+            mime_type TEXT NOT NULL,
+            size_bytes INTEGER NOT NULL,
+            width INTEGER,
+            height INTEGER,
+            stored_path TEXT NOT NULL,
+            original_filename TEXT DEFAULT '',
+            project TEXT DEFAULT '',
+            caption TEXT DEFAULT '',
+            source_refs TEXT DEFAULT '[]',
+            status TEXT DEFAULT 'active',
+            created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+            updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
+        )
+        """
+    )
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_assets_hash ON assets(hash_sha256)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_assets_project ON assets(project)")
+    conn.execute("CREATE INDEX IF NOT EXISTS idx_assets_status ON assets(status)")
+

 def _column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool:
    rows = conn.execute(f"PRAGMA table_info({table})").fetchall()
--- a/src/atocore/observability/alerts.py
+++ b/src/atocore/observability/alerts.py
@@ -0,0 +1,170 @@
+"""Alert emission framework (Phase 4 Robustness V1).
+
+One-stop helper to raise operational alerts from any AtoCore code
+path. An alert is a structured message about something the operator
+should see — harness regression, queue pileup, integrity drift,
+pipeline skipped, etc.
+
+Emission fans out to multiple sinks so a single call touches every
+observability channel:
+
+  1. structlog logger (always)
+  2. Append to ``$ATOCORE_ALERT_LOG`` (default ~/atocore-logs/alerts.log)
+  3. Write the last alert of each severity to AtoCore project state
+     (atocore/alert/last_{severity}) so the dashboard can surface it
+  4. POST to ``$ATOCORE_ALERT_WEBHOOK`` if set (Discord/Slack/generic)
+
+All sinks are fail-open — if one fails the others still fire.
+
+Severity levels (inspired by syslog but simpler):
+  - ``info``      operational event worth noting
+  - ``warning``   degraded state, service still works
+  - ``critical``  something is broken and needs attention
+
+Environment variables:
+  ATOCORE_ALERT_LOG      override the alerts log file path
+  ATOCORE_ALERT_WEBHOOK  POST JSON alerts here (Discord webhook, etc.)
+  ATOCORE_BASE_URL       AtoCore API for project-state write (default localhost:8100)
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import threading
+import urllib.error
+import urllib.request
+from datetime import datetime, timezone
+from pathlib import Path
+
+from atocore.observability.logger import get_logger
+
+log = get_logger("alerts")
+
+SEVERITIES = {"info", "warning", "critical"}
+
+
+def _default_alert_log() -> Path:
+    explicit = os.environ.get("ATOCORE_ALERT_LOG")
+    if explicit:
+        return Path(explicit)
+    return Path.home() / "atocore-logs" / "alerts.log"
+
+
+def _append_log(severity: str, title: str, message: str, context: dict | None) -> None:
+    path = _default_alert_log()
+    try:
+        path.parent.mkdir(parents=True, exist_ok=True)
+        ts = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
+        line = f"[{ts}] [{severity.upper()}] {title}: {message}"
+        if context:
+            line += f"  {json.dumps(context, ensure_ascii=True)[:500]}"
+        line += "\n"
+        with open(path, "a", encoding="utf-8") as f:
+            f.write(line)
+    except Exception as e:
+        log.warning("alert_log_write_failed", error=str(e))
+
+
+def _write_state(severity: str, title: str, message: str, ts: str) -> None:
+    """Record the most-recent alert per severity into project_state.
+
+    Uses the internal ``set_state`` helper directly so we work even
+    when the HTTP API isn't available (e.g. called from cron scripts
+    that import atocore as a library).
+    """
+    try:
+        from atocore.context.project_state import set_state
+
+        set_state(
+            project_name="atocore",
+            category="alert",
+            key=f"last_{severity}",
+            value=json.dumps({"title": title, "message": message[:400], "timestamp": ts}),
+            source="alert framework",
+        )
+    except Exception as e:
+        log.warning("alert_state_write_failed", error=str(e))
+
+
+def _post_webhook(severity: str, title: str, message: str, context: dict | None, ts: str) -> None:
+    url = os.environ.get("ATOCORE_ALERT_WEBHOOK")
+    if not url:
+        return
+
+    # Auto-detect Discord webhook shape for nicer formatting
+    if "discord.com/api/webhooks" in url or "discordapp.com/api/webhooks" in url:
+        emoji = {"info": ":information_source:", "warning": ":warning:", "critical": ":rotating_light:"}.get(severity, "")
+        body = {
+            "content": f"{emoji} **AtoCore {severity}**: {title}",
+            "embeds": [{
+                "description": message[:1800],
+                "timestamp": ts,
+                "fields": [
+                    {"name": k, "value": str(v)[:200], "inline": True}
+                    for k, v in (context or {}).items()
+                ][:10],
+            }],
+        }
+    else:
+        body = {
+            "severity": severity,
+            "title": title,
+            "message": message,
+            "context": context or {},
+            "timestamp": ts,
+        }
+
+    def _fire():
+        try:
+            req = urllib.request.Request(
+                url,
+                data=json.dumps(body).encode("utf-8"),
+                method="POST",
+                headers={"Content-Type": "application/json"},
+            )
+            urllib.request.urlopen(req, timeout=8)
+        except Exception as e:
+            log.warning("alert_webhook_failed", error=str(e))
+
+    threading.Thread(target=_fire, daemon=True).start()
+
+
+def emit_alert(
+    severity: str,
+    title: str,
+    message: str,
+    context: dict | None = None,
+) -> None:
+    """Emit an alert to all configured sinks.
+
+    Fail-open: any single sink failure is logged but does not prevent
+    other sinks from firing.
+    """
+    severity = (severity or "info").lower()
+    if severity not in SEVERITIES:
+        severity = "info"
+
+    ts = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
+
+    # Sink 1: structlog — always
+    logger_fn = {
+        "info": log.info,
+        "warning": log.warning,
+        "critical": log.error,
+    }[severity]
+    logger_fn("alert", title=title, message=message[:500], **(context or {}))
+
+    # Sinks 2-4: fail-open, each wrapped
+    try:
+        _append_log(severity, title, message, context)
+    except Exception:
+        pass
+    try:
+        _write_state(severity, title, message, ts)
+    except Exception:
+        pass
+    try:
+        _post_webhook(severity, title, message, context, ts)
+    except Exception:
+        pass
--- a/src/atocore/projects/registry.py
+++ b/src/atocore/projects/registry.py
@@ -11,6 +11,20 @@ import atocore.config as _config
 from atocore.ingestion.pipeline import ingest_folder


+# Reserved pseudo-projects. `inbox` holds pre-project / lead / quote
+# entities that don't yet belong to a real project. `""` (empty) is the
+# cross-project bucket for facts that apply to every project (material
+# properties, vendor capabilities). Neither may be registered, renamed,
+# or deleted via the normal registry CRUD.
+INBOX_PROJECT = "inbox"
+GLOBAL_PROJECT = ""
+_RESERVED_PROJECT_IDS = {INBOX_PROJECT}
+
+
+def is_reserved_project(name: str) -> bool:
+    return (name or "").strip().lower() in _RESERVED_PROJECT_IDS
+
+
@dataclass(frozen=True)
 class ProjectSourceRef:
    source: str
@@ -56,8 +70,17 @@ def build_project_registration_proposal(
    normalized_id = project_id.strip()
    if not normalized_id:
        raise ValueError("Project id must be non-empty")
+    if is_reserved_project(normalized_id):
+        raise ValueError(
+            f"Project id {normalized_id!r} is reserved and cannot be registered"
+        )

    normalized_aliases = _normalize_aliases(aliases or [])
+    for alias in normalized_aliases:
+        if is_reserved_project(alias):
+            raise ValueError(
+                f"Alias {alias!r} is reserved and cannot be used"
+            )
    normalized_roots = _normalize_ingest_roots(ingest_roots or [])
    if not normalized_roots:
        raise ValueError("At least one ingest root is required")
@@ -129,6 +152,10 @@ def update_project(
    ingest_roots: list[dict] | tuple[dict, ...] | None = None,
 ) -> dict:
    """Update an existing project registration in the registry file."""
+    if is_reserved_project(project_name):
+        raise ValueError(
+            f"Project {project_name!r} is reserved and cannot be modified"
+        )
    existing = get_registered_project(project_name)
    if existing is None:
        raise ValueError(f"Unknown project: {project_name}")
@@ -272,6 +299,8 @@ def resolve_project_name(name: str | None) -> str:
    """
    if not name:
        return name or ""
+    if is_reserved_project(name):
+        return name.strip().lower()
    project = get_registered_project(name)
    if project is not None:
        return project.project_id
--- a/tests/test_alerts.py
+++ b/tests/test_alerts.py
@@ -0,0 +1,58 @@
+"""Tests for the Phase 4 alerts framework."""
+
+from __future__ import annotations
+
+import os
+import tempfile
+from pathlib import Path
+
+import pytest
+
+import atocore.config as _config
+
+
+@pytest.fixture(autouse=True)
+def isolated_env(monkeypatch):
+    """Isolate alerts sinks per test."""
+    tmpdir = tempfile.mkdtemp()
+    log_file = Path(tmpdir) / "alerts.log"
+    monkeypatch.setenv("ATOCORE_ALERT_LOG", str(log_file))
+    monkeypatch.delenv("ATOCORE_ALERT_WEBHOOK", raising=False)
+
+    # Data dir for any state writes
+    monkeypatch.setenv("ATOCORE_DATA_DIR", tmpdir)
+    _config.settings = _config.Settings()
+
+    from atocore.models.database import init_db
+    init_db()
+
+    yield {"tmpdir": tmpdir, "log_file": log_file}
+
+
+def test_emit_alert_writes_log_file(isolated_env):
+    from atocore.observability.alerts import emit_alert
+
+    emit_alert("warning", "test title", "test message body", context={"count": 5})
+
+    content = isolated_env["log_file"].read_text(encoding="utf-8")
+    assert "test title" in content
+    assert "test message body" in content
+    assert "WARNING" in content
+    assert '"count": 5' in content
+
+
+def test_emit_alert_invalid_severity_falls_back_to_info(isolated_env):
+    from atocore.observability.alerts import emit_alert
+
+    emit_alert("made-up-severity", "t", "m")
+    content = isolated_env["log_file"].read_text(encoding="utf-8")
+    assert "INFO" in content
+
+
+def test_emit_alert_fails_open_on_log_write_error(monkeypatch, isolated_env):
+    """An unwritable log path should not crash the emit."""
+    from atocore.observability.alerts import emit_alert
+
+    monkeypatch.setenv("ATOCORE_ALERT_LOG", "/nonexistent/path/that/definitely/is/not/writable/alerts.log")
+    # Must not raise
+    emit_alert("info", "t", "m")
--- a/tests/test_assets.py
+++ b/tests/test_assets.py
@@ -0,0 +1,257 @@
+"""Issue F — binary asset store + artifact entity + wiki rendering."""
+
+from io import BytesIO
+
+import pytest
+from fastapi.testclient import TestClient
+from PIL import Image
+
+from atocore.assets import (
+    AssetTooLarge,
+    AssetTypeNotAllowed,
+    get_asset,
+    get_asset_binary,
+    get_thumbnail,
+    invalidate_asset,
+    list_orphan_assets,
+    store_asset,
+)
+from atocore.engineering.service import (
+    ENTITY_TYPES,
+    create_entity,
+    create_relationship,
+    init_engineering_schema,
+)
+from atocore.main import app
+from atocore.models.database import init_db
+
+
+def _png_bytes(color=(255, 0, 0), size=(64, 48)) -> bytes:
+    buf = BytesIO()
+    Image.new("RGB", size, color).save(buf, format="PNG")
+    return buf.getvalue()
+
+
+@pytest.fixture
+def assets_env(tmp_data_dir, tmp_path, monkeypatch):
+    registry_path = tmp_path / "test-registry.json"
+    registry_path.write_text('{"projects": []}', encoding="utf-8")
+    monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+    from atocore import config
+    config.settings = config.Settings()
+
+    init_db()
+    init_engineering_schema()
+    yield tmp_data_dir
+
+
+def test_artifact_is_in_entity_types():
+    assert "artifact" in ENTITY_TYPES
+
+
+def test_store_asset_happy_path(assets_env):
+    data = _png_bytes()
+    asset = store_asset(data=data, mime_type="image/png", caption="red square")
+    assert asset.hash_sha256
+    assert asset.size_bytes == len(data)
+    assert asset.width == 64
+    assert asset.height == 48
+    assert asset.mime_type == "image/png"
+    from pathlib import Path
+    assert Path(asset.stored_path).exists()
+
+
+def test_store_asset_is_idempotent_on_hash(assets_env):
+    data = _png_bytes()
+    a = store_asset(data=data, mime_type="image/png")
+    b = store_asset(data=data, mime_type="image/png", caption="different caption")
+    assert a.id == b.id, "same content should dedup to the same asset id"
+
+
+def test_store_asset_rejects_unknown_mime(assets_env):
+    with pytest.raises(AssetTypeNotAllowed):
+        store_asset(data=b"hello", mime_type="text/plain")
+
+
+def test_store_asset_rejects_oversize(assets_env, monkeypatch):
+    monkeypatch.setattr(
+        "atocore.config.settings.assets_max_upload_bytes",
+        10,
+        raising=False,
+    )
+    with pytest.raises(AssetTooLarge):
+        store_asset(data=_png_bytes(), mime_type="image/png")
+
+
+def test_get_asset_binary_roundtrip(assets_env):
+    data = _png_bytes(color=(0, 255, 0))
+    asset = store_asset(data=data, mime_type="image/png")
+    _, roundtrip = get_asset_binary(asset.id)
+    assert roundtrip == data
+
+
+def test_thumbnail_generates_and_caches(assets_env):
+    data = _png_bytes(size=(800, 600))
+    asset = store_asset(data=data, mime_type="image/png")
+    _, thumb1 = get_thumbnail(asset.id, size=120)
+    _, thumb2 = get_thumbnail(asset.id, size=120)
+    assert thumb1 == thumb2
+    # Must be a valid JPEG and smaller than the source
+    assert thumb1[:3] == b"\xff\xd8\xff"
+    assert len(thumb1) < len(data)
+
+
+def test_orphan_list_excludes_referenced(assets_env):
+    referenced = store_asset(data=_png_bytes((1, 1, 1)), mime_type="image/png")
+    lonely = store_asset(data=_png_bytes((2, 2, 2)), mime_type="image/png")
+    create_entity(
+        entity_type="artifact",
+        name="ref-test",
+        properties={"kind": "image", "asset_id": referenced.id},
+    )
+    orphan_ids = {o.id for o in list_orphan_assets()}
+    assert lonely.id in orphan_ids
+    assert referenced.id not in orphan_ids
+
+
+def test_invalidate_refuses_referenced_asset(assets_env):
+    asset = store_asset(data=_png_bytes((3, 3, 3)), mime_type="image/png")
+    create_entity(
+        entity_type="artifact",
+        name="pinned",
+        properties={"kind": "image", "asset_id": asset.id},
+    )
+    assert invalidate_asset(asset.id) is False
+    assert get_asset(asset.id).status == "active"
+
+
+def test_invalidate_orphan_succeeds(assets_env):
+    asset = store_asset(data=_png_bytes((4, 4, 4)), mime_type="image/png")
+    assert invalidate_asset(asset.id) is True
+    assert get_asset(asset.id).status == "invalid"
+
+
+def test_api_upload_and_fetch(assets_env):
+    client = TestClient(app)
+    png = _png_bytes((7, 7, 7))
+    r = client.post(
+        "/assets",
+        files={"file": ("red.png", png, "image/png")},
+        data={"project": "p05", "caption": "unit test upload"},
+    )
+    assert r.status_code == 200, r.text
+    body = r.json()
+    assert body["mime_type"] == "image/png"
+    assert body["caption"] == "unit test upload"
+    asset_id = body["id"]
+
+    r2 = client.get(f"/assets/{asset_id}")
+    assert r2.status_code == 200
+    assert r2.headers["content-type"].startswith("image/png")
+    assert r2.content == png
+
+    r3 = client.get(f"/assets/{asset_id}/thumbnail?size=100")
+    assert r3.status_code == 200
+    assert r3.headers["content-type"].startswith("image/jpeg")
+
+    r4 = client.get(f"/assets/{asset_id}/meta")
+    assert r4.status_code == 200
+    assert r4.json()["id"] == asset_id
+
+
+def test_api_upload_rejects_bad_mime(assets_env):
+    client = TestClient(app)
+    r = client.post(
+        "/assets",
+        files={"file": ("notes.txt", b"hello", "text/plain")},
+    )
+    assert r.status_code == 415
+
+
+def test_api_get_entity_evidence_returns_artifacts(assets_env):
+    asset = store_asset(data=_png_bytes((9, 9, 9)), mime_type="image/png")
+    artifact = create_entity(
+        entity_type="artifact",
+        name="cap-001",
+        properties={
+            "kind": "image",
+            "asset_id": asset.id,
+            "caption": "tower base",
+        },
+    )
+    tower = create_entity(entity_type="component", name="tower")
+    create_relationship(
+        source_entity_id=tower.id,
+        target_entity_id=artifact.id,
+        relationship_type="evidenced_by",
+    )
+
+    client = TestClient(app)
+    r = client.get(f"/entities/{tower.id}/evidence")
+    assert r.status_code == 200
+    body = r.json()
+    assert body["count"] == 1
+    ev = body["evidence"][0]
+    assert ev["kind"] == "image"
+    assert ev["caption"] == "tower base"
+    assert ev["asset"]["id"] == asset.id
+
+
+def test_v1_assets_aliases_present(assets_env):
+    client = TestClient(app)
+    spec = client.get("/openapi.json").json()
+    paths = spec["paths"]
+    for p in (
+        "/v1/assets",
+        "/v1/assets/{asset_id}",
+        "/v1/assets/{asset_id}/thumbnail",
+        "/v1/assets/{asset_id}/meta",
+        "/v1/entities/{entity_id}/evidence",
+    ):
+        assert p in paths, f"{p} missing from /v1 alias set"
+
+
+def test_wiki_renders_evidence_strip(assets_env):
+    from atocore.engineering.wiki import render_entity
+
+    asset = store_asset(data=_png_bytes((10, 10, 10)), mime_type="image/png")
+    artifact = create_entity(
+        entity_type="artifact",
+        name="cap-ev-01",
+        properties={
+            "kind": "image",
+            "asset_id": asset.id,
+            "caption": "viewport",
+        },
+    )
+    tower = create_entity(entity_type="component", name="tower-wiki")
+    create_relationship(
+        source_entity_id=tower.id,
+        target_entity_id=artifact.id,
+        relationship_type="evidenced_by",
+    )
+
+    html = render_entity(tower.id)
+    assert "Visual evidence" in html
+    assert f"/assets/{asset.id}/thumbnail" in html
+    assert "viewport" in html
+
+
+def test_wiki_renders_artifact_full_image(assets_env):
+    from atocore.engineering.wiki import render_entity
+
+    asset = store_asset(data=_png_bytes((11, 11, 11)), mime_type="image/png")
+    artifact = create_entity(
+        entity_type="artifact",
+        name="cap-full-01",
+        properties={
+            "kind": "image",
+            "asset_id": asset.id,
+            "caption": "detail shot",
+            "capture_context": "narrator: here's the base plate close-up",
+        },
+    )
+    html = render_entity(artifact.id)
+    assert f"/assets/{asset.id}/thumbnail?size=1024" in html
+    assert "Capture context" in html
+    assert "narrator" in html
--- a/tests/test_confidence_decay.py
+++ b/tests/test_confidence_decay.py
@@ -0,0 +1,251 @@
+"""Phase 7D — confidence decay tests.
+
+Covers:
+  - idle unreferenced memories decay at the expected rate
+  - fresh / reinforced memories are untouched
+  - below floor → auto-supersede with audit
+  - graduated memories exempt
+  - reinforcement reverses decay (integration with Phase 9 Commit B)
+"""
+
+from __future__ import annotations
+
+from datetime import datetime, timedelta, timezone
+
+import pytest
+
+from atocore.memory.service import (
+    create_memory,
+    decay_unreferenced_memories,
+    get_memory_audit,
+    reinforce_memory,
+)
+from atocore.models.database import get_connection, init_db
+
+
+def _force_old(mem_id: str, days_ago: int) -> None:
+    """Force last_referenced_at and created_at to N days in the past."""
+    ts = (datetime.now(timezone.utc) - timedelta(days=days_ago)).strftime("%Y-%m-%d %H:%M:%S")
+    with get_connection() as conn:
+        conn.execute(
+            "UPDATE memories SET last_referenced_at = ?, created_at = ? WHERE id = ?",
+            (ts, ts, mem_id),
+        )
+
+
+def _set_confidence(mem_id: str, c: float) -> None:
+    with get_connection() as conn:
+        conn.execute("UPDATE memories SET confidence = ? WHERE id = ?", (c, mem_id))
+
+
+def _set_reference_count(mem_id: str, n: int) -> None:
+    with get_connection() as conn:
+        conn.execute("UPDATE memories SET reference_count = ? WHERE id = ?", (n, mem_id))
+
+
+def _get(mem_id: str) -> dict:
+    with get_connection() as conn:
+        row = conn.execute("SELECT * FROM memories WHERE id = ?", (mem_id,)).fetchone()
+    return dict(row) if row else {}
+
+
+def _set_status(mem_id: str, status: str) -> None:
+    with get_connection() as conn:
+        conn.execute("UPDATE memories SET status = ? WHERE id = ?", (status, mem_id))
+
+
+# --- Basic decay mechanics ---
+
+
+def test_decay_applies_to_idle_unreferenced(tmp_data_dir):
+    init_db()
+    m = create_memory("knowledge", "cold fact", confidence=0.8)
+    _force_old(m.id, days_ago=60)
+    _set_reference_count(m.id, 0)
+
+    result = decay_unreferenced_memories()
+    assert len(result["decayed"]) == 1
+    assert result["decayed"][0]["memory_id"] == m.id
+
+    row = _get(m.id)
+    # 0.8 * 0.97 = 0.776
+    assert row["confidence"] == pytest.approx(0.776)
+    assert row["status"] == "active"  # still above floor
+
+
+def test_decay_skips_fresh_memory(tmp_data_dir):
+    """A memory created today shouldn't decay even if reference_count=0."""
+    init_db()
+    m = create_memory("knowledge", "just-created fact", confidence=0.8)
+    # Don't force old — it's fresh
+    result = decay_unreferenced_memories()
+    assert not any(e["memory_id"] == m.id for e in result["decayed"])
+    assert not any(e["memory_id"] == m.id for e in result["superseded"])
+
+    row = _get(m.id)
+    assert row["confidence"] == pytest.approx(0.8)
+
+
+def test_decay_skips_reinforced_memory(tmp_data_dir):
+    """Any reinforcement protects the memory from decay."""
+    init_db()
+    m = create_memory("knowledge", "referenced fact", confidence=0.8)
+    _force_old(m.id, days_ago=90)
+    _set_reference_count(m.id, 1)  # just one reference is enough
+
+    result = decay_unreferenced_memories()
+    assert not any(e["memory_id"] == m.id for e in result["decayed"])
+
+    row = _get(m.id)
+    assert row["confidence"] == pytest.approx(0.8)
+
+
+# --- Auto-supersede at floor ---
+
+
+def test_decay_supersedes_below_floor(tmp_data_dir):
+    init_db()
+    m = create_memory("knowledge", "very cold fact", confidence=0.31)
+    _force_old(m.id, days_ago=60)
+    _set_reference_count(m.id, 0)
+
+    # 0.31 * 0.97 = 0.3007 which is still above the default floor 0.30.
+    # Drop it a hair lower to cross the floor in one step.
+    _set_confidence(m.id, 0.305)
+
+    result = decay_unreferenced_memories(supersede_confidence_floor=0.30)
+    # 0.305 * 0.97 = 0.29585 → below 0.30, supersede
+    assert len(result["superseded"]) == 1
+    assert result["superseded"][0]["memory_id"] == m.id
+
+    row = _get(m.id)
+    assert row["status"] == "superseded"
+    assert row["confidence"] < 0.30
+
+
+def test_supersede_writes_audit_row(tmp_data_dir):
+    init_db()
+    m = create_memory("knowledge", "will decay out", confidence=0.305)
+    _force_old(m.id, days_ago=60)
+    _set_reference_count(m.id, 0)
+
+    decay_unreferenced_memories(supersede_confidence_floor=0.30)
+
+    audit = get_memory_audit(m.id)
+    actions = [a["action"] for a in audit]
+    assert "superseded" in actions
+    entry = next(a for a in audit if a["action"] == "superseded")
+    assert entry["actor"] == "confidence-decay"
+    assert "decayed below floor" in entry["note"]
+
+
+# --- Exemptions ---
+
+
+def test_decay_skips_graduated_memory(tmp_data_dir):
+    """Graduated memories are frozen pointers to entities — never decay."""
+    init_db()
+    m = create_memory("knowledge", "graduated fact", confidence=0.8)
+    _force_old(m.id, days_ago=90)
+    _set_reference_count(m.id, 0)
+    _set_status(m.id, "graduated")
+
+    result = decay_unreferenced_memories()
+    assert not any(e["memory_id"] == m.id for e in result["decayed"])
+
+    row = _get(m.id)
+    assert row["confidence"] == pytest.approx(0.8)  # unchanged
+
+
+def test_decay_skips_superseded_memory(tmp_data_dir):
+    """Already superseded memories don't decay further."""
+    init_db()
+    m = create_memory("knowledge", "old news", confidence=0.5)
+    _force_old(m.id, days_ago=90)
+    _set_reference_count(m.id, 0)
+    _set_status(m.id, "superseded")
+
+    result = decay_unreferenced_memories()
+    assert not any(e["memory_id"] == m.id for e in result["decayed"])
+
+
+# --- Reversibility ---
+
+
+def test_reinforcement_reverses_decay(tmp_data_dir):
+    """A memory that decayed then got reinforced comes back up."""
+    init_db()
+    m = create_memory("knowledge", "will come back", confidence=0.8)
+    _force_old(m.id, days_ago=60)
+    _set_reference_count(m.id, 0)
+
+    decay_unreferenced_memories()
+    # Now at 0.776
+    reinforce_memory(m.id, confidence_delta=0.05)
+    row = _get(m.id)
+    assert row["confidence"] == pytest.approx(0.826)
+    assert row["reference_count"] >= 1
+
+
+def test_reinforced_memory_no_longer_decays(tmp_data_dir):
+    """Once reinforce_memory bumps reference_count, decay skips it."""
+    init_db()
+    m = create_memory("knowledge", "protected", confidence=0.8)
+    _force_old(m.id, days_ago=90)
+    # Simulate reinforcement
+    reinforce_memory(m.id)
+
+    result = decay_unreferenced_memories()
+    assert not any(e["memory_id"] == m.id for e in result["decayed"])
+
+
+# --- Parameter validation ---
+
+
+def test_decay_rejects_invalid_factor(tmp_data_dir):
+    init_db()
+    with pytest.raises(ValueError):
+        decay_unreferenced_memories(daily_decay_factor=1.0)
+    with pytest.raises(ValueError):
+        decay_unreferenced_memories(daily_decay_factor=0.0)
+    with pytest.raises(ValueError):
+        decay_unreferenced_memories(daily_decay_factor=-0.5)
+
+
+def test_decay_rejects_invalid_floor(tmp_data_dir):
+    init_db()
+    with pytest.raises(ValueError):
+        decay_unreferenced_memories(supersede_confidence_floor=1.5)
+    with pytest.raises(ValueError):
+        decay_unreferenced_memories(supersede_confidence_floor=-0.1)
+
+
+# --- Threshold tuning ---
+
+
+def test_decay_threshold_tight_excludes_newer(tmp_data_dir):
+    """With idle_days_threshold=90, a 60-day-old memory should NOT decay."""
+    init_db()
+    m = create_memory("knowledge", "60-day-old", confidence=0.8)
+    _force_old(m.id, days_ago=60)
+    _set_reference_count(m.id, 0)
+
+    result = decay_unreferenced_memories(idle_days_threshold=90)
+    assert not any(e["memory_id"] == m.id for e in result["decayed"])
+
+
+# --- Idempotency-ish (multiple runs apply additional decay) ---
+
+
+def test_decay_stacks_across_runs(tmp_data_dir):
+    """Running decay twice (simulating two days) compounds the factor."""
+    init_db()
+    m = create_memory("knowledge", "aging fact", confidence=0.8)
+    _force_old(m.id, days_ago=60)
+    _set_reference_count(m.id, 0)
+
+    decay_unreferenced_memories()
+    decay_unreferenced_memories()
+    row = _get(m.id)
+    # 0.8 * 0.97 * 0.97 = 0.75272
+    assert row["confidence"] == pytest.approx(0.75272, rel=1e-4)
--- a/tests/test_engineering.py
+++ b/tests/test_engineering.py
@@ -116,3 +116,108 @@ def test_entity_name_search(tmp_data_dir):

    results = get_entities(name_contains="Support")
    assert len(results) == 2
+
+
+# --- Phase 5: Entity promote/reject lifecycle + audit + canonicalization ---
+
+
+def test_entity_project_canonicalization(tmp_data_dir):
+    """Aliases resolve to canonical project_id on write (Phase 5)."""
+    init_db()
+    init_engineering_schema()
+    # "p04" is a registered alias for p04-gigabit
+    e = create_entity("component", "Test Component", project="p04")
+    assert e.project == "p04-gigabit"
+
+
+def test_promote_entity_candidate_to_active(tmp_data_dir):
+    from atocore.engineering.service import promote_entity, get_entity
+
+    init_db()
+    init_engineering_schema()
+    e = create_entity("requirement", "CTE tolerance", status="candidate")
+    assert e.status == "candidate"
+
+    assert promote_entity(e.id, actor="test-triage")
+    e2 = get_entity(e.id)
+    assert e2.status == "active"
+
+
+def test_reject_entity_candidate(tmp_data_dir):
+    from atocore.engineering.service import reject_entity_candidate, get_entity
+
+    init_db()
+    init_engineering_schema()
+    e = create_entity("decision", "pick vendor Y", status="candidate")
+
+    assert reject_entity_candidate(e.id, actor="test-triage", note="duplicate")
+    e2 = get_entity(e.id)
+    assert e2.status == "invalid"
+
+
+def test_promote_active_entity_noop(tmp_data_dir):
+    from atocore.engineering.service import promote_entity
+
+    init_db()
+    init_engineering_schema()
+    e = create_entity("component", "Already Active")  # default status=active
+    assert not promote_entity(e.id)  # only candidates can promote
+
+
+def test_entity_audit_log_captures_lifecycle(tmp_data_dir):
+    from atocore.engineering.service import (
+        promote_entity,
+        get_entity_audit,
+    )
+
+    init_db()
+    init_engineering_schema()
+    e = create_entity("requirement", "test req", status="candidate", actor="test")
+    promote_entity(e.id, actor="test-triage", note="looks good")
+
+    audit = get_entity_audit(e.id)
+    actions = [a["action"] for a in audit]
+    assert "created" in actions
+    assert "promoted" in actions
+
+    promote_entry = next(a for a in audit if a["action"] == "promoted")
+    assert promote_entry["actor"] == "test-triage"
+    assert promote_entry["note"] == "looks good"
+    assert promote_entry["before"]["status"] == "candidate"
+    assert promote_entry["after"]["status"] == "active"
+
+
+def test_new_relationship_types_available(tmp_data_dir):
+    """Phase 5 added 6 missing relationship types."""
+    for rel in ["based_on_assumption", "supports", "conflicts_with",
+                "updated_by_session", "evidenced_by", "summarized_in"]:
+        assert rel in RELATIONSHIP_TYPES, f"{rel} missing from RELATIONSHIP_TYPES"
+
+
+def test_conflicts_tables_exist(tmp_data_dir):
+    """Phase 5 conflict-model tables."""
+    from atocore.models.database import get_connection
+
+    init_db()
+    with get_connection() as conn:
+        tables = {r[0] for r in conn.execute(
+            "SELECT name FROM sqlite_master WHERE type='table'"
+        ).fetchall()}
+    assert "conflicts" in tables
+    assert "conflict_members" in tables
+
+
+def test_memory_audit_has_entity_kind(tmp_data_dir):
+    """Phase 5 added entity_kind discriminator."""
+    from atocore.models.database import get_connection
+
+    init_db()
+    with get_connection() as conn:
+        cols = {r["name"] for r in conn.execute("PRAGMA table_info(memory_audit)").fetchall()}
+    assert "entity_kind" in cols
+
+
+def test_graduated_status_accepted(tmp_data_dir):
+    """Phase 5 added 'graduated' memory status for memory→entity transitions."""
+    from atocore.memory.service import MEMORY_STATUSES
+    assert "graduated" in MEMORY_STATUSES
--- a/tests/test_engineering_queries.py
+++ b/tests/test_engineering_queries.py
@@ -0,0 +1,212 @@
+"""Phase 5 tests — the 10 canonical engineering queries.
+
+Test fixtures seed a small p-test graph and exercise each query. The 3 killer
+queries (Q-006/009/011) get dedicated tests that verify they surface real gaps
+and DON'T false-positive on well-formed data.
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from atocore.engineering.queries import (
+    all_gaps,
+    decisions_affecting,
+    evidence_chain,
+    impact_analysis,
+    orphan_requirements,
+    recent_changes,
+    requirements_for,
+    risky_decisions,
+    system_map,
+    unsupported_claims,
+)
+from atocore.engineering.service import (
+    create_entity,
+    create_relationship,
+    init_engineering_schema,
+)
+from atocore.models.database import init_db
+
+
+@pytest.fixture
+def seeded_graph(tmp_data_dir):
+    """Build a small engineering graph for query tests."""
+    init_db()
+    init_engineering_schema()
+
+    # Subsystem + components
+    ss = create_entity("subsystem", "Optics", project="p-test")
+    c1 = create_entity("component", "Primary Mirror", project="p-test")
+    c2 = create_entity("component", "Diverger Lens", project="p-test")
+    c_orphan = create_entity("component", "Unparented", project="p-test")
+    create_relationship(c1.id, ss.id, "part_of")
+    create_relationship(c2.id, ss.id, "part_of")
+
+    # Requirements — one satisfied, one orphan
+    r_ok = create_entity("requirement", "Surface figure < 25nm RMS", project="p-test")
+    r_orphan = create_entity("requirement", "Measurement lambda/20", project="p-test")
+    create_relationship(c1.id, r_ok.id, "satisfies")
+
+    # Decisions
+    d_ok = create_entity("decision", "Use Zerodur blank", project="p-test")
+    d_risky = create_entity("decision", "Use external CGH", project="p-test")
+    create_relationship(d_ok.id, ss.id, "affected_by_decision")
+
+    # Assumption (flagged) — d_risky depends on it
+    a_flagged = create_entity(
+        "parameter", "Vendor lead time 6 weeks",
+        project="p-test",
+        properties={"flagged": True},
+    )
+    create_relationship(d_risky.id, a_flagged.id, "based_on_assumption")
+
+    # Validation claim — one supported, one not
+    v_ok = create_entity("validation_claim", "Margin is adequate", project="p-test")
+    v_orphan = create_entity("validation_claim", "Thermal stability OK", project="p-test")
+    result = create_entity("result", "FEA thermal sweep 2026-03", project="p-test")
+    create_relationship(result.id, v_ok.id, "supports")
+
+    # Material
+    mat = create_entity("material", "Zerodur", project="p-test")
+    create_relationship(c1.id, mat.id, "uses_material")
+
+    return {
+        "subsystem": ss, "component_1": c1, "component_2": c2,
+        "orphan_component": c_orphan,
+        "req_ok": r_ok, "req_orphan": r_orphan,
+        "decision_ok": d_ok, "decision_risky": d_risky,
+        "assumption_flagged": a_flagged,
+        "claim_supported": v_ok, "claim_orphan": v_orphan,
+        "result": result, "material": mat,
+    }
+
+
+# --- Structure queries ---
+
+
+def test_system_map_returns_subsystem_with_components(seeded_graph):
+    result = system_map("p-test")
+    assert result["project"] == "p-test"
+    assert len(result["subsystems"]) == 1
+    optics = result["subsystems"][0]
+    assert optics["name"] == "Optics"
+    comp_names = {c["name"] for c in optics["components"]}
+    assert "Primary Mirror" in comp_names
+    assert "Diverger Lens" in comp_names
+
+
+def test_system_map_reports_orphan_components(seeded_graph):
+    result = system_map("p-test")
+    names = {c["name"] for c in result["orphan_components"]}
+    assert "Unparented" in names
+
+
+def test_system_map_includes_materials(seeded_graph):
+    result = system_map("p-test")
+    primary = next(
+        c for s in result["subsystems"] for c in s["components"] if c["name"] == "Primary Mirror"
+    )
+    assert "Zerodur" in primary["materials"]
+
+
+def test_decisions_affecting_whole_project(seeded_graph):
+    result = decisions_affecting("p-test")
+    names = {d["name"] for d in result["decisions"]}
+    assert "Use Zerodur blank" in names
+    assert "Use external CGH" in names
+
+
+def test_decisions_affecting_specific_subsystem(seeded_graph):
+    ss_id = seeded_graph["subsystem"].id
+    result = decisions_affecting("p-test", subsystem_id=ss_id)
+    names = {d["name"] for d in result["decisions"]}
+    # d_ok has edge to subsystem directly
+    assert "Use Zerodur blank" in names
+
+
+def test_requirements_for_component(seeded_graph):
+    c_id = seeded_graph["component_1"].id
+    result = requirements_for(c_id)
+    assert result["count"] == 1
+    assert result["requirements"][0]["name"] == "Surface figure < 25nm RMS"
+
+
+def test_recent_changes_includes_created_entities(seeded_graph):
+    result = recent_changes("p-test", limit=100)
+    actions = [c["action"] for c in result["changes"]]
+    assert "created" in actions
+    assert result["count"] > 0
+
+
+# --- Killer queries ---
+
+
+def test_orphan_requirements_finds_unsatisfied(seeded_graph):
+    result = orphan_requirements("p-test")
+    names = {r["name"] for r in result["gaps"]}
+    assert "Measurement lambda/20" in names  # orphan
+    assert "Surface figure < 25nm RMS" not in names  # has SATISFIES edge
+
+
+def test_orphan_requirements_empty_when_all_satisfied(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    c = create_entity("component", "C", project="p-clean")
+    r = create_entity("requirement", "R", project="p-clean")
+    create_relationship(c.id, r.id, "satisfies")
+    result = orphan_requirements("p-clean")
+    assert result["count"] == 0
+
+
+def test_risky_decisions_finds_flagged_assumptions(seeded_graph):
+    result = risky_decisions("p-test")
+    names = {d["decision_name"] for d in result["gaps"]}
+    assert "Use external CGH" in names
+    assert "Use Zerodur blank" not in names  # has no flagged assumption
+
+
+def test_unsupported_claims_finds_orphan_claims(seeded_graph):
+    result = unsupported_claims("p-test")
+    names = {c["name"] for c in result["gaps"]}
+    assert "Thermal stability OK" in names
+    assert "Margin is adequate" not in names  # has SUPPORTS edge
+
+
+def test_all_gaps_combines_the_three_killers(seeded_graph):
+    result = all_gaps("p-test")
+    assert result["orphan_requirements"]["count"] == 1
+    assert result["risky_decisions"]["count"] == 1
+    assert result["unsupported_claims"]["count"] == 1
+
+
+def test_all_gaps_clean_project_reports_zero(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    create_entity("component", "alone", project="p-empty")
+    result = all_gaps("p-empty")
+    assert result["orphan_requirements"]["count"] == 0
+    assert result["risky_decisions"]["count"] == 0
+    assert result["unsupported_claims"]["count"] == 0
+
+
+# --- Impact + evidence ---
+
+
+def test_impact_analysis_walks_outbound_edges(seeded_graph):
+    c_id = seeded_graph["component_1"].id
+    result = impact_analysis(c_id, max_depth=2)
+    # Primary Mirror → SATISFIES → Requirement, → USES_MATERIAL → Material
+    rel_types = {i["relationship"] for i in result["impacted"]}
+    assert "satisfies" in rel_types
+    assert "uses_material" in rel_types
+
+
+def test_evidence_chain_walks_inbound_provenance(seeded_graph):
+    v_ok_id = seeded_graph["claim_supported"].id
+    result = evidence_chain(v_ok_id)
+    # The Result entity supports the claim
+    via_types = {e["via"] for e in result["evidence_chain"]}
+    assert "supports" in via_types
+    source_names = {e["source_name"] for e in result["evidence_chain"]}
+    assert "FEA thermal sweep 2026-03" in source_names
--- a/tests/test_engineering_v1_phase5.py
+++ b/tests/test_engineering_v1_phase5.py
@@ -0,0 +1,246 @@
+"""Phase 5F + 5G + 5H tests — graduation, conflicts, MCP tools."""
+
+from __future__ import annotations
+
+import pytest
+
+from atocore.engineering.conflicts import (
+    detect_conflicts_for_entity,
+    list_open_conflicts,
+    resolve_conflict,
+)
+from atocore.engineering._graduation_prompt import (
+    build_user_message,
+    parse_graduation_output,
+)
+from atocore.engineering.service import (
+    create_entity,
+    create_relationship,
+    get_entity,
+    init_engineering_schema,
+    promote_entity,
+)
+from atocore.memory.service import create_memory
+from atocore.models.database import get_connection, init_db
+
+
+# --- 5F Memory graduation ---
+
+
+def test_graduation_prompt_parses_positive_decision():
+    raw = """
+    {"graduate": true, "entity_type": "component", "name": "Primary Mirror",
+     "description": "The 1.2m primary mirror for p04", "confidence": 0.85,
+     "relationships": [{"rel_type": "part_of", "target_hint": "Optics Subsystem"}]}
+    """
+    decision = parse_graduation_output(raw)
+    assert decision is not None
+    assert decision["graduate"] is True
+    assert decision["entity_type"] == "component"
+    assert decision["name"] == "Primary Mirror"
+    assert decision["confidence"] == 0.85
+    assert decision["relationships"] == [
+        {"rel_type": "part_of", "target_hint": "Optics Subsystem"}
+    ]
+
+
+def test_graduation_prompt_parses_negative_decision():
+    raw = '{"graduate": false, "reason": "conversational filler, no typed entity"}'
+    decision = parse_graduation_output(raw)
+    assert decision is not None
+    assert decision["graduate"] is False
+    assert "filler" in decision["reason"]
+
+
+def test_graduation_prompt_rejects_unknown_entity_type():
+    raw = '{"graduate": true, "entity_type": "quantum_thing", "name": "x"}'
+    assert parse_graduation_output(raw) is None
+
+
+def test_graduation_prompt_tolerates_markdown_fences():
+    raw = '```json\n{"graduate": false, "reason": "ok"}\n```'
+    d = parse_graduation_output(raw)
+    assert d is not None
+    assert d["graduate"] is False
+
+
+def test_promote_entity_marks_source_memory_graduated(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    mem = create_memory("knowledge", "The Primary Mirror is 1.2m Zerodur",
+                       project="p-test", status="active")
+    # Create entity candidate pointing back to the memory
+    ent = create_entity(
+        "component",
+        "Primary Mirror",
+        project="p-test",
+        status="candidate",
+        source_refs=[f"memory:{mem.id}"],
+    )
+    # Promote
+    assert promote_entity(ent.id, actor="test-triage")
+
+    # Memory should now be graduated with forward pointer
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT status, graduated_to_entity_id FROM memories WHERE id = ?",
+            (mem.id,),
+        ).fetchone()
+    assert row["status"] == "graduated"
+    assert row["graduated_to_entity_id"] == ent.id
+
+
+def test_promote_entity_without_memory_refs_no_graduation(tmp_data_dir):
+    """Entity not backed by any memory — promote still works, no graduation."""
+    init_db()
+    init_engineering_schema()
+    ent = create_entity("component", "Orphan", project="p-test", status="candidate")
+    assert promote_entity(ent.id)
+    assert get_entity(ent.id).status == "active"
+
+
+# --- 5G Conflict detection ---
+
+
+def test_component_material_conflict_detected(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    c = create_entity("component", "Mirror", project="p-test")
+    m1 = create_entity("material", "Zerodur", project="p-test")
+    m2 = create_entity("material", "ULE", project="p-test")
+    create_relationship(c.id, m1.id, "uses_material")
+    create_relationship(c.id, m2.id, "uses_material")
+
+    detected = detect_conflicts_for_entity(c.id)
+    assert len(detected) == 1
+
+    conflicts = list_open_conflicts(project="p-test")
+    assert any(c["slot_kind"] == "component.material" for c in conflicts)
+    conflict = next(c for c in conflicts if c["slot_kind"] == "component.material")
+    assert len(conflict["members"]) == 2
+
+
+def test_component_part_of_conflict_detected(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    c = create_entity("component", "MultiPart", project="p-test")
+    s1 = create_entity("subsystem", "Mechanical", project="p-test")
+    s2 = create_entity("subsystem", "Optical", project="p-test")
+    create_relationship(c.id, s1.id, "part_of")
+    create_relationship(c.id, s2.id, "part_of")
+
+    detected = detect_conflicts_for_entity(c.id)
+    assert len(detected) == 1
+    conflicts = list_open_conflicts(project="p-test")
+    assert any(c["slot_kind"] == "component.part_of" for c in conflicts)
+
+
+def test_requirement_name_conflict_detected(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    r1 = create_entity("requirement", "Surface figure < 25nm",
+                      project="p-test", description="Primary mirror spec")
+    r2 = create_entity("requirement", "Surface figure < 25nm",
+                      project="p-test", description="Different interpretation")
+
+    detected = detect_conflicts_for_entity(r2.id)
+    assert len(detected) == 1
+    conflicts = list_open_conflicts(project="p-test")
+    assert any(c["slot_kind"] == "requirement.name" for c in conflicts)
+
+
+def test_conflict_not_detected_for_clean_component(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    c = create_entity("component", "Clean", project="p-test")
+    m = create_entity("material", "Zerodur", project="p-test")
+    create_relationship(c.id, m.id, "uses_material")
+
+    detected = detect_conflicts_for_entity(c.id)
+    assert detected == []
+
+
+def test_conflict_resolution_supersedes_losers(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    c = create_entity("component", "Mirror2", project="p-test")
+    m1 = create_entity("material", "Zerodur2", project="p-test")
+    m2 = create_entity("material", "ULE2", project="p-test")
+    create_relationship(c.id, m1.id, "uses_material")
+    create_relationship(c.id, m2.id, "uses_material")
+
+    detected = detect_conflicts_for_entity(c.id)
+    conflict_id = detected[0]
+
+    # Resolve by picking m1 as the winner
+    assert resolve_conflict(conflict_id, "supersede_others", winner_id=m1.id)
+
+    # m2 should now be superseded; m1 stays active
+    assert get_entity(m1.id).status == "active"
+    assert get_entity(m2.id).status == "superseded"
+
+    # Conflict should be marked resolved
+    open_conflicts = list_open_conflicts(project="p-test")
+    assert not any(c["id"] == conflict_id for c in open_conflicts)
+
+
+def test_conflict_resolution_dismiss_leaves_entities_alone(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    r1 = create_entity("requirement", "Dup req", project="p-test",
+                      description="first meaning")
+    r2 = create_entity("requirement", "Dup req", project="p-test",
+                      description="second meaning")
+    detected = detect_conflicts_for_entity(r2.id)
+    conflict_id = detected[0]
+
+    assert resolve_conflict(conflict_id, "dismiss")
+    # Both still active — dismiss just clears the conflict marker
+    assert get_entity(r1.id).status == "active"
+    assert get_entity(r2.id).status == "active"
+
+
+def test_deduplicate_conflicts_for_same_slot(tmp_data_dir):
+    """Running detection twice on the same entity shouldn't dup the conflict row."""
+    init_db()
+    init_engineering_schema()
+    c = create_entity("component", "Dup", project="p-test")
+    m1 = create_entity("material", "A", project="p-test")
+    m2 = create_entity("material", "B", project="p-test")
+    create_relationship(c.id, m1.id, "uses_material")
+    create_relationship(c.id, m2.id, "uses_material")
+
+    detect_conflicts_for_entity(c.id)
+    detect_conflicts_for_entity(c.id)  # should be a no-op
+
+    conflicts = list_open_conflicts(project="p-test")
+    mat_conflicts = [c for c in conflicts if c["slot_kind"] == "component.material"]
+    assert len(mat_conflicts) == 1
+
+
+def test_promote_triggers_conflict_detection(tmp_data_dir):
+    """End-to-end: promoting a candidate component with 2 active material edges
+    triggers conflict detection."""
+    init_db()
+    init_engineering_schema()
+
+    c = create_entity("component", "AutoFlag", project="p-test", status="candidate")
+    m1 = create_entity("material", "X1", project="p-test")
+    m2 = create_entity("material", "X2", project="p-test")
+    create_relationship(c.id, m1.id, "uses_material")
+    create_relationship(c.id, m2.id, "uses_material")
+
+    promote_entity(c.id, actor="test")
+
+    conflicts = list_open_conflicts(project="p-test")
+    assert any(c["slot_kind"] == "component.material" for c in conflicts)
+
+
+# --- 5H MCP tool shape checks (via build_user_message) ---
+
+
+def test_graduation_user_message_includes_project_and_type():
+    msg = build_user_message("some content", "p04-gigabit", "project")
+    assert "p04-gigabit" in msg
+    assert "project" in msg
+    assert "some content" in msg
--- a/tests/test_inbox_crossproject.py
+++ b/tests/test_inbox_crossproject.py
@@ -0,0 +1,201 @@
+"""Issue C — inbox pseudo-project + cross-project (project="") entities."""
+
+import pytest
+from fastapi.testclient import TestClient
+
+from atocore.engineering.service import (
+    create_entity,
+    get_entities,
+    init_engineering_schema,
+    promote_entity,
+)
+from atocore.main import app
+from atocore.projects.registry import (
+    GLOBAL_PROJECT,
+    INBOX_PROJECT,
+    is_reserved_project,
+    register_project,
+    resolve_project_name,
+    update_project,
+)
+
+
+@pytest.fixture
+def seeded_db(tmp_data_dir, tmp_path, monkeypatch):
+    # Isolate the project registry so "p05" etc. don't canonicalize
+    # to aliases inherited from the host registry.
+    registry_path = tmp_path / "test-registry.json"
+    registry_path.write_text('{"projects": []}', encoding="utf-8")
+    monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+    from atocore import config
+    config.settings = config.Settings()
+
+    init_engineering_schema()
+    # Audit table lives in the memory schema — bring it up so audit rows
+    # don't spam warnings during retargeting tests.
+    from atocore.models.database import init_db
+    init_db()
+    yield tmp_data_dir
+
+
+def test_inbox_is_reserved():
+    assert is_reserved_project("inbox") is True
+    assert is_reserved_project("INBOX") is True
+    assert is_reserved_project("p05-interferometer") is False
+    assert is_reserved_project("") is False
+
+
+def test_resolve_project_name_preserves_inbox():
+    assert resolve_project_name("inbox") == "inbox"
+    assert resolve_project_name("INBOX") == "inbox"
+    assert resolve_project_name("") == ""
+
+
+def test_cannot_register_inbox(tmp_path, monkeypatch):
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH",
+        str(tmp_path / "registry.json"),
+    )
+    from atocore import config
+    config.settings = config.Settings()
+
+    with pytest.raises(ValueError, match="reserved"):
+        register_project(
+            project_id="inbox",
+            ingest_roots=[{"source": "vault", "subpath": "incoming/inbox"}],
+        )
+
+
+def test_cannot_update_inbox(tmp_path, monkeypatch):
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH",
+        str(tmp_path / "registry.json"),
+    )
+    from atocore import config
+    config.settings = config.Settings()
+
+    with pytest.raises(ValueError, match="reserved"):
+        update_project(project_name="inbox", description="hijack attempt")
+
+
+def test_create_entity_with_empty_project_is_global(seeded_db):
+    e = create_entity(entity_type="material", name="Invar", project="")
+    assert e.project == ""
+
+
+def test_create_entity_in_inbox(seeded_db):
+    e = create_entity(entity_type="vendor", name="Zygo", project="inbox")
+    assert e.project == "inbox"
+
+
+def test_get_entities_inbox_scope(seeded_db):
+    create_entity(entity_type="vendor", name="Zygo", project="inbox")
+    create_entity(entity_type="material", name="Invar", project="")
+    create_entity(entity_type="component", name="Mirror", project="p05")
+
+    inbox = get_entities(project=INBOX_PROJECT, scope_only=True)
+    assert {e.name for e in inbox} == {"Zygo"}
+
+
+def test_get_entities_global_scope(seeded_db):
+    create_entity(entity_type="vendor", name="Zygo", project="inbox")
+    create_entity(entity_type="material", name="Invar", project="")
+    create_entity(entity_type="component", name="Mirror", project="p05")
+
+    globals_ = get_entities(project=GLOBAL_PROJECT, scope_only=True)
+    assert {e.name for e in globals_} == {"Invar"}
+
+
+def test_real_project_includes_global_by_default(seeded_db):
+    create_entity(entity_type="material", name="Invar", project="")
+    create_entity(entity_type="component", name="Mirror", project="p05")
+    create_entity(entity_type="component", name="Other", project="p06")
+
+    p05 = get_entities(project="p05")
+    names = {e.name for e in p05}
+    assert "Mirror" in names
+    assert "Invar" in names, "cross-project material should bleed in by default"
+    assert "Other" not in names
+
+
+def test_real_project_scope_only_excludes_global(seeded_db):
+    create_entity(entity_type="material", name="Invar", project="")
+    create_entity(entity_type="component", name="Mirror", project="p05")
+
+    p05 = get_entities(project="p05", scope_only=True)
+    assert {e.name for e in p05} == {"Mirror"}
+
+
+def test_api_post_entity_with_null_project_stores_global(seeded_db):
+    client = TestClient(app)
+    r = client.post("/entities", json={
+        "entity_type": "material",
+        "name": "Titanium",
+        "project": None,
+    })
+    assert r.status_code == 200
+
+    globals_ = get_entities(project=GLOBAL_PROJECT, scope_only=True)
+    assert any(e.name == "Titanium" for e in globals_)
+
+
+def test_api_get_entities_scope_only(seeded_db):
+    create_entity(entity_type="material", name="Invar", project="")
+    create_entity(entity_type="component", name="Mirror", project="p05")
+
+    client = TestClient(app)
+    mixed = client.get("/entities?project=p05").json()
+    scoped = client.get("/entities?project=p05&scope_only=true").json()
+
+    assert mixed["count"] == 2
+    assert scoped["count"] == 1
+
+
+def test_promote_with_target_project_retargets(seeded_db):
+    e = create_entity(
+        entity_type="vendor",
+        name="ZygoLead",
+        project="inbox",
+        status="candidate",
+    )
+    ok = promote_entity(e.id, target_project="p05")
+    assert ok is True
+
+    from atocore.engineering.service import get_entity
+    promoted = get_entity(e.id)
+    assert promoted.status == "active"
+    assert promoted.project == "p05"
+
+
+def test_promote_without_target_project_keeps_project(seeded_db):
+    e = create_entity(
+        entity_type="vendor",
+        name="ZygoStay",
+        project="inbox",
+        status="candidate",
+    )
+    ok = promote_entity(e.id)
+    assert ok is True
+
+    from atocore.engineering.service import get_entity
+    promoted = get_entity(e.id)
+    assert promoted.status == "active"
+    assert promoted.project == "inbox"
+
+
+def test_api_promote_with_target_project(seeded_db):
+    e = create_entity(
+        entity_type="vendor",
+        name="ZygoApi",
+        project="inbox",
+        status="candidate",
+    )
+    client = TestClient(app)
+    r = client.post(
+        f"/entities/{e.id}/promote",
+        json={"target_project": "p05"},
+    )
+    assert r.status_code == 200
+    body = r.json()
+    assert body["status"] == "promoted"
+    assert body["target_project"] == "p05"
--- a/tests/test_inject_context_hook.py
+++ b/tests/test_inject_context_hook.py
@@ -0,0 +1,198 @@
+"""Tests for deploy/hooks/inject_context.py — Claude Code UserPromptSubmit hook.
+
+These are process-level tests: we run the actual script with subprocess,
+feed it stdin, and check the exit code + stdout shape. The hook must:
+  - always exit 0 (never block a user prompt)
+  - emit valid hookSpecificOutput JSON on success
+  - fail open (empty output) on network errors, bad stdin, kill-switch
+  - respect the short-prompt filter
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import subprocess
+import sys
+from pathlib import Path
+
+import pytest
+
+HOOK = Path(__file__).resolve().parent.parent / "deploy" / "hooks" / "inject_context.py"
+
+
+def _run_hook(stdin_json: dict | str, env_overrides: dict | None = None, timeout: float = 10) -> tuple[int, str, str]:
+    env = os.environ.copy()
+    # Force kill switch off unless the test overrides
+    env.pop("ATOCORE_CONTEXT_DISABLED", None)
+    if env_overrides:
+        env.update(env_overrides)
+    stdin = stdin_json if isinstance(stdin_json, str) else json.dumps(stdin_json)
+    proc = subprocess.run(
+        [sys.executable, str(HOOK)],
+        input=stdin, text=True,
+        capture_output=True, timeout=timeout,
+        env=env,
+    )
+    return proc.returncode, proc.stdout, proc.stderr
+
+
+def test_hook_exit_0_on_success_or_failure():
+    """Canonical contract: the hook never blocks a prompt. Even with a
+    bogus URL we must exit 0 with empty stdout (fail-open)."""
+    code, stdout, stderr = _run_hook(
+        {
+            "prompt": "What's the p04-gigabit current status?",
+            "cwd": "/tmp",
+            "session_id": "t",
+            "hook_event_name": "UserPromptSubmit",
+        },
+        env_overrides={"ATOCORE_URL": "http://127.0.0.1:1",  # unreachable
+                       "ATOCORE_CONTEXT_TIMEOUT": "1"},
+    )
+    assert code == 0
+    # stdout is empty (fail-open) — no hookSpecificOutput emitted
+    assert stdout.strip() == ""
+    assert "atocore unreachable" in stderr or "request failed" in stderr
+
+
+def test_hook_kill_switch():
+    code, stdout, stderr = _run_hook(
+        {"prompt": "hello world is this a thing", "cwd": "", "session_id": "t"},
+        env_overrides={"ATOCORE_CONTEXT_DISABLED": "1"},
+    )
+    assert code == 0
+    assert stdout.strip() == ""
+
+
+def test_hook_ignores_short_prompt():
+    code, stdout, _ = _run_hook(
+        {"prompt": "ok", "cwd": "", "session_id": "t"},
+        env_overrides={"ATOCORE_URL": "http://127.0.0.1:1"},
+    )
+    assert code == 0
+    # No network call attempted; empty output
+    assert stdout.strip() == ""
+
+
+def test_hook_ignores_xml_prompt():
+    """System/meta prompts starting with '<' should be skipped."""
+    code, stdout, _ = _run_hook(
+        {"prompt": "<system>do something</system>", "cwd": "", "session_id": "t"},
+        env_overrides={"ATOCORE_URL": "http://127.0.0.1:1"},
+    )
+    assert code == 0
+    assert stdout.strip() == ""
+
+
+def test_hook_handles_bad_stdin():
+    code, stdout, stderr = _run_hook("not-json-at-all")
+    assert code == 0
+    assert stdout.strip() == ""
+    assert "bad stdin" in stderr
+
+
+def test_hook_handles_empty_stdin():
+    code, stdout, _ = _run_hook("")
+    assert code == 0
+    assert stdout.strip() == ""
+
+
+def test_hook_success_shape_with_mock_server(monkeypatch, tmp_path):
+    """When the API returns a pack, the hook emits valid
+    hookSpecificOutput JSON wrapping it."""
+    # Start a tiny HTTP server on localhost that returns a fake pack
+    import http.server
+    import json as _json
+    import threading
+
+    pack = "Trusted State: foo=bar"
+
+    class Handler(http.server.BaseHTTPRequestHandler):
+        def do_POST(self):  # noqa: N802
+            self.rfile.read(int(self.headers.get("Content-Length", 0)))
+            body = _json.dumps({"formatted_context": pack}).encode()
+            self.send_response(200)
+            self.send_header("Content-Type", "application/json")
+            self.send_header("Content-Length", str(len(body)))
+            self.end_headers()
+            self.wfile.write(body)
+
+        def log_message(self, *a, **kw):
+            pass
+
+    server = http.server.HTTPServer(("127.0.0.1", 0), Handler)
+    port = server.server_address[1]
+    t = threading.Thread(target=server.serve_forever, daemon=True)
+    t.start()
+    try:
+        code, stdout, stderr = _run_hook(
+            {
+                "prompt": "What do we know about p04?",
+                "cwd": "",
+                "session_id": "t",
+                "hook_event_name": "UserPromptSubmit",
+            },
+            env_overrides={
+                "ATOCORE_URL": f"http://127.0.0.1:{port}",
+                "ATOCORE_CONTEXT_TIMEOUT": "5",
+            },
+            timeout=15,
+        )
+    finally:
+        server.shutdown()
+
+    assert code == 0, stderr
+    assert stdout.strip(), "expected JSON output with context"
+    out = json.loads(stdout)
+    hso = out.get("hookSpecificOutput", {})
+    assert hso.get("hookEventName") == "UserPromptSubmit"
+    assert pack in hso.get("additionalContext", "")
+    assert "AtoCore-injected context" in hso.get("additionalContext", "")
+
+
+def test_hook_project_inference_from_cwd(monkeypatch):
+    """The hook should map a known cwd to a project slug and send it in
+    the /context/build payload."""
+    import http.server
+    import json as _json
+    import threading
+
+    captured_body: dict = {}
+
+    class Handler(http.server.BaseHTTPRequestHandler):
+        def do_POST(self):  # noqa: N802
+            n = int(self.headers.get("Content-Length", 0))
+            body = self.rfile.read(n)
+            captured_body.update(_json.loads(body.decode()))
+            out = _json.dumps({"formatted_context": "ok"}).encode()
+            self.send_response(200)
+            self.send_header("Content-Length", str(len(out)))
+            self.end_headers()
+            self.wfile.write(out)
+
+        def log_message(self, *a, **kw):
+            pass
+
+    server = http.server.HTTPServer(("127.0.0.1", 0), Handler)
+    port = server.server_address[1]
+    t = threading.Thread(target=server.serve_forever, daemon=True)
+    t.start()
+    try:
+        _run_hook(
+            {
+                "prompt": "Is this being tested properly",
+                "cwd": "C:\\Users\\antoi\\ATOCore",
+                "session_id": "t",
+            },
+            env_overrides={
+                "ATOCORE_URL": f"http://127.0.0.1:{port}",
+                "ATOCORE_CONTEXT_TIMEOUT": "5",
+            },
+        )
+    finally:
+        server.shutdown()
+
+    # Hook should have inferred project="atocore" from the ATOCore cwd
+    assert captured_body.get("project") == "atocore"
+    assert captured_body.get("prompt", "").startswith("Is this being tested")
--- a/tests/test_invalidate_supersede.py
+++ b/tests/test_invalidate_supersede.py
@@ -0,0 +1,194 @@
+"""Issue E — /invalidate + /supersede for active entities and memories."""
+
+import pytest
+from fastapi.testclient import TestClient
+
+from atocore.engineering.service import (
+    create_entity,
+    get_entity,
+    get_relationships,
+    init_engineering_schema,
+    invalidate_active_entity,
+    supersede_entity,
+)
+from atocore.main import app
+from atocore.memory.service import create_memory, get_memories
+
+
+def _get_memory(memory_id):
+    for status in ("active", "candidate", "invalid", "superseded"):
+        for m in get_memories(status=status, active_only=False, limit=5000):
+            if m.id == memory_id:
+                return m
+    return None
+from atocore.models.database import init_db
+
+
+@pytest.fixture
+def env(tmp_data_dir, tmp_path, monkeypatch):
+    registry_path = tmp_path / "test-registry.json"
+    registry_path.write_text('{"projects": []}', encoding="utf-8")
+    monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+    from atocore import config
+    config.settings = config.Settings()
+    init_db()
+    init_engineering_schema()
+    yield tmp_data_dir
+
+
+def test_invalidate_active_entity_transitions_to_invalid(env):
+    e = create_entity(entity_type="component", name="tower-to-kill")
+    ok, code = invalidate_active_entity(e.id, reason="duplicate")
+    assert ok is True
+    assert code == "invalidated"
+    assert get_entity(e.id).status == "invalid"
+
+
+def test_invalidate_on_candidate_is_409(env):
+    e = create_entity(entity_type="component", name="still-candidate", status="candidate")
+    ok, code = invalidate_active_entity(e.id)
+    assert ok is False
+    assert code == "not_active"
+
+
+def test_invalidate_is_idempotent_on_invalid(env):
+    e = create_entity(entity_type="component", name="already-gone")
+    invalidate_active_entity(e.id)
+    ok, code = invalidate_active_entity(e.id)
+    assert ok is True
+    assert code == "already_invalid"
+
+
+def test_supersede_creates_relationship(env):
+    old = create_entity(entity_type="component", name="old-tower")
+    new = create_entity(entity_type="component", name="new-tower")
+    ok = supersede_entity(old.id, superseded_by=new.id, note="replaced")
+    assert ok is True
+    assert get_entity(old.id).status == "superseded"
+
+    rels = get_relationships(new.id, direction="outgoing")
+    assert any(
+        r.relationship_type == "supersedes" and r.target_entity_id == old.id
+        for r in rels
+    ), "supersedes relationship must be auto-created"
+
+
+def test_supersede_rejects_self():
+    # no db needed — validation is pre-write. Using a fresh env anyway.
+    pass  # covered below via API test
+
+
+def test_api_invalidate_entity(env):
+    e = create_entity(entity_type="component", name="api-kill")
+    client = TestClient(app)
+    r = client.post(
+        f"/entities/{e.id}/invalidate",
+        json={"reason": "test cleanup"},
+    )
+    assert r.status_code == 200
+    assert r.json()["status"] == "invalidated"
+
+
+def test_api_invalidate_entity_idempotent(env):
+    e = create_entity(entity_type="component", name="api-kill-2")
+    client = TestClient(app)
+    client.post(f"/entities/{e.id}/invalidate", json={"reason": "first"})
+    r = client.post(f"/entities/{e.id}/invalidate", json={"reason": "second"})
+    assert r.status_code == 200
+    assert r.json()["status"] == "already_invalid"
+
+
+def test_api_invalidate_unknown_entity_is_404(env):
+    client = TestClient(app)
+    r = client.post(
+        "/entities/nonexistent-id/invalidate",
+        json={"reason": "missing"},
+    )
+    assert r.status_code == 404
+
+
+def test_api_invalidate_candidate_entity_is_409(env):
+    e = create_entity(entity_type="component", name="cand", status="candidate")
+    client = TestClient(app)
+    r = client.post(f"/entities/{e.id}/invalidate", json={"reason": "x"})
+    assert r.status_code == 409
+
+
+def test_api_supersede_entity(env):
+    old = create_entity(entity_type="component", name="api-old-tower")
+    new = create_entity(entity_type="component", name="api-new-tower")
+    client = TestClient(app)
+    r = client.post(
+        f"/entities/{old.id}/supersede",
+        json={"superseded_by": new.id, "reason": "dedup"},
+    )
+    assert r.status_code == 200
+    body = r.json()
+    assert body["status"] == "superseded"
+    assert body["superseded_by"] == new.id
+    assert get_entity(old.id).status == "superseded"
+
+
+def test_api_supersede_self_is_400(env):
+    e = create_entity(entity_type="component", name="self-sup")
+    client = TestClient(app)
+    r = client.post(
+        f"/entities/{e.id}/supersede",
+        json={"superseded_by": e.id, "reason": "oops"},
+    )
+    assert r.status_code == 400
+
+
+def test_api_supersede_missing_replacement_is_400(env):
+    old = create_entity(entity_type="component", name="orphan-old")
+    client = TestClient(app)
+    r = client.post(
+        f"/entities/{old.id}/supersede",
+        json={"superseded_by": "does-not-exist", "reason": "missing"},
+    )
+    assert r.status_code == 400
+
+
+def test_api_invalidate_memory(env):
+    m = create_memory(
+        memory_type="project",
+        content="memory to retract",
+        project="p05",
+    )
+    client = TestClient(app)
+    r = client.post(
+        f"/memory/{m.id}/invalidate",
+        json={"reason": "outdated"},
+    )
+    assert r.status_code == 200
+    assert r.json()["status"] == "invalidated"
+    assert _get_memory(m.id).status == "invalid"
+
+
+def test_api_supersede_memory(env):
+    m = create_memory(
+        memory_type="project",
+        content="memory to supersede",
+        project="p05",
+    )
+    client = TestClient(app)
+    r = client.post(
+        f"/memory/{m.id}/supersede",
+        json={"reason": "replaced by newer fact"},
+    )
+    assert r.status_code == 200
+    assert r.json()["status"] == "superseded"
+    assert _get_memory(m.id).status == "superseded"
+
+
+def test_v1_aliases_present(env):
+    client = TestClient(app)
+    spec = client.get("/openapi.json").json()
+    paths = spec["paths"]
+    for p in (
+        "/v1/entities/{entity_id}/invalidate",
+        "/v1/entities/{entity_id}/supersede",
+        "/v1/memory/{memory_id}/invalidate",
+        "/v1/memory/{memory_id}/supersede",
+    ):
+        assert p in paths, f"{p} missing"
--- a/tests/test_memory.py
+++ b/tests/test_memory.py
@@ -264,6 +264,170 @@ def test_expire_stale_candidates(isolated_db):
    assert mem["status"] == "invalid"


+# --- Phase 4: memory_audit log ---
+
+
+def test_audit_create_logs_entry(isolated_db):
+    from atocore.memory.service import create_memory, get_memory_audit
+
+    mem = create_memory("knowledge", "test content for audit", actor="test-harness")
+    audit = get_memory_audit(mem.id)
+    assert len(audit) >= 1
+    latest = audit[0]
+    assert latest["action"] == "created"
+    assert latest["actor"] == "test-harness"
+    assert latest["after"]["content"] == "test content for audit"
+
+
+def test_audit_promote_logs_entry(isolated_db):
+    from atocore.memory.service import create_memory, get_memory_audit, promote_memory
+
+    mem = create_memory("knowledge", "candidate for promote", status="candidate")
+    promote_memory(mem.id, actor="test-triage")
+    audit = get_memory_audit(mem.id)
+    actions = [a["action"] for a in audit]
+    assert "promoted" in actions
+    promote_entry = next(a for a in audit if a["action"] == "promoted")
+    assert promote_entry["actor"] == "test-triage"
+    assert promote_entry["before"]["status"] == "candidate"
+    assert promote_entry["after"]["status"] == "active"
+
+
+def test_audit_reject_logs_entry(isolated_db):
+    from atocore.memory.service import create_memory, get_memory_audit, reject_candidate_memory
+
+    mem = create_memory("knowledge", "candidate for reject", status="candidate")
+    reject_candidate_memory(mem.id, actor="test-triage", note="stale")
+    audit = get_memory_audit(mem.id)
+    actions = [a["action"] for a in audit]
+    assert "rejected" in actions
+    reject_entry = next(a for a in audit if a["action"] == "rejected")
+    assert reject_entry["note"] == "stale"
+
+
+def test_audit_update_captures_before_after(isolated_db):
+    from atocore.memory.service import create_memory, get_memory_audit, update_memory
+
+    mem = create_memory("knowledge", "original content", confidence=0.5)
+    update_memory(mem.id, content="updated content", confidence=0.9, actor="human-edit")
+    audit = get_memory_audit(mem.id)
+    update_entries = [a for a in audit if a["action"] == "updated"]
+    assert len(update_entries) >= 1
+    u = update_entries[0]
+    assert u["before"]["content"] == "original content"
+    assert u["after"]["content"] == "updated content"
+    assert u["before"]["confidence"] == 0.5
+    assert u["after"]["confidence"] == 0.9
+
+
+def test_audit_reinforce_logs_entry(isolated_db):
+    from atocore.memory.service import create_memory, get_memory_audit, reinforce_memory
+
+    mem = create_memory("knowledge", "reinforced mem", confidence=0.5)
+    reinforce_memory(mem.id, confidence_delta=0.02)
+    audit = get_memory_audit(mem.id)
+    actions = [a["action"] for a in audit]
+    assert "reinforced" in actions
+
+
+def test_recent_audit_returns_cross_memory_entries(isolated_db):
+    from atocore.memory.service import create_memory, get_recent_audit
+
+    m1 = create_memory("knowledge", "mem one content", actor="harness")
+    m2 = create_memory("knowledge", "mem two content", actor="harness")
+    recent = get_recent_audit(limit=10)
+    ids = {e["memory_id"] for e in recent}
+    assert m1.id in ids and m2.id in ids
+
+
+# --- Phase 3: domain_tags + valid_until ---
+
+
+def test_create_memory_with_tags_and_valid_until(isolated_db):
+    from atocore.memory.service import create_memory
+
+    mem = create_memory(
+        "knowledge",
+        "CTE gradient dominates WFE at F/1.2",
+        domain_tags=["optics", "thermal", "materials"],
+        valid_until="2027-01-01",
+    )
+    assert mem.domain_tags == ["optics", "thermal", "materials"]
+    assert mem.valid_until == "2027-01-01"
+
+
+def test_create_memory_normalizes_tags(isolated_db):
+    from atocore.memory.service import create_memory
+
+    mem = create_memory(
+        "knowledge",
+        "some content here",
+        domain_tags=["  Optics  ", "OPTICS", "Thermal", ""],
+    )
+    # Duplicates and empty removed; lowercased; stripped
+    assert mem.domain_tags == ["optics", "thermal"]
+
+
+def test_update_memory_sets_tags_and_valid_until(isolated_db):
+    from atocore.memory.service import create_memory, update_memory
+    from atocore.models.database import get_connection
+
+    mem = create_memory("knowledge", "some content for update test")
+    assert update_memory(
+        mem.id,
+        domain_tags=["controls", "firmware"],
+        valid_until="2026-12-31",
+    )
+    with get_connection() as conn:
+        row = conn.execute("SELECT domain_tags, valid_until FROM memories WHERE id = ?", (mem.id,)).fetchone()
+    import json as _json
+    assert _json.loads(row["domain_tags"]) == ["controls", "firmware"]
+    assert row["valid_until"] == "2026-12-31"
+
+
+def test_get_memories_for_context_excludes_expired(isolated_db):
+    """Expired active memories must not land in context packs."""
+    from atocore.memory.service import create_memory, get_memories_for_context
+
+    # Active but expired
+    create_memory(
+        "knowledge",
+        "stale snapshot from long ago period",
+        valid_until="2020-01-01",
+        confidence=1.0,
+    )
+    # Active and valid
+    create_memory(
+        "knowledge",
+        "durable engineering insight stays valid forever",
+        confidence=1.0,
+    )
+
+    text, _ = get_memories_for_context(memory_types=["knowledge"], budget=600)
+    assert "durable engineering" in text
+    assert "stale snapshot" not in text
+
+
+def test_context_builder_tag_boost_orders_results(isolated_db):
+    """Memories with tags matching query should rank higher."""
+    from atocore.memory.service import create_memory, get_memories_for_context
+
+    create_memory("knowledge", "generic content has no obvious overlap with topic", confidence=0.8, domain_tags=[])
+    create_memory("knowledge", "generic content has no obvious overlap topic here", confidence=0.8, domain_tags=["optics"])
+
+    text, _ = get_memories_for_context(
+        memory_types=["knowledge"],
+        budget=2000,
+        query="tell me about optics",
+    )
+    # Tagged memory should appear before the untagged one
+    idx_tagged = text.find("overlap topic here")
+    idx_untagged = text.find("overlap with topic")
+    assert idx_tagged != -1
+    assert idx_untagged != -1
+    assert idx_tagged < idx_untagged
+
+
 def test_expire_stale_candidates_keeps_reinforced(isolated_db):
    from atocore.memory.service import create_memory, expire_stale_candidates
    from atocore.models.database import get_connection
--- a/tests/test_memory_dedup.py
+++ b/tests/test_memory_dedup.py
@@ -0,0 +1,501 @@
+"""Phase 7A — memory consolidation tests.
+
+Covers:
+  - similarity helpers (cosine bounds, matrix symmetry, clustering)
+  - _dedup_prompt parser / normalizer robustness
+  - create_merge_candidate idempotency
+  - get_merge_candidates inlines source memories
+  - merge_memories end-to-end happy path (sources → superseded,
+    new merged memory active, audit rows, result_memory_id)
+  - reject_merge_candidate leaves sources untouched
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from atocore.memory._dedup_prompt import (
+    TIER2_SYSTEM_PROMPT,
+    build_tier2_user_message,
+    normalize_merge_verdict,
+    parse_merge_verdict,
+)
+from atocore.memory.service import (
+    create_memory,
+    create_merge_candidate,
+    get_memory_audit,
+    get_merge_candidates,
+    merge_memories,
+    reject_merge_candidate,
+)
+from atocore.memory.similarity import (
+    cluster_by_threshold,
+    cosine,
+    compute_memory_similarity,
+    similarity_matrix,
+)
+from atocore.models.database import get_connection, init_db
+
+
+# --- Similarity helpers ---
+
+
+def test_cosine_bounds():
+    assert cosine([1.0, 0.0], [1.0, 0.0]) == pytest.approx(1.0)
+    assert cosine([1.0, 0.0], [0.0, 1.0]) == pytest.approx(0.0)
+    # Negative dot product clamped to 0
+    assert cosine([1.0, 0.0], [-1.0, 0.0]) == 0.0
+
+
+def test_compute_memory_similarity_identical_high():
+    s = compute_memory_similarity("the sky is blue", "the sky is blue")
+    assert 0.99 <= s <= 1.0
+
+
+def test_compute_memory_similarity_unrelated_low():
+    s = compute_memory_similarity(
+        "APM integrates with NX via a Python bridge",
+        "the polisher firmware must use USB SSD not SD card",
+    )
+    assert 0.0 <= s < 0.7
+
+
+def test_similarity_matrix_symmetric():
+    texts = ["alpha beta gamma", "alpha beta gamma", "completely unrelated text"]
+    m = similarity_matrix(texts)
+    assert len(m) == 3 and all(len(r) == 3 for r in m)
+    for i in range(3):
+        assert m[i][i] == pytest.approx(1.0)
+    for i in range(3):
+        for j in range(3):
+            assert m[i][j] == pytest.approx(m[j][i])
+
+
+def test_cluster_by_threshold_transitive():
+    # Three near-paraphrases should land in one cluster
+    texts = [
+        "Antoine prefers OAuth over API keys",
+        "Antoine's preference is OAuth, not API keys",
+        "the polisher firmware uses USB SSD storage",
+    ]
+    clusters = cluster_by_threshold(texts, threshold=0.7)
+    # At least one cluster of size 2+ containing the paraphrases
+    big = [c for c in clusters if len(c) >= 2]
+    assert big, f"expected at least one multi-member cluster, got {clusters}"
+    assert 0 in big[0] and 1 in big[0]
+
+
+# --- Prompt parser robustness ---
+
+
+def test_parse_merge_verdict_strips_fences():
+    raw = "```json\n{\"action\":\"merge\",\"content\":\"x\"}\n```"
+    parsed = parse_merge_verdict(raw)
+    assert parsed == {"action": "merge", "content": "x"}
+
+
+def test_parse_merge_verdict_handles_prose_prefix():
+    raw = "Sure! Here's the result:\n{\"action\":\"reject\",\"content\":\"no\"}"
+    parsed = parse_merge_verdict(raw)
+    assert parsed is not None
+    assert parsed["action"] == "reject"
+
+
+def test_normalize_merge_verdict_fills_defaults():
+    v = normalize_merge_verdict({
+        "action": "merge",
+        "content": "unified text",
+    })
+    assert v is not None
+    assert v["memory_type"] == "knowledge"
+    assert v["project"] == ""
+    assert v["domain_tags"] == []
+    assert v["confidence"] == 0.5
+
+
+def test_normalize_merge_verdict_rejects_empty_content():
+    assert normalize_merge_verdict({"action": "merge", "content": ""}) is None
+
+
+def test_normalize_merge_verdict_rejects_unknown_action():
+    assert normalize_merge_verdict({"action": "?", "content": "x"}) is None
+
+
+# --- Tier-2 (Phase 7A.1) ---
+
+
+def test_tier2_prompt_is_stricter():
+    # The tier-2 system prompt must explicitly instruct the model to be
+    # stricter than tier-1 — that's the whole point of escalation.
+    assert "STRICTER" in TIER2_SYSTEM_PROMPT
+    assert "REJECT" in TIER2_SYSTEM_PROMPT
+
+
+def test_build_tier2_user_message_includes_tier1_draft():
+    sources = [{
+        "id": "abc12345", "content": "source text A",
+        "memory_type": "knowledge", "project": "p04",
+        "domain_tags": ["optics"], "confidence": 0.6,
+        "valid_until": "", "reference_count": 2,
+    }, {
+        "id": "def67890", "content": "source text B",
+        "memory_type": "knowledge", "project": "p04",
+        "domain_tags": ["optics"], "confidence": 0.7,
+        "valid_until": "", "reference_count": 1,
+    }]
+    tier1 = {
+        "action": "merge",
+        "content": "unified draft by tier1",
+        "memory_type": "knowledge",
+        "project": "p04",
+        "domain_tags": ["optics"],
+        "confidence": 0.65,
+        "reason": "near-paraphrase",
+    }
+    msg = build_tier2_user_message(sources, tier1)
+    assert "source text A" in msg
+    assert "source text B" in msg
+    assert "TIER-1 DRAFT" in msg
+    assert "unified draft by tier1" in msg
+    assert "near-paraphrase" in msg
+    # Should end asking for a verdict
+    assert "verdict" in msg.lower()
+
+
+# --- Host script is stdlib-only (Phase 7A architecture rule) ---
+
+
+def test_memory_dedup_script_is_stdlib_only():
+    """The host-side scripts/memory_dedup.py must NOT import anything
+    that pulls pydantic_settings, sentence-transformers, torch, etc.
+    into the host Python. The only atocore-land module allowed is the
+    stdlib-only prompt helper at atocore.memory._dedup_prompt.
+
+    This regression test prevents re-introducing the bug where the
+    dedup-watcher on Dalidou host crashed with ModuleNotFoundError
+    because someone imported atocore.memory.similarity (which pulls
+    in atocore.retrieval.embeddings → sentence_transformers)."""
+    import importlib.util
+    import sys as _sys
+
+    before = set(_sys.modules.keys())
+    spec = importlib.util.spec_from_file_location(
+        "memory_dedup_for_test", "scripts/memory_dedup.py",
+    )
+    mod = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(mod)
+    after = set(_sys.modules.keys())
+
+    new_atocore = sorted(m for m in (after - before) if m.startswith("atocore"))
+    # Only the stdlib-only shared prompt module is allowed to load
+    allowed = {"atocore", "atocore.memory", "atocore.memory._dedup_prompt"}
+    disallowed = [m for m in new_atocore if m not in allowed]
+    assert not disallowed, (
+        f"scripts/memory_dedup.py pulled non-stdlib atocore modules "
+        f"(will break host Python without ML deps): {disallowed}"
+    )
+
+
+# --- Server-side clustering (still in atocore.memory.similarity) ---
+
+
+def test_similarity_module_server_side():
+    """similarity.py stays server-side for ML deps. These helpers are
+    only invoked via the /admin/memory/dedup-cluster endpoint."""
+    from atocore.memory.similarity import cluster_by_threshold
+    clusters = cluster_by_threshold(
+        ["duplicate fact A", "duplicate fact A slightly reworded",
+         "totally unrelated fact about firmware"],
+        threshold=0.7,
+    )
+    multi = [c for c in clusters if len(c) >= 2]
+    assert multi, "expected at least one multi-member cluster"
+
+
+def test_cluster_endpoint_returns_groups(tmp_data_dir):
+    """POST /admin/memory/dedup-cluster shape test — we just verify the
+    service layer produces the expected output. Full HTTP is
+    integration-tested by the live scan."""
+    from atocore.models.database import init_db
+    init_db()
+    from atocore.memory.service import create_memory, get_memories
+    create_memory("knowledge", "APM uses NX bridge for DXF to STL conversion",
+                  project="apm")
+    create_memory("knowledge", "APM uses the NX Python bridge for DXF-to-STL",
+                  project="apm")
+    create_memory("knowledge", "The polisher firmware requires USB SSD storage",
+                  project="p06-polisher")
+
+    # Mirror the server code path
+    from atocore.memory.similarity import cluster_by_threshold
+    mems = get_memories(project="apm", active_only=True, limit=100)
+    texts = [m.content for m in mems]
+    clusters = cluster_by_threshold(texts, threshold=0.7)
+    multi = [c for c in clusters if len(c) >= 2]
+    assert multi, "expected the two APM memories to cluster together"
+    # Unrelated p06 memory should NOT be in that cluster
+    apm_ids = {mems[i].id for i in multi[0]}
+    assert len(apm_ids) == 2
+    all_ids = {m.id for m in mems}
+    assert apm_ids.issubset(all_ids)
+
+
+# --- create_merge_candidate idempotency ---
+
+
+def test_create_merge_candidate_inserts_row(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "APM uses NX for DXF conversion")
+    m2 = create_memory("knowledge", "APM uses NX for DXF-to-STL")
+
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id],
+        similarity=0.92,
+        proposed_content="APM uses NX for DXF→STL conversion",
+        proposed_memory_type="knowledge",
+        proposed_project="",
+        proposed_tags=["apm", "nx"],
+        proposed_confidence=0.6,
+        reason="near-paraphrase",
+    )
+    assert cid is not None
+
+    pending = get_merge_candidates(status="pending")
+    assert len(pending) == 1
+    assert pending[0]["id"] == cid
+    assert pending[0]["similarity"] == pytest.approx(0.92)
+    assert len(pending[0]["sources"]) == 2
+
+
+def test_create_merge_candidate_idempotent(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "Fact A")
+    m2 = create_memory("knowledge", "Fact A slightly reworded")
+
+    first = create_merge_candidate(
+        memory_ids=[m1.id, m2.id],
+        similarity=0.9,
+        proposed_content="merged",
+        proposed_memory_type="knowledge",
+        proposed_project="",
+    )
+    # Same id set, different order → dedupe skips
+    second = create_merge_candidate(
+        memory_ids=[m2.id, m1.id],
+        similarity=0.9,
+        proposed_content="merged (again)",
+        proposed_memory_type="knowledge",
+        proposed_project="",
+    )
+    assert first is not None
+    assert second is None
+
+
+def test_create_merge_candidate_requires_two_ids(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "lonely")
+    with pytest.raises(ValueError):
+        create_merge_candidate(
+            memory_ids=[m1.id],
+            similarity=1.0,
+            proposed_content="x",
+            proposed_memory_type="knowledge",
+            proposed_project="",
+        )
+
+
+# --- merge_memories end-to-end ---
+
+
+def test_merge_memories_happy_path(tmp_data_dir):
+    init_db()
+    m1 = create_memory(
+        "knowledge", "APM uses NX for DXF conversion",
+        project="apm", confidence=0.6, domain_tags=["apm", "nx"],
+    )
+    m2 = create_memory(
+        "knowledge", "APM does DXF to STL via NX bridge",
+        project="apm", confidence=0.8, domain_tags=["apm", "bridge"],
+    )
+    # Bump reference counts so sum is meaningful
+    with get_connection() as conn:
+        conn.execute("UPDATE memories SET reference_count = 3 WHERE id = ?", (m1.id,))
+        conn.execute("UPDATE memories SET reference_count = 5 WHERE id = ?", (m2.id,))
+
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id],
+        similarity=0.92,
+        proposed_content="APM uses NX bridge for DXF→STL conversion",
+        proposed_memory_type="knowledge",
+        proposed_project="apm",
+        proposed_tags=["apm", "nx", "bridge"],
+        proposed_confidence=0.7,
+        reason="duplicates",
+    )
+    new_id = merge_memories(candidate_id=cid, actor="human-triage")
+    assert new_id is not None
+
+    # Sources superseded
+    with get_connection() as conn:
+        s1 = conn.execute("SELECT status FROM memories WHERE id = ?", (m1.id,)).fetchone()
+        s2 = conn.execute("SELECT status FROM memories WHERE id = ?", (m2.id,)).fetchone()
+        merged = conn.execute(
+            "SELECT content, status, confidence, reference_count, project "
+            "FROM memories WHERE id = ?", (new_id,)
+        ).fetchone()
+        cand = conn.execute(
+            "SELECT status, result_memory_id FROM memory_merge_candidates WHERE id = ?",
+            (cid,),
+        ).fetchone()
+    assert s1["status"] == "superseded"
+    assert s2["status"] == "superseded"
+    assert merged["status"] == "active"
+    assert merged["project"] == "apm"
+    # confidence = max of sources (0.8), not the proposed 0.7 (proposed is hint;
+    # merge_memories picks max of actual source confidences — verify).
+    assert merged["confidence"] == pytest.approx(0.8)
+    # reference_count = sum (3 + 5 = 8)
+    assert int(merged["reference_count"]) == 8
+    assert cand["status"] == "approved"
+    assert cand["result_memory_id"] == new_id
+
+
+def test_merge_memories_content_override(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "draft A", project="p05-interferometer")
+    m2 = create_memory("knowledge", "draft B", project="p05-interferometer")
+
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id],
+        similarity=0.9,
+        proposed_content="AI draft",
+        proposed_memory_type="knowledge",
+        proposed_project="p05-interferometer",
+    )
+    new_id = merge_memories(
+        candidate_id=cid,
+        actor="human-triage",
+        override_content="human-edited final text",
+        override_tags=["optics", "custom"],
+    )
+    assert new_id is not None
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT content, domain_tags FROM memories WHERE id = ?", (new_id,)
+        ).fetchone()
+    assert row["content"] == "human-edited final text"
+    # domain_tags JSON should contain the override
+    assert "optics" in row["domain_tags"]
+    assert "custom" in row["domain_tags"]
+
+
+def test_merge_memories_writes_audit(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "alpha")
+    m2 = create_memory("knowledge", "alpha variant")
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id], similarity=0.9,
+        proposed_content="alpha merged",
+        proposed_memory_type="knowledge", proposed_project="",
+    )
+    new_id = merge_memories(candidate_id=cid)
+    assert new_id
+
+    audit_new = get_memory_audit(new_id)
+    actions_new = {a["action"] for a in audit_new}
+    assert "created_via_merge" in actions_new
+
+    audit_m1 = get_memory_audit(m1.id)
+    actions_m1 = {a["action"] for a in audit_m1}
+    assert "superseded" in actions_m1
+
+
+def test_merge_memories_aborts_if_source_not_active(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "one")
+    m2 = create_memory("knowledge", "two")
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id], similarity=0.9,
+        proposed_content="merged",
+        proposed_memory_type="knowledge", proposed_project="",
+    )
+    # Tamper: supersede one source before the merge runs
+    with get_connection() as conn:
+        conn.execute("UPDATE memories SET status = 'superseded' WHERE id = ?", (m1.id,))
+    result = merge_memories(candidate_id=cid)
+    assert result is None
+
+    # Candidate still pending
+    pending = get_merge_candidates(status="pending")
+    assert any(c["id"] == cid for c in pending)
+
+
+def test_merge_memories_rejects_already_resolved(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "x")
+    m2 = create_memory("knowledge", "y")
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id], similarity=0.9,
+        proposed_content="xy",
+        proposed_memory_type="knowledge", proposed_project="",
+    )
+    first = merge_memories(candidate_id=cid)
+    assert first is not None
+    # second call — already approved, should return None
+    second = merge_memories(candidate_id=cid)
+    assert second is None
+
+
+# --- reject_merge_candidate ---
+
+
+def test_reject_merge_candidate_leaves_sources_untouched(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "a")
+    m2 = create_memory("knowledge", "b")
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id], similarity=0.9,
+        proposed_content="a+b",
+        proposed_memory_type="knowledge", proposed_project="",
+    )
+    ok = reject_merge_candidate(cid, actor="human-triage", note="false positive")
+    assert ok
+
+    # Sources still active
+    with get_connection() as conn:
+        s1 = conn.execute("SELECT status FROM memories WHERE id = ?", (m1.id,)).fetchone()
+        s2 = conn.execute("SELECT status FROM memories WHERE id = ?", (m2.id,)).fetchone()
+        cand = conn.execute(
+            "SELECT status FROM memory_merge_candidates WHERE id = ?", (cid,)
+        ).fetchone()
+    assert s1["status"] == "active"
+    assert s2["status"] == "active"
+    assert cand["status"] == "rejected"
+
+
+def test_reject_merge_candidate_idempotent(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "p")
+    m2 = create_memory("knowledge", "q")
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id], similarity=0.9,
+        proposed_content="pq",
+        proposed_memory_type="knowledge", proposed_project="",
+    )
+    assert reject_merge_candidate(cid) is True
+    # second reject — already rejected, returns False
+    assert reject_merge_candidate(cid) is False
+
+
+# --- Schema sanity ---
+
+
+def test_merge_candidates_table_exists(tmp_data_dir):
+    init_db()
+    with get_connection() as conn:
+        cols = [r["name"] for r in conn.execute("PRAGMA table_info(memory_merge_candidates)").fetchall()]
+    expected = {"id", "status", "memory_ids", "similarity", "proposed_content",
+                "proposed_memory_type", "proposed_project", "proposed_tags",
+                "proposed_confidence", "reason", "created_at", "resolved_at",
+                "resolved_by", "result_memory_id"}
+    assert expected.issubset(set(cols))
--- a/tests/test_patch_entity.py
+++ b/tests/test_patch_entity.py
@@ -0,0 +1,160 @@
+"""PATCH /entities/{id} — edit mutable fields without cloning (sprint P1)."""
+
+import pytest
+from fastapi.testclient import TestClient
+
+from atocore.engineering.service import (
+    create_entity,
+    get_entity,
+    init_engineering_schema,
+    update_entity,
+)
+from atocore.main import app
+from atocore.models.database import init_db
+
+
+@pytest.fixture
+def env(tmp_data_dir, tmp_path, monkeypatch):
+    registry_path = tmp_path / "test-registry.json"
+    registry_path.write_text('{"projects": []}', encoding="utf-8")
+    monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+    from atocore import config
+    config.settings = config.Settings()
+    init_db()
+    init_engineering_schema()
+    yield tmp_data_dir
+
+
+def test_update_entity_description(env):
+    e = create_entity(entity_type="component", name="t", description="old desc")
+    updated = update_entity(e.id, description="new desc")
+    assert updated.description == "new desc"
+    assert get_entity(e.id).description == "new desc"
+
+
+def test_update_entity_properties_merge(env):
+    e = create_entity(
+        entity_type="component",
+        name="t2",
+        properties={"color": "red", "kg": 5},
+    )
+    updated = update_entity(
+        e.id, properties_patch={"color": "blue", "material": "invar"},
+    )
+    assert updated.properties == {"color": "blue", "kg": 5, "material": "invar"}
+
+
+def test_update_entity_properties_null_deletes_key(env):
+    e = create_entity(
+        entity_type="component",
+        name="t3",
+        properties={"color": "red", "kg": 5},
+    )
+    updated = update_entity(e.id, properties_patch={"color": None})
+    assert "color" not in updated.properties
+    assert updated.properties.get("kg") == 5
+
+
+def test_update_entity_confidence_bounds(env):
+    e = create_entity(entity_type="component", name="t4", confidence=0.5)
+    with pytest.raises(ValueError):
+        update_entity(e.id, confidence=1.5)
+    with pytest.raises(ValueError):
+        update_entity(e.id, confidence=-0.1)
+
+
+def test_update_entity_source_refs_append_dedup(env):
+    e = create_entity(
+        entity_type="component",
+        name="t5",
+        source_refs=["session:a", "session:b"],
+    )
+    updated = update_entity(
+        e.id, append_source_refs=["session:b", "session:c"],
+    )
+    assert updated.source_refs == ["session:a", "session:b", "session:c"]
+
+
+def test_update_entity_returns_none_for_unknown(env):
+    assert update_entity("nonexistent", description="x") is None
+
+
+def test_api_patch_happy_path(env):
+    e = create_entity(
+        entity_type="component",
+        name="tower",
+        description="old",
+        properties={"material": "steel"},
+        confidence=0.6,
+    )
+    client = TestClient(app)
+    r = client.patch(
+        f"/entities/{e.id}",
+        json={
+            "description": "three-stage tower",
+            "properties": {"material": "invar", "height_mm": 1200},
+            "confidence": 0.9,
+            "source_refs": ["session:s1"],
+            "note": "from voice session",
+        },
+    )
+    assert r.status_code == 200, r.text
+    body = r.json()
+    assert body["description"] == "three-stage tower"
+    assert body["properties"]["material"] == "invar"
+    assert body["properties"]["height_mm"] == 1200
+    assert body["confidence"] == 0.9
+    assert "session:s1" in body["source_refs"]
+
+
+def test_api_patch_omitted_fields_unchanged(env):
+    e = create_entity(
+        entity_type="component",
+        name="keep-desc",
+        description="keep me",
+    )
+    client = TestClient(app)
+    r = client.patch(
+        f"/entities/{e.id}",
+        json={"confidence": 0.7},
+    )
+    assert r.status_code == 200
+    assert r.json()["description"] == "keep me"
+
+
+def test_api_patch_404_on_missing(env):
+    client = TestClient(app)
+    r = client.patch("/entities/does-not-exist", json={"description": "x"})
+    assert r.status_code == 404
+
+
+def test_api_patch_rejects_bad_confidence(env):
+    e = create_entity(entity_type="component", name="bad-conf")
+    client = TestClient(app)
+    r = client.patch(f"/entities/{e.id}", json={"confidence": 2.0})
+    assert r.status_code == 400
+
+
+def test_api_patch_aliased_under_v1(env):
+    e = create_entity(entity_type="component", name="v1-patch")
+    client = TestClient(app)
+    r = client.patch(
+        f"/v1/entities/{e.id}",
+        json={"description": "via v1"},
+    )
+    assert r.status_code == 200
+    assert get_entity(e.id).description == "via v1"
+
+
+def test_api_patch_audit_row_written(env):
+    from atocore.engineering.service import get_entity_audit
+
+    e = create_entity(entity_type="component", name="audit-check")
+    client = TestClient(app)
+    client.patch(
+        f"/entities/{e.id}",
+        json={"description": "new", "note": "manual edit"},
+    )
+    audit = get_entity_audit(e.id)
+    actions = [a["action"] for a in audit]
+    assert "updated" in actions
--- a/tests/test_phase6_living_taxonomy.py
+++ b/tests/test_phase6_living_taxonomy.py
@@ -0,0 +1,148 @@
+"""Phase 6 tests — Living Taxonomy: detector + transient-to-durable extension."""
+
+from __future__ import annotations
+
+from datetime import datetime, timedelta, timezone
+
+import pytest
+
+from atocore.memory.service import (
+    create_memory,
+    extend_reinforced_valid_until,
+)
+from atocore.models.database import get_connection, init_db
+
+
+def _set_memory_fields(mem_id, reference_count=None, valid_until=None):
+    """Helper to force memory state for tests."""
+    with get_connection() as conn:
+        fields, params = [], []
+        if reference_count is not None:
+            fields.append("reference_count = ?")
+            params.append(reference_count)
+        if valid_until is not None:
+            fields.append("valid_until = ?")
+            params.append(valid_until)
+        params.append(mem_id)
+        conn.execute(
+            f"UPDATE memories SET {', '.join(fields)} WHERE id = ?",
+            params,
+        )
+
+
+# --- Transient-to-durable extension (C.3) ---
+
+
+def test_extend_extends_imminent_valid_until(tmp_data_dir):
+    init_db()
+    mem = create_memory("knowledge", "Reinforced content for extension")
+    soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
+    _set_memory_fields(mem.id, reference_count=6, valid_until=soon)
+
+    result = extend_reinforced_valid_until()
+    assert len(result) == 1
+    assert result[0]["memory_id"] == mem.id
+    assert result[0]["action"] == "extended"
+    # New expiry should be ~90 days out
+    new_date = datetime.strptime(result[0]["new_valid_until"], "%Y-%m-%d")
+    days_out = (new_date - datetime.now(timezone.utc).replace(tzinfo=None)).days
+    assert 85 <= days_out <= 92  # ~90 days, some slop for test timing
+
+
+def test_extend_makes_permanent_at_high_reference_count(tmp_data_dir):
+    init_db()
+    mem = create_memory("knowledge", "Heavy-referenced content")
+    soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
+    _set_memory_fields(mem.id, reference_count=15, valid_until=soon)
+
+    result = extend_reinforced_valid_until()
+    assert len(result) == 1
+    assert result[0]["action"] == "made_permanent"
+    assert result[0]["new_valid_until"] is None
+
+    # Verify the DB reflects the cleared expiry
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT valid_until FROM memories WHERE id = ?", (mem.id,)
+        ).fetchone()
+    assert row["valid_until"] is None
+
+
+def test_extend_skips_not_expiring_soon(tmp_data_dir):
+    init_db()
+    mem = create_memory("knowledge", "Far-future expiry")
+    far = (datetime.now(timezone.utc) + timedelta(days=365)).strftime("%Y-%m-%d")
+    _set_memory_fields(mem.id, reference_count=6, valid_until=far)
+
+    result = extend_reinforced_valid_until(imminent_expiry_days=30)
+    assert result == []
+
+
+def test_extend_skips_low_reference_count(tmp_data_dir):
+    init_db()
+    mem = create_memory("knowledge", "Not reinforced enough")
+    soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
+    _set_memory_fields(mem.id, reference_count=2, valid_until=soon)
+
+    result = extend_reinforced_valid_until(min_reference_count=5)
+    assert result == []
+
+
+def test_extend_skips_permanent_memory(tmp_data_dir):
+    """Memory with no valid_until is already permanent — shouldn't touch."""
+    init_db()
+    mem = create_memory("knowledge", "Already permanent")
+    _set_memory_fields(mem.id, reference_count=20)
+    # no valid_until
+
+    result = extend_reinforced_valid_until()
+    assert result == []
+
+
+def test_extend_writes_audit_row(tmp_data_dir):
+    init_db()
+    mem = create_memory("knowledge", "Audited extension")
+    soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
+    _set_memory_fields(mem.id, reference_count=6, valid_until=soon)
+
+    extend_reinforced_valid_until()
+
+    from atocore.memory.service import get_memory_audit
+    audit = get_memory_audit(mem.id)
+    actions = [a["action"] for a in audit]
+    assert "valid_until_extended" in actions
+    entry = next(a for a in audit if a["action"] == "valid_until_extended")
+    assert entry["actor"] == "transient-to-durable"
+
+
+# --- Emerging detector (smoke tests — detector runs against live DB state
+#     so we test the shape of results rather than full integration here) ---
+
+
+def test_detector_imports_cleanly():
+    """Detector module must import without errors (it's called from nightly cron)."""
+    import importlib.util
+    import sys
+    from pathlib import Path
+
+    # Load the detector script as a module
+    script = Path(__file__).resolve().parent.parent / "scripts" / "detect_emerging.py"
+    assert script.exists()
+    spec = importlib.util.spec_from_file_location("detect_emerging", script)
+    mod = importlib.util.module_from_spec(spec)
+    # Don't actually run main() — just verify it parses and defines expected names
+    spec.loader.exec_module(mod)
+    assert hasattr(mod, "main")
+    assert hasattr(mod, "PROJECT_MIN_MEMORIES")
+    assert hasattr(mod, "PROJECT_ALERT_THRESHOLD")
+
+
+def test_detector_handles_empty_db(tmp_data_dir):
+    """Detector should handle zero memories without crashing."""
+    init_db()
+    # Don't create any memories. Just verify the queries work via the service layer.
+    from atocore.memory.service import get_memories
+    active = get_memories(active_only=True, limit=500)
+    candidates = get_memories(status="candidate", limit=500)
+    assert active == []
+    assert candidates == []
--- a/tests/test_tag_canon.py
+++ b/tests/test_tag_canon.py
@@ -0,0 +1,296 @@
+"""Phase 7C — tag canonicalization tests.
+
+Covers:
+  - prompt parser (fences, prose, empty)
+  - normalizer (identity, protected tokens, empty)
+  - get_tag_distribution counts across active memories
+  - apply_tag_alias rewrites + dedupes + audits
+  - create / approve / reject lifecycle
+  - idempotency (dup proposals skipped)
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from atocore.memory._tag_canon_prompt import (
+    PROTECTED_PROJECT_TOKENS,
+    build_user_message,
+    normalize_alias_item,
+    parse_canon_output,
+)
+from atocore.memory.service import (
+    apply_tag_alias,
+    approve_tag_alias,
+    create_memory,
+    create_tag_alias_proposal,
+    get_memory_audit,
+    get_tag_alias_proposals,
+    get_tag_distribution,
+    reject_tag_alias,
+)
+from atocore.models.database import get_connection, init_db
+
+
+# --- Prompt parser ---
+
+
+def test_parse_canon_output_handles_fences():
+    raw = "```json\n{\"aliases\": [{\"alias\": \"fw\", \"canonical\": \"firmware\", \"confidence\": 0.9}]}\n```"
+    items = parse_canon_output(raw)
+    assert len(items) == 1
+    assert items[0]["alias"] == "fw"
+
+
+def test_parse_canon_output_handles_prose_prefix():
+    raw = "Here you go:\n{\"aliases\": [{\"alias\": \"ml\", \"canonical\": \"machine-learning\", \"confidence\": 0.9}]}"
+    items = parse_canon_output(raw)
+    assert len(items) == 1
+
+
+def test_parse_canon_output_empty_list():
+    assert parse_canon_output("{\"aliases\": []}") == []
+
+
+def test_parse_canon_output_malformed():
+    assert parse_canon_output("not json at all") == []
+    assert parse_canon_output("") == []
+
+
+# --- Normalizer ---
+
+
+def test_normalize_alias_strips_and_lowercases():
+    n = normalize_alias_item({"alias": " FW ", "canonical": "Firmware", "confidence": 0.95, "reason": "abbrev"})
+    assert n == {"alias": "fw", "canonical": "firmware", "confidence": 0.95, "reason": "abbrev"}
+
+
+def test_normalize_rejects_identity():
+    assert normalize_alias_item({"alias": "foo", "canonical": "foo", "confidence": 0.9}) is None
+
+
+def test_normalize_rejects_empty():
+    assert normalize_alias_item({"alias": "", "canonical": "foo", "confidence": 0.9}) is None
+    assert normalize_alias_item({"alias": "foo", "canonical": "", "confidence": 0.9}) is None
+
+
+def test_normalize_protects_project_tokens():
+    # Project ids must not be canonicalized — they're their own namespace
+    assert "p04" in PROTECTED_PROJECT_TOKENS
+    assert normalize_alias_item({"alias": "p04", "canonical": "p04-gigabit", "confidence": 1.0}) is None
+    assert normalize_alias_item({"alias": "p04-gigabit", "canonical": "p04", "confidence": 1.0}) is None
+    assert normalize_alias_item({"alias": "apm", "canonical": "part-manager", "confidence": 1.0}) is None
+
+
+def test_normalize_clamps_confidence():
+    hi = normalize_alias_item({"alias": "a", "canonical": "b", "confidence": 2.5})
+    assert hi["confidence"] == 1.0
+    lo = normalize_alias_item({"alias": "a", "canonical": "b", "confidence": -0.5})
+    assert lo["confidence"] == 0.0
+
+
+def test_normalize_handles_non_numeric_confidence():
+    n = normalize_alias_item({"alias": "a", "canonical": "b", "confidence": "not a number"})
+    assert n is not None and n["confidence"] == 0.0
+
+
+# --- build_user_message ---
+
+
+def test_build_user_message_includes_top_tags():
+    dist = {"firmware": 23, "fw": 5, "optics": 18, "optical": 2}
+    msg = build_user_message(dist)
+    assert "firmware: 23" in msg
+    assert "optics: 18" in msg
+    assert "aliases" in msg.lower() or "JSON" in msg
+
+
+def test_build_user_message_empty():
+    msg = build_user_message({})
+    assert "Empty" in msg or "empty" in msg
+
+
+# --- get_tag_distribution ---
+
+
+def test_tag_distribution_counts_active_only(tmp_data_dir):
+    init_db()
+    create_memory("knowledge", "a", domain_tags=["firmware", "p06"])
+    create_memory("knowledge", "b", domain_tags=["firmware"])
+    create_memory("knowledge", "c", domain_tags=["optics"])
+
+    # Add an invalid memory — should NOT be counted
+    m_invalid = create_memory("knowledge", "d", domain_tags=["firmware", "ignored"])
+    with get_connection() as conn:
+        conn.execute("UPDATE memories SET status = 'invalid' WHERE id = ?", (m_invalid.id,))
+
+    dist = get_tag_distribution()
+    assert dist.get("firmware") == 2  # two active memories
+    assert dist.get("optics") == 1
+    assert dist.get("p06") == 1
+    assert "ignored" not in dist
+
+
+def test_tag_distribution_min_count_filter(tmp_data_dir):
+    init_db()
+    create_memory("knowledge", "a", domain_tags=["firmware"])
+    create_memory("knowledge", "b", domain_tags=["firmware"])
+    create_memory("knowledge", "c", domain_tags=["once"])
+
+    dist = get_tag_distribution(min_count=2)
+    assert "firmware" in dist
+    assert "once" not in dist
+
+
+# --- apply_tag_alias ---
+
+
+def test_apply_tag_alias_rewrites_across_memories(tmp_data_dir):
+    init_db()
+    m1 = create_memory("knowledge", "a", domain_tags=["fw", "p06"])
+    m2 = create_memory("knowledge", "b", domain_tags=["fw"])
+    m3 = create_memory("knowledge", "c", domain_tags=["optics"])  # untouched
+
+    result = apply_tag_alias("fw", "firmware")
+    assert result["memories_touched"] == 2
+
+    import json as _json
+    with get_connection() as conn:
+        r1 = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m1.id,)).fetchone()
+        r2 = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m2.id,)).fetchone()
+        r3 = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m3.id,)).fetchone()
+    assert "firmware" in _json.loads(r1["domain_tags"])
+    assert "fw" not in _json.loads(r1["domain_tags"])
+    assert "firmware" in _json.loads(r2["domain_tags"])
+    assert _json.loads(r3["domain_tags"]) == ["optics"]  # untouched
+
+
+def test_apply_tag_alias_dedupes_when_both_present(tmp_data_dir):
+    """Memory has both fw AND firmware → rewrite collapses to just firmware."""
+    init_db()
+    m = create_memory("knowledge", "dual-tagged", domain_tags=["fw", "firmware", "p06"])
+
+    result = apply_tag_alias("fw", "firmware")
+    assert result["memories_touched"] == 1
+
+    import json as _json
+    with get_connection() as conn:
+        r = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m.id,)).fetchone()
+    tags = _json.loads(r["domain_tags"])
+    assert tags.count("firmware") == 1
+    assert "fw" not in tags
+    assert "p06" in tags
+
+
+def test_apply_tag_alias_skips_memories_without_alias(tmp_data_dir):
+    init_db()
+    m = create_memory("knowledge", "no match", domain_tags=["optics", "p04"])
+    result = apply_tag_alias("fw", "firmware")
+    assert result["memories_touched"] == 0
+
+
+def test_apply_tag_alias_writes_audit(tmp_data_dir):
+    init_db()
+    m = create_memory("knowledge", "audited", domain_tags=["fw"])
+    apply_tag_alias("fw", "firmware", actor="auto-tag-canon")
+
+    audit = get_memory_audit(m.id)
+    actions = [a["action"] for a in audit]
+    assert "tag_canonicalized" in actions
+    entry = next(a for a in audit if a["action"] == "tag_canonicalized")
+    assert entry["actor"] == "auto-tag-canon"
+    assert "fw → firmware" in entry["note"]
+    assert "fw" in entry["before"]["domain_tags"]
+    assert "firmware" in entry["after"]["domain_tags"]
+
+
+def test_apply_tag_alias_rejects_identity(tmp_data_dir):
+    init_db()
+    with pytest.raises(ValueError):
+        apply_tag_alias("foo", "foo")
+
+
+def test_apply_tag_alias_rejects_empty(tmp_data_dir):
+    init_db()
+    with pytest.raises(ValueError):
+        apply_tag_alias("", "firmware")
+
+
+# --- Proposal lifecycle ---
+
+
+def test_create_proposal_inserts_pending(tmp_data_dir):
+    init_db()
+    pid = create_tag_alias_proposal("fw", "firmware", confidence=0.65,
+                                    alias_count=5, canonical_count=23,
+                                    reason="standard abbreviation")
+    assert pid is not None
+
+    rows = get_tag_alias_proposals(status="pending")
+    assert len(rows) == 1
+    assert rows[0]["alias"] == "fw"
+    assert rows[0]["confidence"] == pytest.approx(0.65)
+
+
+def test_create_proposal_idempotent(tmp_data_dir):
+    init_db()
+    first = create_tag_alias_proposal("fw", "firmware", confidence=0.6)
+    second = create_tag_alias_proposal("fw", "firmware", confidence=0.7)
+    assert first is not None
+    assert second is None
+
+
+def test_approve_applies_rewrite(tmp_data_dir):
+    init_db()
+    m = create_memory("knowledge", "x", domain_tags=["fw"])
+    pid = create_tag_alias_proposal("fw", "firmware", confidence=0.7)
+    result = approve_tag_alias(pid, actor="human-triage")
+    assert result is not None
+    assert result["memories_touched"] == 1
+
+    # Proposal now approved with applied_to_memories recorded
+    rows = get_tag_alias_proposals(status="approved")
+    assert len(rows) == 1
+    assert rows[0]["applied_to_memories"] == 1
+
+    # Memory actually rewritten
+    import json as _json
+    with get_connection() as conn:
+        r = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m.id,)).fetchone()
+    assert "firmware" in _json.loads(r["domain_tags"])
+
+
+def test_approve_already_resolved_returns_none(tmp_data_dir):
+    init_db()
+    pid = create_tag_alias_proposal("a", "b", confidence=0.6)
+    approve_tag_alias(pid)
+    assert approve_tag_alias(pid) is None  # second approve — no-op
+
+
+def test_reject_leaves_memories_untouched(tmp_data_dir):
+    init_db()
+    m = create_memory("knowledge", "x", domain_tags=["fw"])
+    pid = create_tag_alias_proposal("fw", "firmware", confidence=0.6)
+    assert reject_tag_alias(pid)
+
+    rows = get_tag_alias_proposals(status="rejected")
+    assert len(rows) == 1
+
+    # Memory still has the original tag
+    import json as _json
+    with get_connection() as conn:
+        r = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m.id,)).fetchone()
+    assert "fw" in _json.loads(r["domain_tags"])
+
+
+# --- Schema sanity ---
+
+
+def test_tag_aliases_table_exists(tmp_data_dir):
+    init_db()
+    with get_connection() as conn:
+        cols = [r["name"] for r in conn.execute("PRAGMA table_info(tag_aliases)").fetchall()]
+    expected = {"id", "alias", "canonical", "status", "confidence",
+                "alias_count", "canonical_count", "reason",
+                "applied_to_memories", "created_at", "resolved_at", "resolved_by"}
+    assert expected.issubset(set(cols))
--- a/tests/test_triage_escalation.py
+++ b/tests/test_triage_escalation.py
@@ -0,0 +1,219 @@
+"""Tests for 3-tier triage escalation logic (Phase Triage Quality).
+
+The actual LLM calls are gated by ``shutil.which('claude')`` and can't be
+exercised in CI without the CLI, so we mock the tier functions directly
+and verify the control-flow (escalation routing, discard vs human, project
+misattribution, metadata update).
+"""
+
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+from unittest import mock
+
+import pytest
+
+# Import the script as a module for unit testing
+_SCRIPTS = str(Path(__file__).resolve().parent.parent / "scripts")
+if _SCRIPTS not in sys.path:
+    sys.path.insert(0, _SCRIPTS)
+
+import auto_triage  # noqa: E402
+
+
+@pytest.fixture(autouse=True)
+def reset_thresholds(monkeypatch):
+    """Make sure env-var overrides don't leak between tests."""
+    monkeypatch.setattr(auto_triage, "AUTO_PROMOTE_MIN_CONFIDENCE", 0.8)
+    monkeypatch.setattr(auto_triage, "ESCALATION_CONFIDENCE_THRESHOLD", 0.75)
+    monkeypatch.setattr(auto_triage, "TIER3_ACTION", "discard")
+    monkeypatch.setattr(auto_triage, "TIER1_MODEL", "sonnet")
+    monkeypatch.setattr(auto_triage, "TIER2_MODEL", "opus")
+
+
+def test_parse_verdict_captures_suggested_project():
+    raw = '{"verdict": "promote", "confidence": 0.9, "reason": "clear", "suggested_project": "p04-gigabit"}'
+    v = auto_triage.parse_verdict(raw)
+    assert v["verdict"] == "promote"
+    assert v["suggested_project"] == "p04-gigabit"
+
+
+def test_parse_verdict_defaults_suggested_project_to_empty():
+    raw = '{"verdict": "reject", "confidence": 0.9, "reason": "dup"}'
+    v = auto_triage.parse_verdict(raw)
+    assert v["suggested_project"] == ""
+
+
+def test_high_confidence_tier1_promote_no_escalation():
+    """Tier 1 confident promote → no tier 2 call."""
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.triage_escalation") as t2, \
+         mock.patch("auto_triage.api_post"), \
+         mock.patch("auto_triage._apply_metadata_update"):
+        t1.return_value = {
+            "verdict": "promote", "confidence": 0.95, "reason": "clear",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        action, _ = auto_triage.process_candidate(
+            cand, "http://fake", {"p-test": []}, {"p-test": []},
+            {"p-test": []}, dry_run=False,
+        )
+        assert action == "promote"
+        t2.assert_not_called()
+
+
+def test_high_confidence_tier1_reject_no_escalation():
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.triage_escalation") as t2, \
+         mock.patch("auto_triage.api_post"):
+        t1.return_value = {
+            "verdict": "reject", "confidence": 0.9, "reason": "duplicate",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        action, _ = auto_triage.process_candidate(
+            cand, "http://fake", {"p-test": []}, {"p-test": []},
+            {"p-test": []}, dry_run=False,
+        )
+        assert action == "reject"
+        t2.assert_not_called()
+
+
+def test_low_confidence_escalates_to_tier2():
+    """Tier 1 low confidence → tier 2 is consulted."""
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.triage_escalation") as t2, \
+         mock.patch("auto_triage.api_post"), \
+         mock.patch("auto_triage._apply_metadata_update"):
+        t1.return_value = {
+            "verdict": "promote", "confidence": 0.6, "reason": "maybe",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        t2.return_value = {
+            "verdict": "promote", "confidence": 0.9, "reason": "opus agrees",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        action, note = auto_triage.process_candidate(
+            cand, "http://fake", {"p-test": []}, {"p-test": []},
+            {"p-test": []}, dry_run=False,
+        )
+        assert action == "promote"
+        assert "opus" in note
+        t2.assert_called_once()
+
+
+def test_needs_human_tier1_always_escalates():
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.triage_escalation") as t2, \
+         mock.patch("auto_triage.api_post"):
+        t1.return_value = {
+            "verdict": "needs_human", "confidence": 0.5, "reason": "uncertain",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        t2.return_value = {
+            "verdict": "reject", "confidence": 0.88, "reason": "opus decided",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        action, _ = auto_triage.process_candidate(
+            cand, "http://fake", {"p-test": []}, {"p-test": []},
+            {"p-test": []}, dry_run=False,
+        )
+        assert action == "reject"
+        t2.assert_called_once()
+
+
+def test_tier2_uncertain_leads_to_discard_by_default(monkeypatch):
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
+    monkeypatch.setattr(auto_triage, "TIER3_ACTION", "discard")
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.triage_escalation") as t2, \
+         mock.patch("auto_triage.api_post") as api_post:
+        t1.return_value = {
+            "verdict": "needs_human", "confidence": 0.4, "reason": "unclear",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        t2.return_value = {
+            "verdict": "needs_human", "confidence": 0.5, "reason": "still unclear",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        action, _ = auto_triage.process_candidate(
+            cand, "http://fake", {"p-test": []}, {"p-test": []},
+            {"p-test": []}, dry_run=False,
+        )
+        assert action == "discard"
+        # Should have called reject on the API
+        api_post.assert_called_once()
+        assert "reject" in api_post.call_args.args[1]
+
+
+def test_tier2_uncertain_goes_to_human_when_configured(monkeypatch):
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
+    monkeypatch.setattr(auto_triage, "TIER3_ACTION", "human")
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.triage_escalation") as t2, \
+         mock.patch("auto_triage.api_post") as api_post:
+        t1.return_value = {
+            "verdict": "needs_human", "confidence": 0.4, "reason": "unclear",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        t2.return_value = {
+            "verdict": "needs_human", "confidence": 0.5, "reason": "still unclear",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        action, _ = auto_triage.process_candidate(
+            cand, "http://fake", {"p-test": []}, {"p-test": []},
+            {"p-test": []}, dry_run=False,
+        )
+        assert action == "human"
+        # Should NOT have touched the API — leave candidate in queue
+        api_post.assert_not_called()
+
+
+def test_dry_run_does_not_call_api():
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.api_post") as api_post:
+        t1.return_value = {
+            "verdict": "promote", "confidence": 0.9, "reason": "clear",
+            "domain_tags": [], "valid_until": "", "suggested_project": "",
+        }
+        action, _ = auto_triage.process_candidate(
+            cand, "http://fake", {"p-test": []}, {"p-test": []},
+            {"p-test": []}, dry_run=True,
+        )
+        assert action == "promote"
+        api_post.assert_not_called()
+
+
+def test_misattribution_flagged_when_suggestion_differs(capsys):
+    cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p04-gigabit"}
+
+    with mock.patch("auto_triage.triage_one") as t1, \
+         mock.patch("auto_triage.api_post"), \
+         mock.patch("auto_triage._apply_metadata_update"):
+        t1.return_value = {
+            "verdict": "promote", "confidence": 0.9, "reason": "clear",
+            "domain_tags": [], "valid_until": "",
+            "suggested_project": "p05-interferometer",
+        }
+        auto_triage.process_candidate(
+            cand, "http://fake",
+            {"p04-gigabit": [], "p05-interferometer": []},
+            {"p04-gigabit": [], "p05-interferometer": []},
+            {"p04-gigabit": [], "p05-interferometer": []},
+            dry_run=True,
+        )
+    out = capsys.readouterr().out
+    assert "misattribution" in out
+    assert "p05-interferometer" in out
--- a/tests/test_v1_aliases.py
+++ b/tests/test_v1_aliases.py
@@ -0,0 +1,61 @@
+"""Tests for /v1 API aliases — stable contract for external clients."""
+
+from fastapi.testclient import TestClient
+
+from atocore.main import app
+
+
+def test_v1_health_alias_matches_unversioned():
+    client = TestClient(app)
+    unversioned = client.get("/health")
+    versioned = client.get("/v1/health")
+    assert unversioned.status_code == 200
+    assert versioned.status_code == 200
+    assert versioned.json()["build_sha"] == unversioned.json()["build_sha"]
+
+
+def test_v1_projects_alias_returns_same_shape():
+    client = TestClient(app)
+    unversioned = client.get("/projects")
+    versioned = client.get("/v1/projects")
+    assert unversioned.status_code == 200
+    assert versioned.status_code == 200
+    assert versioned.json() == unversioned.json()
+
+
+def test_v1_entities_get_alias_reachable(tmp_data_dir):
+    from atocore.engineering.service import init_engineering_schema
+
+    init_engineering_schema()
+    client = TestClient(app)
+    response = client.get("/v1/entities")
+    assert response.status_code == 200
+    body = response.json()
+    assert "entities" in body
+
+
+def test_v1_paths_appear_in_openapi():
+    client = TestClient(app)
+    spec = client.get("/openapi.json").json()
+    paths = spec["paths"]
+    expected = [
+        "/v1/entities",
+        "/v1/relationships",
+        "/v1/ingest",
+        "/v1/context/build",
+        "/v1/projects",
+        "/v1/memory",
+        "/v1/projects/{project_name}/mirror",
+        "/v1/health",
+    ]
+    for path in expected:
+        assert path in paths, f"{path} missing from OpenAPI spec"
+
+
+def test_unversioned_paths_still_present_in_openapi():
+    """Regression: /v1 aliases must not displace the original paths."""
+    client = TestClient(app)
+    spec = client.get("/openapi.json").json()
+    paths = spec["paths"]
+    for path in ("/entities", "/projects", "/health", "/context/build"):
+        assert path in paths, f"unversioned {path} missing — aliases must coexist"
--- a/tests/test_wiki_pages.py
+++ b/tests/test_wiki_pages.py
@@ -0,0 +1,263 @@
+"""Tests for the new wiki pages shipped in the UI refresh:
+  - /wiki/capture     (7I follow-up)
+  - /wiki/memories/{id}   (7E)
+  - /wiki/domains/{tag}   (7F)
+  - /wiki/activity        (activity feed)
+  - home refresh          (topnav + activity snippet)
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from atocore.engineering.wiki import (
+    render_activity,
+    render_capture,
+    render_domain,
+    render_homepage,
+    render_memory_detail,
+)
+from atocore.engineering.service import init_engineering_schema
+from atocore.memory.service import create_memory
+from atocore.models.database import init_db
+
+
+def _init_all():
+    """Wiki pages read from both the memory and engineering schemas, so
+    tests need both initialized (the engineering schema is a separate
+    init_engineering_schema() call)."""
+    init_db()
+    init_engineering_schema()
+
+
+def test_capture_page_renders_as_fallback(tmp_data_dir):
+    _init_all()
+    html = render_capture()
+    # Page is reachable but now labeled as a fallback, not promoted
+    assert "fallback only" in html
+    assert "sanctioned capture surfaces are Claude Code" in html
+    # Form inputs still exist for emergency use
+    assert "cap-prompt" in html
+    assert "cap-response" in html
+
+
+def test_capture_not_in_topnav(tmp_data_dir):
+    """The paste form should NOT appear in topnav — it's not the sanctioned path."""
+    _init_all()
+    html = render_homepage()
+    assert "/wiki/capture" not in html
+    assert "📥 Capture" not in html
+
+
+def test_memory_detail_renders(tmp_data_dir):
+    _init_all()
+    m = create_memory(
+        "knowledge", "APM uses NX bridge for DXF → STL",
+        project="apm", confidence=0.7, domain_tags=["apm", "nx", "cad"],
+    )
+    html = render_memory_detail(m.id)
+    assert html is not None
+    assert "APM uses NX" in html
+    assert "Audit trail" in html
+    # Tag links go to domain pages
+    assert '/wiki/domains/apm' in html
+    assert '/wiki/domains/nx' in html
+    # Project link present
+    assert '/wiki/projects/apm' in html
+
+
+def test_memory_detail_404(tmp_data_dir):
+    _init_all()
+    assert render_memory_detail("nonexistent-id") is None
+
+
+def test_domain_page_lists_memories(tmp_data_dir):
+    _init_all()
+    create_memory("knowledge", "optics fact 1", project="p04-gigabit",
+                  domain_tags=["optics"])
+    create_memory("knowledge", "optics fact 2", project="p05-interferometer",
+                  domain_tags=["optics", "metrology"])
+    create_memory("knowledge", "other", project="p06-polisher",
+                  domain_tags=["firmware"])
+
+    html = render_domain("optics")
+    assert "Domain: <code>optics</code>" in html
+    assert "p04-gigabit" in html
+    assert "p05-interferometer" in html
+    assert "optics fact 1" in html
+    assert "optics fact 2" in html
+    # Unrelated memory should NOT appear
+    assert "other" not in html or "firmware" not in html
+
+
+def test_domain_page_empty(tmp_data_dir):
+    _init_all()
+    html = render_domain("definitely-not-a-tag")
+    assert "No memories currently carry" in html
+
+
+def test_domain_page_normalizes_tag(tmp_data_dir):
+    _init_all()
+    create_memory("knowledge", "x", domain_tags=["firmware"])
+    # Case-insensitive
+    assert "firmware" in render_domain("FIRMWARE")
+    # Whitespace tolerant
+    assert "firmware" in render_domain("  firmware  ")
+
+
+def test_activity_feed_renders(tmp_data_dir):
+    _init_all()
+    m = create_memory("knowledge", "activity test")
+    html = render_activity()
+    assert "Activity Feed" in html
+    # The newly-created memory should appear as a "created" event
+    assert "created" in html
+    # Short timestamp format
+    assert m.id[:8] in html
+
+
+def test_activity_feed_groups_by_action_and_actor(tmp_data_dir):
+    _init_all()
+    for i in range(3):
+        create_memory("knowledge", f"m{i}", actor="test-actor")
+
+    html = render_activity()
+    # Summary row should show "created: 3" or similar
+    assert "created" in html
+    assert "test-actor" in html
+
+
+def test_homepage_has_topnav_and_activity(tmp_data_dir):
+    _init_all()
+    create_memory("knowledge", "homepage test")
+    html = render_homepage()
+    # Topnav with expected items (Capture removed — it's not sanctioned capture)
+    assert "🏠 Home" in html
+    assert "📡 Activity" in html
+    assert "/wiki/activity" in html
+    assert "/wiki/capture" not in html
+    # Activity snippet
+    assert "What the brain is doing" in html
+
+
+def test_memory_detail_shows_superseded_sources(tmp_data_dir):
+    """After a merge, sources go to status=superseded. Detail page should
+    still render them."""
+    from atocore.memory.service import (
+        create_merge_candidate, merge_memories,
+    )
+    _init_all()
+    m1 = create_memory("knowledge", "alpha variant 1", project="test")
+    m2 = create_memory("knowledge", "alpha variant 2", project="test")
+    cid = create_merge_candidate(
+        memory_ids=[m1.id, m2.id], similarity=0.9,
+        proposed_content="alpha merged",
+        proposed_memory_type="knowledge", proposed_project="test",
+    )
+    merge_memories(cid, actor="auto-dedup-tier1")
+
+    # Source detail page should render and show the superseded status
+    html1 = render_memory_detail(m1.id)
+    assert html1 is not None
+    assert "superseded" in html1
+    assert "auto-dedup-tier1" in html1  # audit trail shows who merged
+
+
+# -------------------------------------------------- low-signal wiki filters
+# Ambient AKC session memories and test pollution shouldn't dominate domain
+# pages / homepage counts. These tests lock the partitioning behaviour.
+
+def test_domain_page_hides_empty_transcript_sessions(tmp_data_dir):
+    """Silent-mic AKC sessions (content has '(no transcript)') are ambient
+    noise — they go into the hidden count, not the main list."""
+    _init_all()
+    # One real knowledge memory with tag "optics"
+    create_memory(
+        "knowledge",
+        "CGH null corrector supports F/1.2 asphere testing",
+        project="p05", confidence=0.9, domain_tags=["optics", "cgh"],
+    )
+    # One silent AKC session with the same tag — should NOT appear
+    create_memory(
+        "episodic",
+        "AKC voice session abc (gen-002)\nDuration: 60s, 2 captures\n"
+        "\n## Transcript\n(no transcript)\n",
+        project="p05", confidence=0.7,
+        domain_tags=["optics", "session", "akc", "voice"],
+    )
+    html = render_domain("optics")
+    assert "CGH null corrector" in html
+    # The hidden-count banner should be present
+    assert "low-signal" in html or "Ambient provenance" in html
+    # And the empty-transcript content itself is not rendered inline
+    assert "(no transcript)" not in html
+
+
+def test_domain_page_collapses_akc_session_snapshots(tmp_data_dir):
+    """AKC voice-session memories are provenance records — count them as
+    a single collapsed link, don't inline every one."""
+    _init_all()
+    for i in range(5):
+        create_memory(
+            "episodic",
+            f"AKC voice session session-{i} (gen-00{i})\nDuration: 120s, 3 captures\n"
+            f"\n## Transcript\nReal transcript number {i}",
+            project="p05", confidence=0.7,
+            domain_tags=["optics", "session", "akc", "voice"],
+        )
+    html = render_domain("optics")
+    # Inline count should mention AKC session snapshots
+    assert "AKC voice session snapshots" in html
+    # None of the session transcripts should be pasted inline on the domain
+    # page (they're provenance, linked via /wiki/activity)
+    assert "Real transcript number 0" not in html
+
+
+def test_homepage_stats_exclude_ambient_memory(tmp_data_dir):
+    """Homepage system-stats line shows real memory count, pushes ambient
+    counts into a dimmed sub-segment."""
+    _init_all()
+    # 2 real memories + 3 ambient sessions + 1 silent junk
+    create_memory("knowledge", "Real fact 1", project="p05", confidence=0.8)
+    create_memory("knowledge", "Real fact 2", project="p05", confidence=0.8)
+    for i in range(3):
+        create_memory(
+            "episodic",
+            f"AKC voice session s{i} (gen-00{i})\nReal transcript x",
+            project="p05", confidence=0.7,
+            domain_tags=["session", "akc", "voice"],
+        )
+    create_memory(
+        "episodic",
+        "AKC voice session silent (gen-099)\nDuration: 30s, 0 captures\n"
+        "\n## Transcript\n(no transcript)\n",
+        project="p05", confidence=0.7,
+        domain_tags=["session", "akc", "voice"],
+    )
+    html = render_homepage()
+    assert "3 AKC session snapshots" in html
+    assert "low-signal hidden" in html
+    # Main count reflects only real knowledge
+    assert "2 memories" in html
+
+
+def test_low_signal_predicate_catches_known_patterns():
+    from atocore.engineering.wiki import _is_low_signal_memory, _is_akc_session_memory
+    from dataclasses import dataclass
+
+    @dataclass
+    class M:
+        content: str = ""
+        domain_tags: list = None
+
+    # Explicit empty-transcript — low signal
+    assert _is_low_signal_memory(M(content="AKC voice session x\n## Transcript\n(no transcript)\n"))
+    # E2E test pollution — low signal
+    assert _is_low_signal_memory(M(content="IMG integration test — synthetic session"))
+    assert _is_low_signal_memory(M(content="synthetic AKC integration session"))
+    # Real knowledge — NOT low signal
+    assert not _is_low_signal_memory(M(content="The CGH is mounted to the fold mirror via…"))
+    # AKC session tag predicate
+    assert _is_akc_session_memory(M(content="anything", domain_tags=["session", "akc", "voice"]))
+    assert _is_akc_session_memory(M(content="AKC voice session abc"))
+    assert not _is_akc_session_memory(M(content="Real fact", domain_tags=["optics"]))
--- a/tests/test_wikilinks.py
+++ b/tests/test_wikilinks.py
@@ -0,0 +1,132 @@
+"""Issue B — wikilinks with redlinks + cross-project resolution."""
+
+import pytest
+from fastapi.testclient import TestClient
+
+from atocore.engineering.service import (
+    create_entity,
+    init_engineering_schema,
+)
+from atocore.engineering.wiki import (
+    _resolve_wikilink,
+    _wikilink_transform,
+    render_entity,
+    render_new_entity_form,
+    render_project,
+)
+from atocore.main import app
+from atocore.models.database import init_db
+
+
+@pytest.fixture
+def env(tmp_data_dir, tmp_path, monkeypatch):
+    registry_path = tmp_path / "test-registry.json"
+    registry_path.write_text('{"projects": []}', encoding="utf-8")
+    monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+    from atocore import config
+    config.settings = config.Settings()
+    init_db()
+    init_engineering_schema()
+    yield tmp_data_dir
+
+
+def test_resolve_wikilink_same_project_is_live(env):
+    tower = create_entity(entity_type="component", name="Tower", project="p05")
+    href, cls, _ = _resolve_wikilink("Tower", current_project="p05")
+    assert href == f"/wiki/entities/{tower.id}"
+    assert cls == "wikilink"
+
+
+def test_resolve_wikilink_missing_is_redlink(env):
+    href, cls, suffix = _resolve_wikilink("DoesNotExist", current_project="p05")
+    assert "/wiki/new" in href
+    assert "name=DoesNotExist" in href
+    assert cls == "redlink"
+
+
+def test_resolve_wikilink_cross_project_indicator(env):
+    other = create_entity(entity_type="material", name="Invar", project="p06")
+    href, cls, suffix = _resolve_wikilink("Invar", current_project="p05")
+    assert href == f"/wiki/entities/{other.id}"
+    assert "wikilink-cross" in cls
+    assert "in p06" in suffix
+
+
+def test_resolve_wikilink_case_insensitive(env):
+    tower = create_entity(entity_type="component", name="Tower", project="p05")
+    href, cls, _ = _resolve_wikilink("tower", current_project="p05")
+    assert href == f"/wiki/entities/{tower.id}"
+    assert cls == "wikilink"
+
+
+def test_transform_replaces_brackets_with_anchor(env):
+    create_entity(entity_type="component", name="Base Plate", project="p05")
+    out = _wikilink_transform("See [[Base Plate]] for details.", current_project="p05")
+    assert '<a href="/wiki/entities/' in out
+    assert 'class="wikilink"' in out
+    assert "[[Base Plate]]" not in out
+
+
+def test_transform_redlink_for_missing(env):
+    out = _wikilink_transform("Mentions [[Ghost]] nowhere.", current_project="p05")
+    assert 'class="redlink"' in out
+    assert "/wiki/new?name=Ghost" in out
+
+
+def test_transform_alias_syntax(env):
+    tower = create_entity(entity_type="component", name="Tower", project="p05")
+    out = _wikilink_transform("The [[Tower|big tower]] is tall.", current_project="p05")
+    assert f'href="/wiki/entities/{tower.id}"' in out
+    assert ">big tower<" in out
+
+
+def test_render_entity_description_has_redlink(env):
+    a = create_entity(
+        entity_type="component",
+        name="EntityA",
+        project="p05",
+        description="This depends on [[MissingPart]] which does not exist.",
+    )
+    html = render_entity(a.id)
+    assert 'class="redlink"' in html
+    assert "/wiki/new?name=MissingPart" in html
+
+
+def test_regression_redlink_becomes_live_once_target_created(env):
+    a = create_entity(
+        entity_type="component",
+        name="EntityA",
+        project="p05",
+        description="Connected to [[EntityB]].",
+    )
+    # Pre-create: redlink.
+    html_before = render_entity(a.id)
+    assert 'class="redlink"' in html_before
+
+    b = create_entity(entity_type="component", name="EntityB", project="p05")
+    html_after = render_entity(a.id)
+    assert 'class="redlink"' not in html_after
+    assert f"/wiki/entities/{b.id}" in html_after
+
+
+def test_new_entity_form_prefills_name():
+    html = render_new_entity_form(name="FreshEntity", project="p05")
+    assert 'value="FreshEntity"' in html
+    assert 'value="p05"' in html
+    assert "entity_type" in html
+    assert 'method="post"' not in html  # JS-driven
+
+
+def test_wiki_new_route_renders(env):
+    client = TestClient(app)
+    r = client.get("/wiki/new?name=NewThing&project=p05")
+    assert r.status_code == 200
+    assert "NewThing" in r.text
+    assert "Create entity" in r.text
+
+
+def test_wiki_new_url_escapes_special_chars(env):
+    # "steel (likely)" is the kind of awkward name AKC produces
+    href, cls, _ = _resolve_wikilink("steel (likely)", current_project="p05")
+    assert cls == "redlink"
+    assert "name=steel%20%28likely%29" in href