feat: post-backup validation + retention cleanup (Tasks B & C)

- create_runtime_backup() now auto-validates its output and includes validated/validation_errors fields in returned metadata - New cleanup_old_backups() with retention policy: 7 daily, 4 weekly (Sundays), 6 monthly (1st of month), dry-run by default - CLI `cleanup` subcommand added to backup module - 9 new tests (2 validation + 7 retention), 259 total passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: token-overlap matcher for reinforcement (Phase 9B)
2026-04-11 09:46:46 -04:00 · 2026-04-11 09:40:05 -04:00 · 2026-04-11 09:17:21 -04:00 · 2026-04-11 09:00:42 -04:00 · 2026-04-09 09:13:21 -04:00 · 2026-04-08 21:17:48 -04:00
41 changed files with 8130 additions and 79 deletions
--- a/.claude/commands/atocore-context.md
+++ b/.claude/commands/atocore-context.md
@@ -0,0 +1,159 @@
+---
+description: Pull a context pack from the live AtoCore service for the current prompt
+argument-hint: <prompt text> [project-id]
+---
+
+You are about to enrich a user prompt with context from the live
+AtoCore service. This is the daily-use entry point for AtoCore from
+inside Claude Code.
+
+The work happens via the **shared AtoCore operator client** at
+`scripts/atocore_client.py`. That client is the canonical Python
+backbone for stable AtoCore operations and is meant to be reused by
+every LLM client (OpenClaw helper, future Codex skill, etc.) — see
+`docs/architecture/llm-client-integration.md` for the layering. This
+slash command is a thin Claude Code-specific frontend on top of it.
+
+## Step 1 — parse the arguments
+
+The user invoked `/atocore-context` with:
+
+```
+$ARGUMENTS
+```
+
+You need to figure out two things:
+
+1. The **prompt text** — what AtoCore will retrieve context for
+2. An **optional project hint** — used to scope retrieval to a
+   specific project's trusted state and corpus
+
+The user may have passed a project id or alias as the **last
+whitespace-separated token**. Don't maintain a hardcoded list of
+known aliases — let the shared client decide. Use this rule:
+
+- Take the last token of `$ARGUMENTS`. Call it `MAYBE_HINT`.
+- Run `python scripts/atocore_client.py detect-project "$MAYBE_HINT"`
+  to ask the registry whether it's a known project id or alias.
+  This call is cheap (it just hits `/projects` and does a regex
+  match) and inherits the client's fail-open behavior.
+- If the response has a non-null `matched_project`, the last
+  token was an explicit project hint. `PROMPT_TEXT` is everything
+  except the last token; `PROJECT_HINT` is the matched canonical
+  project id.
+- Otherwise the last token is just part of the prompt.
+  `PROMPT_TEXT` is the full `$ARGUMENTS`; `PROJECT_HINT` is empty.
+
+This delegates the alias-knowledge to the registry instead of
+embedding a stale list in this markdown file. When you add a new
+project to the registry, the slash command picks it up
+automatically with no edits here.
+
+## Step 2 — call the shared client for the context pack
+
+The server resolves project hints through the registry before
+looking up trusted state, so you can pass either the canonical id
+or any alias to `context-build` and the trusted state lookup will
+work either way. (Regression test:
+`tests/test_context_builder.py::test_alias_hint_resolves_through_registry`.)
+
+**If `PROJECT_HINT` is non-empty**, call `context-build` directly
+with that hint:
+
+```bash
+python scripts/atocore_client.py context-build \
+  "$PROMPT_TEXT" \
+  "$PROJECT_HINT"
+```
+
+**If `PROJECT_HINT` is empty**, do the 2-step fallback dance so the
+user always gets a context pack regardless of whether the prompt
+implies a project:
+
+```bash
+# Try project auto-detection first.
+RESULT=$(python scripts/atocore_client.py auto-context "$PROMPT_TEXT")
+
+# If auto-context could not detect a project it returns a small
+# {"status": "no_project_match", ...} envelope. In that case fall
+# back to a corpus-wide context build with no project hint, which
+# is the right behaviour for cross-project or generic prompts like
+# "what changed in AtoCore backup policy this week?"
+if echo "$RESULT" | grep -q '"no_project_match"'; then
+  RESULT=$(python scripts/atocore_client.py context-build "$PROMPT_TEXT")
+fi
+
+echo "$RESULT"
+```
+
+This is the fix for the P2 finding from codex's review: previously
+the slash command sent every no-hint prompt through `auto-context`
+and returned `no_project_match` to the user with no context, even
+though the underlying client's `context-build` subcommand has
+always supported corpus-wide context builds.
+
+In both branches the response is the JSON payload from
+`/context/build` (or, in the rare case where even the corpus-wide
+build fails, a `{"status": "unavailable"}` envelope from the
+client's fail-open layer).
+
+## Step 3 — present the context pack to the user
+
+The successful response contains at least:
+
+- `formatted_context` — the assembled context block AtoCore would
+  feed an LLM
+- `chunks_used`, `total_chars`, `budget`, `budget_remaining`,
+  `duration_ms`
+- `chunks` — array of source documents that contributed, each with
+  `source_file`, `heading_path`, `score`
+
+Render in this order:
+
+1. A one-line stats banner: `chunks=N, chars=X/budget, duration=Yms`
+2. The `formatted_context` block verbatim inside a fenced text code
+   block so the user can read what AtoCore would feed an LLM
+3. The `chunks` array as a small bullet list with `source_file`,
+   `heading_path`, and `score` per chunk
+
+Two special cases:
+
+- **`{"status": "unavailable"}`** (fail-open from the client)
+  → Tell the user: "AtoCore is unreachable at `$ATOCORE_BASE_URL`.
+  Check `python scripts/atocore_client.py health` for diagnostics."
+- **Empty `chunks_used: 0` with no project state and no memories**
+  → Tell the user: "AtoCore returned no context for this prompt —
+  either the corpus does not have relevant information or the
+  project hint is wrong. Try a different hint or a longer prompt."
+
+## Step 4 — what about capturing the interaction
+
+Capture (Phase 9 Commit A) and the rest of the reflection loop
+(reinforcement, extraction, review queue) are intentionally NOT
+exposed by the shared client yet. The contracts are stable but the
+workflow ergonomics are not, so the daily-use slash command stays
+focused on context retrieval until those review flows have been
+exercised in real use. See `docs/architecture/llm-client-integration.md`
+for the deferral rationale.
+
+When capture is added to the shared client, this slash command will
+gain a follow-up `/atocore-record-response` companion command that
+posts the LLM's response back to the same interaction. That work is
+queued.
+
+## Notes for the assistant
+
+- DO NOT bypass the shared client by calling curl yourself. The
+  client is the contract between AtoCore and every LLM frontend; if
+  you find a missing capability, the right fix is to extend the
+  client, not to work around it.
+- DO NOT maintain a hardcoded list of project aliases in this
+  file. Use `detect-project` to ask the registry — that's the
+  whole point of having a registry.
+- DO NOT silently change `ATOCORE_BASE_URL`. If the env var points
+  at the wrong instance, surface the error so the user can fix it.
+- DO NOT hide the formatted context pack from the user. Showing
+  what AtoCore would feed an LLM is the whole point.
+- The output goes into the user's working context as background;
+  they may follow up with their actual question, and the AtoCore
+  context pack acts as informal injected knowledge.
--- a/.gitignore
+++ b/.gitignore
@@ -10,4 +10,6 @@ htmlcov/
 .coverage
 venv/
 .venv/
-.claude/
+.claude/*
+!.claude/commands/
+!.claude/commands/**
--- a/deploy/dalidou/deploy.sh
+++ b/deploy/dalidou/deploy.sh
@@ -0,0 +1,349 @@
+#!/usr/bin/env bash
+#
+# deploy/dalidou/deploy.sh
+# -------------------------
+# One-shot deploy script for updating the running AtoCore container
+# on Dalidou from the current Gitea main branch.
+#
+# The script is idempotent and safe to re-run. It handles both the
+# first-time deploy (where /srv/storage/atocore/app may not yet be
+# a git checkout) and the ongoing update case (where it is).
+#
+# Usage
+# -----
+#
+#   # Normal update from main (most common)
+#   bash deploy/dalidou/deploy.sh
+#
+#   # Deploy a specific branch or tag
+#   ATOCORE_BRANCH=codex/some-feature bash deploy/dalidou/deploy.sh
+#
+#   # Dry-run: show what would happen without touching anything
+#   ATOCORE_DEPLOY_DRY_RUN=1 bash deploy/dalidou/deploy.sh
+#
+# Environment variables
+# ---------------------
+#
+#   ATOCORE_APP_DIR      default /srv/storage/atocore/app
+#   ATOCORE_GIT_REMOTE   default http://127.0.0.1:3000/Antoine/ATOCore.git
+#                        This is the local Dalidou gitea, reached
+#                        via loopback. Override only when running
+#                        the deploy from a remote host. The default
+#                        is loopback (not the hostname "dalidou")
+#                        because the hostname doesn't reliably
+#                        resolve on the host itself — Dalidou
+#                        Claude's first deploy had to work around
+#                        exactly this.
+#   ATOCORE_BRANCH       default main
+#   ATOCORE_DEPLOY_DRY_RUN  if set to 1, report only, no mutations
+#   ATOCORE_HEALTH_URL   default http://127.0.0.1:8100/health
+#
+# Safety rails
+# ------------
+#
+# - If the app dir exists but is NOT a git repo, the script renames
+#   it to <dir>.pre-git-<timestamp> before re-cloning, so you never
+#   lose the pre-existing snapshot to a git clobber.
+# - If the health check fails after restart, the script exits
+#   non-zero and prints the container logs tail for diagnosis.
+# - Dry-run mode is the default recommendation for the first deploy
+#   on a new environment: it shows the planned git operations and
+#   the compose command without actually running them.
+#
+# What this script does NOT do
+# ----------------------------
+#
+# - Does not manage secrets / .env files. The caller is responsible
+#   for placing deploy/dalidou/.env before running.
+# - Does not run a backup before deploying. Run the backup endpoint
+#   first if you want a pre-deploy snapshot.
+# - Does not roll back on health-check failure. If deploy fails,
+#   the previous container is already stopped; you need to redeploy
+#   a known-good commit to recover.
+# - Does not touch the database. The Phase 9 schema migrations in
+#   src/atocore/models/database.py::_apply_migrations are idempotent
+#   ALTER TABLE ADD COLUMN calls that run at service startup via the
+#   lifespan handler. Stale pre-Phase-9 schema is upgraded in place.
+
+set -euo pipefail
+
+APP_DIR="${ATOCORE_APP_DIR:-/srv/storage/atocore/app}"
+GIT_REMOTE="${ATOCORE_GIT_REMOTE:-http://127.0.0.1:3000/Antoine/ATOCore.git}"
+BRANCH="${ATOCORE_BRANCH:-main}"
+HEALTH_URL="${ATOCORE_HEALTH_URL:-http://127.0.0.1:8100/health}"
+DRY_RUN="${ATOCORE_DEPLOY_DRY_RUN:-0}"
+COMPOSE_DIR="$APP_DIR/deploy/dalidou"
+
+log() { printf '==> %s\n' "$*"; }
+run() {
+    if [ "$DRY_RUN" = "1" ]; then
+        printf '    [dry-run] %s\n' "$*"
+    else
+        eval "$@"
+    fi
+}
+
+log "AtoCore deploy starting"
+log "  app dir:    $APP_DIR"
+log "  git remote: $GIT_REMOTE"
+log "  branch:     $BRANCH"
+log "  health url: $HEALTH_URL"
+log "  dry run:    $DRY_RUN"
+
+# ---------------------------------------------------------------------
+# Step 0: pre-flight permission check
+# ---------------------------------------------------------------------
+#
+# If $APP_DIR exists but the current user cannot write to it (because
+# a previous manual deploy left it root-owned, for example), the git
+# fetch / reset in step 1 will fail with cryptic errors. Detect this
+# up front and give the operator a clean remediation command instead
+# of letting git produce half-state on partial failure. This was the
+# exact workaround the 2026-04-08 Dalidou redeploy needed — pre-
+# existing root ownership from the pre-phase9 manual schema fix.
+
+if [ -d "$APP_DIR" ] && [ "$DRY_RUN" != "1" ]; then
+    if [ ! -w "$APP_DIR" ] || [ ! -r "$APP_DIR/.git" ] 2>/dev/null; then
+        log "WARNING: app dir exists but may not be writable by current user"
+    fi
+    current_owner="$(stat -c '%U:%G' "$APP_DIR" 2>/dev/null || echo unknown)"
+    current_user="$(id -un 2>/dev/null || echo unknown)"
+    current_uid_gid="$(id -u 2>/dev/null):$(id -g 2>/dev/null)"
+    log "Step 0: permission check"
+    log "  app dir owner: $current_owner"
+    log "  current user:  $current_user ($current_uid_gid)"
+    # Try to write a tiny marker file. If it fails, surface a clean
+    # remediation message and exit before git produces confusing
+    # half-state.
+    marker="$APP_DIR/.deploy-permission-check"
+    if ! ( : > "$marker" ) 2>/dev/null; then
+        log "FATAL: cannot write to $APP_DIR as $current_user"
+        log ""
+        log "The app dir is owned by $current_owner and the current user"
+        log "doesn't have write permission. This usually happens after a"
+        log "manual workaround deploy that ran as root."
+        log ""
+        log "Remediation (pick the one that matches your setup):"
+        log ""
+        log "  # If you have passwordless sudo and gitea runs as UID 1000:"
+        log "  sudo chown -R 1000:1000 $APP_DIR"
+        log ""
+        log "  # If you're running deploy.sh itself as root:"
+        log "  sudo bash $0"
+        log ""
+        log "  # If neither works, do it via a throwaway container:"
+        log "  docker run --rm -v $APP_DIR:/app alpine \\"
+        log "      chown -R 1000:1000 /app"
+        log ""
+        log "Then re-run deploy.sh."
+        exit 5
+    fi
+    rm -f "$marker" 2>/dev/null || true
+fi
+
+# ---------------------------------------------------------------------
+# Step 1: make sure $APP_DIR is a proper git checkout of the branch
+# ---------------------------------------------------------------------
+
+if [ -d "$APP_DIR/.git" ]; then
+    log "Step 1: app dir is already a git checkout; fetching latest"
+    run "cd '$APP_DIR' && git fetch origin '$BRANCH'"
+    run "cd '$APP_DIR' && git reset --hard 'origin/$BRANCH'"
+else
+    log "Step 1: app dir is NOT a git checkout; converting"
+    if [ -d "$APP_DIR" ]; then
+        BACKUP="${APP_DIR}.pre-git-$(date -u +%Y%m%dT%H%M%SZ)"
+        log "       backing up existing snapshot to $BACKUP"
+        run "mv '$APP_DIR' '$BACKUP'"
+    fi
+    log "       cloning $GIT_REMOTE -> $APP_DIR (branch: $BRANCH)"
+    run "git clone --branch '$BRANCH' '$GIT_REMOTE' '$APP_DIR'"
+fi
+
+# ---------------------------------------------------------------------
+# Step 1.5: self-update re-exec guard
+# ---------------------------------------------------------------------
+#
+# When deploy.sh itself changes in the commit we just pulled, the bash
+# process running this script is still executing the OLD deploy.sh
+# from memory — git reset --hard updated the file on disk but our
+# in-memory instructions are stale. That's exactly how the first
+# 2026-04-09 Dalidou deploy silently wrote "unknown" build_sha: old
+# Step 2 logic ran against fresh source. Detect the mismatch and
+# re-exec into the fresh copy so every post-update run exercises the
+# new script.
+#
+# Guard rails:
+# - Only runs when $APP_DIR exists, holds a git checkout, and a
+#   deploy.sh exists there (i.e. after Step 1 succeeded).
+# - Uses a sentinel env var ATOCORE_DEPLOY_REEXECED=1 to make sure
+#   we only re-exec once, never recurse.
+# - Skipped in dry-run mode (no mutation).
+# - Skipped if $0 isn't a readable file (bash -c pipe inputs, etc.).
+
+if [ "$DRY_RUN" != "1" ] \
+    && [ -z "${ATOCORE_DEPLOY_REEXECED:-}" ] \
+    && [ -r "$0" ] \
+    && [ -f "$APP_DIR/deploy/dalidou/deploy.sh" ]; then
+    ON_DISK_HASH="$(sha1sum "$APP_DIR/deploy/dalidou/deploy.sh" 2>/dev/null | awk '{print $1}')"
+    RUNNING_HASH="$(sha1sum "$0" 2>/dev/null | awk '{print $1}')"
+    if [ -n "$ON_DISK_HASH" ] \
+        && [ -n "$RUNNING_HASH" ] \
+        && [ "$ON_DISK_HASH" != "$RUNNING_HASH" ]; then
+        log "Step 1.5: deploy.sh changed in the pulled commit; re-exec'ing"
+        log "  running script hash: $RUNNING_HASH"
+        log "  on-disk script hash: $ON_DISK_HASH"
+        log "  re-exec -> $APP_DIR/deploy/dalidou/deploy.sh"
+        export ATOCORE_DEPLOY_REEXECED=1
+        exec bash "$APP_DIR/deploy/dalidou/deploy.sh" "$@"
+    fi
+fi
+
+# ---------------------------------------------------------------------
+# Step 2: capture build provenance to pass to the container
+# ---------------------------------------------------------------------
+#
+# We compute the full SHA, the short SHA, the UTC build timestamp,
+# and the source branch. These get exported as env vars before
+# `docker compose up -d --build` so the running container can read
+# them at startup and report them via /health. The post-deploy
+# verification step (Step 6) reads /health and compares the
+# reported SHA against this value to detect any silent drift.
+
+log "Step 2: capturing build provenance"
+if [ "$DRY_RUN" != "1" ] && [ -d "$APP_DIR/.git" ]; then
+    DEPLOYING_SHA_FULL="$(cd "$APP_DIR" && git rev-parse HEAD)"
+    DEPLOYING_SHA="$(echo "$DEPLOYING_SHA_FULL" | cut -c1-7)"
+    DEPLOYING_TIME="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+    DEPLOYING_BRANCH="$BRANCH"
+    log "  commit:    $DEPLOYING_SHA ($DEPLOYING_SHA_FULL)"
+    log "  built at:  $DEPLOYING_TIME"
+    log "  branch:    $DEPLOYING_BRANCH"
+    ( cd "$APP_DIR" && git log --oneline -1 ) | sed 's/^/  /'
+    export ATOCORE_BUILD_SHA="$DEPLOYING_SHA_FULL"
+    export ATOCORE_BUILD_TIME="$DEPLOYING_TIME"
+    export ATOCORE_BUILD_BRANCH="$DEPLOYING_BRANCH"
+else
+    log "  [dry-run] would read git log from $APP_DIR"
+    DEPLOYING_SHA="dry-run"
+    DEPLOYING_SHA_FULL="dry-run"
+fi
+
+# ---------------------------------------------------------------------
+# Step 3: preserve the .env file (it's not in git)
+# ---------------------------------------------------------------------
+
+ENV_FILE="$COMPOSE_DIR/.env"
+if [ "$DRY_RUN" != "1" ] && [ ! -f "$ENV_FILE" ]; then
+    log "Step 3: WARNING — $ENV_FILE does not exist"
+    log "       the compose workflow needs this file to map mount points"
+    log "       copy deploy/dalidou/.env.example to $ENV_FILE and edit it"
+    log "       before re-running this script"
+    exit 2
+fi
+
+# ---------------------------------------------------------------------
+# Step 4: rebuild and restart the container
+# ---------------------------------------------------------------------
+
+log "Step 4: rebuilding and restarting the atocore container"
+run "cd '$COMPOSE_DIR' && docker compose up -d --build"
+
+if [ "$DRY_RUN" = "1" ]; then
+    log "dry-run complete — no mutations performed"
+    exit 0
+fi
+
+# ---------------------------------------------------------------------
+# Step 5: wait for the service to come up and pass the health check
+# ---------------------------------------------------------------------
+
+log "Step 5: waiting for /health to respond"
+for i in 1 2 3 4 5 6 7 8 9 10; do
+    if curl -fsS "$HEALTH_URL" > /tmp/atocore-health.json 2>/dev/null; then
+        log "       service is responding"
+        break
+    fi
+    log "       not ready yet ($i/10); waiting 3s"
+    sleep 3
+done
+
+if ! curl -fsS "$HEALTH_URL" > /tmp/atocore-health.json 2>/dev/null; then
+    log "FATAL: service did not come up within 30 seconds"
+    log "       container logs (last 50 lines):"
+    cd "$COMPOSE_DIR" && docker compose logs --tail=50 atocore || true
+    exit 3
+fi
+
+# ---------------------------------------------------------------------
+# Step 6: verify the deployed build matches what we just shipped
+# ---------------------------------------------------------------------
+#
+# Two layers of comparison:
+#
+# - code_version: matches src/atocore/__init__.py::__version__.
+#   Coarse: any commit between version bumps reports the same value.
+# - build_sha: full git SHA the container was built from. Set as
+#   an env var by Step 2 above and read by /health from
+#   ATOCORE_BUILD_SHA. This is the precise drift signal — if the
+#   live build_sha doesn't match $DEPLOYING_SHA_FULL, the build
+#   didn't pick up the new source.
+
+log "Step 6: verifying deployed build"
+log "  /health response:"
+if command -v jq >/dev/null 2>&1; then
+    jq . < /tmp/atocore-health.json | sed 's/^/    /'
+    REPORTED_VERSION="$(jq -r '.code_version // .version' < /tmp/atocore-health.json)"
+    REPORTED_SHA="$(jq -r '.build_sha // "unknown"' < /tmp/atocore-health.json)"
+    REPORTED_BUILD_TIME="$(jq -r '.build_time // "unknown"' < /tmp/atocore-health.json)"
+else
+    cat /tmp/atocore-health.json | sed 's/^/    /'
+    echo
+    REPORTED_VERSION="$(grep -o '"code_version":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
+    if [ -z "$REPORTED_VERSION" ]; then
+        REPORTED_VERSION="$(grep -o '"version":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
+    fi
+    REPORTED_SHA="$(grep -o '"build_sha":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
+    REPORTED_SHA="${REPORTED_SHA:-unknown}"
+    REPORTED_BUILD_TIME="$(grep -o '"build_time":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
+    REPORTED_BUILD_TIME="${REPORTED_BUILD_TIME:-unknown}"
+fi
+
+EXPECTED_VERSION="$(grep -oE "__version__ = \"[^\"]+\"" "$APP_DIR/src/atocore/__init__.py" | head -1 | cut -d'"' -f2)"
+
+log "  Layer 1 — coarse version:"
+log "    expected code_version: $EXPECTED_VERSION (from src/atocore/__init__.py)"
+log "    reported code_version: $REPORTED_VERSION (from live /health)"
+
+if [ "$REPORTED_VERSION" != "$EXPECTED_VERSION" ]; then
+    log "FATAL: code_version mismatch"
+    log "       the container may not have picked up the new image"
+    log "       try: docker compose down && docker compose up -d --build"
+    exit 4
+fi
+
+log "  Layer 2 — precise build SHA:"
+log "    expected build_sha: $DEPLOYING_SHA_FULL (from this deploy.sh run)"
+log "    reported build_sha: $REPORTED_SHA (from live /health)"
+log "    reported build_time: $REPORTED_BUILD_TIME"
+
+if [ "$REPORTED_SHA" != "$DEPLOYING_SHA_FULL" ]; then
+    log "FATAL: build_sha mismatch"
+    log "       the live container is reporting a different commit than"
+    log "       the one this deploy.sh run just shipped. Possible causes:"
+    log "       - the container is using a cached image instead of the"
+    log "         freshly-built one (try: docker compose build --no-cache)"
+    log "       - the env vars didn't propagate (check that"
+    log "         deploy/dalidou/docker-compose.yml has the environment"
+    log "         section with ATOCORE_BUILD_SHA)"
+    log "       - another process restarted the container between the"
+    log "         build and the health check"
+    exit 6
+fi
+
+log "Deploy complete."
+log "  commit:       $DEPLOYING_SHA ($DEPLOYING_SHA_FULL)"
+log "  code_version: $REPORTED_VERSION"
+log "  build_sha:    $REPORTED_SHA"
+log "  build_time:   $REPORTED_BUILD_TIME"
+log "  health:       ok"
--- a/deploy/dalidou/docker-compose.yml
+++ b/deploy/dalidou/docker-compose.yml
@@ -9,6 +9,15 @@ services:
      - "${ATOCORE_PORT:-8100}:8100"
    env_file:
      - .env
+    environment:
+      # Build provenance — set by deploy/dalidou/deploy.sh on each
+      # rebuild so /health can report exactly which commit is live.
+      # Defaults to 'unknown' for direct `docker compose up` runs that
+      # bypass deploy.sh; in that case the operator should run
+      # deploy.sh instead so the deployed SHA is recorded.
+      ATOCORE_BUILD_SHA: "${ATOCORE_BUILD_SHA:-unknown}"
+      ATOCORE_BUILD_TIME: "${ATOCORE_BUILD_TIME:-unknown}"
+      ATOCORE_BUILD_BRANCH: "${ATOCORE_BUILD_BRANCH:-unknown}"
    volumes:
      - ${ATOCORE_DB_DIR}:${ATOCORE_DB_DIR}
      - ${ATOCORE_CHROMA_DIR}:${ATOCORE_CHROMA_DIR}
--- a/deploy/hooks/capture_stop.py
+++ b/deploy/hooks/capture_stop.py
@@ -0,0 +1,188 @@
+#!/usr/bin/env python3
+"""Claude Code Stop hook: capture interaction to AtoCore.
+
+Reads the Stop hook JSON from stdin, extracts the last user prompt
+from the transcript JSONL, and POSTs to the AtoCore /interactions
+endpoint in conservative mode (reinforce=false, no extraction).
+
+Fail-open: always exits 0, logs errors to stderr only.
+
+Environment variables:
+    ATOCORE_URL     Base URL of the AtoCore instance (default: http://dalidou:8100)
+    ATOCORE_CAPTURE_DISABLED  Set to "1" to disable capture (kill switch)
+
+Usage in ~/.claude/settings.json:
+    "Stop": [{
+        "matcher": "",
+        "hooks": [{
+            "type": "command",
+            "command": "python /path/to/capture_stop.py",
+            "timeout": 15
+        }]
+    }]
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+import urllib.error
+import urllib.request
+
+ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100")
+TIMEOUT_SECONDS = 10
+
+# Minimum prompt length to bother capturing. Single-word acks,
+# slash commands, and empty lines aren't useful interactions.
+MIN_PROMPT_LENGTH = 15
+
+# Maximum response length to capture. Truncate very long assistant
+# responses to keep the interactions table manageable.
+MAX_RESPONSE_LENGTH = 50_000
+
+
+def main() -> None:
+    """Entry point. Always exits 0."""
+    try:
+        _capture()
+    except Exception as exc:
+        print(f"capture_stop: {exc}", file=sys.stderr)
+
+
+def _capture() -> None:
+    if os.environ.get("ATOCORE_CAPTURE_DISABLED") == "1":
+        return
+
+    raw = sys.stdin.read()
+    if not raw.strip():
+        return
+
+    hook_data = json.loads(raw)
+
+    session_id = hook_data.get("session_id", "")
+    assistant_message = hook_data.get("last_assistant_message", "")
+    transcript_path = hook_data.get("transcript_path", "")
+    cwd = hook_data.get("cwd", "")
+
+    prompt = _extract_last_user_prompt(transcript_path)
+    if not prompt or len(prompt.strip()) < MIN_PROMPT_LENGTH:
+        return
+
+    response = assistant_message or ""
+    if len(response) > MAX_RESPONSE_LENGTH:
+        response = response[:MAX_RESPONSE_LENGTH] + "\n\n[truncated]"
+
+    project = _infer_project(cwd)
+
+    payload = {
+        "prompt": prompt,
+        "response": response,
+        "client": "claude-code",
+        "session_id": session_id,
+        "project": project,
+        "reinforce": False,
+    }
+
+    body = json.dumps(payload, ensure_ascii=True).encode("utf-8")
+    req = urllib.request.Request(
+        f"{ATOCORE_URL}/interactions",
+        data=body,
+        headers={"Content-Type": "application/json"},
+        method="POST",
+    )
+    resp = urllib.request.urlopen(req, timeout=TIMEOUT_SECONDS)
+    result = json.loads(resp.read().decode("utf-8"))
+    print(
+        f"capture_stop: recorded interaction {result.get('id', '?')} "
+        f"(project={project or 'none'}, prompt_chars={len(prompt)}, "
+        f"response_chars={len(response)})",
+        file=sys.stderr,
+    )
+
+
+def _extract_last_user_prompt(transcript_path: str) -> str:
+    """Read the JSONL transcript and return the last real user prompt.
+
+    Skips meta messages (isMeta=True) and system/command messages
+    (content starting with '<').
+    """
+    if not transcript_path:
+        return ""
+
+    # Normalize path for the current OS
+    path = os.path.normpath(transcript_path)
+    if not os.path.isfile(path):
+        return ""
+
+    last_prompt = ""
+    try:
+        with open(path, encoding="utf-8", errors="replace") as f:
+            for line in f:
+                line = line.strip()
+                if not line:
+                    continue
+                try:
+                    entry = json.loads(line)
+                except json.JSONDecodeError:
+                    continue
+
+                if entry.get("type") != "user":
+                    continue
+                if entry.get("isMeta", False):
+                    continue
+
+                msg = entry.get("message", {})
+                if not isinstance(msg, dict):
+                    continue
+
+                content = msg.get("content", "")
+
+                if isinstance(content, str):
+                    text = content.strip()
+                elif isinstance(content, list):
+                    # Content blocks: extract text blocks
+                    parts = []
+                    for block in content:
+                        if isinstance(block, str):
+                            parts.append(block)
+                        elif isinstance(block, dict) and block.get("type") == "text":
+                            parts.append(block.get("text", ""))
+                    text = "\n".join(parts).strip()
+                else:
+                    continue
+
+                # Skip system/command XML and very short messages
+                if text.startswith("<") or len(text) < MIN_PROMPT_LENGTH:
+                    continue
+
+                last_prompt = text
+    except OSError:
+        pass
+
+    return last_prompt
+
+
+# Project inference from working directory.
+# Maps known repo paths to AtoCore project IDs. The user can extend
+# this table or replace it with a registry lookup later.
+_PROJECT_PATH_MAP: dict[str, str] = {
+    # Add mappings as needed, e.g.:
+    # "C:\\Users\\antoi\\gigabit": "p04-gigabit",
+    # "C:\\Users\\antoi\\interferometer": "p05-interferometer",
+}
+
+
+def _infer_project(cwd: str) -> str:
+    """Try to map the working directory to an AtoCore project."""
+    if not cwd:
+        return ""
+    norm = os.path.normpath(cwd).lower()
+    for path_prefix, project_id in _PROJECT_PATH_MAP.items():
+        if norm.startswith(os.path.normpath(path_prefix).lower()):
+            return project_id
+    return ""
+
+
+if __name__ == "__main__":
+    main()
--- a/docs/architecture/engineering-v1-acceptance.md
+++ b/docs/architecture/engineering-v1-acceptance.md
@@ -0,0 +1,434 @@
+# Engineering Layer V1 Acceptance Criteria
+
+## Why this document exists
+
+The engineering layer planning sprint produced 7 architecture
+docs. None of them on their own says "you're done with V1, ship
+it". This document does. It translates the planning into
+measurable, falsifiable acceptance criteria so the implementation
+sprint can know unambiguously when V1 is complete.
+
+The acceptance criteria are organized into four categories:
+
+1. **Functional** — what the system must be able to do
+2. **Quality** — how well it must do it
+3. **Operational** — what running it must look like
+4. **Documentation** — what must be written down
+
+V1 is "done" only when **every criterion in this document is met
+against at least one of the three active projects** (`p04-gigabit`,
+`p05-interferometer`, `p06-polisher`). The choice of which
+project is the test bed is up to the implementer, but the same
+project must satisfy all functional criteria.
+
+## The single-sentence definition
+
+> AtoCore Engineering Layer V1 is done when, against one chosen
+> active project, every v1-required query in
+> `engineering-query-catalog.md` returns a correct result, the
+> Human Mirror renders a coherent project overview, and a real
+> KB-CAD or KB-FEM export round-trips through the ingest →
+> review queue → active entity flow without violating any
+> conflict or trust invariant.
+
+Everything below is the operational form of that sentence.
+
+## Category 1 — Functional acceptance
+
+### F-1: Entity store implemented per the V1 ontology
+
+- The 12 V1 entity types from `engineering-ontology-v1.md` exist
+  in the database with the schema described there
+- The 4 relationship families (Structural, Intent, Validation,
+  Provenance) are implemented as edges with the relationship
+  types listed in the catalog
+- Every entity has the shared header fields:
+  `id, type, name, project_id, status, confidence, source_refs,
+   created_at, updated_at, extractor_version, canonical_home`
+- The status lifecycle matches the memory layer:
+  `candidate → active → superseded | invalid`
+
+### F-2: All v1-required queries return correct results
+
+For the chosen test project, every query Q-001 through Q-020 in
+`engineering-query-catalog.md` must:
+
+- be implemented as an API endpoint with the shape specified in
+  the catalog
+- return the expected result shape against real data
+- include the provenance chain when the catalog requires it
+- handle the empty case (no matches) gracefully — empty array,
+  not 500
+
+The "killer correctness queries" — Q-006 (orphan requirements),
+Q-009 (decisions on flagged assumptions), Q-011 (unsupported
+validation claims) — are non-negotiable. If any of those three
+returns wrong results, V1 is not done.
+
+### F-3: Tool ingest endpoints are live
+
+Both endpoints from `tool-handoff-boundaries.md` are implemented:
+
+- `POST /ingest/kb-cad/export` accepts the documented JSON
+  shape, validates it, and produces entity candidates
+- `POST /ingest/kb-fem/export` ditto
+- Both refuse exports with invalid schemas (4xx with a clear
+  error)
+- Both return a summary of created/dropped/failed counts
+- Both never auto-promote anything; everything lands as
+  `status="candidate"`
+- Both carry source identifiers (exporter name, exporter version,
+  source artifact id) into the candidate's provenance fields
+
+A real KB-CAD export — even a hand-crafted one if the actual
+exporter doesn't exist yet — must round-trip through the endpoint
+and produce reviewable candidates for the test project.
+
+### F-4: Candidate review queue works end to end
+
+Per `promotion-rules.md`:
+
+- `GET /entities?status=candidate` lists the queue
+- `POST /entities/{id}/promote` moves candidate → active
+- `POST /entities/{id}/reject` moves candidate → invalid
+- The same shapes work for memories (already shipped in Phase 9 C)
+- The reviewer can edit a candidate's content via
+  `PUT /entities/{id}` before promoting
+- Every promote/reject is logged with timestamp and reason
+
+### F-5: Conflict detection fires
+
+Per `conflict-model.md`:
+
+- The synchronous detector runs at every active write
+  (create, promote, project_state set, KB import)
+- A test must demonstrate that pushing a contradictory KB-CAD
+  export creates a `conflicts` row with both members linked
+- The reviewer can resolve the conflict via
+  `POST /conflicts/{id}/resolve` with one of the supported
+  actions (supersede_others, no_action, dismiss)
+- Resolution updates the underlying entities according to the
+  chosen action
+
+### F-6: Human Mirror renders for the test project
+
+Per `human-mirror-rules.md`:
+
+- `GET /mirror/{project}/overview` returns rendered markdown
+- `GET /mirror/{project}/decisions` returns rendered markdown
+- `GET /mirror/{project}/subsystems/{subsystem}` returns
+  rendered markdown for at least one subsystem
+- `POST /mirror/{project}/regenerate` triggers regeneration on
+  demand
+- Generated files appear under `/srv/storage/atocore/data/mirror/`
+  with the "do not edit" header banner
+- Disputed markers appear inline when conflicts exist
+- Project-state overrides display with the `(curated)` annotation
+- Output is deterministic (the same inputs produce the same
+  bytes, suitable for diffing)
+
+### F-7: Memory-to-entity graduation works for at least one type
+
+Per `memory-vs-entities.md`:
+
+- `POST /memory/{id}/graduate` exists
+- Graduating a memory of type `adaptation` produces a Decision
+  entity candidate with the memory's content as a starting point
+- The original memory row stays at `status="graduated"` (a new
+  status added by the engineering layer migration)
+- The graduated memory has a forward pointer to the entity
+  candidate's id
+- Promoting the entity candidate does NOT delete the original
+  memory
+- The same graduation flow works for `project` → Requirement
+  and `knowledge` → Fact entity types (test the path; doesn't
+  have to be exhaustive)
+
+### F-8: Provenance chain is complete
+
+For every active entity in the test project, the following must
+be true:
+
+- It links back to at least one source via `source_refs` (which
+  is one or more of: source_chunk_id, source_interaction_id,
+  source_artifact_id from KB import)
+- The provenance chain can be walked from the entity to the
+  underlying raw text (source_chunks) or external artifact
+- Q-017 (the evidence query) returns at least one row for every
+  active entity
+
+If any active entity has no provenance, it's a bug — provenance
+is mandatory at write time per the promotion rules.
+
+## Category 2 — Quality acceptance
+
+### Q-1: All existing tests still pass
+
+The full pre-V1 test suite (currently 160 tests) must still
+pass. The V1 implementation may add new tests but cannot regress
+any existing test.
+
+### Q-2: V1 has its own test coverage
+
+For each of F-1 through F-8 above, at least one automated test
+exists that:
+
+- exercises the happy path
+- covers at least one error path
+- runs in CI in under 10 seconds (no real network, no real LLM)
+
+The full V1 test suite should be under 30 seconds total runtime
+to keep the development loop fast.
+
+### Q-3: Conflict invariants are enforced by tests
+
+Specific tests must demonstrate:
+
+- Two contradictory KB exports produce a conflict (not silent
+  overwrite)
+- A reviewer can't accidentally promote both members of an open
+  conflict to active without resolving the conflict first
+- The "flag, never block" rule holds — writes still succeed
+  even when they create a conflict
+
+### Q-4: Trust hierarchy is enforced by tests
+
+Specific tests must demonstrate:
+
+- Entity candidates can never appear in context packs
+- Reinforcement only touches active memories (already covered
+  by Phase 9 Commit B tests, but the same property must hold
+  for entities once they exist)
+- Nothing automatically writes to project_state ever
+- Candidates can never satisfy Q-005 (only active entities count)
+
+### Q-5: The Human Mirror is reproducible
+
+A golden-file test exists for at least one Mirror page. Updating
+the golden file is a normal part of template work (single
+command, well-documented). The test fails if the renderer
+produces different bytes for the same input, catching
+non-determinism.
+
+### Q-6: Killer correctness queries pass against real-ish data
+
+The test bed for Q-006, Q-009, Q-011 is not synthetic. The
+implementation must seed the test project with at least:
+
+- One Requirement that has a satisfying Component (Q-006 should
+  not flag it)
+- One Requirement with no satisfying Component (Q-006 must flag it)
+- One Decision based on an Assumption flagged as `needs_review`
+  (Q-009 must flag the Decision)
+- One ValidationClaim with at least one supporting Result
+  (Q-011 should not flag it)
+- One ValidationClaim with no supporting Result (Q-011 must flag it)
+
+These five seed cases run as a single integration test that
+exercises the killer correctness queries against actual
+representative data.
+
+## Category 3 — Operational acceptance
+
+### O-1: Migration is safe and reversible
+
+The V1 schema migration (adding the `entities`, `relationships`,
+`conflicts`, `conflict_members` tables, plus `mirror_regeneration_failures`)
+must:
+
+- run cleanly against a production-shape database
+- be implemented via the same `_apply_migrations` pattern as
+  Phase 9 (additive only, idempotent, safe to run twice)
+- be tested by spinning up a fresh DB AND running against a
+  copy of the live Dalidou DB taken from a backup
+
+### O-2: Backup and restore still work
+
+The backup endpoint must include the new tables. A restore drill
+on the test project must:
+
+- successfully back up the V1 entity state via
+  `POST /admin/backup`
+- successfully validate the snapshot
+- successfully restore from the snapshot per
+  `docs/backup-restore-procedure.md`
+- pass post-restore verification including a Q-001 query against
+  the test project
+
+The drill must be performed once before V1 is declared done.
+
+### O-3: Performance bounds
+
+These are starting bounds; tune later if real usage shows
+problems:
+
+- Single-entity write (`POST /entities/...`): under 100ms p99
+  on the production Dalidou hardware
+- Single Q-001 / Q-005 / Q-008 query: under 500ms p99 against
+  a project with up to 1000 entities
+- Mirror regeneration of one project overview: under 5 seconds
+  for a project with up to 1000 entities
+- Conflict detector at write time: adds no more than 50ms p99
+  to a write that doesn't actually produce a conflict
+
+These bounds are not tested by automated benchmarks in V1 (that
+would be over-engineering). They are sanity-checked by the
+developer running the operations against the test project.
+
+### O-4: No new manual ops burden
+
+V1 should not introduce any new "you have to remember to run X
+every day" requirement. Specifically:
+
+- Mirror regeneration is automatic (debounced async + daily
+  refresh), no manual cron entry needed
+- Conflict detection is automatic at write time, no manual sweep
+  needed in V1 (the nightly sweep is V2)
+- Backup retention cleanup is **still** an open follow-up from
+  the operational baseline; V1 does not block on it
+
+### O-5: No regressions in Phase 9 reflection loop
+
+The capture, reinforcement, and extraction loop from Phase 9
+A/B/C must continue to work end to end with the engineering
+layer in place. Specifically:
+
+- Memories whose types are NOT in the engineering layer
+  (identity, preference, episodic) keep working exactly as
+  before
+- Memories whose types ARE in the engineering layer (project,
+  knowledge, adaptation) can still be created hand or by
+  extraction; the deprecation rule from `memory-vs-entities.md`
+  ("no new writes after V1 ships") is implemented as a
+  configurable warning, not a hard block, so existing
+  workflows aren't disrupted
+
+## Category 4 — Documentation acceptance
+
+### D-1: Per-entity-type spec docs
+
+Each of the 12 V1 entity types has a short spec doc under
+`docs/architecture/entities/` covering:
+
+- the entity's purpose
+- its required and optional fields
+- its lifecycle quirks (if any beyond the standard
+  candidate/active/superseded/invalid)
+- which queries it appears in (cross-reference to the catalog)
+- which relationship types reference it
+
+These docs can be terse — a page each, mostly bullet lists.
+Their purpose is to make the entity model legible to a future
+maintainer, not to be reference manuals.
+
+### D-2: KB-CAD and KB-FEM export schema docs
+
+`docs/architecture/kb-cad-export-schema.md` and
+`docs/architecture/kb-fem-export-schema.md` are written and
+match the implemented validators.
+
+### D-3: V1 release notes
+
+A `docs/v1-release-notes.md` summarizes:
+
+- What V1 added (entities, relationships, conflicts, mirror,
+  ingest endpoints)
+- What V1 deferred (auto-promotion, BOM/cost/manufacturing
+  entities, NX direct integration, cross-project rollups)
+- The migration story for existing memories (graduation flow)
+- Known limitations and the V2 roadmap pointers
+
+### D-4: master-plan-status.md and current-state.md updated
+
+Both top-level status docs reflect V1's completion:
+
+- Phase 6 (AtoDrive) and the engineering layer are explicitly
+  marked as separate tracks
+- The engineering planning sprint section is marked complete
+- Phase 9 stays at "baseline complete" (V1 doesn't change Phase 9)
+- The engineering layer V1 is added as its own line item
+
+## What V1 explicitly does NOT need to do
+
+To prevent scope creep, here is the negative list. None of the
+following are V1 acceptance criteria:
+
+- **No LLM extractor.** The Phase 9 C rule-based extractor is
+  the entity extractor for V1 too, just with new rules added for
+  entity types.
+- **No auto-promotion of candidates.** Per `promotion-rules.md`.
+- **No write-back to KB-CAD or KB-FEM.** Per
+  `tool-handoff-boundaries.md`.
+- **No multi-user / per-reviewer auth.** Single-user assumed.
+- **No real-time UI.** API + Mirror markdown is the V1 surface.
+  A web UI is V2+.
+- **No cross-project rollups.** Per `human-mirror-rules.md`.
+- **No time-travel queries** (Q-015 stays v1-stretch).
+- **No nightly conflict sweep.** Synchronous detection only in V1.
+- **No incremental Chroma snapshots.** The current full-copy
+  approach in `backup-restore-procedure.md` is fine for V1.
+- **No retention cleanup script.** Still an open follow-up.
+- **No backup encryption.** Still an open follow-up.
+- **No off-Dalidou backup target.** Still an open follow-up.
+
+## How to use this document during implementation
+
+When the implementation sprint begins:
+
+1. Read this doc once, top to bottom
+2. Pick the test project (probably p05-interferometer because
+   the optical/structural domain has the cleanest entity model)
+3. For each section, write the test or the implementation, in
+   roughly the order: F-1 → F-2 → F-3 → F-4 → F-5 → F-6 → F-7 → F-8
+4. Each acceptance criterion's test should be written **before
+   or alongside** the implementation, not after
+5. Run the full test suite at every commit
+6. When every box is checked, write D-3 (release notes), update
+   D-4 (status docs), and call V1 done
+
+The implementation sprint should not touch anything outside the
+scope listed here. If a desire arises to add something not in
+this doc, that's a V2 conversation, not a V1 expansion.
+
+## Anticipated friction points
+
+These are the things I expect will be hard during implementation:
+
+1. **The graduation flow (F-7)** is the most cross-cutting
+   change because it touches the existing memory module.
+   Worth doing it last so the memory module is stable for
+   all the V1 entity work first.
+2. **The Mirror's deterministic-output requirement (Q-5)** will
+   bite if the implementer iterates over Python dicts without
+   sorting. Plan to use `sorted()` literally everywhere.
+3. **Conflict detection (F-5)** has subtle correctness traps:
+   the slot key extraction must be stable, the dedup-of-existing-conflicts
+   logic must be right, and the synchronous detector must not
+   slow writes meaningfully (Q-3 / O-3 cover this, but watch).
+4. **Provenance backfill** for entities that come from the
+   existing memory layer via graduation (F-7) is the trickiest
+   part: the original memory may not have had a strict
+   `source_chunk_id`, in which case the graduated entity also
+   doesn't have one. The implementation needs an "orphan
+   provenance" allowance for graduated entities, with a
+   warning surfaced in the Mirror.
+
+These aren't blockers, just the parts of the V1 spec I'd
+attack with extra care.
+
+## TL;DR
+
+- Engineering V1 is done when every box in this doc is checked
+  against one chosen active project
+- Functional: 8 criteria covering entities, queries, ingest,
+  review queue, conflicts, mirror, graduation, provenance
+- Quality: 6 criteria covering tests, golden files, killer
+  correctness, trust enforcement
+- Operational: 5 criteria covering migration safety, backup
+  drill, performance bounds, no new manual ops, Phase 9 not
+  regressed
+- Documentation: 4 criteria covering entity specs, KB schema
+  docs, release notes, top-level status updates
+- Negative list: a clear set of things V1 deliberately does
+  NOT need to do, to prevent scope creep
+- The implementation sprint follows this doc as a checklist
--- a/docs/architecture/human-mirror-rules.md
+++ b/docs/architecture/human-mirror-rules.md
@@ -0,0 +1,384 @@
+# Human Mirror Rules (Layer 3 → derived markdown views)
+
+## Why this document exists
+
+The engineering layer V1 stores facts as typed entities and
+relationships in a SQL database. That representation is excellent
+for queries, conflict detection, and automated reasoning, but
+it's terrible for the human reading experience. People want to
+read prose, not crawl JSON.
+
+The Human Mirror is the layer that turns the typed entity store
+into human-readable markdown pages. It's strictly a derived view —
+nothing in the Human Mirror is canonical, every page is regenerated
+from current entity state on demand.
+
+This document defines:
+
+- what the Human Mirror generates
+- when it regenerates
+- how the human edits things they see in the Mirror
+- how the canonical-vs-derived rule is enforced (so editing the
+  derived markdown can't silently corrupt the entity store)
+
+## The non-negotiable rule
+
+> **The Human Mirror is read-only from the human's perspective.**
+>
+> If the human wants to change a fact they see in the Mirror, they
+> change it in the canonical home (per `representation-authority.md`),
+> NOT in the Mirror page. The next regeneration picks up the change.
+
+This rule is what makes the whole derived-view approach safe. If
+the human is allowed to edit Mirror pages directly, the
+canonical-vs-derived split breaks and the Mirror becomes a second
+source of truth that disagrees with the entity store.
+
+The technical enforcement is that every Mirror page carries a
+header banner that says "this file is generated from AtoCore
+entity state, do not edit", and the file is regenerated from the
+entity store on every change to its underlying entities. Manual
+edits will be silently overwritten on the next regeneration.
+
+## What the Mirror generates in V1
+
+Three template families, each producing one or more pages per
+project:
+
+### 1. Project Overview
+
+One page per registered project. Renders:
+
+- Project header (id, aliases, description)
+- Subsystem tree (from Q-001 / Q-004 in the query catalog)
+- Active Decisions affecting this project (Q-008, ordered by date)
+- Open Requirements with coverage status (Q-005, Q-006)
+- Open ValidationClaims with support status (Q-010, Q-011)
+- Currently flagged conflicts (from the conflict model)
+- Recent changes (Q-013) — last 14 days
+
+This is the most important Mirror page. It's the page someone
+opens when they want to know "what's the state of this project
+right now". It deliberately mirrors what `current-state.md` does
+for AtoCore itself but generated entirely from typed state.
+
+### 2. Decision Log
+
+One page per project. Renders:
+
+- All active Decisions in chronological order (newest first)
+- Each Decision shows: id, what was decided, when, the affected
+  Subsystem/Component, the supporting evidence (Q-014, Q-017)
+- Superseded Decisions appear as collapsed "history" entries
+  with a forward link to whatever superseded them
+- Conflicting Decisions get a "⚠ disputed" marker
+
+This is the human-readable form of the engineering query catalog's
+Q-014 query.
+
+### 3. Subsystem Detail
+
+One page per Subsystem (so a few per project). Renders:
+
+- Subsystem header
+- Components contained in this subsystem (Q-001)
+- Interfaces this subsystem has (Q-003)
+- Constraints applying to it (Q-007)
+- Decisions affecting it (Q-008)
+- Validation status: which Requirements are satisfied,
+  which are open (Q-005, Q-006)
+- Change history within this subsystem (Q-013 scoped)
+
+Subsystem detail pages are what someone reads when they're
+working on a specific part of the system and want everything
+relevant in one place.
+
+## What the Mirror does NOT generate in V1
+
+Intentionally excluded so the V1 implementation stays scoped:
+
+- **Per-component detail pages.** Components are listed in
+  Subsystem pages but don't get their own pages. Reduces page
+  count from hundreds to dozens.
+- **Per-Decision detail pages.** Decisions appear inline in
+  Project Overview and Decision Log; their full text plus
+  evidence chain is shown there, not on a separate page.
+- **Cross-project rollup pages.** No "all projects at a glance"
+  page in V1. Each project is its own report.
+- **Time-series / historical pages.** The Mirror is always
+  "current state". History is accessible via Decision Log and
+  superseded chains, but no "what was true on date X" page exists
+  in V1 (Q-015 is v1-stretch in the query catalog for the same
+  reason).
+- **Diff pages between two timestamps.** Same reasoning.
+- **Render of the conflict queue itself.** Conflicts appear
+  inline in the relevant Mirror pages with the "⚠ disputed"
+  marker and a link to `/conflicts/{id}`, but there's no
+  Mirror page that lists all conflicts. Use `GET /conflicts`.
+- **Per-memory pages.** Memories are not engineering entities;
+  they appear in context packs and the review queue, not in the
+  Human Mirror.
+
+## Where Mirror pages live
+
+Two options were considered. The chosen V1 path is option B:
+
+**Option A — write Mirror pages back into the source vault.**
+Generate `/srv/storage/atocore/sources/vault/mirror/p05/overview.md`
+so the human reads them in their normal Obsidian / markdown
+viewer. **Rejected** because writing into the source vault
+violates the "sources are read-only" rule from
+`tool-handoff-boundaries.md` and the operating model.
+
+**Option B (chosen) — write Mirror pages into a dedicated AtoCore
+output dir, served via the API.** Generate under
+`/srv/storage/atocore/data/mirror/p05/overview.md`. The human
+reads them via:
+
+- the API endpoints `GET /mirror/{project}/overview`,
+  `GET /mirror/{project}/decisions`,
+  `GET /mirror/{project}/subsystems/{subsystem}` (all return
+  rendered markdown as text/markdown)
+- a future "Mirror viewer" in the Claude Code slash command
+  `/atocore-mirror <project>` that fetches the rendered markdown
+  and displays it inline
+- direct file access on Dalidou for power users:
+  `cat /srv/storage/atocore/data/mirror/p05/overview.md`
+
+The dedicated dir keeps the Mirror clearly separated from the
+canonical sources and makes regeneration safe (it's just a
+directory wipe + write).
+
+## When the Mirror regenerates
+
+Three triggers, in order from cheapest to most expensive:
+
+### 1. On explicit human request
+
+```
+POST /mirror/{project}/regenerate
+```
+
+Returns the timestamp of the regeneration and the list of files
+written. This is the path the human takes when they've just
+curated something into project_state and want to see the Mirror
+reflect it immediately.
+
+### 2. On entity write (debounced, async, per project)
+
+When any entity in a project changes status (candidate → active,
+active → superseded), a regeneration of that project's Mirror is
+queued. The queue is debounced — multiple writes within a 30-second
+window only trigger one regeneration. This keeps the Mirror
+"close to current" without generating a Mirror update on every
+single API call.
+
+The implementation is a simple dict of "next regeneration time"
+per project, checked by a background task. No cron, no message
+queue, no Celery. Just a `dict[str, datetime]` and a thread.
+
+### 3. On scheduled refresh (daily)
+
+Once per day at a quiet hour, every project's Mirror regenerates
+unconditionally. This catches any state drift from manual
+project_state edits that bypassed the entity write hooks, and
+provides a baseline guarantee that the Mirror is at most 24
+hours stale.
+
+The schedule runs from the same machinery as the future backup
+retention job, so we get one cron-equivalent system to maintain
+instead of two.
+
+## What if regeneration fails
+
+The Mirror has to be resilient. If regeneration fails for a
+project (e.g. a query catalog query crashes, a template rendering
+error), the existing Mirror files are **not** deleted. The
+existing files stay in place (showing the last successful state)
+and a regeneration error is recorded in:
+
+- the API response if the trigger was explicit
+- a log entry at warning level for the async path
+- a `mirror_regeneration_failures` table for the daily refresh
+
+This means the human can always read the Mirror, even if the
+last 5 minutes of changes haven't made it in yet. Stale is
+better than blank.
+
+## How the human curates "around" the Mirror
+
+The Mirror reflects the current entity state. If the human
+doesn't like what they see, the right edits go into one of:
+
+| What you want to change | Where you change it |
+|---|---|
+| A Decision's text | `PUT /entities/Decision/{id}` (or `PUT /memory/{id}` if it's still memory-layer) |
+| A Decision's status (active → superseded) | `POST /entities/Decision/{id}/supersede` (V1 entity API) |
+| Whether a Component "satisfies" a Requirement | edit the relationship directly via the entity API (V1) |
+| The current trusted next focus shown on the Project Overview | `POST /project/state` with `category=status, key=next_focus` |
+| A typo in a generated heading or label | edit the **template**, not the rendered file. Templates live in `templates/mirror/` (V1 implementation) |
+| Source of a fact ("this came from KB-CAD on day X") | not editable by hand — it's automatically populated from provenance |
+
+The rule is consistent: edit the canonical home, regenerate (or
+let the auto-trigger fire), see the change reflected in the
+Mirror.
+
+## Templates
+
+The Mirror uses Jinja2-style templates checked into the repo
+under `templates/mirror/`. Each template is a markdown file with
+placeholders that the renderer fills from query catalog results.
+
+Template list for V1:
+
+- `templates/mirror/project-overview.md.j2`
+- `templates/mirror/decision-log.md.j2`
+- `templates/mirror/subsystem-detail.md.j2`
+
+Editing a template is a code change, reviewed via normal git PRs.
+The templates are deliberately small and readable so the human
+can tweak the output format without touching renderer code.
+
+The renderer is a thin module:
+
+```python
+# src/atocore/mirror/renderer.py (V1, not yet implemented)
+
+def render_project_overview(project: str) -> str:
+    """Generate the project overview markdown for one project."""
+    facts = collect_project_overview_facts(project)
+    template = load_template("project-overview.md.j2")
+    return template.render(**facts)
+```
+
+## The "do not edit" header
+
+Every generated Mirror file starts with a fixed banner:
+
+```markdown
+<!--
+  This file is generated by AtoCore from current entity state.
+  DO NOT EDIT — manual changes will be silently overwritten on
+  the next regeneration.
+  Edit the canonical home instead. See:
+    https://docs.atocore.../representation-authority.md
+  Regenerated: 2026-04-07T12:34:56Z
+  Source entities: <commit-like checksum of input data>
+-->
+```
+
+The checksum at the end lets the renderer skip work when nothing
+relevant has changed since the last regeneration. If the inputs
+match the previous run's checksum, the existing file is left
+untouched.
+
+## Conflicts in the Mirror
+
+Per the conflict model, any open conflict on a fact that appears
+in the Mirror gets a visible disputed marker:
+
+```markdown
+- Lateral support material: **GF-PTFE** ⚠ disputed
+  - The KB-CAD import on 2026-04-07 reported PEEK; conflict #c-039.
+```
+
+The disputed marker is a hyperlink (in renderer terms; the markdown
+output is a relative link) to the conflict detail page in the API
+or to the conflict id for direct lookup. The reviewer follows the
+link, resolves the conflict via `POST /conflicts/{id}/resolve`,
+and on the next regeneration the marker disappears.
+
+## Project-state overrides in the Mirror
+
+When a Mirror page would show a value derived from entities, but
+project_state has an override on the same key, **the Mirror shows
+the project_state value** with a small annotation noting the
+override:
+
+```markdown
+- Next focus: **Wave 2 trusted-operational ingestion** (curated)
+```
+
+The `(curated)` annotation tells the reader "this is from the
+trusted-state Layer 3, not from extracted entities". This makes
+the trust hierarchy visible in the human reading experience.
+
+## The "Mirror diff" workflow (post-V1, but designed for)
+
+A common workflow after V1 ships will be:
+
+1. Reviewer has curated some new entities
+2. They want to see "what changed in the Mirror as a result"
+3. They want to share that diff with someone else as evidence
+
+To support this, the Mirror generator writes its output
+deterministically (sorted iteration, stable timestamp formatting)
+so a `git diff` between two regenerated states is meaningful.
+
+V1 doesn't add an explicit "diff between two Mirror snapshots"
+endpoint — that's deferred. But the deterministic-output
+property is a V1 requirement so future diffing works without
+re-renderer-design work.
+
+## What the Mirror enables
+
+With the Mirror in place:
+
+- **OpenClaw can read project state in human form.** The
+  read-only AtoCore helper skill on the T420 already calls
+  `/context/build`; in V1 it gains the option to call
+  `/mirror/{project}/overview` to get a fully-rendered markdown
+  page instead of just retrieved chunks. This is much faster
+  than crawling individual entities for general questions.
+- **The human gets a daily-readable artifact.** Every morning,
+  Antoine can `cat /srv/storage/atocore/data/mirror/p05/overview.md`
+  and see the current state of p05 in his preferred reading
+  format. No API calls, no JSON parsing.
+- **Cross-collaborator sharing.** If you ever want to send
+  someone a project overview without giving them AtoCore access,
+  the Mirror file is a self-contained markdown document they can
+  read in any markdown viewer.
+- **Claude Code integration.** A future
+  `/atocore-mirror <project>` slash command renders the Mirror
+  inline, complementing the existing `/atocore-context` command
+  with a human-readable view of "what does AtoCore think about
+  this project right now".
+
+## Open questions for V1 implementation
+
+1. **What's the regeneration debounce window?** 30 seconds is the
+   starting value but should be tuned with real usage.
+2. **Does the daily refresh need a separate trigger mechanism, or
+   is it just a long-period entry in the same in-process scheduler
+   that handles the debounced async refreshes?** Probably the
+   latter — keep it simple.
+3. **How are templates tested?** Likely a small set of fixture
+   project states + golden output files, with a single test that
+   asserts `render(fixture) == golden`. Updating golden files is
+   a normal part of template work.
+4. **Are Mirror pages discoverable via a directory listing
+   endpoint?** `GET /mirror/{project}` returns the list of
+   available pages for that project. Probably yes; cheap to add.
+5. **How does the Mirror handle a project that has zero entities
+   yet?** Render an empty-state page that says "no curated facts
+   yet — add some via /memory or /entities/Decision". Better than
+   a blank file.
+
+## TL;DR
+
+- The Human Mirror generates 3 template families per project
+  (Overview, Decision Log, Subsystem Detail) from current entity
+  state
+- It's strictly read-only from the human's perspective; edits go
+  to the canonical home and the Mirror picks them up on
+  regeneration
+- Three regeneration triggers: explicit POST, debounced
+  async-on-write, daily scheduled refresh
+- Mirror files live in `/srv/storage/atocore/data/mirror/`
+  (NOT in the source vault — sources stay read-only)
+- Conflicts and project_state overrides are visible inline in
+  the rendered markdown so the trust hierarchy shows through
+- Templates are checked into the repo and edited via PR; the
+  rendered files are derived and never canonical
+- Deterministic output is a V1 requirement so future diffing
+  works without rework
--- a/docs/architecture/llm-client-integration.md
+++ b/docs/architecture/llm-client-integration.md
@@ -0,0 +1,333 @@
+# LLM Client Integration (the layering)
+
+## Why this document exists
+
+AtoCore must be reachable from many different LLM client contexts:
+
+- **OpenClaw** on the T420 (already integrated via the read-only
+  helper skill at `/home/papa/clawd/skills/atocore-context/`)
+- **Claude Code** on the laptop (via the slash command shipped in
+  this repo at `.claude/commands/atocore-context.md`)
+- **Codex** sessions (future)
+- **Direct API consumers** — scripts, Python code, ad-hoc curl
+- **The eventual MCP server** when it's worth building
+
+Without an explicit layering rule, every new client tends to
+reimplement the same routing logic (project detection, context
+build, retrieval audit, project-state inspection) in slightly
+different ways. That is exactly what almost happened in the first
+draft of the Claude Code slash command, which started as a curl +
+jq script that duplicated capabilities the existing operator client
+already had.
+
+This document defines the layering so future clients don't repeat
+that mistake.
+
+## The layering
+
+Three layers, top to bottom:
+
+```
+        +----------------------------------------------------+
+        |  Per-agent thin frontends                          |
+        |                                                    |
+        |  - Claude Code slash command                       |
+        |    (.claude/commands/atocore-context.md)           |
+        |  - OpenClaw helper skill                           |
+        |    (/home/papa/clawd/skills/atocore-context/)      |
+        |  - Codex skill (future)                            |
+        |  - MCP server (future)                             |
+        +----------------------------------------------------+
+                              |
+                              | shells out to / imports
+                              v
+        +----------------------------------------------------+
+        |  Shared operator client                            |
+        |  scripts/atocore_client.py                         |
+        |                                                    |
+        |  - subcommands for stable AtoCore operations       |
+        |  - fail-open on network errors                     |
+        |  - consistent JSON output across all subcommands   |
+        |  - environment-driven configuration                |
+        |    (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS,     |
+        |     ATOCORE_REFRESH_TIMEOUT_SECONDS,               |
+        |     ATOCORE_FAIL_OPEN)                             |
+        +----------------------------------------------------+
+                              |
+                              | HTTP
+                              v
+        +----------------------------------------------------+
+        |  AtoCore HTTP API                                  |
+        |  src/atocore/api/routes.py                         |
+        |                                                    |
+        |  - the universal interface to AtoCore              |
+        |  - everything else above is glue                   |
+        +----------------------------------------------------+
+```
+
+## The non-negotiable rules
+
+These rules are what make the layering work.
+
+### Rule 1 — every per-agent frontend is a thin wrapper
+
+A per-agent frontend exists to do exactly two things:
+
+1. **Translate the agent platform's command/skill format** into an
+   invocation of the shared client (or a small sequence of them)
+2. **Render the JSON response** into whatever shape the agent
+   platform wants (markdown for Claude Code, plaintext for
+   OpenClaw, MCP tool result for an MCP server, etc.)
+
+Everything else — talking to AtoCore, project detection, retrieval
+audit, fail-open behavior, configuration — is the **shared
+client's** job.
+
+If a per-agent frontend grows logic beyond the two responsibilities
+above, that logic is in the wrong place. It belongs in the shared
+client where every other frontend gets to use it.
+
+### Rule 2 — the shared client never duplicates the API
+
+The shared client is allowed to **compose** API calls (e.g.
+`auto-context` calls `detect-project` then `context-build`), but
+it never reimplements API logic. If a useful operation can't be
+expressed via the existing API endpoints, the right fix is to
+extend the API, not to embed the logic in the client.
+
+This rule keeps the API as the single source of truth for what
+AtoCore can do.
+
+### Rule 3 — the shared client only exposes stable operations
+
+A subcommand only makes it into the shared client when:
+
+- the API endpoint behind it has been exercised by at least one
+  real workflow
+- the request and response shapes are unlikely to change
+- the operation is one that more than one frontend will plausibly
+  want
+
+This rule keeps the client surface stable so frontends don't have
+to chase changes. New endpoints land in the API first, get
+exercised in real use, and only then get a client subcommand.
+
+## What's in scope for the shared client today
+
+The currently shipped scope (per `scripts/atocore_client.py`):
+
+### Stable operations (shipped since the client was introduced)
+
+| Subcommand | Purpose | API endpoint(s) |
+|---|---|---|
+| `health` | service status, mount + source readiness | `GET /health` |
+| `sources` | enabled source roots and their existence | `GET /sources` |
+| `stats` | document/chunk/vector counts | `GET /stats` |
+| `projects` | registered projects | `GET /projects` |
+| `project-template` | starter shape for a new project | `GET /projects/template` |
+| `propose-project` | preview a registration | `POST /projects/proposal` |
+| `register-project` | persist a registration | `POST /projects/register` |
+| `update-project` | update an existing registration | `PUT /projects/{name}` |
+| `refresh-project` | re-ingest a project's roots | `POST /projects/{name}/refresh` |
+| `project-state` | list trusted state for a project | `GET /project/state/{name}` |
+| `project-state-set` | curate trusted state | `POST /project/state` |
+| `project-state-invalidate` | supersede trusted state | `DELETE /project/state` |
+| `query` | raw retrieval | `POST /query` |
+| `context-build` | full context pack | `POST /context/build` |
+| `auto-context` | detect-project then context-build | composes `/projects` + `/context/build` |
+| `detect-project` | match a prompt to a registered project | composes `/projects` + local regex |
+| `audit-query` | retrieval-quality audit with classification | composes `/query` + local labelling |
+| `debug-context` | last context pack inspection | `GET /debug/context` |
+| `ingest-sources` | ingest configured source dirs | `POST /ingest/sources` |
+
+### Phase 9 reflection loop (shipped after migration safety work)
+
+These were explicitly deferred in earlier versions of this doc
+pending "exercised workflow". The constraint was real — premature
+API freeze would have made it harder to iterate on the ergonomics —
+but the deferral ran into a bootstrap problem: you can't exercise
+the workflow in real Claude Code sessions without a usable client
+surface to drive it from. The fix is to ship a minimal Phase 9
+surface now and treat it as stable-but-refinable: adding new
+optional parameters is fine, renaming subcommands is not.
+
+| Subcommand | Purpose | API endpoint(s) |
+|---|---|---|
+| `capture` | record one interaction round-trip | `POST /interactions` |
+| `extract` | run the rule-based extractor (preview or persist) | `POST /interactions/{id}/extract` |
+| `reinforce-interaction` | backfill reinforcement on an existing interaction | `POST /interactions/{id}/reinforce` |
+| `list-interactions` | paginated list with filters | `GET /interactions` |
+| `get-interaction` | fetch one interaction by id | `GET /interactions/{id}` |
+| `queue` | list the candidate review queue | `GET /memory?status=candidate` |
+| `promote` | move a candidate memory to active | `POST /memory/{id}/promote` |
+| `reject` | mark a candidate memory invalid | `POST /memory/{id}/reject` |
+
+All 8 Phase 9 subcommands have test coverage in
+`tests/test_atocore_client.py` via mocked `request()`, including
+an end-to-end test that drives the full capture → extract → queue
+→ promote/reject cycle through the client.
+
+### Coverage summary
+
+That covers everything in the "stable operations" set AND the
+full Phase 9 reflection loop: project lifecycle, ingestion,
+project-state curation, retrieval, context build,
+retrieval-quality audit, health and stats inspection, interaction
+capture, candidate extraction, candidate review queue.
+
+## What's intentionally NOT in scope today
+
+Two families of operations remain deferred:
+
+### 1. Backup and restore admin operations
+
+Phase 9 Commit B shipped these endpoints:
+
+- `POST /admin/backup` (with `include_chroma`)
+- `GET /admin/backup` (list)
+- `GET /admin/backup/{stamp}/validate`
+
+The backup endpoints are stable, but the documented operational
+procedure (`docs/backup-restore-procedure.md`) intentionally uses
+direct curl rather than the shared client. The reason is that
+backup operations are *administrative* and benefit from being
+explicit about which instance they're targeting, with no
+fail-open behavior. The shared client's fail-open default would
+hide a real backup failure.
+
+If we later decide to add backup commands to the shared client,
+they would set `ATOCORE_FAIL_OPEN=false` for the duration of the
+call so the operator gets a real error on failure rather than a
+silent fail-open envelope.
+
+### 2. Engineering layer entity operations
+
+The engineering layer is in planning, not implementation. When
+V1 ships per `engineering-v1-acceptance.md`, the shared client
+will gain entity, relationship, conflict, and Mirror commands.
+None of those exist as stable contracts yet, so they are not in
+the shared client today.
+
+## How a new agent platform integrates
+
+When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom
+GPT, a Cursor extension), the integration recipe is:
+
+1. **Don't reimplement.** Don't write a new HTTP client. Use the
+   shared client.
+2. **Write a thin frontend** that translates the platform's
+   command/skill format into a shell call to
+   `python scripts/atocore_client.py <subcommand> <args...>`.
+3. **Render the JSON response** in the platform's preferred shape.
+4. **Inherit fail-open and env-var behavior** from the shared
+   client. Don't override unless the platform explicitly needs
+   to (e.g. an admin tool that wants to see real errors).
+5. **If a needed capability is missing**, propose adding it to
+   the shared client. If the underlying API endpoint also
+   doesn't exist, propose adding it to the API first. Don't
+   add the logic to your frontend.
+
+The Claude Code slash command in this repo is a worked example:
+~50 lines of markdown that does argument parsing, calls the
+shared client, and renders the result. It contains zero AtoCore
+business logic of its own.
+
+## How OpenClaw fits
+
+OpenClaw's helper skill at `/home/papa/clawd/skills/atocore-context/`
+on the T420 currently has its own implementation of `auto-context`,
+`detect-project`, and the project lifecycle commands. It predates
+this layering doc.
+
+The right long-term shape is to **refactor the OpenClaw helper to
+shell out to the shared client** instead of duplicating the
+routing logic. This isn't urgent because:
+
+- OpenClaw's helper works today and is in active use
+- The duplication is on the OpenClaw side; AtoCore itself is not
+  affected
+- The shared client and the OpenClaw helper are in different
+  repos (AtoCore vs OpenClaw clawd), so the refactor is a
+  cross-repo coordination
+
+The refactor is queued as a follow-up. Until then, **the OpenClaw
+helper and the Claude Code slash command are parallel
+implementations** of the same idea. The shared client is the
+canonical backbone going forward; new clients should follow the
+new pattern even though the existing OpenClaw helper still has
+its own.
+
+## How this connects to the master plan
+
+| Layer | Phase home | Status |
+|---|---|---|
+| AtoCore HTTP API | Phases 0/0.5/1/2/3/5/7/9 | shipped |
+| Shared operator client (`scripts/atocore_client.py`) | implicitly Phase 8 (OpenClaw integration) infrastructure | shipped via codex/port-atocore-ops-client merge |
+| OpenClaw helper skill (T420) | Phase 8 — partial | shipped (own implementation, refactor queued) |
+| Claude Code slash command (this repo) | precursor to Phase 11 (multi-model) | shipped (refactored to use the shared client) |
+| Codex skill | Phase 11 | future |
+| MCP server | Phase 11 | future |
+| Web UI / dashboard | Phase 11+ | future |
+
+The shared client is the **substrate Phase 11 will build on**.
+Every new client added in Phase 11 should be a thin frontend on
+the shared client, not a fresh reimplementation.
+
+## Versioning and stability
+
+The shared client's subcommand surface is **stable**. Adding new
+subcommands is non-breaking. Changing or removing existing
+subcommands is breaking and would require a coordinated update
+of every frontend that depends on them.
+
+The current shared client has no explicit version constant; the
+implicit contract is "the subcommands and JSON shapes documented
+in this file". When the client surface meaningfully changes,
+add a `CLIENT_VERSION = "x.y.z"` constant to
+`scripts/atocore_client.py` and bump it per semver:
+
+- patch: bug fixes, no surface change
+- minor: new subcommands or new optional fields
+- major: removed subcommands, renamed fields, changed defaults
+
+## Open follow-ups
+
+1. **Refactor the OpenClaw helper** to shell out to the shared
+   client. Cross-repo coordination, not blocking anything in
+   AtoCore itself. With the Phase 9 subcommands now in the shared
+   client, the OpenClaw refactor can reuse all the reflection-loop
+   work instead of duplicating it.
+2. **Real-usage validation of the Phase 9 loop**, now that the
+   client surface exists. First capture → extract → review cycle
+   against the live Dalidou instance, likely via the Claude Code
+   slash command flow. Findings feed back into subcommand
+   refinement (new optional flags are fine, renames require a
+   semver bump).
+3. **Add backup admin subcommands** if and when we decide the
+   shared client should be the canonical backup operator
+   interface (with fail-open disabled for admin commands).
+4. **Add engineering-layer entity subcommands** as part of the
+   engineering V1 implementation sprint, per
+   `engineering-v1-acceptance.md`.
+5. **Tag a `CLIENT_VERSION` constant** the next time the shared
+   client surface meaningfully changes. Today's surface with the
+   Phase 9 loop added is the v0.2.0 baseline (v0.1.0 was the
+   stable-ops-only version).
+
+## TL;DR
+
+- AtoCore HTTP API is the universal interface
+- `scripts/atocore_client.py` is the canonical shared Python
+  backbone for stable AtoCore operations
+- Per-agent frontends (Claude Code slash command, OpenClaw
+  helper, future Codex skill, future MCP server) are thin
+  wrappers that shell out to the shared client
+- The shared client today covers project lifecycle, ingestion,
+  retrieval, context build, project-state, retrieval audit, AND
+  the full Phase 9 reflection loop (capture / extract /
+  reinforce / list / queue / promote / reject)
+- Backup admin and engineering-entity commands remain deferred
+- The OpenClaw helper is currently a parallel implementation and
+  the refactor to the shared client is a queued follow-up
+- New LLM clients should never reimplement HTTP calls — they
+  follow the shell-out pattern documented here
--- a/docs/architecture/project-identity-canonicalization.md
+++ b/docs/architecture/project-identity-canonicalization.md
@@ -0,0 +1,462 @@
+# Project Identity Canonicalization
+
+## Why this document exists
+
+AtoCore identifies projects by name in many places: trusted state
+rows, memories, captured interactions, query/context API parameters,
+extractor candidates, future engineering entities. Without an
+explicit rule, every callsite would have to remember to canonicalize
+project names through the registry — and the recent codex review
+caught exactly the bug class that follows when one of them forgets.
+
+The fix landed in `fb6298a` and works correctly today. This document
+exists to make the rule **explicit and discoverable** so the
+engineering layer V1 implementation, future entity write paths, and
+any new agent integration don't reintroduce the same fragmentation
+when nobody is looking.
+
+## The contract
+
+> **Every read/write that takes a project name MUST canonicalize it
+> through `resolve_project_name()` before the value crosses a service
+> boundary.**
+
+The boundary is wherever a project name becomes a database row, a
+query filter, an attribute on a stored object, or a key for any
+lookup. The canonicalization happens **once**, at that boundary,
+before the underlying storage primitive is called.
+
+Symbolically:
+
+```
+HTTP layer (raw user input)
+    ↓
+   service entry point
+    ↓
+   project_name = resolve_project_name(project_name)   ← ONLY canonical from this point
+    ↓
+   storage / queries / further service calls
+```
+
+The rule is intentionally simple. There's no per-call exception,
+no "trust me, the caller already canonicalized it" shortcut, no
+opt-out flag. Every service-layer entry point applies the helper
+the moment it receives a project name from outside the service.
+
+## The helper
+
+```python
+# src/atocore/projects/registry.py
+
+def resolve_project_name(name: str | None) -> str:
+    """Canonicalize a project name through the registry.
+
+    Returns the canonical project_id if the input matches any
+    registered project's id or alias. Returns the input unchanged
+    when it's empty or not in the registry — the second case keeps
+    backwards compatibility with hand-curated state, memories, and
+    interactions that predate the registry, or for projects that
+    are intentionally not registered.
+    """
+    if not name:
+        return name or ""
+    project = get_registered_project(name)
+    if project is not None:
+        return project.project_id
+    return name
+```
+
+Three behaviors worth keeping in mind:
+
+1. **Empty / None input → empty string output.** Callers don't have
+   to pre-check; passing `""` or `None` to a query filter still
+   works as "no project scope".
+2. **Registered alias → canonical project_id.** The helper does the
+   case-insensitive lookup and returns the project's `id` field
+   (e.g. `"p05" → "p05-interferometer"`).
+3. **Unregistered name → input unchanged.** This is the
+   backwards-compatibility path. Hand-curated state, memories, or
+   interactions created under a name that isn't in the registry
+   keep working. The retrieval is then "best effort" — the raw
+   string is used as the SQL key, which still finds the row that
+   was stored under the same raw string. This path exists so the
+   engineering layer V1 doesn't have to also be a data migration.
+
+## Where the helper is currently called
+
+As of `fb6298a`, the helper is invoked at exactly these eight
+service-layer entry points:
+
+| Module | Function | What gets canonicalized |
+|---|---|---|
+| `src/atocore/context/builder.py` | `build_context` | the `project_hint` parameter, before the trusted state lookup |
+| `src/atocore/context/project_state.py` | `set_state` | `project_name`, before `ensure_project()` |
+| `src/atocore/context/project_state.py` | `get_state` | `project_name`, before the SQL lookup |
+| `src/atocore/context/project_state.py` | `invalidate_state` | `project_name`, before the SQL lookup |
+| `src/atocore/interactions/service.py` | `record_interaction` | `project`, before insert |
+| `src/atocore/interactions/service.py` | `list_interactions` | `project` filter parameter, before WHERE clause |
+| `src/atocore/memory/service.py` | `create_memory` | `project`, before insert |
+| `src/atocore/memory/service.py` | `get_memories` | `project` filter parameter, before WHERE clause |
+
+Every one of those is the **first** thing the function does after
+input validation. There is no path through any of those eight
+functions where a project name reaches storage without passing
+through `resolve_project_name`.
+
+## Where the helper is NOT called (and why that's correct)
+
+These places intentionally do not canonicalize:
+
+1. **`update_memory`'s project field.** The API does not allow
+   changing a memory's project after creation, so there's no
+   project to canonicalize. The function only updates `content`,
+   `confidence`, and `status`.
+2. **The retriever's `_project_match_boost` substring matcher.** It
+   already calls `get_registered_project` internally to expand the
+   hint into the candidate set (canonical id + all aliases + last
+   path segments). It accepts the raw hint by design.
+3. **`_rank_chunks`'s secondary substring boost in
+   `builder.py`.** Still uses the raw hint. This is a multiplicative
+   factor on top of correct retrieval, not a filter, so it cannot
+   drop relevant chunks. Tracked as a future cleanup but not
+   critical.
+4. **Direct SQL queries for the projects table itself** (e.g.
+   `ensure_project`'s lookup). These are intentional case-insensitive
+   raw lookups against the column the canonical id is stored in.
+   `set_state` already canonicalized before reaching `ensure_project`,
+   so the value passed is the canonical id by definition.
+5. **Hand-authored project names that aren't in the registry.**
+   The helper returns those unchanged. This is the backwards-compat
+   path mentioned above; it is *not* a violation of the rule, it's
+   the rule applied to a name with no registry record.
+
+## Why this is the trust hierarchy in action
+
+The whole point of AtoCore is the trust hierarchy from the operating
+model:
+
+1. Trusted Project State (Layer 3) is the most authoritative layer
+2. Memories (active) are second
+3. Source chunks (raw retrieved content) are last
+
+If a caller passes the alias `p05` and Layer 3 was written under
+`p05-interferometer`, and the lookup fails to find the canonical
+row, **the trust hierarchy collapses**. The most-authoritative
+layer is silently invisible to the caller. The system would still
+return *something* — namely, lower-trust retrieved chunks — and the
+human would never know they got a degraded answer.
+
+The canonicalization helper is what makes the trust hierarchy
+**dependable**. Layer 3 is supposed to win every time. To win it
+has to be findable. To be findable, the lookup key has to match
+how the row was stored. And the only way to guarantee that match
+across every entry point is to canonicalize at every boundary.
+
+## Compatibility gap: legacy alias-keyed rows
+
+The canonicalization rule fixes new writes going forward, but it
+does NOT fix rows that were already written under a registered
+alias before `fb6298a` landed. Those rows have a real, concrete
+gap that must be closed by a one-time migration before the
+engineering layer V1 ships.
+
+The exact failure mode:
+
+```
+        time T0 (before fb6298a):
+            POST /project/state {project: "p05", ...}
+            -> set_state("p05", ...)        # no canonicalization
+            -> ensure_project("p05")        # creates a "p05" row
+            -> writes state with project_id pointing at the "p05" row
+
+        time T1 (after fb6298a):
+            POST /project/state {project: "p05", ...}     (or any read)
+            -> set_state("p05", ...)
+            -> resolve_project_name("p05") -> "p05-interferometer"
+            -> ensure_project("p05-interferometer")        # creates a SECOND row
+            -> writes new state under the canonical row
+            -> the T0 state is still in the "p05" row, INVISIBLE to every
+               canonicalized read
+```
+
+The unregistered-name fallback path saves you when the project was
+never in the registry: a row stored under `"orphan-project"` is read
+back via `"orphan-project"`, both pass through `resolve_project_name`
+unchanged, and the strings line up. **It does not save you when the
+name is a registered alias** — the helper rewrites the read key but
+not the storage key, and the legacy row becomes invisible.
+
+What is at risk on the live Dalidou DB:
+
+1. **`projects` table**: any rows whose `name` column matches a
+   registered alias (one row per alias actually written under
+   before the fix landed). These shadow the canonical project row
+   and silently fragment the projects namespace.
+2. **`project_state` table**: any rows whose `project_id` points
+   at one of those shadow project rows. **This is the highest-risk
+   case** because it directly defeats the trust hierarchy: Layer 3
+   trusted state becomes invisible to every canonicalized lookup.
+3. **`memories` table**: any rows whose `project` column is a
+   registered alias. Reinforcement and extraction queries will
+   miss them.
+4. **`interactions` table**: any rows whose `project` column is a
+   registered alias. Listing and downstream reflection will miss
+   them.
+
+How to find out the actual blast radius on the live Dalidou DB:
+
+```sql
+-- inspect the projects table for alias-shadow rows
+SELECT id, name FROM projects;
+
+-- count alias-keyed memories per known alias
+SELECT project, COUNT(*) FROM memories
+  WHERE project IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core')
+  GROUP BY project;
+
+-- count alias-keyed interactions
+SELECT project, COUNT(*) FROM interactions
+  WHERE project IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core')
+  GROUP BY project;
+
+-- count alias-shadowed project_state rows by project name
+SELECT p.name, COUNT(*) FROM project_state ps
+  JOIN projects p ON ps.project_id = p.id
+  WHERE p.name IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core');
+```
+
+The migration that closes the gap has to:
+
+1. For each registered project, find all `projects` rows whose
+   name matches one of the project's aliases AND is not the
+   canonical id itself. These are the "shadow" rows.
+2. For each shadow row, MERGE its dependent state into the
+   canonical project's row:
+   - rekey `project_state.project_id` from shadow → canonical
+   - if the merge would create a `(project_id, category, key)`
+     collision (a state row already exists under the canonical
+     id with the same category+key), the migration must surface
+     the conflict via the existing conflict model and pause
+     until the human resolves it
+   - delete the now-empty shadow `projects` row
+3. For `memories` and `interactions`, the fix is simpler because
+   the alias appears as a string column (not a foreign key):
+   `UPDATE memories SET project = canonical WHERE project = alias`,
+   then same for interactions.
+4. The migration must run in dry-run mode first, printing the
+   exact rows it would touch and the canonical destinations they
+   would be merged into.
+5. The migration must be idempotent — running it twice produces
+   the same final state as running it once.
+
+This work is **required before the engineering layer V1 ships**
+because V1 will add new `entities`, `relationships`, `conflicts`,
+and `mirror_regeneration_failures` tables that all key on the
+canonical project id. Any leaked alias-keyed rows in the existing
+tables would show up in V1 reads as silently missing data, and
+the killer-correctness queries from `engineering-query-catalog.md`
+(orphan requirements, decisions on flagged assumptions,
+unsupported claims) would report wrong results against any project
+that has shadow rows.
+
+The migration script does NOT exist yet. The open follow-ups
+section below tracks it as the next concrete step.
+
+## The rule for new entry points
+
+When you add a new service-layer function that takes a project name,
+follow this checklist:
+
+1. **Does the function read or write a row keyed by project?** If
+   yes, you must call `resolve_project_name`. If no (e.g. it only
+   takes `project` as a label for logging), you may skip the
+   canonicalization but you should add a comment explaining why.
+2. **Where does the canonicalization go?** As the first statement
+   after input validation. Not later, not "before storage", not
+   "in the helper that does the actual write". As the first
+   statement, so any subsequent service call inside the function
+   sees the canonical value.
+3. **Add a regression test that uses an alias.** Use the
+   `project_registry` fixture from `tests/conftest.py` to set up
+   a temp registry with at least one project + aliases, then
+   verify the new function works when called with the alias and
+   when called with the canonical id.
+4. **If the function can be called with `None` or empty string,
+   verify that path too.** The helper handles it correctly but
+   the function-under-test might not.
+
+## How the `project_registry` test fixture works
+
+`tests/conftest.py::project_registry` returns a callable that
+takes one or more `(project_id, [aliases])` tuples (or just a bare
+`project_id` string), writes them into a temp registry file,
+points `ATOCORE_PROJECT_REGISTRY_PATH` at it, and reloads
+`config.settings`. Use it like:
+
+```python
+def test_my_new_thing_canonicalizes(project_registry):
+    project_registry(("p05-interferometer", ["p05", "interferometer"]))
+
+    # ... call your service function with "p05" ...
+    # ... assert it works the same as if you'd passed "p05-interferometer" ...
+```
+
+The fixture is reused by all 12 alias-canonicalization regression
+tests added in `fb6298a`. Following the same pattern for new
+features is the cheapest way to keep the contract intact.
+
+## What this rule does NOT cover
+
+1. **Alias creation / management.** This document is about reading
+   and writing project-keyed data. Adding new projects or new
+   aliases is the registry's own write path
+   (`POST /projects/register`, `PUT /projects/{name}`), which
+   already enforces collision detection and atomic file writes.
+2. **Registry hot-reloading.** The helper calls
+   `load_project_registry()` on every invocation, which reads the
+   JSON file each time. There is no in-process cache. If the
+   registry file changes, the next call sees the new contents.
+   Performance is fine for the current registry size but if it
+   becomes a bottleneck, add a versioned cache here, not at every
+   call site.
+3. **Cross-project deduplication.** If two different projects in
+   the registry happen to share an alias, the registry's collision
+   detection blocks the second one at registration time, so this
+   case can't arise in practice. The helper does not handle it
+   defensively.
+4. **Time-bounded canonicalization.** A project's canonical id is
+   stable. Aliases can be added or removed via
+   `PUT /projects/{name}`, but the canonical `id` field never
+   changes after registration. So a row written today under the
+   canonical id will always remain findable under that id, even
+   if the alias set evolves.
+5. **Migration of legacy data.** If the live Dalidou DB has rows
+   that were written under aliases before the canonicalization
+   landed (e.g. a `memories` row with `project = "p05"` from
+   before `fb6298a`), those rows are **NOT** automatically
+   reachable from the canonicalized read path. The unregistered-
+   name fallback only helps for project names that were never
+   registered at all; it does **NOT** help for names that are
+   registered as aliases. See the "Compatibility gap" section
+   below for the exact failure mode and the migration path that
+   has to run before the engineering layer V1 ships.
+
+## What this enables for the engineering layer V1
+
+When the engineering layer ships per `engineering-v1-acceptance.md`,
+it adds at least these new project-keyed surfaces:
+
+- `entities` table with a `project_id` column
+- `relationships` table that joins entities, indirectly project-keyed
+- `conflicts` table with a `project` column
+- `mirror_regeneration_failures` table with a `project` column
+- new endpoints: `POST /entities/...`, `POST /ingest/kb-cad/export`,
+  `POST /ingest/kb-fem/export`, `GET /mirror/{project}/...`,
+  `GET /conflicts?project=...`
+
+**Every one of those write/read paths needs to call
+`resolve_project_name` at its service-layer entry point**, following
+the same pattern as the eight existing call sites listed above. The
+implementation sprint should:
+
+1. Apply the helper at each new service entry point as the first
+   statement after input validation
+2. Add a regression test using the `project_registry` fixture that
+   exercises an alias against each new entry point
+3. Treat any new service function that takes a project name without
+   calling `resolve_project_name` as a code review failure
+
+The pattern is simple enough to follow without thinking, which is
+exactly the property we want for a contract that has to hold
+across many independent additions.
+
+## Open follow-ups
+
+These are things the canonicalization story still has open. None
+are blockers, but they're the rough edges to be aware of.
+
+1. **Legacy alias data migration — REQUIRED before engineering V1
+   ships, NOT optional.** If the live Dalidou DB has any rows
+   written under aliases before `fb6298a` landed, they are
+   silently invisible to the canonicalized read path (see the
+   "Compatibility gap" section above for the exact failure mode).
+   This is a real correctness issue, not a theoretical one: any
+   trusted state, memory, or interaction stored under `p05`,
+   `gigabit`, `polisher`, etc. before the fix landed is currently
+   unreachable from any service-layer query. The migration script
+   has to walk `projects`, `project_state`, `memories`, and
+   `interactions`, merge shadow rows into their canonical
+   counterparts (with conflict-model handling for any collisions),
+   and run in dry-run mode first. Estimated cost: ~150 LOC for
+   the migration script + ~50 LOC of tests + a one-time supervised
+   run on the live Dalidou DB. **This migration is the next
+   concrete pre-V1 step.**
+2. **Registry file caching.** `load_project_registry()` reads the
+   JSON file on every `resolve_project_name` call. With ~5
+   projects this is fine; with 50+ it would warrant a versioned
+   cache (cache key = file mtime + size). Defer until measured.
+3. **Case sensitivity audit.** The helper uses
+   `get_registered_project` which lowercases for comparison. The
+   stored canonical id keeps its original casing. No bug today
+   because every test passes, but worth re-confirming when the
+   engineering layer adds entity-side storage.
+4. **`_rank_chunks`'s secondary substring boost.** Mentioned
+   earlier; still uses the raw hint. Replace it with the same
+   helper-driven approach the retriever uses, OR delete it as
+   redundant once we confirm the retriever's primary boost is
+   sufficient.
+5. **Documentation discoverability.** This doc lives under
+   `docs/architecture/`. The contract is also restated in the
+   docstring of `resolve_project_name` and referenced from each
+   call site's comment. That redundancy is intentional — the
+   contract is too easy to forget to live in only one place.
+
+## Quick reference card
+
+Copy-pasteable for new service functions:
+
+```python
+from atocore.projects.registry import resolve_project_name
+
+
+def my_new_service_entry_point(
+    project_name: str,
+    other_args: ...,
+) -> ...:
+    # Validate inputs first
+    if not project_name:
+        raise ValueError("project_name is required")
+
+    # Canonicalize through the registry as the first thing after
+    # validation. Every subsequent operation in this function uses
+    # the canonical id, so storage and queries are guaranteed
+    # consistent across alias and canonical-id callers.
+    project_name = resolve_project_name(project_name)
+
+    # ... rest of the function ...
+```
+
+## TL;DR
+
+- One helper, one rule: `resolve_project_name` at every service-layer
+  entry point that takes a project name
+- Currently called in 8 places across builder, project_state,
+  interactions, and memory; all 8 listed in this doc
+- Backwards-compat path returns **unregistered** names unchanged
+  (e.g. `"orphan-project"`); this does NOT cover **registered
+  alias** names that were used as storage keys before `fb6298a`
+- **Real compatibility gap**: any row whose `project` column is a
+  registered alias from before the canonicalization landed is
+  silently invisible to the new read path. A one-time migration
+  is required before engineering V1 ships. See the "Compatibility
+  gap" section.
+- The trust hierarchy depends on this helper being applied
+  everywhere — Layer 3 trusted state has to be findable for it to
+  win the trust battle
+- Use the `project_registry` test fixture to add regression tests
+  for any new service function that takes a project name
+- The engineering layer V1 implementation must follow the same
+  pattern at every new service entry point
+- Open follow-ups (in priority order): **legacy alias data
+  migration (required pre-V1)**, redundant substring boost
+  cleanup, registry caching when projects scale
--- a/docs/architecture/representation-authority.md
+++ b/docs/architecture/representation-authority.md
@@ -0,0 +1,273 @@
+# Representation Authority (canonical home matrix)
+
+## Why this document exists
+
+The same fact about an engineering project can show up in many
+places: a markdown note in the PKM, a structured field in KB-CAD,
+a commit message in a Gitea repo, an active memory in AtoCore, an
+entity in the engineering layer, a row in trusted project state.
+**Without an explicit rule about which representation is
+authoritative for which kind of fact, the system will accumulate
+contradictions and the human will lose trust in all of them.**
+
+This document is the canonical-home matrix. Every kind of fact
+that AtoCore handles has exactly one authoritative representation,
+and every other place that holds a copy of that fact is, by
+definition, a derived view that may be stale.
+
+## The representations in scope
+
+Six places where facts can live in this ecosystem:
+
+| Layer | What it is | Who edits it | How it's structured |
+|---|---|---|---|
+| **PKM** | Antoine's Obsidian-style markdown vault under `/srv/storage/atocore/sources/vault/` | Antoine, by hand | unstructured markdown with optional frontmatter |
+| **KB project** | the engineering Knowledge Base (KB-CAD / KB-FEM repos and any companion docs) | Antoine, semi-structured | per-tool typed records |
+| **Gitea repos** | source code repos under `dalidou:3000/Antoine/*` (Fullum-Interferometer, polisher-sim, ATOCore itself, ...) | Antoine via git commits | code, READMEs, repo-specific markdown |
+| **AtoCore memories** | rows in the `memories` table | hand-authored or extracted from interactions | typed (identity / preference / project / episodic / knowledge / adaptation) |
+| **AtoCore entities** | rows in the `entities` table (V1, not yet built) | imported from KB exports or extracted from interactions | typed entities + relationships per the V1 ontology |
+| **AtoCore project state** | rows in the `project_state` table (Layer 3, trusted) | hand-curated only, never automatic | category + key + value |
+
+## The canonical home rule
+
+> For each kind of fact, exactly one of the six representations is
+> the authoritative source. The other five may hold derived
+> copies, but they are not allowed to disagree with the
+> authoritative one. When they disagree, the disagreement is a
+> conflict and surfaces via the conflict model.
+
+The matrix below assigns the authoritative representation per fact
+kind. It is the practical answer to the question "where does this
+fact actually live?" for daily decisions.
+
+## The canonical-home matrix
+
+| Fact kind | Canonical home | Why | How it gets into AtoCore |
+|---|---|---|---|
+| **CAD geometry** (the actual model) | NX (or successor CAD tool) | the only place that can render and validate it | not in AtoCore at all in V1 |
+| **CAD-side structure** (subsystem tree, component list, materials, parameters) | KB-CAD | KB-CAD is the structured wrapper around NX | KB-CAD export → `/ingest/kb-cad/export` → entities |
+| **FEM mesh & solver settings** | KB-FEM (wrapping the FEM tool) | only the solver representation can run | not in AtoCore at all in V1 |
+| **FEM results & validation outcomes** | KB-FEM | KB-FEM owns the outcome records | KB-FEM export → `/ingest/kb-fem/export` → entities |
+| **Source code** | Gitea repos | repos are version-controlled and reviewable | indirectly via repo markdown ingestion (Phase 1) |
+| **Repo-level documentation** (READMEs, design docs in the repo) | Gitea repos | lives next to the code it documents | ingested as source chunks; never hand-edited in AtoCore |
+| **Project-level prose notes** (decisions in long-form, journal-style entries, working notes) | PKM | the place Antoine actually writes when thinking | ingested as source chunks; the extractor proposes candidates from these for the review queue |
+| **Identity** ("the user is a mechanical engineer running AtoCore") | AtoCore memories (`identity` type) | nowhere else holds personal identity | hand-authored via `POST /memory` or extracted from interactions |
+| **Preference** ("prefers small reviewable diffs", "uses SI units") | AtoCore memories (`preference` type) | nowhere else holds personal preferences | hand-authored or extracted |
+| **Episodic** ("on April 6 we debugged the EXDEV bug") | AtoCore memories (`episodic` type) | nowhere else has time-bound personal recall | extracted from captured interactions |
+| **Decision** (a structured engineering decision) | AtoCore **entities** (Decision) once the engineering layer ships; AtoCore memories (`adaptation`) until then | needs structured supersession, audit trail, and link to affected components | extracted from PKM or interactions; promoted via review queue |
+| **Requirement** | AtoCore **entities** (Requirement) | needs structured satisfaction tracking | extracted from PKM, KB-CAD, or interactions |
+| **Constraint** | AtoCore **entities** (Constraint) | needs structured link to the entity it constrains | extracted from PKM, KB-CAD, or interactions |
+| **Validation claim** | AtoCore **entities** (ValidationClaim) | needs structured link to supporting Result | extracted from KB-FEM exports or interactions |
+| **Material** | KB-CAD if the material is on a real component; AtoCore entity (Material) if it's a project-wide material decision not yet attached to geometry | structured properties live in KB-CAD's material database | KB-CAD export, or hand-authored as a Material entity |
+| **Parameter** | KB-CAD or KB-FEM depending on whether it's a geometry or solver parameter; AtoCore entity (Parameter) if it's a higher-level project parameter not in either tool | structured numeric values with units live in their tool of origin | KB export, or hand-authored |
+| **Project status / current focus / next milestone** | AtoCore **project_state** (Layer 3) | the trust hierarchy says trusted state is the highest authority for "what is the current state of the project" | hand-curated via `POST /project/state` |
+| **Architectural decision records (ADRs)** | depends on form: long-form ADR markdown lives in the repo; the structured fact about which ADR was selected lives in the AtoCore Decision entity | both representations are useful for different audiences | repo ingestion provides the prose; the entity is created by extraction or hand-authored |
+| **Operational runbooks** | repo (next to the code they describe) | lives with the system it operates | not promoted into AtoCore entities — runbooks are reference material, not facts |
+| **Backup metadata** (snapshot timestamps, integrity status) | the backup-metadata.json files under `/srv/storage/atocore/backups/` | each snapshot is its own self-describing record | not in AtoCore's database; queried via the `/admin/backup` endpoints |
+| **Conversation history with AtoCore (interactions)** | AtoCore `interactions` table | nowhere else has the prompt + context pack + response triple | written by capture (Phase 9 Commit A) |
+
+## The supremacy rule for cross-layer facts
+
+When the same fact has copies in multiple representations and they
+disagree, the trust hierarchy applies in this order:
+
+1. **AtoCore project_state** (Layer 3) is highest authority for any
+   "current state of the project" question. This is why it requires
+   manual curation and never gets touched by automatic processes.
+2. **The tool-of-origin canonical home** is highest authority for
+   facts that are tool-managed: KB-CAD wins over AtoCore entities
+   for CAD-side structure facts; KB-FEM wins for FEM result facts.
+3. **AtoCore entities** are highest authority for facts that are
+   AtoCore-managed: Decisions, Requirements, Constraints,
+   ValidationClaims (when the supporting Results are still loose).
+4. **Active AtoCore memories** are highest authority for personal
+   facts (identity, preference, episodic).
+5. **Source chunks (PKM, repos, ingested docs)** are lowest
+   authority — they are the raw substrate from which higher layers
+   are extracted, but they may be stale, contradictory among
+   themselves, or out of date.
+
+This is the same hierarchy enforced by `conflict-model.md`. This
+document just makes it explicit per fact kind.
+
+## Examples
+
+### Example 1 — "what material does the lateral support pad use?"
+
+Possible representations:
+
+- KB-CAD has the field `component.lateral-support-pad.material = "GF-PTFE"`
+- A PKM note from last month says "considering PEEK for the
+  lateral support, GF-PTFE was the previous choice"
+- An AtoCore Material entity says `GF-PTFE`
+- An AtoCore project_state entry says `p05 / decision /
+  lateral_support_material = GF-PTFE`
+
+Which one wins for the question "what's the current material"?
+
+- **project_state wins** if the query is "what is the current
+  trusted answer for p05's lateral support material" (Layer 3)
+- **KB-CAD wins** if project_state has not been curated for this
+  field yet, because KB-CAD is the canonical home for CAD-side
+  structure
+- **The Material entity** is a derived view from KB-CAD; if it
+  disagrees with KB-CAD, the entity is wrong and a conflict is
+  surfaced
+- **The PKM note** is historical context, not authoritative for
+  "current"
+
+### Example 2 — "did we decide to merge the bind mounts?"
+
+Possible representations:
+
+- A working session interaction is captured in the `interactions`
+  table with the response containing `## Decision: merge the two
+  bind mounts into one`
+- The Phase 9 Commit C extractor produced a candidate adaptation
+  memory from that decision
+- A reviewer promoted the candidate to active
+- The AtoCore source repo has the actual code change in commit
+  `d0ff8b5` and the docker-compose.yml is in its post-merge form
+
+Which one wins for "is this decision real and current"?
+
+- **The Gitea repo** wins for "is this decision implemented" —
+  the docker-compose.yml is the canonical home for the actual
+  bind mount configuration
+- **The active adaptation memory** wins for "did we decide this"
+  — that's exactly what the Commit C lifecycle is for
+- **The interaction record** is the audit trail — it's
+  authoritative for "when did this conversation happen and what
+  did the LLM say", but not for "is this decision current"
+- **The source chunks** from PKM are not relevant here because no
+  PKM note about this decision exists yet (and that's fine —
+  decisions don't have to live in PKM if they live in the repo
+  and the AtoCore memory)
+
+### Example 3 — "what's p05's current next focus?"
+
+Possible representations:
+
+- The PKM has a `current-status.md` note updated last week
+- AtoCore project_state has `p05 / status / next_focus = "wave 2 ingestion"`
+- A captured interaction from yesterday discussed the next focus
+  at length
+
+Which one wins?
+
+- **project_state wins**, full stop. The trust hierarchy says
+  Layer 3 is canonical for current state. This is exactly the
+  reason project_state exists.
+- The PKM note is historical context.
+- The interaction is conversation history.
+- If project_state and the PKM disagree, the human updates one or
+  the other to bring them in line — usually by re-curating
+  project_state if the conversation revealed a real change.
+
+## What this means for the engineering layer V1 implementation
+
+Several concrete consequences fall out of the matrix:
+
+1. **The Material and Parameter entity types are mostly KB-CAD
+   shadows in V1.** They exist in AtoCore so other entities
+   (Decisions, Requirements) can reference them with structured
+   links, but their authoritative values come from KB-CAD imports.
+   If KB-CAD doesn't know about a material, the AtoCore entity is
+   the canonical home only because nothing else is.
+2. **Decisions / Requirements / Constraints / ValidationClaims
+   are AtoCore-canonical.** These don't have a natural home in
+   KB-CAD or KB-FEM. They live in AtoCore as first-class entities
+   with full lifecycle and supersession.
+3. **The PKM is never authoritative.** It is the substrate for
+   extraction. The reviewer promotes things out of it; they don't
+   point at PKM notes as the "current truth".
+4. **project_state is the override layer.** Whenever the human
+   wants to declare "the current truth is X regardless of what
+   the entities and memories and KB exports say", they curate
+   into project_state. Layer 3 is intentionally small and
+   intentionally manual.
+5. **The conflict model is the enforcement mechanism.** When two
+   representations disagree on a fact whose canonical home rule
+   should pick a winner, the conflict surfaces via the
+   `/conflicts` endpoint and the reviewer resolves it. The
+   matrix in this document tells the reviewer who is supposed
+   to win in each scenario; they're not making the decision blind.
+
+## What the matrix does NOT define
+
+1. **Facts about people other than the user.** No "team member"
+   entity, no per-collaborator preferences. AtoCore is
+   single-user in V1.
+2. **Facts about AtoCore itself as a project.** Those are project
+   memories and project_state entries under `project=atocore`,
+   same lifecycle as any other project's facts.
+3. **Vendor / supplier / cost facts.** Out of V1 scope.
+4. **Time-bounded facts** (a value that was true between two
+   dates and may not be true now). The current matrix treats all
+   active facts as currently-true and uses supersession to
+   represent change. Temporal facts are a V2 concern.
+5. **Cross-project shared facts** (a Material that is reused across
+   p04, p05, and p06). Currently each project has its own copy.
+   Cross-project deduplication is also a V2 concern.
+
+## The "single canonical home" invariant in practice
+
+The hard rule that every fact has exactly one canonical home is
+the load-bearing invariant of this matrix. To enforce it
+operationally:
+
+- **Extraction never duplicates.** When the extractor scans an
+  interaction or a source chunk and proposes a candidate, the
+  candidate is dropped if it duplicates an already-active record
+  in the canonical home (the existing extractor implementation
+  already does this for memories; the entity extractor will
+  follow the same pattern).
+- **Imports never duplicate.** When KB-CAD pushes the same
+  Component twice with the same value, the second push is
+  recognized as identical and updates the `last_imported_at`
+  timestamp without creating a new entity.
+- **Imports surface drift as conflict.** When KB-CAD pushes the
+  same Component with a different value, that's a conflict per
+  the conflict model — never a silent overwrite.
+- **Hand-curation into project_state always wins.** A
+  project_state entry can disagree with an entity or a KB
+  export; the project_state entry is correct by fiat (Layer 3
+  trust), and the reviewer is responsible for bringing the lower
+  layers in line if appropriate.
+
+## Open questions for V1 implementation
+
+1. **How does the reviewer see the canonical home for a fact in
+   the UI?** Probably by including the fact's authoritative
+   layer in the entity / memory detail view: "this Material is
+   currently mirrored from KB-CAD; the canonical home is KB-CAD".
+2. **Who owns running the KB-CAD / KB-FEM exporter?** The
+   `tool-handoff-boundaries.md` doc lists this as an open
+   question; same answer applies here.
+3. **Do we need an explicit `canonical_home` field on entity
+   rows?** A field that records "this entity is canonical here"
+   vs "this entity is a mirror of <external system>". Probably
+   yes; deferred to the entity schema spec.
+4. **How are project_state overrides surfaced in the engineering
+   layer query results?** When a query (e.g. Q-001 "what does
+   this subsystem contain?") would return entity rows, the result
+   should also flag any project_state entries that contradict the
+   entities — letting the reviewer see the override at query
+   time, not just in the conflict queue.
+
+## TL;DR
+
+- Six representation layers: PKM, KB project, repos, AtoCore
+  memories, AtoCore entities, AtoCore project_state
+- Every fact kind has exactly one canonical home
+- The trust hierarchy resolves cross-layer conflicts:
+  project_state > tool-of-origin (KB-CAD/KB-FEM) > entities >
+  active memories > source chunks
+- Decisions / Requirements / Constraints / ValidationClaims are
+  AtoCore-canonical (no other system has a natural home for them)
+- Materials / Parameters / CAD-side structure are KB-CAD-canonical
+- FEM results / validation outcomes are KB-FEM-canonical
+- project_state is the human override layer, top of the
+  hierarchy, manually curated only
+- Conflicts surface via `/conflicts` and the reviewer applies the
+  matrix to pick a winner
--- a/docs/architecture/tool-handoff-boundaries.md
+++ b/docs/architecture/tool-handoff-boundaries.md
@@ -0,0 +1,339 @@
+# Tool Hand-off Boundaries (KB-CAD / KB-FEM and friends)
+
+## Why this document exists
+
+The engineering layer V1 will accumulate typed entities about
+projects, subsystems, components, materials, requirements,
+constraints, decisions, parameters, analysis models, results, and
+validation claims. Many of those concepts also live in real
+external tools — CAD systems, FEM solvers, BOM managers, PLM
+databases, vendor portals.
+
+The first big design decision before writing any entity-layer code
+is: **what is AtoCore's read/write relationship with each of those
+external tools?**
+
+The wrong answer in either direction is expensive:
+
+- Too read-only: AtoCore becomes a stale shadow of the tools and
+  loses the trust battle the moment a value drifts.
+- Too bidirectional: AtoCore takes on responsibilities it can't
+  reliably honor (live sync, conflict resolution against external
+  schemas, write-back validation), and the project never ships.
+
+This document picks a position for V1.
+
+## The position
+
+> **AtoCore is a one-way mirror in V1.** External tools push
+> structured exports into AtoCore. AtoCore never pushes back.
+
+That position has three corollaries:
+
+1. **External tools remain the source of truth for everything they
+   already manage.** A CAD model is canonical for geometry; a FEM
+   project is canonical for meshes and solver settings; KB-CAD is
+   canonical for whatever KB-CAD already calls canonical.
+2. **AtoCore is the source of truth for the *AtoCore-shaped*
+   record** of those facts: the Decision that selected the geometry,
+   the Requirement the geometry satisfies, the ValidationClaim the
+   FEM result supports. AtoCore does not duplicate the external
+   tool's primary representation; it stores the structured *facts
+   about* it.
+3. **The boundary is enforced by absence.** No write endpoint in
+   AtoCore ever generates a `.prt`, a `.fem`, an export to a PLM
+   schema, or a vendor purchase order. If we find ourselves wanting
+   to add such an endpoint in V1, we should stop and reconsider
+   the V1 scope.
+
+## Why one-way and not bidirectional
+
+Bidirectional sync between independent systems is one of the
+hardest problems in engineering software. The honest reasons we
+are not attempting it in V1:
+
+1. **Schema drift.** External tools evolve their schemas
+   independently. A bidirectional sync would have to track every
+   schema version of every external tool we touch. That is a
+   permanent maintenance tax.
+2. **Conflict semantics.** When AtoCore and an external tool
+   disagree on the same field, "who wins" is a per-tool, per-field
+   decision. There is no general rule. Bidirectional sync would
+   require us to specify that decision exhaustively.
+3. **Trust hierarchy.** AtoCore's whole point is the trust
+   hierarchy: trusted project state > entities > memories. If we
+   let entities push values back into the external tools, we
+   silently elevate AtoCore's confidence to "high enough to write
+   to a CAD model", which it almost never deserves.
+4. **Velocity.** A bidirectional engineering layer is a
+   multi-year project. A one-way mirror is a months project. The
+   value-to-effort ratio favors one-way for V1 by an enormous
+   margin.
+5. **Reversibility.** We can always add bidirectional sync later
+   on a per-tool basis once V1 has shown itself to be useful. We
+   cannot easily walk back a half-finished bidirectional sync that
+   has already corrupted data in someone's CAD model.
+
+## Per-tool stance for V1
+
+| External tool | V1 stance | What AtoCore reads in | What AtoCore writes back |
+|---|---|---|---|
+| **KB-CAD** (Antoine's CAD knowledge base) | one-way mirror | structured exports of subsystems, components, materials, parameters via a documented JSON or CSV shape | nothing |
+| **KB-FEM** (Antoine's FEM knowledge base) | one-way mirror | structured exports of analysis models, results, validation claims | nothing |
+| **NX / Siemens NX** (the CAD tool itself) | not connected in V1 | nothing direct — only what KB-CAD exports about NX projects | nothing |
+| **PKM (Obsidian / markdown vault)** | already connected via the ingestion pipeline (Phase 1) | full markdown/text corpus per the ingestion-waves doc | nothing |
+| **Gitea repos** | already connected via the ingestion pipeline | repo markdown/text per project | nothing |
+| **OpenClaw** (the LLM agent) | already connected via the read-only helper skill on the T420 | nothing — OpenClaw reads from AtoCore | nothing — OpenClaw does not write into AtoCore |
+| **AtoDrive** (operational truth layer, future) | future: bidirectional with AtoDrive itself, but AtoDrive is internal to AtoCore so this isn't an external tool boundary | n/a in V1 | n/a in V1 |
+| **PLM / vendor portals / cost systems** | not in V1 scope | nothing | nothing |
+
+## What "one-way mirror" actually looks like in code
+
+AtoCore exposes an ingestion endpoint per external tool that
+accepts a structured export and turns it into entity candidates.
+The endpoint is read-side from AtoCore's perspective (it reads
+from a file or HTTP body), even though the external tool is the
+one initiating the call.
+
+Proposed V1 ingestion endpoints:
+
+```
+POST /ingest/kb-cad/export       body: KB-CAD export JSON
+POST /ingest/kb-fem/export       body: KB-FEM export JSON
+```
+
+Each endpoint:
+
+1. Validates the export against the documented schema
+2. Maps each export record to an entity candidate (status="candidate")
+3. Carries the export's source identifier into the candidate's
+   provenance fields (source_artifact_id, exporter_version, etc.)
+4. Returns a summary: how many candidates were created, how many
+   were dropped as duplicates, how many failed schema validation
+5. Does NOT auto-promote anything
+
+The KB-CAD and KB-FEM teams (which is to say, future-you) own the
+exporter scripts that produce these JSON bodies. Those scripts
+live in the KB-CAD / KB-FEM repos respectively, not in AtoCore.
+
+## The export schemas (sketch, not final)
+
+These are starting shapes, intentionally minimal. The schemas
+will be refined in `kb-cad-export-schema.md` and
+`kb-fem-export-schema.md` once the V1 ontology lands.
+
+### KB-CAD export shape (starting sketch)
+
+```json
+{
+  "exporter": "kb-cad",
+  "exporter_version": "1.0.0",
+  "exported_at": "2026-04-07T12:00:00Z",
+  "project": "p05-interferometer",
+  "subsystems": [
+    {
+      "id": "subsystem.optical-frame",
+      "name": "Optical frame",
+      "parent": null,
+      "components": [
+        {
+          "id": "component.lateral-support-pad",
+          "name": "Lateral support pad",
+          "material": "GF-PTFE",
+          "parameters": {
+            "thickness_mm": 3.0,
+            "preload_n": 12.0
+          },
+          "source_artifact": "kb-cad://p05/subsystems/optical-frame#lateral-support"
+        }
+      ]
+    }
+  ]
+}
+```
+
+### KB-FEM export shape (starting sketch)
+
+```json
+{
+  "exporter": "kb-fem",
+  "exporter_version": "1.0.0",
+  "exported_at": "2026-04-07T12:00:00Z",
+  "project": "p05-interferometer",
+  "analysis_models": [
+    {
+      "id": "model.optical-frame-modal",
+      "name": "Optical frame modal analysis v3",
+      "subsystem": "subsystem.optical-frame",
+      "results": [
+        {
+          "id": "result.first-mode-frequency",
+          "name": "First-mode frequency",
+          "value": 187.4,
+          "unit": "Hz",
+          "supports_validation_claim": "claim.frame-rigidity-min-150hz",
+          "source_artifact": "kb-fem://p05/models/optical-frame-modal#first-mode"
+        }
+      ]
+    }
+  ]
+}
+```
+
+These shapes will evolve. The point of including them now is to
+make the one-way mirror concrete: it is a small, well-defined
+JSON shape, not "AtoCore reaches into KB-CAD's database".
+
+## What AtoCore is allowed to do with the imported records
+
+After ingestion, the imported records become entity candidates
+in AtoCore's own table. From that point forward they follow the
+exact same lifecycle as any other candidate:
+
+- they sit at status="candidate" until a human reviews them
+- the reviewer promotes them to status="active" or rejects them
+- the active entities are queryable via the engineering query
+  catalog (Q-001 through Q-020)
+- the active entities can be referenced from Decisions, Requirements,
+  ValidationClaims, etc. via the V1 relationship types
+
+The imported records are never automatically pushed into trusted
+project state, never modified in place after import (they are
+superseded by re-imports, not edited), and never written back to
+the external tool.
+
+## What happens when KB-CAD changes a value AtoCore already has
+
+This is the canonical "drift" scenario. The flow:
+
+1. KB-CAD exports a fresh JSON. Component `component.lateral-support-pad`
+   now has `material: "PEEK"` instead of `material: "GF-PTFE"`.
+2. AtoCore's ingestion endpoint sees the same `id` and a different
+   value.
+3. The ingestion endpoint creates a new entity candidate with the
+   new value, **does NOT delete or modify the existing active
+   entity**, and creates a `conflicts` row linking the two members
+   (per the conflict model doc).
+4. The reviewer sees an open conflict on the next visit to
+   `/conflicts`.
+5. The reviewer either:
+   - **promotes the new value** (the active is superseded, the
+     candidate becomes the new active, the audit trail keeps both)
+   - **rejects the new value** (the candidate is invalidated, the
+     active stays — useful when the export was wrong)
+   - **dismisses the conflict** (declares them not actually about
+     the same thing, both stay active)
+
+The reviewer never touches KB-CAD from AtoCore. If the resolution
+implies a change in KB-CAD itself, the reviewer makes that change
+in KB-CAD, then re-exports.
+
+## What about NX directly?
+
+NX (Siemens NX) is the underlying CAD tool that KB-CAD wraps.
+**NX is not connected to AtoCore in V1.** Any facts about NX
+projects flow through KB-CAD as the structured intermediate. This
+gives us:
+
+- **One schema to maintain.** AtoCore only has to understand the
+  KB-CAD export shape, not the NX API.
+- **One ownership boundary.** KB-CAD owns the question of "what's
+  in NX". AtoCore owns the question of "what's in the typed
+  knowledge base".
+- **Future flexibility.** When NX is replaced or upgraded, only
+  KB-CAD has to adapt; AtoCore doesn't notice.
+
+The same logic applies to FEM solvers (Nastran, Abaqus, ANSYS):
+KB-FEM is the structured intermediate, AtoCore never talks to the
+solver directly.
+
+## The hard-line invariants
+
+These are the things V1 will not do, regardless of how convenient
+they might seem:
+
+1. **No write to external tools.** No POST/PUT/PATCH to any
+   external API, no file generation that gets written into a
+   CAD/FEM project tree, no email/chat sends.
+2. **No live polling.** AtoCore does not poll KB-CAD or KB-FEM on
+   a schedule. Imports are explicit pushes from the external tool
+   into AtoCore's ingestion endpoint.
+3. **No silent merging.** Every value drift surfaces as a
+   conflict for the reviewer (per the conflict model doc).
+4. **No schema fan-out.** AtoCore does not store every field that
+   KB-CAD knows about. Only fields that map to one of the V1
+   entity types make it into AtoCore. Everything else is dropped
+   at the import boundary.
+5. **No external-tool-specific logic in entity types.** A
+   `Component` in AtoCore is the same shape regardless of whether
+   it came from KB-CAD, KB-FEM, the PKM, or a hand-curated
+   project state entry. The source is recorded in provenance,
+   not in the entity shape.
+
+## What this enables
+
+With the one-way mirror locked in, V1 implementation can focus on:
+
+- The entity table and its lifecycle
+- The two `/ingest/kb-cad/export` and `/ingest/kb-fem/export`
+  endpoints with their JSON validators
+- The candidate review queue extension (already designed in
+  `promotion-rules.md`)
+- The conflict model (already designed in `conflict-model.md`)
+- The query catalog implementation (already designed in
+  `engineering-query-catalog.md`)
+
+None of those are unbounded. Each is a finite, well-defined
+implementation task. The one-way mirror is the choice that makes
+V1 finishable.
+
+## What V2 might consider (deferred)
+
+After V1 has been live and demonstrably useful for a quarter or
+two, the questions that become reasonable to revisit:
+
+1. **Selective write-back to KB-CAD for low-risk fields.** For
+   example, AtoCore could push back a "Decision id linked to this
+   component" annotation that KB-CAD then displays without it
+   being canonical there. Read-only annotations from AtoCore's
+   perspective, advisory metadata from KB-CAD's perspective.
+2. **Live polling for very small payloads.** A daily poll of
+   "what subsystem ids exist in KB-CAD now" so AtoCore can flag
+   subsystems that disappeared from KB-CAD without an explicit
+   AtoCore invalidation.
+3. **Direct NX integration** if the KB-CAD layer becomes a
+   bottleneck — but only if the friction is real, not theoretical.
+4. **Cost / vendor / PLM connections** for projects where the
+   procurement cycle is part of the active engineering work.
+
+None of these are V1 work and they are listed only so the V1
+design intentionally leaves room for them later.
+
+## Open questions for the V1 implementation sprint
+
+1. **Where do the export schemas live?** Probably in
+   `docs/architecture/kb-cad-export-schema.md` and
+   `docs/architecture/kb-fem-export-schema.md`, drafted during
+   the implementation sprint.
+2. **Who runs the exporter?** A scheduled job on the KB-CAD /
+   KB-FEM hosts, triggered by the human after a meaningful
+   change, or both?
+3. **Is the export incremental or full?** Full is simpler but
+   more expensive. Incremental needs delta semantics. V1 starts
+   with full and revisits when full becomes too slow.
+4. **How is the exporter authenticated to AtoCore?** Probably
+   the existing PAT model (one PAT per exporter, scoped to
+   `write:engineering-import` once that scope exists). Worth a
+   quick auth design pass before the endpoints exist.
+
+## TL;DR
+
+- AtoCore is a one-way mirror in V1: external tools push,
+  AtoCore reads, AtoCore never writes back
+- Two import endpoints for V1: KB-CAD and KB-FEM, each with a
+  documented JSON export shape
+- Drift surfaces as conflicts in the existing conflict model
+- No NX, no FEM solvers, no PLM, no vendor portals, no
+  cost/BOM systems in V1
+- Bidirectional sync is reserved for V2+ on a per-tool basis,
+  only after V1 demonstrates value
--- a/docs/backup-restore-procedure.md
+++ b/docs/backup-restore-procedure.md
@@ -0,0 +1,442 @@
+# AtoCore Backup and Restore Procedure
+
+## Scope
+
+This document defines the operational procedure for backing up and
+restoring AtoCore's machine state on the Dalidou deployment. It is
+the practical companion to `docs/backup-strategy.md` (which defines
+the strategy) and `src/atocore/ops/backup.py` (which implements the
+mechanics).
+
+The intent is that this procedure can be followed by anyone with
+SSH access to Dalidou and the AtoCore admin endpoints.
+
+## What gets backed up
+
+A `create_runtime_backup` snapshot contains, in order of importance:
+
+| Artifact | Source path on Dalidou | Backup destination | Always included |
+|---|---|---|---|
+| SQLite database | `/srv/storage/atocore/data/db/atocore.db` | `<backup_root>/db/atocore.db` | yes |
+| Project registry JSON | `/srv/storage/atocore/config/project-registry.json` | `<backup_root>/config/project-registry.json` | yes (if file exists) |
+| Backup metadata | (generated) | `<backup_root>/backup-metadata.json` | yes |
+| Chroma vector store | `/srv/storage/atocore/data/chroma/` | `<backup_root>/chroma/` | only when `include_chroma=true` |
+
+The SQLite snapshot uses the online `conn.backup()` API and is safe
+to take while the database is in use. The Chroma snapshot is a cold
+directory copy and is **only safe when no ingestion is running**;
+the API endpoint enforces this by acquiring the ingestion lock for
+the duration of the copy.
+
+What is **not** in the backup:
+
+- Source documents under `/srv/storage/atocore/sources/vault/` and
+  `/srv/storage/atocore/sources/drive/`. These are read-only
+  inputs and live in the user's PKM/Drive, which is backed up
+  separately by their own systems.
+- Application code. The container image is the source of truth for
+  code; recovery means rebuilding the image, not restoring code from
+  a backup.
+- Logs under `/srv/storage/atocore/logs/`.
+- Embeddings cache under `/srv/storage/atocore/data/cache/`.
+- Temp files under `/srv/storage/atocore/data/tmp/`.
+
+## Backup root layout
+
+Each backup snapshot lives in its own timestamped directory:
+
+```
+/srv/storage/atocore/backups/snapshots/
+  ├── 20260407T060000Z/
+  │   ├── backup-metadata.json
+  │   ├── db/
+  │   │   └── atocore.db
+  │   ├── config/
+  │   │   └── project-registry.json
+  │   └── chroma/                    # only if include_chroma=true
+  │       └── ...
+  ├── 20260408T060000Z/
+  │   └── ...
+  └── ...
+```
+
+The timestamp is UTC, format `YYYYMMDDTHHMMSSZ`.
+
+## Triggering a backup
+
+### Option A — via the admin endpoint (preferred)
+
+```bash
+# DB + registry only (fast, safe at any time)
+curl -fsS -X POST http://dalidou:8100/admin/backup \
+  -H "Content-Type: application/json" \
+  -d '{"include_chroma": false}'
+
+# DB + registry + Chroma (acquires ingestion lock)
+curl -fsS -X POST http://dalidou:8100/admin/backup \
+  -H "Content-Type: application/json" \
+  -d '{"include_chroma": true}'
+```
+
+The response is the backup metadata JSON. Save the `backup_root`
+field — that's the directory the snapshot was written to.
+
+### Option B — via the standalone script (when the API is down)
+
+```bash
+docker exec atocore python -m atocore.ops.backup
+```
+
+This runs `create_runtime_backup()` directly, without going through
+the API or the ingestion lock. Use it only when the AtoCore service
+itself is unhealthy and you can't hit the admin endpoint.
+
+### Option C — manual file copy (last resort)
+
+If both the API and the standalone script are unusable:
+
+```bash
+sudo systemctl stop atocore   # or: docker compose stop atocore
+sudo cp /srv/storage/atocore/data/db/atocore.db \
+        /srv/storage/atocore/backups/manual-$(date -u +%Y%m%dT%H%M%SZ).db
+sudo cp /srv/storage/atocore/config/project-registry.json \
+        /srv/storage/atocore/backups/manual-$(date -u +%Y%m%dT%H%M%SZ).registry.json
+sudo systemctl start atocore
+```
+
+This is a cold backup and requires brief downtime.
+
+## Listing backups
+
+```bash
+curl -fsS http://dalidou:8100/admin/backup
+```
+
+Returns the configured `backup_dir` and a list of all snapshots
+under it, with their full metadata if available.
+
+Or, on the host directly:
+
+```bash
+ls -la /srv/storage/atocore/backups/snapshots/
+```
+
+## Validating a backup
+
+Before relying on a backup for restore, validate it:
+
+```bash
+curl -fsS http://dalidou:8100/admin/backup/20260407T060000Z/validate
+```
+
+The validator:
+- confirms the snapshot directory exists
+- opens the SQLite snapshot and runs `PRAGMA integrity_check`
+- parses the registry JSON
+- confirms the Chroma directory exists (if it was included)
+
+A valid backup returns `"valid": true` and an empty `errors` array.
+A failing validation returns `"valid": false` with one or more
+specific error strings (e.g. `db_integrity_check_failed`,
+`registry_invalid_json`, `chroma_snapshot_missing`).
+
+**Validate every backup at creation time.** A backup that has never
+been validated is not actually a backup — it's just a hopeful copy
+of bytes.
+
+## Restore procedure
+
+Since 2026-04-09 the restore is implemented as a proper module
+function plus CLI entry point: `restore_runtime_backup()` in
+`src/atocore/ops/backup.py`, invoked as
+`python -m atocore.ops.backup restore <STAMP> --confirm-service-stopped`.
+It automatically takes a pre-restore safety snapshot (your rollback
+anchor), handles SQLite WAL/SHM cleanly, restores the registry, and
+runs `PRAGMA integrity_check` on the restored db. This replaces the
+earlier manual `sudo cp` sequence.
+
+The function refuses to run without `--confirm-service-stopped`.
+This is deliberate: hot-restoring into a running service corrupts
+SQLite state.
+
+### Pre-flight (always)
+
+1. Identify which snapshot you want to restore. List available
+   snapshots and pick by timestamp:
+   ```bash
+   curl -fsS http://127.0.0.1:8100/admin/backup | jq '.backups[].stamp'
+   ```
+2. Validate it. Refuse to restore an invalid backup:
+   ```bash
+   STAMP=20260409T060000Z
+   curl -fsS http://127.0.0.1:8100/admin/backup/$STAMP/validate | jq .
+   ```
+3. **Stop AtoCore.** SQLite cannot be hot-restored under a running
+   process and Chroma will not pick up new files until the process
+   restarts.
+   ```bash
+   cd /srv/storage/atocore/app/deploy/dalidou
+   docker compose down
+   docker compose ps   # atocore should be Exited/gone
+   ```
+
+### Run the restore
+
+Use a one-shot container that reuses the live service's volume
+mounts so every path (`db_path`, `chroma_path`, backup dir) resolves
+to the same place the main service would see:
+
+```bash
+cd /srv/storage/atocore/app/deploy/dalidou
+docker compose run --rm --entrypoint python atocore \
+    -m atocore.ops.backup restore \
+        $STAMP \
+        --confirm-service-stopped
+```
+
+Output is a JSON document. The critical fields:
+
+- `pre_restore_snapshot`: stamp of the safety snapshot of live
+  state taken right before the restore. **Write this down.** If
+  the restore was the wrong call, this is how you roll it back.
+- `db_restored`: should be `true`
+- `registry_restored`: `true` if the backup captured a registry
+- `chroma_restored`: `true` if the backup captured a chroma tree
+  and include_chroma resolved to true (default)
+- `restored_integrity_ok`: **must be `true`** — if this is false,
+  STOP and do not start the service; investigate the integrity
+  error first. The restored file is still on disk but untrusted.
+
+### Controlling the restore
+
+The CLI supports a few flags for finer control:
+
+- `--no-pre-snapshot` skips the pre-restore safety snapshot. Use
+  this only when you know you have another rollback path.
+- `--no-chroma` restores only SQLite + registry, leaving the
+  current Chroma dir alone. Useful if Chroma is consistent but
+  SQLite needs a rollback.
+- `--chroma` forces Chroma restoration even if the metadata
+  doesn't clearly indicate the snapshot has it (rare).
+
+### Chroma restore and bind-mounted volumes
+
+The Chroma dir on Dalidou is a bind-mounted Docker volume. The
+restore cannot `rmtree` the destination (you can't unlink a mount
+point — it raises `OSError [Errno 16] Device or resource busy`),
+so the function clears the dir's CONTENTS and uses
+`copytree(dirs_exist_ok=True)` to copy the snapshot back in. The
+regression test `test_restore_chroma_does_not_unlink_destination_directory`
+in `tests/test_backup.py` captures the destination inode before
+and after restore and asserts it's stable — the same invariant
+that protects the bind mount.
+
+This was discovered during the first real Dalidou restore drill
+on 2026-04-09. If you see a new restore failure with
+`Device or resource busy`, something has regressed this fix.
+
+### Restart AtoCore
+
+```bash
+cd /srv/storage/atocore/app/deploy/dalidou
+docker compose up -d
+# Wait for /health to come up
+for i in 1 2 3 4 5 6 7 8 9 10; do
+    curl -fsS http://127.0.0.1:8100/health \
+        && break || { echo "not ready ($i/10)"; sleep 3; }
+done
+```
+
+**Note on build_sha after restore:** The one-shot `docker compose run`
+container does not carry the build provenance env vars that `deploy.sh`
+exports at deploy time. After a restore, `/health` will report
+`build_sha: "unknown"` until you re-run `deploy.sh` or manually
+re-deploy. This is cosmetic — the data is correctly restored — but if
+you need `build_sha` to be accurate, run a redeploy after the restore:
+
+```bash
+cd /srv/storage/atocore/app
+bash deploy/dalidou/deploy.sh
+```
+
+### Post-restore verification
+
+```bash
+# 1. Service is healthy
+curl -fsS http://127.0.0.1:8100/health | jq .
+
+# 2. Stats look right
+curl -fsS http://127.0.0.1:8100/stats | jq .
+
+# 3. Project registry loads
+curl -fsS http://127.0.0.1:8100/projects | jq '.projects | length'
+
+# 4. A known-good context query returns non-empty results
+curl -fsS -X POST http://127.0.0.1:8100/context/build \
+  -H "Content-Type: application/json" \
+  -d '{"prompt": "what is p05 about", "project": "p05-interferometer"}' | jq '.chunks_used'
+```
+
+If any of these are wrong, the restore is bad. Roll back using the
+pre-restore safety snapshot whose stamp you recorded from the
+restore output. The rollback is the same procedure — stop the
+service and restore that stamp:
+
+```bash
+docker compose down
+docker compose run --rm --entrypoint python atocore \
+    -m atocore.ops.backup restore \
+        $PRE_RESTORE_SNAPSHOT_STAMP \
+        --confirm-service-stopped \
+        --no-pre-snapshot
+docker compose up -d
+```
+
+(`--no-pre-snapshot` because the rollback itself doesn't need one;
+you already have the original snapshot as a fallback if everything
+goes sideways.)
+
+### Restore drill
+
+The restore is exercised at three levels:
+
+1. **Unit tests.** `tests/test_backup.py` has six restore tests
+   (refuse-without-confirm, invalid backup, full round-trip,
+   Chroma round-trip, inode-stability regression, WAL sidecar
+   cleanup, skip-pre-snapshot). These run in CI on every commit.
+2. **Module-level round-trip.**
+   `test_restore_round_trip_reverses_post_backup_mutations` is
+   the canonical drill in code form: seed baseline, snapshot,
+   mutate, restore, assert mutation reversed + baseline survived
+   + pre-restore snapshot captured the mutation.
+3. **Live drill on Dalidou.** Periodically run the full procedure
+   against the real service with a disposable drill-marker
+   memory (created via `POST /memory` with `memory_type=episodic`
+   and `project=drill`), following the sequence above and then
+   verifying the marker is gone afterward via
+   `GET /memory?project=drill`. The first such drill on
+   2026-04-09 surfaced the bind-mount bug; future runs
+   primarily exist to verify the fix stays fixed.
+
+Run the live drill:
+
+- **Before** enabling any new write-path automation (auto-capture,
+  automated ingestion, reinforcement sweeps).
+- **After** any change to `src/atocore/ops/backup.py` or to
+  schema migrations in `src/atocore/models/database.py`.
+- **After** a Dalidou OS upgrade or docker version bump.
+- **At least once per quarter** as a standing operational check.
+- **After any incident** that touched the storage layer.
+
+Record each drill run (stamp, pre-restore snapshot stamp, pass/fail,
+any surprises) somewhere durable — a line in the project journal
+or a git commit message is enough. A drill you ran once and never
+again is barely more than a drill you never ran.
+
+## Retention policy
+
+- **Last 7 daily backups**: kept verbatim
+- **Last 4 weekly backups** (Sunday): kept verbatim
+- **Last 6 monthly backups** (1st of month): kept verbatim
+- **Anything older**: deleted
+
+The retention job is **not yet implemented** and is tracked as a
+follow-up. Until then, the snapshots directory grows monotonically.
+A simple cron-based cleanup script is the next step:
+
+```cron
+0 4 * * * /srv/storage/atocore/scripts/cleanup-old-backups.sh
+```
+
+## Common failure modes and what to do about them
+
+| Symptom | Likely cause | Action |
+|---|---|---|
+| `db_integrity_check_failed` on validation | SQLite snapshot copied while a write was in progress, or disk corruption | Take a fresh backup and validate again. If it fails twice, suspect the underlying disk. |
+| `registry_invalid_json` | Registry was being edited at backup time | Take a fresh backup. The registry is small so this is cheap. |
+| Restore: `restored_integrity_ok: false` | Source snapshot was itself corrupt (validation should have caught it — file a bug) or copy was interrupted mid-write | Do NOT start the service. Validate the snapshot directly with `python -m atocore.ops.backup validate <STAMP>`, try a different older snapshot, or roll back to the pre-restore safety snapshot. |
+| Restore: `OSError [Errno 16] Device or resource busy` on Chroma | Old code tried to `rmtree` the Chroma mount point. Fixed on 2026-04-09 by `test_restore_chroma_does_not_unlink_destination_directory` | Ensure you're running commit 2026-04-09 or later; if you need to work around an older build, use `--no-chroma` and restore Chroma contents manually. |
+| `chroma_snapshot_missing` after a restore | Snapshot was DB-only | Either rebuild via fresh ingestion or restore an older snapshot that includes Chroma. |
+| Service won't start after restore | Permissions wrong on the restored files | Re-run `chown 1000:1000` (or whatever the gitea/atocore container user is) on the data dir. |
+| `/stats` returns 0 documents after restore | The SQL store was restored but the source paths in `source_documents` don't match the current Dalidou paths | This means the backup came from a different deployment. Don't trust this restore — it's pulling from the wrong layout. |
+| Drill marker still present after restore | Wrong stamp, service still writing during `docker compose down`, or the restore JSON didn't report `db_restored: true` | Roll back via the pre-restore safety snapshot and retry with the correct source snapshot. |
+
+## Open follow-ups (not yet implemented)
+
+Tracked separately in `docs/next-steps.md` — the list below is the
+backup-specific subset.
+
+1. **Retention cleanup script**: see the cron entry above. The
+   snapshots directory grows monotonically until this exists.
+2. **Off-Dalidou backup target**: currently snapshots live on the
+   same disk as the live data. A real disaster-recovery story
+   needs at least one snapshot on a different physical machine.
+   The simplest first step is a periodic `rsync` to the user's
+   laptop or to another server.
+3. **Backup encryption**: snapshots contain raw SQLite and JSON.
+   Consider age/gpg encryption if backups will be shipped off-site.
+4. **Automatic post-backup validation**: today the validator must
+   be invoked manually. The `create_runtime_backup` function
+   should call `validate_backup` on its own output and refuse to
+   declare success if validation fails.
+5. **Chroma backup is currently full directory copy** every time.
+   For large vector stores this gets expensive. A future
+   improvement would be incremental snapshots via filesystem-level
+   snapshotting (LVM, btrfs, ZFS).
+
+**Done** (kept for historical reference):
+
+- ~~Implement `restore_runtime_backup()` as a proper module
+  function so the restore isn't a manual `sudo cp` dance~~ —
+  landed 2026-04-09 in commit 3362080, followed by the
+  Chroma bind-mount fix from the first real drill.
+
+## Quickstart cheat sheet
+
+```bash
+# Daily backup (DB + registry only — fast)
+curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
+  -H "Content-Type: application/json" -d '{}'
+
+# Weekly backup (DB + registry + Chroma — slower, holds ingestion lock)
+curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
+  -H "Content-Type: application/json" -d '{"include_chroma": true}'
+
+# List backups
+curl -fsS http://127.0.0.1:8100/admin/backup | jq '.backups[].stamp'
+
+# Validate the most recent backup
+LATEST=$(curl -fsS http://127.0.0.1:8100/admin/backup | jq -r '.backups[-1].stamp')
+curl -fsS http://127.0.0.1:8100/admin/backup/$LATEST/validate | jq .
+
+# Full restore (service must be stopped first)
+cd /srv/storage/atocore/app/deploy/dalidou
+docker compose down
+docker compose run --rm --entrypoint python atocore \
+    -m atocore.ops.backup restore $STAMP --confirm-service-stopped
+docker compose up -d
+
+# Live drill: exercise the full create -> mutate -> restore flow
+# against the running service. The marker memory uses
+# memory_type=episodic (valid types: identity, preference, project,
+# episodic, knowledge, adaptation) and project=drill so it's easy
+# to find via GET /memory?project=drill before and after.
+#
+# See the "Restore drill" section above for the full sequence.
+STAMP=$(curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
+    -H 'Content-Type: application/json' \
+    -d '{"include_chroma": true}' | jq -r '.backup_root' | awk -F/ '{print $NF}')
+
+curl -fsS -X POST http://127.0.0.1:8100/memory \
+    -H 'Content-Type: application/json' \
+    -d '{"memory_type":"episodic","content":"DRILL-MARKER","project":"drill","confidence":1.0}'
+
+cd /srv/storage/atocore/app/deploy/dalidou
+docker compose down
+docker compose run --rm --entrypoint python atocore \
+    -m atocore.ops.backup restore $STAMP --confirm-service-stopped
+docker compose up -d
+
+# Marker should be gone:
+curl -fsS 'http://127.0.0.1:8100/memory?project=drill' | jq .
+```
--- a/docs/current-state.md
+++ b/docs/current-state.md
@@ -200,10 +200,30 @@ The runtime has now been hardened in a few practical ways:
 - SQLite connections use a configurable busy timeout
 - SQLite uses WAL mode to reduce transient lock pain under normal concurrent use
 - project registry writes are atomic file replacements rather than in-place rewrites
- a first runtime backup path now exists for:
-  - SQLite
-  - project registry
+- a full runtime backup and restore path now exists and has been exercised on
+  live Dalidou:
+  - SQLite (hot online backup via `conn.backup()`)
+  - project registry (file copy)
+  - Chroma vector store (cold directory copy under `exclusive_ingestion()`)
  - backup metadata
+  - `restore_runtime_backup()` with CLI entry point
+    (`python -m atocore.ops.backup restore <STAMP>
+    --confirm-service-stopped`), pre-restore safety snapshot for
+    rollback, WAL/SHM sidecar cleanup, `PRAGMA integrity_check`
+    on the restored file
+  - the first live drill on 2026-04-09 surfaced and fixed a Chroma
+    restore bug on Docker bind-mounted volumes (`shutil.rmtree`
+    on a mount point); a regression test now asserts the
+    destination inode is stable across restore
+- deploy provenance is visible end-to-end:
+  - `/health` reports `build_sha`, `build_time`, `build_branch`
+    from env vars wired by `deploy.sh`
+  - `deploy.sh` Step 6 verifies the live `build_sha` matches the
+    just-built commit (exit code 6 on drift) so "live is current?"
+    can be answered precisely, not just by `__version__`
+  - `deploy.sh` Step 1.5 detects that the script itself changed
+    in the pulled commit and re-execs into the fresh copy, so
+    the deploy never silently runs the old script against new source

 This does not eliminate every concurrency edge, but it materially improves the
 current operational baseline.
@@ -224,15 +244,23 @@ This separation is healthy:

 ## Immediate Next Focus

-1. Use the new T420-side organic routing layer in real OpenClaw workflows
-2. Tighten retrieval quality for the now fully ingested active project corpora
-3. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
-4. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
-5. Expand the boring operations baseline:
-   - restore validation
-   - Chroma rebuild / backup policy
-   - retention
-6. Only later consider write-back, reflection, or deeper autonomous behaviors
+1. ~~Re-run the full backup/restore drill~~ — DONE 2026-04-11,
+   full pass (db, registry, chroma, integrity all true)
+2. ~~Turn on auto-capture of Claude Code sessions in conservative
+   mode~~ — DONE 2026-04-11, Stop hook wired via
+   `deploy/hooks/capture_stop.py` → `POST /interactions`
+   with `reinforce=false`; kill switch via
+   `ATOCORE_CAPTURE_DISABLED=1`
+3. Run a short real-use pilot with auto-capture on, verify
+   interactions are landing in Dalidou, review quality
+4. Use the new T420-side organic routing layer in real OpenClaw workflows
+4. Tighten retrieval quality for the now fully ingested active project corpora
+5. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
+6. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
+7. Expand the remaining boring operations baseline:
+   - retention policy cleanup script
+   - off-Dalidou backup target (rsync or similar)
+8. Only later consider write-back, reflection, or deeper autonomous behaviors

 See also:

--- a/docs/dalidou-deployment.md
+++ b/docs/dalidou-deployment.md
@@ -50,26 +50,205 @@ starting from:
 deploy/dalidou/.env.example
 ```

-## Deployment steps
+## First-time deployment steps
+
+1. Place the repository under `/srv/storage/atocore/app` — ideally as a
+   proper git clone so future updates can be pulled, not as a static
+   snapshot:
+
+   ```bash
+   sudo git clone http://dalidou:3000/Antoine/ATOCore.git \
+       /srv/storage/atocore/app
+   ```

-1. Place the repository under `/srv/storage/atocore/app`.
 2. Create the canonical directories listed above.
 3. Copy `deploy/dalidou/.env.example` to `deploy/dalidou/.env`.
 4. Adjust the source paths if your AtoVault/AtoDrive mirrors live elsewhere.
 5. Run:

-```bash
-cd /srv/storage/atocore/app/deploy/dalidou
-docker compose up -d --build
-```
+   ```bash
+   cd /srv/storage/atocore/app/deploy/dalidou
+   docker compose up -d --build
+   ```

 6. Validate:

+   ```bash
+   curl http://127.0.0.1:8100/health
+   curl http://127.0.0.1:8100/sources
+   ```
+
+## Updating a running deployment
+
+**Use `deploy/dalidou/deploy.sh` for every code update.** It is the
+one-shot sync script that:
+
+- fetches latest main from Gitea into `/srv/storage/atocore/app`
+- (if the app dir is not a git checkout) backs it up as
+  `<dir>.pre-git-<timestamp>` and re-clones
+- rebuilds the container image
+- restarts the container
+- waits for `/health` to respond
+- compares the reported `code_version` against the
+  `__version__` in the freshly-pulled source, and exits non-zero
+  if they don't match (deployment drift detection)
+
 ```bash
-curl http://127.0.0.1:8100/health
-curl http://127.0.0.1:8100/sources
+# Normal update from main
+bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
+
+# Deploy a specific branch or tag
+ATOCORE_BRANCH=codex/some-feature \
+    bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
+
+# Dry-run: show what would happen without touching anything
+ATOCORE_DEPLOY_DRY_RUN=1 \
+    bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
+
+# Deploy from a remote host (e.g. the laptop) using the Tailscale
+# or LAN address instead of loopback
+ATOCORE_GIT_REMOTE=http://192.168.86.50:3000/Antoine/ATOCore.git \
+    bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
 ```

+The script is idempotent and safe to re-run. It never touches the
+database directly — schema migrations are applied automatically at
+service startup by the lifespan handler in `src/atocore/main.py`
+which calls `init_db()` (which in turn runs the ALTER TABLE
+statements in `_apply_migrations`).
+
+### Troubleshooting hostname resolution
+
+`deploy.sh` defaults `ATOCORE_GIT_REMOTE` to
+`http://127.0.0.1:3000/Antoine/ATOCore.git` (loopback) because the
+hostname "dalidou" doesn't reliably resolve on the host itself —
+the first real Dalidou deploy hit exactly this on 2026-04-08. If
+you need to override (e.g. running deploy.sh from a laptop against
+the Dalidou LAN), set `ATOCORE_GIT_REMOTE` explicitly.
+
+The same applies to `scripts/atocore_client.py`: its default
+`ATOCORE_BASE_URL` is `http://dalidou:8100` for remote callers, but
+when running the client on Dalidou itself (or inside the container
+via `docker exec`), override to loopback:
+
+```bash
+ATOCORE_BASE_URL=http://127.0.0.1:8100 \
+    python scripts/atocore_client.py health
+```
+
+If you see `{"status": "unavailable", "fail_open": true}` from the
+client, the first thing to check is whether the base URL resolves
+from where you're running the client.
+
+### The deploy.sh self-update race
+
+When `deploy.sh` itself changes in the commit being pulled, the
+first run after the update is still executing the *old* script from
+the bash process's in-memory copy. `git reset --hard` updates the
+file on disk, but the running bash has already loaded the
+instructions. On 2026-04-09 this silently shipped an "unknown"
+`build_sha` because the old Step 2 (which predated env-var export)
+ran against fresh source.
+
+`deploy.sh` now detects this: Step 1.5 compares the sha1 of `$0`
+(the running script) against the sha1 of
+`$APP_DIR/deploy/dalidou/deploy.sh` (the on-disk copy) after the
+git reset. If they differ, it sets `ATOCORE_DEPLOY_REEXECED=1` and
+`exec`s the fresh copy so the rest of the deploy runs under the new
+script. The sentinel env var prevents infinite recursion.
+
+You'll see this in the logs as:
+
+```text
+==> Step 1.5: deploy.sh changed in the pulled commit; re-exec'ing
+==>   running script hash: <old>
+==>   on-disk script hash: <new>
+==>   re-exec -> /srv/storage/atocore/app/deploy/dalidou/deploy.sh
+```
+
+To opt out (debugging, for example), pre-set
+`ATOCORE_DEPLOY_REEXECED=1` before invoking `deploy.sh` and the
+self-update guard will be skipped.
+
+### Deployment drift detection
+
+`/health` reports drift signals at three increasing levels of
+precision:
+
+| Field | Source | Precision | When to use |
+|---|---|---|---|
+| `version` / `code_version` | `atocore.__version__` (manual bump) | coarse — same value across many commits | quick smoke check that the right *release* is running |
+| `build_sha` | `ATOCORE_BUILD_SHA` env var, set by `deploy.sh` per build | precise — changes per commit | the canonical drift signal |
+| `build_time` / `build_branch` | same env var path | per-build | forensics when multiple branches in flight |
+
+The **precise** check (run on the laptop or any host that can curl
+the live service AND has the source repo at hand):
+
+```bash
+# What's actually running on Dalidou
+LIVE_SHA=$(curl -fsS http://dalidou:8100/health | grep -o '"build_sha":"[^"]*"' | cut -d'"' -f4)
+
+# What the deployed branch tip should be
+EXPECTED_SHA=$(cd /srv/storage/atocore/app && git rev-parse HEAD)
+
+# Compare
+if [ "$LIVE_SHA" = "$EXPECTED_SHA" ]; then
+    echo "live is current at $LIVE_SHA"
+else
+    echo "DRIFT: live $LIVE_SHA vs expected $EXPECTED_SHA"
+    echo "run deploy.sh to sync"
+fi
+```
+
+The `deploy.sh` script does exactly this comparison automatically
+in its post-deploy verification step (Step 6) and exits non-zero
+on mismatch. So the **simplest drift check** is just to run
+`deploy.sh` — if there's nothing to deploy, it succeeds quickly;
+if the live service is stale, it deploys and verifies.
+
+If `/health` reports `build_sha: "unknown"`, the running container
+was started without `deploy.sh` (probably via `docker compose up`
+directly), and the build provenance was never recorded. Re-run
+via `deploy.sh` to fix.
+
+The coarse `code_version` check is still useful as a quick visual
+sanity check — bumping `__version__` from `0.2.0` to `0.3.0`
+signals a meaningful release boundary even if the precise
+`build_sha` is what tools should compare against:
+
+```bash
+# Quick sanity check (coarse)
+curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"'
+grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py
+```
+
+### Schema migrations on redeploy
+
+When updating from an older `__version__`, the first startup after
+the redeploy runs the idempotent ALTER TABLE migrations in
+`_apply_migrations`. For a pre-0.2.0 → 0.2.0 upgrade the migrations
+add these columns to existing tables (all with safe defaults so no
+data is touched):
+
+- `memories.project TEXT DEFAULT ''`
+- `memories.last_referenced_at DATETIME`
+- `memories.reference_count INTEGER DEFAULT 0`
+- `interactions.response TEXT DEFAULT ''`
+- `interactions.memories_used TEXT DEFAULT '[]'`
+- `interactions.chunks_used TEXT DEFAULT '[]'`
+- `interactions.client TEXT DEFAULT ''`
+- `interactions.session_id TEXT DEFAULT ''`
+- `interactions.project TEXT DEFAULT ''`
+
+Plus new indexes on the new columns. No row data is modified. The
+migration is safe to run against a database that already has the
+columns — the `_column_exists` check makes each ALTER a no-op in
+that case.
+
+Backup the database before any redeploy (via `POST /admin/backup`)
+if you want a pre-upgrade snapshot. The migration is additive and
+reversible by restoring the snapshot.
+
 ## Deferred

 - backup automation
--- a/docs/master-plan-status.md
+++ b/docs/master-plan-status.md
@@ -44,8 +44,9 @@ read-only additive mode.

 ### Engineering Layer Planning Sprint

-The engineering layer is intentionally in planning, not implementation.
-The architecture docs below are the current state of that planning:
+**Status: complete.** All 8 architecture docs are drafted. The
+engineering layer is now ready for V1 implementation against the
+active project set.

 - [engineering-query-catalog.md](architecture/engineering-query-catalog.md) —
  the 20 v1-required queries the engineering layer must answer
@@ -55,17 +56,44 @@ The architecture docs below are the current state of that planning:
  Layer 0 → Layer 2 pipeline, triggers, review queue mechanics
 - [conflict-model.md](architecture/conflict-model.md) —
  detection, representation, and resolution of contradictory facts
+- [tool-handoff-boundaries.md](architecture/tool-handoff-boundaries.md) —
+  KB-CAD / KB-FEM one-way mirror stance, ingest endpoints, drift handling
+- [representation-authority.md](architecture/representation-authority.md) —
+  canonical home matrix across PKM / KB / repos / AtoCore for 22 fact kinds
+- [human-mirror-rules.md](architecture/human-mirror-rules.md) —
+  templates, regeneration triggers, edit flow, "do not edit" enforcement
+- [engineering-v1-acceptance.md](architecture/engineering-v1-acceptance.md) —
+  measurable done definition with 23 acceptance criteria
 - [engineering-knowledge-hybrid-architecture.md](architecture/engineering-knowledge-hybrid-architecture.md) —
  the 5-layer model (from the previous planning wave)
 - [engineering-ontology-v1.md](architecture/engineering-ontology-v1.md) —
  the initial V1 object and relationship inventory (previous wave)
+- [project-identity-canonicalization.md](architecture/project-identity-canonicalization.md) —
+  the helper-at-every-service-boundary contract that keeps the
+  trust hierarchy dependable across alias and canonical-id callers;
+  required reading before adding new project-keyed entity surfaces
+  in the V1 implementation sprint

-Still to draft before engineering-layer implementation begins:
+The next concrete next step is the V1 implementation sprint, which
+should follow engineering-v1-acceptance.md as its checklist, and
+must apply the project-identity-canonicalization contract at every
+new service-layer entry point.

- tool-handoff-boundaries.md (KB-CAD / KB-FEM read vs write)
- human-mirror-rules.md (templates, triggers, edit flow)
- representation-authority.md (PKM / KB / repo / AtoCore canonical home matrix)
- engineering-v1-acceptance.md (done definition)
+### LLM Client Integration
+
+A separate but related architectural concern: how AtoCore is reachable
+from many different LLM client contexts (OpenClaw, Claude Code, future
+Codex skills, future MCP server). The layering rule is documented in:
+
+- [llm-client-integration.md](architecture/llm-client-integration.md) —
+  three-layer shape: HTTP API → shared operator client
+  (`scripts/atocore_client.py`) → per-agent thin frontends; the
+  shared client is the canonical backbone every new client should
+  shell out to instead of reimplementing HTTP calls
+
+This sits implicitly between Phase 8 (OpenClaw) and Phase 11
+(multi-model). Memory-review and engineering-entity commands are
+deferred from the shared client until their workflows are exercised.

 ## What Is Real Today

--- a/docs/next-steps.md
+++ b/docs/next-steps.md
@@ -20,45 +20,65 @@ This working list should be read alongside:

 ## Immediate Next Steps

-1. Use the T420 `atocore-context` skill and the new organic routing layer in
+1. ~~Re-run the backup/restore drill~~ — DONE 2026-04-11, full pass
+2. ~~Turn on auto-capture of Claude Code sessions~~ — DONE 2026-04-11,
+   Stop hook via `deploy/hooks/capture_stop.py` → `POST /interactions`
+   with `reinforce=false`; kill switch: `ATOCORE_CAPTURE_DISABLED=1`
+2a. Run a short real-use pilot with auto-capture on
+   - verify interactions are landing in Dalidou
+   - check prompt/response quality and truncation
+   - confirm fail-open: no user-visible impact when Dalidou is down
+3. Use the T420 `atocore-context` skill and the new organic routing layer in
   real OpenClaw workflows
   - confirm `auto-context` feels natural
   - confirm project inference is good enough in practice
   - confirm the fail-open behavior remains acceptable in practice
-2. Review retrieval quality after the first real project ingestion batch
+4. Review retrieval quality after the first real project ingestion batch
   - check whether the top hits are useful
   - check whether trusted project state remains dominant
   - reduce cross-project competition and prompt ambiguity where needed
   - use `debug-context` to inspect the exact last AtoCore supplement
-3. Treat the active-project full markdown/text wave as complete
+5. Treat the active-project full markdown/text wave as complete
   - `p04-gigabit`
   - `p05-interferometer`
   - `p06-polisher`
-4. Define a cleaner source refresh model
+6. Define a cleaner source refresh model
   - make the difference between source truth, staged inputs, and machine store
     explicit
   - move toward a project source registry and refresh workflow
   - foundation now exists via project registry + per-project refresh API
   - registration policy + template + proposal + approved registration are now
     the normal path for new projects
-5. Move to Wave 2 trusted-operational ingestion
+7. Move to Wave 2 trusted-operational ingestion
   - curated dashboards
   - decision logs
   - milestone/current-status views
   - operational truth, not just raw project notes
-6. Integrate the new engineering architecture docs into active planning, not immediate schema code
+8. Integrate the new engineering architecture docs into active planning, not immediate schema code
   - keep `docs/architecture/engineering-knowledge-hybrid-architecture.md` as the target layer model
   - keep `docs/architecture/engineering-ontology-v1.md` as the V1 structured-domain target
   - do not start entity/relationship persistence until the ingestion, retrieval, registry, and backup baseline feels boring and stable
-7. Define backup and export procedures for Dalidou
-   - exercise the new SQLite + registry snapshot path on Dalidou
-   - Chroma backup or rebuild policy
-   - retention and restore validation
-   - admin backup endpoint now supports `include_chroma` cold snapshot
-     under the ingestion lock and `validate` confirms each snapshot is
-     openable; remaining work is the operational retention policy
-8. Keep deeper automatic runtime integration modest until the organic read-only
-   model has proven value
+9. Finish the boring operations baseline around backup
+   - retention policy cleanup script (snapshots dir grows
+     monotonically today)
+   - off-Dalidou backup target (at minimum an rsync to laptop or
+     another host so a single-disk failure isn't terminal)
+   - automatic post-backup validation (have `create_runtime_backup`
+     call `validate_backup` on its own output and refuse to
+     declare success if validation fails)
+   - DONE in commits be40994 / 0382238 / 3362080 / this one:
+     - `create_runtime_backup` + `list_runtime_backups` +
+       `validate_backup` + `restore_runtime_backup` with CLI
+     - `POST /admin/backup` with `include_chroma=true` under
+       the ingestion lock
+     - `/health` build_sha / build_time / build_branch provenance
+     - `deploy.sh` self-update re-exec guard + build_sha drift
+       verification
+     - live drill procedure in `docs/backup-restore-procedure.md`
+       with failure-mode table and the memory_type=episodic
+       marker pattern from the 2026-04-09 drill
+10. Keep deeper automatic runtime integration modest until the organic read-only
+    model has proven value

 ## Trusted State Status

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "atocore"
-version = "0.1.0"
+version = "0.2.0"
 description = "Personal context engine for LLM interactions"
 requires-python = ">=3.11"
 dependencies = [
--- a/scripts/atocore_client.py
+++ b/scripts/atocore_client.py
@@ -1,8 +1,43 @@
 """Operator-facing API client for live AtoCore instances.

-This script is intentionally external to the app runtime. It is for admins and
-operators who want a convenient way to inspect live project state, refresh
-projects, audit retrieval quality, and manage trusted project-state entries.
+This script is intentionally external to the app runtime. It is for admins
+and operators who want a convenient way to inspect live project state,
+refresh projects, audit retrieval quality, manage trusted project-state
+entries, and drive the Phase 9 reflection loop (capture, extract, queue,
+promote, reject).
+
+Environment variables
+---------------------
+
+ATOCORE_BASE_URL
+    Base URL of the AtoCore service (default: ``http://dalidou:8100``).
+
+    When running ON the Dalidou host itself or INSIDE the Dalidou
+    container, override this with loopback or the real IP::
+
+        ATOCORE_BASE_URL=http://127.0.0.1:8100 \\
+            python scripts/atocore_client.py health
+
+    The default hostname "dalidou" is meant for cases where the
+    caller is a remote machine (laptop, T420/OpenClaw, etc.) with
+    "dalidou" in its /etc/hosts or resolvable via Tailscale. It does
+    NOT reliably resolve on the host itself or inside the container,
+    and when it fails the client returns
+    ``{"status": "unavailable", "fail_open": true}`` — the right
+    diagnosis when that happens is to set ATOCORE_BASE_URL explicitly
+    to 127.0.0.1:8100 and retry.
+
+ATOCORE_TIMEOUT_SECONDS
+    Request timeout for most operations (default: 30).
+
+ATOCORE_REFRESH_TIMEOUT_SECONDS
+    Longer timeout for project refresh operations which can be slow
+    (default: 1800).
+
+ATOCORE_FAIL_OPEN
+    When "true" (default), network errors return a small fail-open
+    envelope instead of raising. Set to "false" for admin operations
+    where you need the real error.
 """

 from __future__ import annotations
@@ -23,6 +58,15 @@ TIMEOUT = int(os.environ.get("ATOCORE_TIMEOUT_SECONDS", "30"))
 REFRESH_TIMEOUT = int(os.environ.get("ATOCORE_REFRESH_TIMEOUT_SECONDS", "1800"))
 FAIL_OPEN = os.environ.get("ATOCORE_FAIL_OPEN", "true").lower() == "true"

+# Bumped when the subcommand surface or JSON output shapes meaningfully
+# change. See docs/architecture/llm-client-integration.md for the
+# semver rules. History:
+#   0.1.0  initial stable-ops-only client
+#   0.2.0  Phase 9 reflection loop added: capture, extract,
+#          reinforce-interaction, list-interactions, get-interaction,
+#          queue, promote, reject
+CLIENT_VERSION = "0.2.0"
+

 def print_json(payload: Any) -> None:
    print(json.dumps(payload, ensure_ascii=True, indent=2))
@@ -243,6 +287,59 @@ def build_parser() -> argparse.ArgumentParser:
    p.add_argument("top_k", nargs="?", type=int, default=5)
    p.add_argument("project", nargs="?", default="")

+    # --- Phase 9 reflection loop surface --------------------------------
+    #
+    # capture: record one interaction (prompt + response + context used).
+    #   Mirrors POST /interactions. response is positional so shell
+    #   callers can pass it via $(cat file.txt) or heredoc. project,
+    #   client, and session_id are optional positionals with empty
+    #   defaults, matching the existing script's style.
+    p = sub.add_parser("capture")
+    p.add_argument("prompt")
+    p.add_argument("response", nargs="?", default="")
+    p.add_argument("project", nargs="?", default="")
+    p.add_argument("client", nargs="?", default="")
+    p.add_argument("session_id", nargs="?", default="")
+    p.add_argument("reinforce", nargs="?", default="true")
+
+    # extract: run the Phase 9 C rule-based extractor against an
+    #   already-captured interaction. persist='true' writes the
+    #   candidates as status='candidate' memories; default is
+    #   preview-only.
+    p = sub.add_parser("extract")
+    p.add_argument("interaction_id")
+    p.add_argument("persist", nargs="?", default="false")
+
+    # reinforce: backfill reinforcement on an already-captured interaction.
+    p = sub.add_parser("reinforce-interaction")
+    p.add_argument("interaction_id")
+
+    # list-interactions: paginated listing with filters.
+    p = sub.add_parser("list-interactions")
+    p.add_argument("project", nargs="?", default="")
+    p.add_argument("session_id", nargs="?", default="")
+    p.add_argument("client", nargs="?", default="")
+    p.add_argument("since", nargs="?", default="")
+    p.add_argument("limit", nargs="?", type=int, default=50)
+
+    # get-interaction: fetch one by id
+    p = sub.add_parser("get-interaction")
+    p.add_argument("interaction_id")
+
+    # queue: list the candidate review queue
+    p = sub.add_parser("queue")
+    p.add_argument("memory_type", nargs="?", default="")
+    p.add_argument("project", nargs="?", default="")
+    p.add_argument("limit", nargs="?", type=int, default=50)
+
+    # promote: candidate -> active
+    p = sub.add_parser("promote")
+    p.add_argument("memory_id")
+
+    # reject: candidate -> invalid
+    p = sub.add_parser("reject")
+    p.add_argument("memory_id")
+
    return parser


@@ -304,6 +401,79 @@ def main() -> int:
        print_json(request("POST", "/context/build", {"prompt": args.prompt, "project": args.project or None, "budget": args.budget}))
    elif cmd == "audit-query":
        print_json(audit_query(args.prompt, args.top_k, args.project or None))
+    # --- Phase 9 reflection loop surface ------------------------------
+    elif cmd == "capture":
+        body: dict[str, Any] = {
+            "prompt": args.prompt,
+            "response": args.response,
+            "project": args.project,
+            "client": args.client or "atocore-client",
+            "session_id": args.session_id,
+            "reinforce": args.reinforce.lower() in {"1", "true", "yes", "y"},
+        }
+        print_json(request("POST", "/interactions", body))
+    elif cmd == "extract":
+        persist = args.persist.lower() in {"1", "true", "yes", "y"}
+        print_json(
+            request(
+                "POST",
+                f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/extract",
+                {"persist": persist},
+            )
+        )
+    elif cmd == "reinforce-interaction":
+        print_json(
+            request(
+                "POST",
+                f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/reinforce",
+                {},
+            )
+        )
+    elif cmd == "list-interactions":
+        query_parts: list[str] = []
+        if args.project:
+            query_parts.append(f"project={urllib.parse.quote(args.project)}")
+        if args.session_id:
+            query_parts.append(f"session_id={urllib.parse.quote(args.session_id)}")
+        if args.client:
+            query_parts.append(f"client={urllib.parse.quote(args.client)}")
+        if args.since:
+            query_parts.append(f"since={urllib.parse.quote(args.since)}")
+        query_parts.append(f"limit={int(args.limit)}")
+        query = "?" + "&".join(query_parts)
+        print_json(request("GET", f"/interactions{query}"))
+    elif cmd == "get-interaction":
+        print_json(
+            request(
+                "GET",
+                f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}",
+            )
+        )
+    elif cmd == "queue":
+        query_parts = ["status=candidate"]
+        if args.memory_type:
+            query_parts.append(f"memory_type={urllib.parse.quote(args.memory_type)}")
+        if args.project:
+            query_parts.append(f"project={urllib.parse.quote(args.project)}")
+        query_parts.append(f"limit={int(args.limit)}")
+        query = "?" + "&".join(query_parts)
+        print_json(request("GET", f"/memory{query}"))
+    elif cmd == "promote":
+        print_json(
+            request(
+                "POST",
+                f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/promote",
+                {},
+            )
+        )
+    elif cmd == "reject":
+        print_json(
+            request(
+                "POST",
+                f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/reject",
+                {},
+            )
+        )
    else:
        return 1
    return 0
--- a/scripts/migrate_legacy_aliases.py
+++ b/scripts/migrate_legacy_aliases.py
--- a/src/atocore/init.py
+++ b/src/atocore/init.py
@@ -1,3 +1,15 @@
 """AtoCore — Personal Context Engine."""

-__version__ = "0.1.0"
+# Bumped when a commit meaningfully changes the API surface, schema, or
+# user-visible behavior. The /health endpoint reports this value so
+# deployment drift is immediately visible: if the running service's
+# /health reports an older version than the main branch's __version__,
+# the deployment is stale and needs a redeploy (see
+# docs/dalidou-deployment.md and deploy/dalidou/deploy.sh).
+#
+# History:
+#   0.1.0  Phase 0/0.5/1/2/3/5/7 baseline
+#   0.2.0  Phase 9 reflection loop (capture/reinforce/extract + review
+#          queue), shared client v0.2.0, project identity
+#          canonicalization at every service-layer entry point
+__version__ = "0.2.0"
--- a/src/atocore/api/routes.py
+++ b/src/atocore/api/routes.py
@@ -742,12 +742,45 @@ def api_validate_backup(stamp: str) -> dict:

@router.get("/health")
 def api_health() -> dict:
-    """Health check."""
+    """Health check.
+
+    Three layers of version reporting, in increasing precision:
+
+    - ``version`` / ``code_version``: ``atocore.__version__`` (e.g.
+      "0.2.0"). Bumped manually on commits that change the API
+      surface, schema, or user-visible behavior. Coarse — any
+      number of commits can land between bumps without changing
+      this value.
+    - ``build_sha``: full git SHA of the commit the running
+      container was built from. Set by ``deploy/dalidou/deploy.sh``
+      via the ``ATOCORE_BUILD_SHA`` env var on every rebuild.
+      Reports ``"unknown"`` for builds that bypass deploy.sh
+      (direct ``docker compose up`` etc.). This is the precise
+      drift signal: if the live ``build_sha`` doesn't match the
+      tip of the deployed branch on Gitea, the service is stale
+      regardless of what ``code_version`` says.
+    - ``build_time`` / ``build_branch``: when and from which branch
+      the live container was built. Useful for forensics when
+      multiple branches are in flight or when build_sha is
+      ambiguous (e.g. a force-push to the same SHA).
+
+    The deploy.sh post-deploy verification step compares the live
+    ``build_sha`` to the SHA it just set, and exits non-zero on
+    mismatch.
+    """
+    import os
+
+    from atocore import __version__
+
    store = get_vector_store()
    source_status = get_source_status()
    return {
        "status": "ok",
-        "version": "0.1.0",
+        "version": __version__,
+        "code_version": __version__,
+        "build_sha": os.environ.get("ATOCORE_BUILD_SHA", "unknown"),
+        "build_time": os.environ.get("ATOCORE_BUILD_TIME", "unknown"),
+        "build_branch": os.environ.get("ATOCORE_BUILD_BRANCH", "unknown"),
        "vectors_count": store.count,
        "env": _config.settings.env,
        "machine_paths": {
--- a/src/atocore/context/builder.py
+++ b/src/atocore/context/builder.py
@@ -14,6 +14,7 @@ import atocore.config as _config
 from atocore.context.project_state import format_project_state, get_state
 from atocore.memory.service import get_memories_for_context
 from atocore.observability.logger import get_logger
+from atocore.projects.registry import resolve_project_name
 from atocore.retrieval.retriever import ChunkResult, retrieve

 log = get_logger("context_builder")
@@ -84,8 +85,16 @@ def build_context(
        max(0, int(budget * PROJECT_STATE_BUDGET_RATIO)),
    )

-    if project_hint:
-        state_entries = get_state(project_hint)
+    # Canonicalize the project hint through the registry so callers
+    # can pass an alias (`p05`, `gigabit`) and still find trusted
+    # state stored under the canonical project id. The same helper
+    # is used everywhere a project name crosses a trust boundary
+    # (project_state, memories, interactions). When the registry has
+    # no entry the helper returns the input unchanged so hand-curated
+    # state that predates the registry still works.
+    canonical_project = resolve_project_name(project_hint) if project_hint else ""
+    if canonical_project:
+        state_entries = get_state(canonical_project)
        if state_entries:
            project_state_text = format_project_state(state_entries)
            project_state_text, project_state_chars = _truncate_text_block(
--- a/src/atocore/context/project_state.py
+++ b/src/atocore/context/project_state.py
@@ -18,6 +18,7 @@ from datetime import datetime, timezone

 from atocore.models.database import get_connection
 from atocore.observability.logger import get_logger
+from atocore.projects.registry import resolve_project_name

 log = get_logger("project_state")

@@ -101,11 +102,19 @@ def set_state(
    source: str = "",
    confidence: float = 1.0,
 ) -> ProjectStateEntry:
-    """Set or update a project state entry. Upsert semantics."""
+    """Set or update a project state entry. Upsert semantics.
+
+    The ``project_name`` is canonicalized through the registry so a
+    caller passing an alias (``p05``) ends up writing into the same
+    row as the canonical id (``p05-interferometer``). Without this
+    step, alias and canonical names would create two parallel
+    project rows and fragmented state.
+    """
    if category not in CATEGORIES:
        raise ValueError(f"Invalid category '{category}'. Must be one of: {CATEGORIES}")
    _validate_confidence(confidence)

+    project_name = resolve_project_name(project_name)
    project_id = ensure_project(project_name)
    entry_id = str(uuid.uuid4())
    now = datetime.now(timezone.utc).isoformat()
@@ -153,7 +162,12 @@ def get_state(
    category: str | None = None,
    active_only: bool = True,
 ) -> list[ProjectStateEntry]:
-    """Get project state entries, optionally filtered by category."""
+    """Get project state entries, optionally filtered by category.
+
+    The lookup is canonicalized through the registry so an alias hint
+    finds the same rows as the canonical id.
+    """
+    project_name = resolve_project_name(project_name)
    with get_connection() as conn:
        project = conn.execute(
            "SELECT id FROM projects WHERE lower(name) = lower(?)", (project_name,)
@@ -191,7 +205,12 @@ def get_state(


 def invalidate_state(project_name: str, category: str, key: str) -> bool:
-    """Mark a project state entry as superseded."""
+    """Mark a project state entry as superseded.
+
+    The lookup is canonicalized through the registry so an alias is
+    treated as the canonical project for the invalidation lookup.
+    """
+    project_name = resolve_project_name(project_name)
    with get_connection() as conn:
        project = conn.execute(
            "SELECT id FROM projects WHERE lower(name) = lower(?)", (project_name,)
--- a/src/atocore/interactions/service.py
+++ b/src/atocore/interactions/service.py
@@ -18,15 +18,24 @@ violating the AtoCore trust hierarchy.
 from __future__ import annotations

 import json
+import re
 import uuid
 from dataclasses import dataclass, field
 from datetime import datetime, timezone

 from atocore.models.database import get_connection
 from atocore.observability.logger import get_logger
+from atocore.projects.registry import resolve_project_name

 log = get_logger("interactions")

+# Stored timestamps use 'YYYY-MM-DD HH:MM:SS' (no timezone offset, UTC by
+# convention) so they sort lexically and compare cleanly with the SQLite
+# CURRENT_TIMESTAMP default. The since filter accepts ISO 8601 strings
+# (with 'T', optional 'Z' or +offset, optional fractional seconds) and
+# normalizes them to the storage format before the SQL comparison.
+_STORAGE_TIMESTAMP_FORMAT = "%Y-%m-%d %H:%M:%S"
+

@dataclass
 class Interaction:
@@ -72,6 +81,13 @@ def record_interaction(
    if not prompt or not prompt.strip():
        raise ValueError("Interaction prompt must be non-empty")

+    # Canonicalize the project through the registry so an alias and
+    # the canonical id store under the same bucket. Without this,
+    # reinforcement and extraction (which both query by raw
+    # interaction.project) would silently miss memories and create
+    # candidates in the wrong project.
+    project = resolve_project_name(project)
+
    interaction_id = str(uuid.uuid4())
    # Store created_at explicitly so the same string lives in both the DB
    # column and the returned dataclass. SQLite's CURRENT_TIMESTAMP uses
@@ -159,9 +175,14 @@ def list_interactions(
 ) -> list[Interaction]:
    """List captured interactions, optionally filtered.

-    ``since`` is an ISO timestamp string; only interactions created at or
-    after that time are returned. ``limit`` is hard-capped at 500 to keep
-    casual API listings cheap.
+    ``since`` accepts an ISO 8601 timestamp string (with ``T``, an
+    optional ``Z`` or numeric offset, optional fractional seconds).
+    The value is normalized to the storage format (UTC,
+    ``YYYY-MM-DD HH:MM:SS``) before the SQL comparison so external
+    callers can pass any of the common ISO shapes without filter
+    drift. ``project`` is canonicalized through the registry so an
+    alias finds rows stored under the canonical project id.
+    ``limit`` is hard-capped at 500 to keep casual API listings cheap.
    """
    if limit <= 0:
        return []
@@ -172,7 +193,7 @@ def list_interactions(

    if project:
        query += " AND project = ?"
-        params.append(project)
+        params.append(resolve_project_name(project))
    if session_id:
        query += " AND session_id = ?"
        params.append(session_id)
@@ -181,7 +202,7 @@ def list_interactions(
        params.append(client)
    if since:
        query += " AND created_at >= ?"
-        params.append(since)
+        params.append(_normalize_since(since))

    query += " ORDER BY created_at DESC LIMIT ?"
    params.append(limit)
@@ -243,3 +264,41 @@ def _safe_json_dict(raw: str | None) -> dict:
    if not isinstance(value, dict):
        return {}
    return value
+
+
+def _normalize_since(since: str) -> str:
+    """Normalize an ISO 8601 ``since`` filter to the storage format.
+
+    Stored ``created_at`` values are ``YYYY-MM-DD HH:MM:SS`` (no
+    timezone, UTC by convention). External callers naturally pass
+    ISO 8601 with ``T`` separator, optional ``Z`` suffix, optional
+    fractional seconds, and optional ``+HH:MM`` offsets. A naive
+    string comparison between the two formats fails on the same
+    day because the lexically-greater ``T`` makes any ISO value
+    sort after any space-separated value.
+
+    This helper accepts the common ISO shapes plus the bare
+    storage format and returns the storage format. On a parse
+    failure it returns the input unchanged so the SQL comparison
+    fails open (no rows match) instead of raising and breaking
+    the listing endpoint.
+    """
+    if not since:
+        return since
+    candidate = since.strip()
+    # Python's fromisoformat understands trailing 'Z' from 3.11+ but
+    # we replace it explicitly for safety against earlier shapes.
+    if candidate.endswith("Z"):
+        candidate = candidate[:-1] + "+00:00"
+    try:
+        dt = datetime.fromisoformat(candidate)
+    except ValueError:
+        # Already in storage format, or unparseable: best-effort
+        # match the storage format with a regex; if that fails too,
+        # return the raw input.
+        if re.fullmatch(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", since):
+            return since
+        return since
+    if dt.tzinfo is not None:
+        dt = dt.astimezone(timezone.utc).replace(tzinfo=None)
+    return dt.strftime(_STORAGE_TIMESTAMP_FORMAT)
--- a/src/atocore/main.py
+++ b/src/atocore/main.py
@@ -4,6 +4,7 @@ from contextlib import asynccontextmanager

 from fastapi import FastAPI

+from atocore import __version__
 from atocore.api.routes import router
 import atocore.config as _config
 from atocore.context.project_state import init_project_state_schema
@@ -43,7 +44,7 @@ async def lifespan(app: FastAPI):
 app = FastAPI(
    title="AtoCore",
    description="Personal Context Engine for LLM interactions",
-    version="0.1.0",
+    version=__version__,
    lifespan=lifespan,
 )

--- a/src/atocore/memory/reinforcement.py
+++ b/src/atocore/memory/reinforcement.py
@@ -8,10 +8,11 @@ given memory, without ever promoting anything new into trusted state.

 Design notes
 ------------
- Matching is intentionally simple and explainable:
-    * normalize both sides (lowercase, collapse whitespace)
-    * require the normalized memory content (or its first 80 chars) to
-      appear as a substring in the normalized response
+- Matching uses token-overlap: tokenize both sides (lowercase, stem,
+  drop stop words), then check whether >= 70 % of the memory's content
+  tokens appear in the response token set. This handles natural
+  paraphrases (e.g. "prefers" vs "prefer", "because history" vs
+  "because the history") that substring matching missed.
 - Candidates and invalidated memories are NEVER considered — reinforcement
  must not revive history.
 - Reinforcement is capped at 1.0 and monotonically non-decreasing.
@@ -43,9 +44,12 @@ log = get_logger("reinforcement")
 # memories like "prefers Python".
 _MIN_MEMORY_CONTENT_LENGTH = 12

-# When a memory's content is very long, match on its leading window only
-# to avoid punishing small paraphrases further into the body.
-_MATCH_WINDOW_CHARS = 80
+# Token-overlap matching constants.
+_STOP_WORDS: frozenset[str] = frozenset({
+    "the", "a", "an", "and", "or", "of", "to", "is", "was",
+    "that", "this", "with", "for", "from", "into",
+})
+_MATCH_THRESHOLD = 0.70

 DEFAULT_CONFIDENCE_DELTA = 0.02

@@ -144,12 +148,58 @@ def _normalize(text: str) -> str:
    return collapsed.strip()


+def _stem(word: str) -> str:
+    """Aggressive suffix-folding so inflected forms collapse.
+
+    Handles trailing ``ing``, ``ed``, and ``s`` — good enough for
+    reinforcement matching without pulling in nltk/snowball.
+    """
+    # Order matters: try longest suffix first.
+    if word.endswith("ing") and len(word) >= 6:
+        return word[:-3]
+    if word.endswith("ed") and len(word) > 4:
+        stem = word[:-2]
+        # "preferred" → "preferr" → "prefer" (doubled consonant before -ed)
+        if len(stem) >= 3 and stem[-1] == stem[-2]:
+            stem = stem[:-1]
+        return stem
+    if word.endswith("s") and len(word) > 3:
+        return word[:-1]
+    return word
+
+
+def _tokenize(text: str) -> set[str]:
+    """Split normalized text into a stemmed token set.
+
+    Strips punctuation, drops words shorter than 3 chars and stop words.
+    """
+    tokens: set[str] = set()
+    for raw in text.split():
+        # Strip leading/trailing punctuation (commas, periods, quotes, etc.)
+        word = raw.strip(".,;:!?\"'()[]{}-/")
+        if len(word) < 3:
+            continue
+        if word in _STOP_WORDS:
+            continue
+        tokens.add(_stem(word))
+    return tokens
+
+
 def _memory_matches(memory_content: str, normalized_response: str) -> bool:
-    """Return True if the memory content appears in the response."""
+    """Return True if enough of the memory's tokens appear in the response.
+
+    Uses token-overlap: tokenize both sides (lowercase, stem, drop stop
+    words), then check whether >= 70 % of the memory's content tokens
+    appear in the response token set.
+    """
    if not memory_content:
        return False
    normalized_memory = _normalize(memory_content)
    if len(normalized_memory) < _MIN_MEMORY_CONTENT_LENGTH:
        return False
-    window = normalized_memory[:_MATCH_WINDOW_CHARS]
-    return window in normalized_response
+    memory_tokens = _tokenize(normalized_memory)
+    if not memory_tokens:
+        return False
+    response_tokens = _tokenize(normalized_response)
+    overlap = memory_tokens & response_tokens
+    return len(overlap) / len(memory_tokens) >= _MATCH_THRESHOLD
--- a/src/atocore/memory/service.py
+++ b/src/atocore/memory/service.py
@@ -29,6 +29,7 @@ from datetime import datetime, timezone

 from atocore.models.database import get_connection
 from atocore.observability.logger import get_logger
+from atocore.projects.registry import resolve_project_name

 log = get_logger("memory")

@@ -84,6 +85,13 @@ def create_memory(
        raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
    _validate_confidence(confidence)

+    # Canonicalize the project through the registry so an alias and
+    # the canonical id store under the same bucket. This keeps
+    # reinforcement queries (which use the interaction's project) and
+    # context retrieval (which uses the registry-canonicalized hint)
+    # consistent with how memories are created.
+    project = resolve_project_name(project)
+
    memory_id = str(uuid.uuid4())
    now = datetime.now(timezone.utc).isoformat()

@@ -162,8 +170,13 @@ def get_memories(
        query += " AND memory_type = ?"
        params.append(memory_type)
    if project is not None:
+        # Canonicalize on the read side so a caller passing an alias
+        # finds rows that were stored under the canonical id (and
+        # vice versa). resolve_project_name returns the input
+        # unchanged for unregistered names so empty-string queries
+        # for "no project scope" still work.
        query += " AND project = ?"
-        params.append(project)
+        params.append(resolve_project_name(project))
    if status is not None:
        query += " AND status = ?"
        params.append(status)
--- a/src/atocore/models/database.py
+++ b/src/atocore/models/database.py
@@ -71,14 +71,18 @@ CREATE TABLE IF NOT EXISTS interactions (
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
 );

+-- Indexes that reference columns guaranteed to exist since the first
+-- release ship here. Indexes that reference columns added by later
+-- migrations (memories.project, interactions.project,
+-- interactions.session_id) are created inside _apply_migrations AFTER
+-- the corresponding ALTER TABLE, NOT here. Creating them here would
+-- fail on upgrade from a pre-migration schema because CREATE TABLE
+-- IF NOT EXISTS is a no-op on an existing table, so the new columns
+-- wouldn't be added before the CREATE INDEX runs.
 CREATE INDEX IF NOT EXISTS idx_chunks_document ON source_chunks(document_id);
 CREATE INDEX IF NOT EXISTS idx_memories_type ON memories(memory_type);
-CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project);
 CREATE INDEX IF NOT EXISTS idx_memories_status ON memories(status);
 CREATE INDEX IF NOT EXISTS idx_interactions_project ON interactions(project_id);
-CREATE INDEX IF NOT EXISTS idx_interactions_project_name ON interactions(project);
-CREATE INDEX IF NOT EXISTS idx_interactions_session ON interactions(session_id);
-CREATE INDEX IF NOT EXISTS idx_interactions_created_at ON interactions(created_at);
 """


--- a/src/atocore/ops/backup.py
+++ b/src/atocore/ops/backup.py
@@ -103,12 +103,27 @@ def create_runtime_backup(
        encoding="utf-8",
    )

+    # Automatic post-backup validation. Failures log a warning but do
+    # not raise — the backup files are still on disk and may be useful.
+    validation = validate_backup(stamp)
+    validated = validation.get("valid", False)
+    validation_errors = validation.get("errors", [])
+    if not validated:
+        log.warning(
+            "post_backup_validation_failed",
+            backup_root=str(backup_root),
+            errors=validation_errors,
+        )
+    metadata["validated"] = validated
+    metadata["validation_errors"] = validation_errors
+
    log.info(
        "runtime_backup_created",
        backup_root=str(backup_root),
        db_snapshot=str(db_snapshot_path),
        chroma_included=include_chroma,
        chroma_bytes=chroma_bytes_copied,
+        validated=validated,
    )
    return metadata

@@ -216,6 +231,286 @@ def validate_backup(stamp: str) -> dict:
    return result


+def restore_runtime_backup(
+    stamp: str,
+    *,
+    include_chroma: bool | None = None,
+    pre_restore_snapshot: bool = True,
+    confirm_service_stopped: bool = False,
+) -> dict:
+    """Restore a previously captured runtime backup.
+
+    CRITICAL: the AtoCore service MUST be stopped before calling this.
+    Overwriting a live SQLite database corrupts state and can break
+    the running container's open connections. The caller must pass
+    ``confirm_service_stopped=True`` as an explicit acknowledgment —
+    otherwise this function refuses to run.
+
+    The restore procedure:
+
+    1. Validate the backup via ``validate_backup``; refuse on any error.
+    2. (default) Create a pre-restore safety snapshot of the CURRENT
+       state so the restore itself is reversible. The snapshot stamp
+       is returned in the result for the operator to record.
+    3. Remove stale SQLite WAL/SHM sidecar files next to the target db
+       before copying — the snapshot is a self-contained main-file
+       image from ``conn.backup()``, and leftover WAL/SHM from the old
+       live db would desync against the restored main file.
+    4. Copy the snapshot db over the target db path.
+    5. Restore the project registry file if the snapshot captured one.
+    6. Restore the Chroma directory if ``include_chroma`` resolves to
+       true. When ``include_chroma is None`` the function defers to
+       whether the snapshot captured Chroma (the common case).
+    7. Run ``PRAGMA integrity_check`` on the restored db and report
+       the result.
+
+    Returns a dict describing what was restored. On refused restore
+    (service still running, validation failed) raises ``RuntimeError``.
+    """
+    if not confirm_service_stopped:
+        raise RuntimeError(
+            "restore_runtime_backup refuses to run without "
+            "confirm_service_stopped=True — stop the AtoCore container "
+            "first (e.g. `docker compose down` from deploy/dalidou) "
+            "before calling this function"
+        )
+
+    validation = validate_backup(stamp)
+    if not validation.get("valid"):
+        raise RuntimeError(
+            f"backup {stamp} failed validation: {validation.get('errors')}"
+        )
+    metadata = validation.get("metadata") or {}
+
+    pre_snapshot_stamp: str | None = None
+    if pre_restore_snapshot:
+        pre = create_runtime_backup(include_chroma=False)
+        pre_snapshot_stamp = Path(pre["backup_root"]).name
+
+    target_db = _config.settings.db_path
+    source_db = Path(metadata.get("db_snapshot_path", ""))
+    if not source_db.exists():
+        raise RuntimeError(
+            f"db snapshot not found at {source_db} — backup "
+            f"metadata may be stale"
+        )
+
+    # Force sqlite to flush any lingering WAL into the main file and
+    # release OS-level file handles on -wal/-shm before we swap the
+    # main file. Passing through conn.backup() in the pre-restore
+    # snapshot can leave sidecars momentarily locked on Windows;
+    # an explicit checkpoint(TRUNCATE) is the reliable way to flush
+    # and release. Best-effort: if the target db can't be opened
+    # (missing, corrupt), fall through and trust the copy step.
+    if target_db.exists():
+        try:
+            with sqlite3.connect(str(target_db)) as checkpoint_conn:
+                checkpoint_conn.execute("PRAGMA wal_checkpoint(TRUNCATE)")
+        except sqlite3.DatabaseError as exc:
+            log.warning(
+                "restore_pre_checkpoint_failed",
+                target_db=str(target_db),
+                error=str(exc),
+            )
+
+    # Remove stale WAL/SHM sidecars from the old live db so SQLite
+    # can't read inconsistent state on next open. Tolerant to
+    # Windows file-lock races — the subsequent copy replaces the
+    # main file anyway, and the integrity check afterward is the
+    # actual correctness signal.
+    wal_path = target_db.with_name(target_db.name + "-wal")
+    shm_path = target_db.with_name(target_db.name + "-shm")
+    for stale in (wal_path, shm_path):
+        if stale.exists():
+            try:
+                stale.unlink()
+            except OSError as exc:
+                log.warning(
+                    "restore_sidecar_unlink_failed",
+                    path=str(stale),
+                    error=str(exc),
+                )
+
+    target_db.parent.mkdir(parents=True, exist_ok=True)
+    shutil.copy2(source_db, target_db)
+
+    registry_restored = False
+    registry_snapshot_path = metadata.get("registry_snapshot_path", "")
+    if registry_snapshot_path:
+        src_reg = Path(registry_snapshot_path)
+        if src_reg.exists():
+            dst_reg = _config.settings.resolved_project_registry_path
+            dst_reg.parent.mkdir(parents=True, exist_ok=True)
+            shutil.copy2(src_reg, dst_reg)
+            registry_restored = True
+
+    chroma_snapshot_path = metadata.get("chroma_snapshot_path", "")
+    if include_chroma is None:
+        include_chroma = bool(chroma_snapshot_path)
+    chroma_restored = False
+    if include_chroma and chroma_snapshot_path:
+        src_chroma = Path(chroma_snapshot_path)
+        if src_chroma.exists() and src_chroma.is_dir():
+            dst_chroma = _config.settings.chroma_path
+            # Do NOT rmtree the destination itself: in a Dockerized
+            # deployment the chroma dir is a bind-mounted volume, and
+            # unlinking a mount point raises
+            #   OSError [Errno 16] Device or resource busy.
+            # Instead, clear the directory's CONTENTS and copytree into
+            # it with dirs_exist_ok=True. This is equivalent to an
+            # rmtree+copytree for restore purposes but stays inside the
+            # mount boundary. Discovered during the first real restore
+            # drill on Dalidou (2026-04-09).
+            dst_chroma.mkdir(parents=True, exist_ok=True)
+            for item in dst_chroma.iterdir():
+                if item.is_dir() and not item.is_symlink():
+                    shutil.rmtree(item)
+                else:
+                    item.unlink()
+            shutil.copytree(src_chroma, dst_chroma, dirs_exist_ok=True)
+            chroma_restored = True
+
+    restored_integrity_ok = False
+    integrity_error: str | None = None
+    try:
+        with sqlite3.connect(str(target_db)) as conn:
+            row = conn.execute("PRAGMA integrity_check").fetchone()
+            restored_integrity_ok = bool(row and row[0] == "ok")
+            if not restored_integrity_ok:
+                integrity_error = row[0] if row else "no_row"
+    except sqlite3.DatabaseError as exc:
+        integrity_error = f"db_open_failed: {exc}"
+
+    result: dict = {
+        "stamp": stamp,
+        "pre_restore_snapshot": pre_snapshot_stamp,
+        "target_db": str(target_db),
+        "db_restored": True,
+        "registry_restored": registry_restored,
+        "chroma_restored": chroma_restored,
+        "restored_integrity_ok": restored_integrity_ok,
+    }
+    if integrity_error:
+        result["integrity_error"] = integrity_error
+
+    log.info(
+        "runtime_backup_restored",
+        stamp=stamp,
+        pre_restore_snapshot=pre_snapshot_stamp,
+        registry_restored=registry_restored,
+        chroma_restored=chroma_restored,
+        integrity_ok=restored_integrity_ok,
+    )
+    return result
+
+
+def cleanup_old_backups(*, confirm: bool = False) -> dict:
+    """Apply retention policy and remove old snapshots.
+
+    Retention keeps:
+    - Last 7 daily snapshots (most recent per calendar day)
+    - Last 4 weekly snapshots (most recent on each Sunday)
+    - Last 6 monthly snapshots (most recent on the 1st of each month)
+
+    All other snapshots are candidates for deletion. Runs as dry-run by
+    default; pass ``confirm=True`` to actually delete.
+
+    Returns a dict with kept/deleted counts and any errors.
+    """
+    snapshots_root = _config.settings.resolved_backup_dir / "snapshots"
+    if not snapshots_root.exists() or not snapshots_root.is_dir():
+        return {"kept": 0, "deleted": 0, "would_delete": 0, "dry_run": not confirm, "errors": []}
+
+    # Parse all stamp directories into (datetime, dir_path) pairs.
+    stamps: list[tuple[datetime, Path]] = []
+    unparseable: list[str] = []
+    for entry in sorted(snapshots_root.iterdir()):
+        if not entry.is_dir():
+            continue
+        try:
+            dt = datetime.strptime(entry.name, "%Y%m%dT%H%M%SZ").replace(tzinfo=UTC)
+            stamps.append((dt, entry))
+        except ValueError:
+            unparseable.append(entry.name)
+
+    if not stamps:
+        return {
+            "kept": 0, "deleted": 0, "would_delete": 0,
+            "dry_run": not confirm, "errors": [],
+            "unparseable": unparseable,
+        }
+
+    # Sort newest first so "most recent per bucket" is a simple first-seen.
+    stamps.sort(key=lambda t: t[0], reverse=True)
+
+    keep_set: set[Path] = set()
+
+    # Last 7 daily: most recent snapshot per calendar day.
+    seen_days: set[str] = set()
+    for dt, path in stamps:
+        day_key = dt.strftime("%Y-%m-%d")
+        if day_key not in seen_days:
+            seen_days.add(day_key)
+            keep_set.add(path)
+            if len(seen_days) >= 7:
+                break
+
+    # Last 4 weekly: most recent snapshot that falls on a Sunday.
+    seen_weeks: set[str] = set()
+    for dt, path in stamps:
+        if dt.weekday() == 6:  # Sunday
+            week_key = dt.strftime("%Y-W%W")
+            if week_key not in seen_weeks:
+                seen_weeks.add(week_key)
+                keep_set.add(path)
+                if len(seen_weeks) >= 4:
+                    break
+
+    # Last 6 monthly: most recent snapshot on the 1st of a month.
+    seen_months: set[str] = set()
+    for dt, path in stamps:
+        if dt.day == 1:
+            month_key = dt.strftime("%Y-%m")
+            if month_key not in seen_months:
+                seen_months.add(month_key)
+                keep_set.add(path)
+                if len(seen_months) >= 6:
+                    break
+
+    to_delete = [path for _, path in stamps if path not in keep_set]
+
+    errors: list[str] = []
+    deleted_count = 0
+    if confirm:
+        for path in to_delete:
+            try:
+                shutil.rmtree(path)
+                deleted_count += 1
+            except OSError as exc:
+                errors.append(f"{path.name}: {exc}")
+
+    result: dict = {
+        "kept": len(keep_set),
+        "dry_run": not confirm,
+        "errors": errors,
+    }
+    if confirm:
+        result["deleted"] = deleted_count
+    else:
+        result["would_delete"] = len(to_delete)
+    if unparseable:
+        result["unparseable"] = unparseable
+
+    log.info(
+        "cleanup_old_backups",
+        kept=len(keep_set),
+        deleted=deleted_count if confirm else 0,
+        would_delete=len(to_delete) if not confirm else 0,
+        dry_run=not confirm,
+    )
+    return result
+
+
 def _backup_sqlite_db(source_path: Path, dest_path: Path) -> None:
    source_conn = sqlite3.connect(str(source_path))
    dest_conn = sqlite3.connect(str(dest_path))
@@ -242,7 +537,98 @@ def _copy_directory_tree(source: Path, dest: Path) -> tuple[int, int]:


 def main() -> None:
-    result = create_runtime_backup()
+    """CLI entry point for the backup module.
+
+    Supports four subcommands:
+
+    - ``create``   run ``create_runtime_backup`` (default if none given)
+    - ``list``     list all runtime backup snapshots
+    - ``validate`` validate a specific snapshot by stamp
+    - ``restore``  restore a specific snapshot by stamp
+
+    The restore subcommand is the one used by the backup/restore drill
+    and MUST be run only when the AtoCore service is stopped. It takes
+    ``--confirm-service-stopped`` as an explicit acknowledgment.
+    """
+    import argparse
+
+    parser = argparse.ArgumentParser(
+        prog="python -m atocore.ops.backup",
+        description="AtoCore runtime backup create/list/validate/restore",
+    )
+    sub = parser.add_subparsers(dest="command")
+
+    p_create = sub.add_parser("create", help="create a new runtime backup")
+    p_create.add_argument(
+        "--chroma",
+        action="store_true",
+        help="also snapshot the Chroma vector store (cold copy)",
+    )
+
+    sub.add_parser("list", help="list runtime backup snapshots")
+
+    p_validate = sub.add_parser("validate", help="validate a snapshot by stamp")
+    p_validate.add_argument("stamp", help="snapshot stamp (e.g. 20260409T010203Z)")
+
+    p_cleanup = sub.add_parser("cleanup", help="remove old snapshots per retention policy")
+    p_cleanup.add_argument(
+        "--confirm",
+        action="store_true",
+        help="actually delete (default is dry-run)",
+    )
+
+    p_restore = sub.add_parser(
+        "restore",
+        help="restore a snapshot by stamp (service must be stopped)",
+    )
+    p_restore.add_argument("stamp", help="snapshot stamp to restore")
+    p_restore.add_argument(
+        "--confirm-service-stopped",
+        action="store_true",
+        help="explicit acknowledgment that the AtoCore container is stopped",
+    )
+    p_restore.add_argument(
+        "--no-pre-snapshot",
+        action="store_true",
+        help="skip the pre-restore safety snapshot of current state",
+    )
+    chroma_group = p_restore.add_mutually_exclusive_group()
+    chroma_group.add_argument(
+        "--chroma",
+        dest="include_chroma",
+        action="store_true",
+        default=None,
+        help="force-restore the Chroma snapshot",
+    )
+    chroma_group.add_argument(
+        "--no-chroma",
+        dest="include_chroma",
+        action="store_false",
+        help="skip the Chroma snapshot even if it was captured",
+    )
+
+    args = parser.parse_args()
+    command = args.command or "create"
+
+    if command == "create":
+        include_chroma = getattr(args, "chroma", False)
+        result = create_runtime_backup(include_chroma=include_chroma)
+    elif command == "list":
+        result = {"backups": list_runtime_backups()}
+    elif command == "validate":
+        result = validate_backup(args.stamp)
+    elif command == "cleanup":
+        result = cleanup_old_backups(confirm=getattr(args, "confirm", False))
+    elif command == "restore":
+        result = restore_runtime_backup(
+            args.stamp,
+            include_chroma=args.include_chroma,
+            pre_restore_snapshot=not args.no_pre_snapshot,
+            confirm_service_stopped=args.confirm_service_stopped,
+        )
+    else:  # pragma: no cover — argparse guards this
+        parser.error(f"unknown command: {command}")
+
    print(json.dumps(result, indent=2, ensure_ascii=True))


--- a/src/atocore/projects/registry.py
+++ b/src/atocore/projects/registry.py
@@ -254,6 +254,30 @@ def get_registered_project(project_name: str) -> RegisteredProject | None:
    return None


+def resolve_project_name(name: str | None) -> str:
+    """Canonicalize a project name through the registry.
+
+    Returns the canonical ``project_id`` if the input matches any
+    registered project's id or alias. Returns the input unchanged
+    when it's empty or not in the registry — the second case keeps
+    backwards compatibility with hand-curated state, memories, and
+    interactions that predate the registry, or for projects that
+    are intentionally not registered.
+
+    This helper is the single canonicalization boundary for project
+    names across the trust hierarchy. Every read/write that takes a
+    project name should pass it through ``resolve_project_name``
+    before storing or querying. The contract is documented in
+    ``docs/architecture/representation-authority.md``.
+    """
+    if not name:
+        return name or ""
+    project = get_registered_project(name)
+    if project is not None:
+        return project.project_id
+    return name
+
+
 def refresh_registered_project(project_name: str, purge_deleted: bool = False) -> dict:
    """Ingest all configured source roots for a registered project.

--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -1,5 +1,6 @@
 """pytest configuration and shared fixtures."""

+import json
 import os
 import sys
 import tempfile
@@ -29,6 +30,45 @@ def tmp_data_dir(tmp_path):
    return tmp_path


+@pytest.fixture
+def project_registry(tmp_path, monkeypatch):
+    """Stand up an isolated project registry pointing at a temp file.
+
+    Returns a callable that takes one or more (project_id, [aliases])
+    tuples and writes them into the registry, then forces the in-process
+    settings singleton to re-resolve. Use this when a test needs the
+    canonicalization helpers (resolve_project_name, get_registered_project)
+    to recognize aliases.
+    """
+    registry_path = tmp_path / "test-project-registry.json"
+
+    def _set(*projects):
+        payload = {"projects": []}
+        for entry in projects:
+            if isinstance(entry, str):
+                project_id, aliases = entry, []
+            else:
+                project_id, aliases = entry
+            payload["projects"].append(
+                {
+                    "id": project_id,
+                    "aliases": list(aliases),
+                    "description": f"test project {project_id}",
+                    "ingest_roots": [
+                        {"source": "vault", "subpath": f"incoming/projects/{project_id}"}
+                    ],
+                }
+            )
+        registry_path.write_text(json.dumps(payload), encoding="utf-8")
+        monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+        from atocore import config
+
+        config.settings = config.Settings()
+        return registry_path
+
+    return _set
+
+
@pytest.fixture
 def sample_markdown(tmp_path) -> Path:
    """Create a sample markdown file for testing."""
--- a/tests/test_api_storage.py
+++ b/tests/test_api_storage.py
@@ -50,6 +50,65 @@ def test_health_endpoint_exposes_machine_paths_and_source_readiness(tmp_data_dir
    assert "run_dir" in body["machine_paths"]


+def test_health_endpoint_reports_code_version_from_module(tmp_data_dir):
+    """The /health response must include code_version reflecting
+    atocore.__version__, so deployment drift detection works."""
+    from atocore import __version__
+
+    client = TestClient(app)
+    response = client.get("/health")
+
+    assert response.status_code == 200
+    body = response.json()
+    assert body["version"] == __version__
+    assert body["code_version"] == __version__
+
+
+def test_health_endpoint_reports_build_metadata_from_env(tmp_data_dir, monkeypatch):
+    """The /health response must include build_sha, build_time, and
+    build_branch from the ATOCORE_BUILD_* env vars, so deploy.sh can
+    detect precise drift via SHA comparison instead of relying on
+    the coarse code_version field.
+
+    Regression test for the codex finding from 2026-04-08:
+    code_version 0.2.0 is too coarse to trust as a 'live is current'
+    signal because it only changes on manual bumps. The build_sha
+    field changes per commit and is set by deploy.sh.
+    """
+    monkeypatch.setenv("ATOCORE_BUILD_SHA", "abc1234567890fedcba0987654321")
+    monkeypatch.setenv("ATOCORE_BUILD_TIME", "2026-04-09T01:23:45Z")
+    monkeypatch.setenv("ATOCORE_BUILD_BRANCH", "main")
+
+    client = TestClient(app)
+    response = client.get("/health")
+
+    assert response.status_code == 200
+    body = response.json()
+    assert body["build_sha"] == "abc1234567890fedcba0987654321"
+    assert body["build_time"] == "2026-04-09T01:23:45Z"
+    assert body["build_branch"] == "main"
+
+
+def test_health_endpoint_reports_unknown_when_build_env_unset(tmp_data_dir, monkeypatch):
+    """When deploy.sh hasn't set the build env vars (e.g. someone
+    ran `docker compose up` directly), /health reports 'unknown'
+    for all three build fields. This is a clear signal to the
+    operator that the deploy provenance is missing and they should
+    re-run via deploy.sh."""
+    monkeypatch.delenv("ATOCORE_BUILD_SHA", raising=False)
+    monkeypatch.delenv("ATOCORE_BUILD_TIME", raising=False)
+    monkeypatch.delenv("ATOCORE_BUILD_BRANCH", raising=False)
+
+    client = TestClient(app)
+    response = client.get("/health")
+
+    assert response.status_code == 200
+    body = response.json()
+    assert body["build_sha"] == "unknown"
+    assert body["build_time"] == "unknown"
+    assert body["build_branch"] == "unknown"
+
+
 def test_projects_endpoint_reports_registered_projects(tmp_data_dir, monkeypatch):
    vault_dir = tmp_data_dir / "vault-source"
    drive_dir = tmp_data_dir / "drive-source"
--- a/tests/test_atocore_client.py
+++ b/tests/test_atocore_client.py
@@ -0,0 +1,313 @@
+"""Tests for scripts/atocore_client.py — the shared operator CLI.
+
+Specifically covers the Phase 9 reflection-loop subcommands added
+after codex's sequence-step-3 review: ``capture``, ``extract``,
+``reinforce-interaction``, ``list-interactions``, ``get-interaction``,
+``queue``, ``promote``, ``reject``.
+
+The tests mock the client's ``request()`` helper and verify each
+subcommand:
+
+- calls the correct HTTP method and path
+- builds the correct JSON body (or the correct query string)
+- passes the right subset of CLI arguments through
+
+This is the same "wiring test" shape used by tests/test_api_storage.py:
+we don't exercise the live HTTP stack; we verify the client builds
+the request correctly. The server side is already covered by its
+own route tests.
+"""
+
+from __future__ import annotations
+
+import json
+import sys
+from pathlib import Path
+
+import pytest
+
+# Make scripts/ importable
+_REPO_ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(_REPO_ROOT / "scripts"))
+
+import atocore_client as client  # noqa: E402
+
+
+# ---------------------------------------------------------------------------
+# Request capture helper
+# ---------------------------------------------------------------------------
+
+
+class _RequestCapture:
+    """Drop-in replacement for client.request() that records calls."""
+
+    def __init__(self, response: dict | None = None):
+        self.calls: list[dict] = []
+        self._response = response if response is not None else {"ok": True}
+
+    def __call__(self, method, path, data=None, timeout=None):
+        self.calls.append(
+            {"method": method, "path": path, "data": data, "timeout": timeout}
+        )
+        return self._response
+
+
+@pytest.fixture
+def capture_requests(monkeypatch):
+    """Replace client.request with a recording stub and return it."""
+    stub = _RequestCapture()
+    monkeypatch.setattr(client, "request", stub)
+    return stub
+
+
+def _run_client(monkeypatch, argv: list[str]) -> int:
+    """Simulate a CLI invocation with the given argv."""
+    monkeypatch.setattr(sys, "argv", ["atocore_client.py", *argv])
+    return client.main()
+
+
+# ---------------------------------------------------------------------------
+# capture
+# ---------------------------------------------------------------------------
+
+
+def test_capture_posts_to_interactions_endpoint(capture_requests, monkeypatch):
+    _run_client(
+        monkeypatch,
+        [
+            "capture",
+            "what is p05's current focus",
+            "The current focus is wave 2 operational ingestion.",
+            "p05-interferometer",
+            "claude-code-test",
+            "session-abc",
+        ],
+    )
+    assert len(capture_requests.calls) == 1
+    call = capture_requests.calls[0]
+    assert call["method"] == "POST"
+    assert call["path"] == "/interactions"
+    body = call["data"]
+    assert body["prompt"] == "what is p05's current focus"
+    assert body["response"].startswith("The current focus")
+    assert body["project"] == "p05-interferometer"
+    assert body["client"] == "claude-code-test"
+    assert body["session_id"] == "session-abc"
+    assert body["reinforce"] is True  # default
+
+
+def test_capture_sets_default_client_when_omitted(capture_requests, monkeypatch):
+    _run_client(
+        monkeypatch,
+        ["capture", "hi", "hello"],
+    )
+    call = capture_requests.calls[0]
+    assert call["data"]["client"] == "atocore-client"
+    assert call["data"]["project"] == ""
+    assert call["data"]["reinforce"] is True
+
+
+def test_capture_accepts_reinforce_false(capture_requests, monkeypatch):
+    _run_client(
+        monkeypatch,
+        ["capture", "prompt", "response", "p05", "claude", "sess", "false"],
+    )
+    call = capture_requests.calls[0]
+    assert call["data"]["reinforce"] is False
+
+
+# ---------------------------------------------------------------------------
+# extract
+# ---------------------------------------------------------------------------
+
+
+def test_extract_default_is_preview(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["extract", "abc-123"])
+    call = capture_requests.calls[0]
+    assert call["method"] == "POST"
+    assert call["path"] == "/interactions/abc-123/extract"
+    assert call["data"] == {"persist": False}
+
+
+def test_extract_persist_true(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["extract", "abc-123", "true"])
+    call = capture_requests.calls[0]
+    assert call["data"] == {"persist": True}
+
+
+def test_extract_url_encodes_interaction_id(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["extract", "abc/def"])
+    call = capture_requests.calls[0]
+    assert call["path"] == "/interactions/abc%2Fdef/extract"
+
+
+# ---------------------------------------------------------------------------
+# reinforce-interaction
+# ---------------------------------------------------------------------------
+
+
+def test_reinforce_interaction_posts_to_correct_path(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["reinforce-interaction", "int-xyz"])
+    call = capture_requests.calls[0]
+    assert call["method"] == "POST"
+    assert call["path"] == "/interactions/int-xyz/reinforce"
+    assert call["data"] == {}
+
+
+# ---------------------------------------------------------------------------
+# list-interactions
+# ---------------------------------------------------------------------------
+
+
+def test_list_interactions_no_filters(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["list-interactions"])
+    call = capture_requests.calls[0]
+    assert call["method"] == "GET"
+    assert call["path"] == "/interactions?limit=50"
+
+
+def test_list_interactions_with_project_filter(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["list-interactions", "p05-interferometer"])
+    call = capture_requests.calls[0]
+    assert "project=p05-interferometer" in call["path"]
+    assert "limit=50" in call["path"]
+
+
+def test_list_interactions_full_filter_set(capture_requests, monkeypatch):
+    _run_client(
+        monkeypatch,
+        [
+            "list-interactions",
+            "p05",
+            "sess-1",
+            "claude-code",
+            "2026-04-07T00:00:00Z",
+            "20",
+        ],
+    )
+    call = capture_requests.calls[0]
+    path = call["path"]
+    assert "project=p05" in path
+    assert "session_id=sess-1" in path
+    assert "client=claude-code" in path
+    # Since is URL-encoded — the : and + chars get escaped
+    assert "since=2026-04-07" in path
+    assert "limit=20" in path
+
+
+# ---------------------------------------------------------------------------
+# get-interaction
+# ---------------------------------------------------------------------------
+
+
+def test_get_interaction_fetches_by_id(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["get-interaction", "int-42"])
+    call = capture_requests.calls[0]
+    assert call["method"] == "GET"
+    assert call["path"] == "/interactions/int-42"
+
+
+# ---------------------------------------------------------------------------
+# queue
+# ---------------------------------------------------------------------------
+
+
+def test_queue_always_filters_by_candidate_status(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["queue"])
+    call = capture_requests.calls[0]
+    assert call["method"] == "GET"
+    assert call["path"].startswith("/memory?")
+    assert "status=candidate" in call["path"]
+    assert "limit=50" in call["path"]
+
+
+def test_queue_with_memory_type_and_project(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["queue", "adaptation", "p05-interferometer", "10"])
+    call = capture_requests.calls[0]
+    path = call["path"]
+    assert "status=candidate" in path
+    assert "memory_type=adaptation" in path
+    assert "project=p05-interferometer" in path
+    assert "limit=10" in path
+
+
+def test_queue_limit_coercion(capture_requests, monkeypatch):
+    """limit is typed as int by argparse so string '25' becomes 25."""
+    _run_client(monkeypatch, ["queue", "", "", "25"])
+    call = capture_requests.calls[0]
+    assert "limit=25" in call["path"]
+
+
+# ---------------------------------------------------------------------------
+# promote / reject
+# ---------------------------------------------------------------------------
+
+
+def test_promote_posts_to_memory_promote_path(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["promote", "mem-abc"])
+    call = capture_requests.calls[0]
+    assert call["method"] == "POST"
+    assert call["path"] == "/memory/mem-abc/promote"
+    assert call["data"] == {}
+
+
+def test_reject_posts_to_memory_reject_path(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["reject", "mem-xyz"])
+    call = capture_requests.calls[0]
+    assert call["method"] == "POST"
+    assert call["path"] == "/memory/mem-xyz/reject"
+    assert call["data"] == {}
+
+
+def test_promote_url_encodes_memory_id(capture_requests, monkeypatch):
+    _run_client(monkeypatch, ["promote", "mem/with/slashes"])
+    call = capture_requests.calls[0]
+    assert "mem%2Fwith%2Fslashes" in call["path"]
+
+
+# ---------------------------------------------------------------------------
+# end-to-end: ensure the Phase 9 loop can be driven entirely through
+# the client
+# ---------------------------------------------------------------------------
+
+
+def test_phase9_full_loop_via_client_shape(capture_requests, monkeypatch):
+    """Simulate the full capture -> extract -> queue -> promote cycle.
+
+    This doesn't exercise real HTTP — each call is intercepted by
+    the mock request. But it proves every step of the Phase 9 loop
+    is reachable through the shared client, which is the whole point
+    of the codex-step-3 work.
+    """
+    # Step 1: capture
+    _run_client(
+        monkeypatch,
+        [
+            "capture",
+            "what about GF-PTFE for lateral support",
+            "## Decision: use GF-PTFE pads for thermal stability",
+            "p05-interferometer",
+        ],
+    )
+    # Step 2: extract candidates (preview)
+    _run_client(monkeypatch, ["extract", "fake-interaction-id"])
+    # Step 3: extract and persist
+    _run_client(monkeypatch, ["extract", "fake-interaction-id", "true"])
+    # Step 4: list the review queue
+    _run_client(monkeypatch, ["queue"])
+    # Step 5: promote a candidate
+    _run_client(monkeypatch, ["promote", "fake-memory-id"])
+    # Step 6: reject another
+    _run_client(monkeypatch, ["reject", "fake-memory-id-2"])
+
+    methods_and_paths = [
+        (c["method"], c["path"]) for c in capture_requests.calls
+    ]
+    assert methods_and_paths == [
+        ("POST", "/interactions"),
+        ("POST", "/interactions/fake-interaction-id/extract"),
+        ("POST", "/interactions/fake-interaction-id/extract"),
+        ("GET", "/memory?status=candidate&limit=50"),
+        ("POST", "/memory/fake-memory-id/promote"),
+        ("POST", "/memory/fake-memory-id-2/reject"),
+    ]
--- a/tests/test_backup.py
+++ b/tests/test_backup.py
@@ -1,14 +1,18 @@
-"""Tests for runtime backup creation."""
+"""Tests for runtime backup creation, restore, and retention cleanup."""

 import json
 import sqlite3
-from datetime import UTC, datetime
+from datetime import UTC, datetime, timedelta
+
+import pytest

 import atocore.config as config
 from atocore.models.database import init_db
 from atocore.ops.backup import (
+    cleanup_old_backups,
    create_runtime_backup,
    list_runtime_backups,
+    restore_runtime_backup,
    validate_backup,
 )

@@ -156,3 +160,531 @@ def test_create_runtime_backup_handles_missing_registry(tmp_path, monkeypatch):
        config.settings = original_settings

    assert result["registry_snapshot_path"] == ""
+
+
+def test_restore_refuses_without_confirm_service_stopped(tmp_path, monkeypatch):
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+        create_runtime_backup(datetime(2026, 4, 9, 10, 0, 0, tzinfo=UTC))
+
+        with pytest.raises(RuntimeError, match="confirm_service_stopped"):
+            restore_runtime_backup("20260409T100000Z")
+    finally:
+        config.settings = original_settings
+
+
+def test_restore_raises_on_invalid_backup(tmp_path, monkeypatch):
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+        with pytest.raises(RuntimeError, match="failed validation"):
+            restore_runtime_backup(
+                "20250101T000000Z", confirm_service_stopped=True
+            )
+    finally:
+        config.settings = original_settings
+
+
+def test_restore_round_trip_reverses_post_backup_mutations(tmp_path, monkeypatch):
+    """Canonical drill: snapshot -> mutate -> restore -> mutation gone."""
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    registry_path = tmp_path / "config" / "project-registry.json"
+    registry_path.parent.mkdir(parents=True)
+    registry_path.write_text(
+        '{"projects":[{"id":"p01-example","aliases":[],'
+        '"ingest_roots":[{"source":"vault","subpath":"incoming/projects/p01-example"}]}]}\n',
+        encoding="utf-8",
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+
+        # 1. Seed baseline state that should SURVIVE the restore.
+        with sqlite3.connect(str(config.settings.db_path)) as conn:
+            conn.execute(
+                "INSERT INTO projects (id, name) VALUES (?, ?)",
+                ("p01", "Baseline Project"),
+            )
+            conn.commit()
+
+        # 2. Create the backup we're going to restore to.
+        create_runtime_backup(datetime(2026, 4, 9, 11, 0, 0, tzinfo=UTC))
+        stamp = "20260409T110000Z"
+
+        # 3. Mutate live state AFTER the backup — this is what the
+        #    restore should reverse.
+        with sqlite3.connect(str(config.settings.db_path)) as conn:
+            conn.execute(
+                "INSERT INTO projects (id, name) VALUES (?, ?)",
+                ("p99", "Post Backup Mutation"),
+            )
+            conn.commit()
+
+        # Confirm the mutation is present before restore.
+        with sqlite3.connect(str(config.settings.db_path)) as conn:
+            row = conn.execute(
+                "SELECT name FROM projects WHERE id = ?", ("p99",)
+            ).fetchone()
+            assert row is not None and row[0] == "Post Backup Mutation"
+
+        # 4. Restore — the drill procedure. Explicit confirm_service_stopped.
+        result = restore_runtime_backup(
+            stamp, confirm_service_stopped=True
+        )
+
+        # 5. Verify restore report
+        assert result["stamp"] == stamp
+        assert result["db_restored"] is True
+        assert result["registry_restored"] is True
+        assert result["restored_integrity_ok"] is True
+        assert result["pre_restore_snapshot"] is not None
+
+        # 6. Verify live state reflects the restore: baseline survived,
+        #    post-backup mutation is gone.
+        with sqlite3.connect(str(config.settings.db_path)) as conn:
+            baseline = conn.execute(
+                "SELECT name FROM projects WHERE id = ?", ("p01",)
+            ).fetchone()
+            mutation = conn.execute(
+                "SELECT name FROM projects WHERE id = ?", ("p99",)
+            ).fetchone()
+        assert baseline is not None and baseline[0] == "Baseline Project"
+        assert mutation is None
+
+        # 7. Pre-restore safety snapshot DOES contain the mutation —
+        #    it captured current state before overwriting. This is the
+        #    reversibility guarantee: the operator can restore back to
+        #    it if the restore itself was a mistake.
+        pre_stamp = result["pre_restore_snapshot"]
+        pre_validation = validate_backup(pre_stamp)
+        assert pre_validation["valid"] is True
+        pre_db_path = pre_validation["metadata"]["db_snapshot_path"]
+        with sqlite3.connect(pre_db_path) as conn:
+            pre_mutation = conn.execute(
+                "SELECT name FROM projects WHERE id = ?", ("p99",)
+            ).fetchone()
+        assert pre_mutation is not None and pre_mutation[0] == "Post Backup Mutation"
+    finally:
+        config.settings = original_settings
+
+
+def test_restore_round_trip_with_chroma(tmp_path, monkeypatch):
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+
+        # Seed baseline chroma state that should survive restore.
+        chroma_dir = config.settings.chroma_path
+        (chroma_dir / "coll-a").mkdir(parents=True, exist_ok=True)
+        (chroma_dir / "coll-a" / "baseline.bin").write_bytes(b"baseline")
+
+        create_runtime_backup(
+            datetime(2026, 4, 9, 12, 0, 0, tzinfo=UTC), include_chroma=True
+        )
+        stamp = "20260409T120000Z"
+
+        # Mutate chroma after backup: add a file + remove baseline.
+        (chroma_dir / "coll-a" / "post_backup.bin").write_bytes(b"post")
+        (chroma_dir / "coll-a" / "baseline.bin").unlink()
+
+        result = restore_runtime_backup(
+            stamp, confirm_service_stopped=True
+        )
+
+        assert result["chroma_restored"] is True
+        assert (chroma_dir / "coll-a" / "baseline.bin").exists()
+        assert not (chroma_dir / "coll-a" / "post_backup.bin").exists()
+    finally:
+        config.settings = original_settings
+
+
+def test_restore_chroma_does_not_unlink_destination_directory(tmp_path, monkeypatch):
+    """Regression: restore must not rmtree the chroma dir itself.
+
+    In a Dockerized deployment the chroma dir is a bind-mounted
+    volume. Calling shutil.rmtree on a mount point raises
+    ``OSError [Errno 16] Device or resource busy``, which broke the
+    first real Dalidou drill on 2026-04-09. The fix clears the
+    directory's CONTENTS and copytree(dirs_exist_ok=True) into it,
+    keeping the directory inode (and any bind mount) intact.
+
+    This test captures the inode of the destination directory before
+    and after restore and asserts they match — that's what a
+    bind-mounted chroma dir would also see.
+    """
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+
+        chroma_dir = config.settings.chroma_path
+        (chroma_dir / "coll-a").mkdir(parents=True, exist_ok=True)
+        (chroma_dir / "coll-a" / "baseline.bin").write_bytes(b"baseline")
+
+        create_runtime_backup(
+            datetime(2026, 4, 9, 15, 0, 0, tzinfo=UTC), include_chroma=True
+        )
+
+        # Capture the destination directory's stat signature before restore.
+        chroma_stat_before = chroma_dir.stat()
+
+        # Add a file post-backup so restore has work to do.
+        (chroma_dir / "coll-a" / "post_backup.bin").write_bytes(b"post")
+
+        restore_runtime_backup(
+            "20260409T150000Z", confirm_service_stopped=True
+        )
+
+        # Directory still exists (would have failed on mount point) and
+        # its st_ino matches — the mount itself wasn't unlinked.
+        assert chroma_dir.exists()
+        chroma_stat_after = chroma_dir.stat()
+        assert chroma_stat_before.st_ino == chroma_stat_after.st_ino, (
+            "chroma directory inode changed — restore recreated the "
+            "directory instead of clearing its contents; this would "
+            "fail on a Docker bind-mounted volume"
+        )
+        # And the contents did actually get restored.
+        assert (chroma_dir / "coll-a" / "baseline.bin").exists()
+        assert not (chroma_dir / "coll-a" / "post_backup.bin").exists()
+    finally:
+        config.settings = original_settings
+
+
+def test_restore_skips_pre_snapshot_when_requested(tmp_path, monkeypatch):
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+        create_runtime_backup(datetime(2026, 4, 9, 13, 0, 0, tzinfo=UTC))
+
+        before_count = len(list_runtime_backups())
+
+        result = restore_runtime_backup(
+            "20260409T130000Z",
+            confirm_service_stopped=True,
+            pre_restore_snapshot=False,
+        )
+
+        after_count = len(list_runtime_backups())
+        assert result["pre_restore_snapshot"] is None
+        assert after_count == before_count
+    finally:
+        config.settings = original_settings
+
+
+def test_create_backup_includes_validation_fields(tmp_path, monkeypatch):
+    """Task B: create_runtime_backup auto-validates and reports result."""
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+        result = create_runtime_backup(datetime(2026, 4, 11, 10, 0, 0, tzinfo=UTC))
+    finally:
+        config.settings = original_settings
+
+    assert "validated" in result
+    assert "validation_errors" in result
+    assert result["validated"] is True
+    assert result["validation_errors"] == []
+
+
+def test_create_backup_validation_failure_does_not_raise(tmp_path, monkeypatch):
+    """Task B: if post-backup validation fails, backup still returns metadata."""
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    def _broken_validate(stamp):
+        return {"valid": False, "errors": ["db_missing", "metadata_missing"]}
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+        monkeypatch.setattr("atocore.ops.backup.validate_backup", _broken_validate)
+        result = create_runtime_backup(datetime(2026, 4, 11, 11, 0, 0, tzinfo=UTC))
+    finally:
+        config.settings = original_settings
+
+    # Should NOT have raised — backup still returned metadata
+    assert result["validated"] is False
+    assert result["validation_errors"] == ["db_missing", "metadata_missing"]
+    # Core backup fields still present
+    assert "db_snapshot_path" in result
+    assert "created_at" in result
+
+
+def test_restore_cleans_stale_wal_sidecars(tmp_path, monkeypatch):
+    """Stale WAL/SHM sidecars must not carry bytes past the restore.
+
+    Note: after restore runs, PRAGMA integrity_check reopens the
+    restored db which may legitimately recreate a fresh -wal. So we
+    assert that the STALE byte marker no longer appears in either
+    sidecar, not that the files are absent.
+    """
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+        init_db()
+        create_runtime_backup(datetime(2026, 4, 9, 14, 0, 0, tzinfo=UTC))
+
+        # Write fake stale WAL/SHM next to the live db with an
+        # unmistakable marker.
+        target_db = config.settings.db_path
+        wal = target_db.with_name(target_db.name + "-wal")
+        shm = target_db.with_name(target_db.name + "-shm")
+        stale_marker = b"STALE-SIDECAR-MARKER-DO-NOT-SURVIVE"
+        wal.write_bytes(stale_marker)
+        shm.write_bytes(stale_marker)
+        assert wal.exists() and shm.exists()
+
+        restore_runtime_backup(
+            "20260409T140000Z", confirm_service_stopped=True
+        )
+
+        # The restored db must pass integrity check (tested elsewhere);
+        # here we just confirm that no file next to it still contains
+        # the stale marker from the old live process.
+        for sidecar in (wal, shm):
+            if sidecar.exists():
+                assert stale_marker not in sidecar.read_bytes(), (
+                    f"{sidecar.name} still carries stale marker"
+                )
+    finally:
+        config.settings = original_settings
+
+
+# ---------------------------------------------------------------------------
+# Task C: Backup retention cleanup
+# ---------------------------------------------------------------------------
+
+
+def _setup_cleanup_env(tmp_path, monkeypatch):
+    """Helper: configure env, init db, return snapshots_root."""
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
+    monkeypatch.setenv(
+        "ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
+    )
+    original = config.settings
+    config.settings = config.Settings()
+    init_db()
+    snapshots_root = config.settings.resolved_backup_dir / "snapshots"
+    snapshots_root.mkdir(parents=True, exist_ok=True)
+    return original, snapshots_root
+
+
+def _seed_snapshots(snapshots_root, dates):
+    """Create minimal valid snapshot dirs for the given datetimes."""
+    for dt in dates:
+        stamp = dt.strftime("%Y%m%dT%H%M%SZ")
+        snap_dir = snapshots_root / stamp
+        db_dir = snap_dir / "db"
+        db_dir.mkdir(parents=True, exist_ok=True)
+        db_path = db_dir / "atocore.db"
+        conn = sqlite3.connect(str(db_path))
+        conn.execute("CREATE TABLE IF NOT EXISTS _marker (id INTEGER)")
+        conn.close()
+        metadata = {
+            "created_at": dt.isoformat(),
+            "backup_root": str(snap_dir),
+            "db_snapshot_path": str(db_path),
+            "db_size_bytes": db_path.stat().st_size,
+            "registry_snapshot_path": "",
+            "chroma_snapshot_path": "",
+            "chroma_snapshot_bytes": 0,
+            "chroma_snapshot_files": 0,
+            "chroma_snapshot_included": False,
+            "vector_store_note": "",
+        }
+        (snap_dir / "backup-metadata.json").write_text(
+            json.dumps(metadata, indent=2) + "\n", encoding="utf-8"
+        )
+
+
+def test_cleanup_empty_dir(tmp_path, monkeypatch):
+    original, _ = _setup_cleanup_env(tmp_path, monkeypatch)
+    try:
+        result = cleanup_old_backups()
+        assert result["kept"] == 0
+        assert result["would_delete"] == 0
+        assert result["dry_run"] is True
+    finally:
+        config.settings = original
+
+
+def test_cleanup_dry_run_identifies_old_snapshots(tmp_path, monkeypatch):
+    original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
+    try:
+        # 10 daily snapshots Apr 2-11 (avoiding Apr 1 which is monthly).
+        base = datetime(2026, 4, 2, 12, 0, 0, tzinfo=UTC)
+        dates = [base + timedelta(days=i) for i in range(10)]
+        _seed_snapshots(snapshots_root, dates)
+
+        result = cleanup_old_backups()
+        assert result["dry_run"] is True
+        # 7 daily kept + Apr 5 is a Sunday (weekly) but already in daily.
+        # Apr 2, 3, 4 are oldest. Apr 5 is Sunday → kept as weekly.
+        # So: 7 daily (Apr 5-11) + 1 weekly (Apr 5 already counted) = 7 daily.
+        # But Apr 5 is the 8th newest day from Apr 11... wait.
+        # Newest 7 days: Apr 11,10,9,8,7,6,5 → all kept as daily.
+        # Remaining: Apr 4,3,2. Apr 5 is already in daily.
+        # None of Apr 4,3,2 are Sunday or 1st → all 3 deleted.
+        assert result["kept"] == 7
+        assert result["would_delete"] == 3
+        assert len(list(snapshots_root.iterdir())) == 10
+    finally:
+        config.settings = original
+
+
+def test_cleanup_confirm_deletes(tmp_path, monkeypatch):
+    original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
+    try:
+        base = datetime(2026, 4, 2, 12, 0, 0, tzinfo=UTC)
+        dates = [base + timedelta(days=i) for i in range(10)]
+        _seed_snapshots(snapshots_root, dates)
+
+        result = cleanup_old_backups(confirm=True)
+        assert result["dry_run"] is False
+        assert result["deleted"] == 3
+        assert result["kept"] == 7
+        assert len(list(snapshots_root.iterdir())) == 7
+    finally:
+        config.settings = original
+
+
+def test_cleanup_keeps_last_7_daily(tmp_path, monkeypatch):
+    """Exactly 7 snapshots on different days → all kept."""
+    original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
+    try:
+        base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
+        dates = [base + timedelta(days=i) for i in range(7)]
+        _seed_snapshots(snapshots_root, dates)
+
+        result = cleanup_old_backups()
+        assert result["kept"] == 7
+        assert result["would_delete"] == 0
+    finally:
+        config.settings = original
+
+
+def test_cleanup_keeps_sunday_weekly(tmp_path, monkeypatch):
+    """Snapshots on Sundays outside the 7-day window are kept as weekly."""
+    original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
+    try:
+        # 7 daily snapshots covering Apr 5-11
+        base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
+        daily = [base + timedelta(days=i) for i in range(7)]
+
+        # 2 older Sunday snapshots
+        sun1 = datetime(2026, 3, 29, 12, 0, 0, tzinfo=UTC)  # Sunday
+        sun2 = datetime(2026, 3, 22, 12, 0, 0, tzinfo=UTC)  # Sunday
+        # A non-Sunday old snapshot that should be deleted
+        wed = datetime(2026, 3, 25, 12, 0, 0, tzinfo=UTC)   # Wednesday
+
+        _seed_snapshots(snapshots_root, daily + [sun1, sun2, wed])
+
+        result = cleanup_old_backups()
+        # 7 daily + 2 Sunday weekly = 9 kept, 1 Wednesday deleted
+        assert result["kept"] == 9
+        assert result["would_delete"] == 1
+    finally:
+        config.settings = original
+
+
+def test_cleanup_keeps_monthly_first(tmp_path, monkeypatch):
+    """Snapshots on the 1st of a month outside daily+weekly are kept as monthly."""
+    original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
+    try:
+        # 7 daily in April 2026
+        base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
+        daily = [base + timedelta(days=i) for i in range(7)]
+
+        # Old monthly 1st snapshots
+        m1 = datetime(2026, 1, 1, 12, 0, 0, tzinfo=UTC)
+        m2 = datetime(2025, 12, 1, 12, 0, 0, tzinfo=UTC)
+        # Old non-1st, non-Sunday snapshot — should be deleted
+        old = datetime(2026, 1, 15, 12, 0, 0, tzinfo=UTC)
+
+        _seed_snapshots(snapshots_root, daily + [m1, m2, old])
+
+        result = cleanup_old_backups()
+        # 7 daily + 2 monthly = 9 kept, 1 deleted
+        assert result["kept"] == 9
+        assert result["would_delete"] == 1
+    finally:
+        config.settings = original
+
+
+def test_cleanup_unparseable_stamp_skipped(tmp_path, monkeypatch):
+    """Directories with unparseable names are ignored, not deleted."""
+    original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
+    try:
+        base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
+        _seed_snapshots(snapshots_root, [base])
+
+        bad_dir = snapshots_root / "not-a-timestamp"
+        bad_dir.mkdir()
+
+        result = cleanup_old_backups(confirm=True)
+        assert result.get("unparseable") == ["not-a-timestamp"]
+        assert bad_dir.exists()
+        assert result["kept"] == 1
+    finally:
+        config.settings = original
--- a/tests/test_capture_stop.py
+++ b/tests/test_capture_stop.py
@@ -0,0 +1,249 @@
+"""Tests for deploy/hooks/capture_stop.py — Claude Code Stop hook."""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+import tempfile
+import textwrap
+from io import StringIO
+from pathlib import Path
+from unittest import mock
+
+import pytest
+
+# The hook script lives outside of the normal package tree, so import
+# it by manipulating sys.path.
+_HOOK_DIR = str(Path(__file__).resolve().parent.parent / "deploy" / "hooks")
+if _HOOK_DIR not in sys.path:
+    sys.path.insert(0, _HOOK_DIR)
+
+import capture_stop  # noqa: E402
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _write_transcript(tmp: Path, entries: list[dict]) -> str:
+    """Write a JSONL transcript and return the path."""
+    path = tmp / "transcript.jsonl"
+    with open(path, "w", encoding="utf-8") as f:
+        for entry in entries:
+            f.write(json.dumps(entry, ensure_ascii=False) + "\n")
+    return str(path)
+
+
+def _user_entry(content: str, *, is_meta: bool = False) -> dict:
+    return {
+        "type": "user",
+        "isMeta": is_meta,
+        "message": {"role": "user", "content": content},
+    }
+
+
+def _assistant_entry() -> dict:
+    return {
+        "type": "assistant",
+        "message": {
+            "role": "assistant",
+            "content": [{"type": "text", "text": "Sure, here's the answer."}],
+        },
+    }
+
+
+def _system_entry() -> dict:
+    return {"type": "system", "message": {"role": "system", "content": "system init"}}
+
+
+# ---------------------------------------------------------------------------
+# _extract_last_user_prompt
+# ---------------------------------------------------------------------------
+
+class TestExtractLastUserPrompt:
+    def test_returns_last_real_prompt(self, tmp_path):
+        path = _write_transcript(tmp_path, [
+            _user_entry("First prompt that is long enough to capture"),
+            _assistant_entry(),
+            _user_entry("Second prompt that should be the one we capture"),
+            _assistant_entry(),
+        ])
+        result = capture_stop._extract_last_user_prompt(path)
+        assert result == "Second prompt that should be the one we capture"
+
+    def test_skips_meta_messages(self, tmp_path):
+        path = _write_transcript(tmp_path, [
+            _user_entry("Real prompt that is definitely long enough"),
+            _user_entry("<local-command>some system stuff</local-command>"),
+            _user_entry("Meta message that looks real enough", is_meta=True),
+        ])
+        result = capture_stop._extract_last_user_prompt(path)
+        assert result == "Real prompt that is definitely long enough"
+
+    def test_skips_xml_content(self, tmp_path):
+        path = _write_transcript(tmp_path, [
+            _user_entry("Actual prompt from a real human user"),
+            _user_entry("<command-name>/help</command-name>"),
+        ])
+        result = capture_stop._extract_last_user_prompt(path)
+        assert result == "Actual prompt from a real human user"
+
+    def test_skips_short_messages(self, tmp_path):
+        path = _write_transcript(tmp_path, [
+            _user_entry("This prompt is long enough to be captured"),
+            _user_entry("yes"),  # too short
+        ])
+        result = capture_stop._extract_last_user_prompt(path)
+        assert result == "This prompt is long enough to be captured"
+
+    def test_handles_content_blocks(self, tmp_path):
+        entry = {
+            "type": "user",
+            "message": {
+                "role": "user",
+                "content": [
+                    {"type": "text", "text": "First paragraph of the prompt."},
+                    {"type": "text", "text": "Second paragraph continues here."},
+                ],
+            },
+        }
+        path = _write_transcript(tmp_path, [entry])
+        result = capture_stop._extract_last_user_prompt(path)
+        assert "First paragraph" in result
+        assert "Second paragraph" in result
+
+    def test_empty_transcript(self, tmp_path):
+        path = _write_transcript(tmp_path, [])
+        result = capture_stop._extract_last_user_prompt(path)
+        assert result == ""
+
+    def test_missing_file(self):
+        result = capture_stop._extract_last_user_prompt("/nonexistent/path.jsonl")
+        assert result == ""
+
+    def test_empty_path(self):
+        result = capture_stop._extract_last_user_prompt("")
+        assert result == ""
+
+
+# ---------------------------------------------------------------------------
+# _infer_project
+# ---------------------------------------------------------------------------
+
+class TestInferProject:
+    def test_empty_cwd(self):
+        assert capture_stop._infer_project("") == ""
+
+    def test_unknown_path(self):
+        assert capture_stop._infer_project("C:\\Users\\antoi\\random") == ""
+
+    def test_mapped_path(self):
+        with mock.patch.dict(capture_stop._PROJECT_PATH_MAP, {
+            "C:\\Users\\antoi\\gigabit": "p04-gigabit",
+        }):
+            result = capture_stop._infer_project("C:\\Users\\antoi\\gigabit\\src")
+            assert result == "p04-gigabit"
+
+
+# ---------------------------------------------------------------------------
+# _capture (integration-style, mocking HTTP)
+# ---------------------------------------------------------------------------
+
+class TestCapture:
+    def _hook_input(self, *, transcript_path: str = "", **overrides) -> str:
+        data = {
+            "session_id": "test-session-123",
+            "transcript_path": transcript_path,
+            "cwd": "C:\\Users\\antoi\\ATOCore",
+            "permission_mode": "default",
+            "hook_event_name": "Stop",
+            "last_assistant_message": "Here is the answer to your question about the code.",
+            "turn_number": 3,
+        }
+        data.update(overrides)
+        return json.dumps(data)
+
+    @mock.patch("capture_stop.urllib.request.urlopen")
+    def test_posts_to_atocore(self, mock_urlopen, tmp_path):
+        transcript = _write_transcript(tmp_path, [
+            _user_entry("Please explain how the backup system works in detail"),
+            _assistant_entry(),
+        ])
+        mock_resp = mock.MagicMock()
+        mock_resp.read.return_value = json.dumps({"id": "int-001", "status": "recorded"}).encode()
+        mock_urlopen.return_value = mock_resp
+
+        with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
+            capture_stop._capture()
+
+        mock_urlopen.assert_called_once()
+        req = mock_urlopen.call_args[0][0]
+        body = json.loads(req.data.decode())
+        assert body["prompt"] == "Please explain how the backup system works in detail"
+        assert body["client"] == "claude-code"
+        assert body["session_id"] == "test-session-123"
+        assert body["reinforce"] is False
+
+    @mock.patch("capture_stop.urllib.request.urlopen")
+    def test_skips_when_disabled(self, mock_urlopen, tmp_path):
+        transcript = _write_transcript(tmp_path, [
+            _user_entry("A prompt that would normally be captured"),
+        ])
+        with mock.patch.dict(os.environ, {"ATOCORE_CAPTURE_DISABLED": "1"}):
+            with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
+                capture_stop._capture()
+        mock_urlopen.assert_not_called()
+
+    @mock.patch("capture_stop.urllib.request.urlopen")
+    def test_skips_short_prompt(self, mock_urlopen, tmp_path):
+        transcript = _write_transcript(tmp_path, [
+            _user_entry("yes"),
+        ])
+        with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
+            capture_stop._capture()
+        mock_urlopen.assert_not_called()
+
+    @mock.patch("capture_stop.urllib.request.urlopen")
+    def test_truncates_long_response(self, mock_urlopen, tmp_path):
+        transcript = _write_transcript(tmp_path, [
+            _user_entry("Tell me everything about the entire codebase architecture"),
+        ])
+        long_response = "x" * 60_000
+        mock_resp = mock.MagicMock()
+        mock_resp.read.return_value = json.dumps({"id": "int-002"}).encode()
+        mock_urlopen.return_value = mock_resp
+
+        with mock.patch("sys.stdin", StringIO(
+            self._hook_input(transcript_path=transcript, last_assistant_message=long_response)
+        )):
+            capture_stop._capture()
+
+        req = mock_urlopen.call_args[0][0]
+        body = json.loads(req.data.decode())
+        assert len(body["response"]) <= capture_stop.MAX_RESPONSE_LENGTH + 20
+        assert body["response"].endswith("[truncated]")
+
+    def test_main_never_raises(self):
+        """main() must always exit 0, even on garbage input."""
+        with mock.patch("sys.stdin", StringIO("not json at all")):
+            # Should not raise
+            capture_stop.main()
+
+    @mock.patch("capture_stop.urllib.request.urlopen")
+    def test_uses_atocore_url_env(self, mock_urlopen, tmp_path):
+        transcript = _write_transcript(tmp_path, [
+            _user_entry("Please help me with this particular problem in the code"),
+        ])
+        mock_resp = mock.MagicMock()
+        mock_resp.read.return_value = json.dumps({"id": "int-003"}).encode()
+        mock_urlopen.return_value = mock_resp
+
+        with mock.patch.dict(os.environ, {"ATOCORE_URL": "http://localhost:9999"}):
+            # Re-read the env var
+            with mock.patch.object(capture_stop, "ATOCORE_URL", "http://localhost:9999"):
+                with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
+                    capture_stop._capture()
+
+        req = mock_urlopen.call_args[0][0]
+        assert req.full_url == "http://localhost:9999/interactions"
--- a/tests/test_context_builder.py
+++ b/tests/test_context_builder.py
@@ -1,5 +1,8 @@
 """Tests for the context builder."""

+import json
+
+import atocore.config as config
 from atocore.context.builder import build_context, get_last_context_pack
 from atocore.context.project_state import init_project_state_schema, set_state
 from atocore.ingestion.pipeline import ingest_file
@@ -162,3 +165,89 @@ def test_no_project_state_without_hint(tmp_data_dir, sample_markdown):
    pack = build_context("What is AtoCore?")
    assert pack.project_state_chars == 0
    assert "--- Trusted Project State ---" not in pack.formatted_context
+
+
+def test_alias_hint_resolves_through_registry(tmp_data_dir, sample_markdown, monkeypatch):
+    """An alias hint like 'p05' should find project state stored under 'p05-interferometer'.
+
+    This is the regression test for the P1 finding from codex's review:
+    /context/build was previously doing an exact-name lookup that
+    silently dropped trusted project state when the caller passed an
+    alias instead of the canonical project id.
+    """
+    init_db()
+    init_project_state_schema()
+    ingest_file(sample_markdown)
+
+    # Stand up a minimal project registry that knows the aliases.
+    # The registry lives in a JSON file pointed to by
+    # ATOCORE_PROJECT_REGISTRY_PATH; the dataclass-driven loader picks
+    # it up on every call (no in-process cache to invalidate).
+    registry_path = tmp_data_dir / "project-registry.json"
+    registry_path.write_text(
+        json.dumps(
+            {
+                "projects": [
+                    {
+                        "id": "p05-interferometer",
+                        "aliases": ["p05", "interferometer"],
+                        "description": "P05 alias-resolution regression test",
+                        "ingest_roots": [
+                            {"source": "vault", "subpath": "incoming/projects/p05"}
+                        ],
+                    }
+                ]
+            }
+        ),
+        encoding="utf-8",
+    )
+    monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+    config.settings = config.Settings()
+
+    # Trusted state is stored under the canonical id (the way the
+    # /project/state endpoint always writes it).
+    set_state(
+        "p05-interferometer",
+        "status",
+        "next_focus",
+        "Wave 2 trusted-operational ingestion",
+    )
+
+    # The bug: pack with alias hint used to silently miss the state.
+    pack_with_alias = build_context("status?", project_hint="p05", budget=2000)
+    assert "Wave 2 trusted-operational ingestion" in pack_with_alias.formatted_context
+    assert pack_with_alias.project_state_chars > 0
+
+    # The canonical id should still work the same way.
+    pack_with_canonical = build_context(
+        "status?", project_hint="p05-interferometer", budget=2000
+    )
+    assert "Wave 2 trusted-operational ingestion" in pack_with_canonical.formatted_context
+
+    # A second alias should also resolve.
+    pack_with_other_alias = build_context(
+        "status?", project_hint="interferometer", budget=2000
+    )
+    assert "Wave 2 trusted-operational ingestion" in pack_with_other_alias.formatted_context
+
+
+def test_unknown_hint_falls_back_to_raw_lookup(tmp_data_dir, sample_markdown, monkeypatch):
+    """A hint that isn't in the registry should still try the raw name.
+
+    This preserves backwards compatibility with hand-curated
+    project_state entries that predate the project registry.
+    """
+    init_db()
+    init_project_state_schema()
+    ingest_file(sample_markdown)
+
+    # Empty registry — the hint won't resolve through it.
+    registry_path = tmp_data_dir / "project-registry.json"
+    registry_path.write_text('{"projects": []}', encoding="utf-8")
+    monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
+    config.settings = config.Settings()
+
+    set_state("orphan-project", "status", "phase", "Solo run")
+
+    pack = build_context("status?", project_hint="orphan-project", budget=2000)
+    assert "Solo run" in pack.formatted_context
--- a/tests/test_database.py
+++ b/tests/test_database.py
@@ -47,3 +47,138 @@ def test_get_connection_uses_configured_timeout_value(tmp_path, monkeypatch):

    assert calls
    assert calls[0] == 2.5
+
+
+def test_init_db_upgrades_pre_phase9_schema_without_failing(tmp_path, monkeypatch):
+    """Regression test for the schema init ordering bug caught during
+    the first real Dalidou deploy (report from 2026-04-08).
+
+    Before the fix, SCHEMA_SQL contained CREATE INDEX statements that
+    referenced columns (memories.project, interactions.project,
+    interactions.session_id) added by _apply_migrations later in
+    init_db. On a fresh install this worked because CREATE TABLE
+    created the tables with the new columns before the CREATE INDEX
+    ran, but on UPGRADE from a pre-Phase-9 schema the CREATE TABLE
+    IF NOT EXISTS was a no-op and the CREATE INDEX hit
+    OperationalError: no such column.
+
+    This test seeds the tables with the OLD pre-Phase-9 shape then
+    calls init_db() and verifies that:
+
+    - init_db does not raise
+    - The new columns were added via _apply_migrations
+    - The new indexes exist
+
+    If the bug is reintroduced by moving a CREATE INDEX for a
+    migration column back into SCHEMA_SQL, this test will fail
+    with OperationalError before reaching the assertions.
+    """
+    monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
+    original_settings = config.settings
+    try:
+        config.settings = config.Settings()
+
+        # Step 1: create the data dir and open a direct connection
+        config.ensure_runtime_dirs()
+        db_path = config.settings.db_path
+
+        # Step 2: seed the DB with the old pre-Phase-9 shape. No
+        # project/last_referenced_at/reference_count on memories; no
+        # project/client/session_id/response/memories_used/chunks_used
+        # on interactions. We also need the prerequisite tables
+        # (projects, source_documents, source_chunks) because the
+        # memories table has an FK to source_chunks.
+        with sqlite3.connect(str(db_path)) as conn:
+            conn.executescript(
+                """
+                CREATE TABLE source_documents (
+                    id TEXT PRIMARY KEY,
+                    file_path TEXT UNIQUE NOT NULL,
+                    file_hash TEXT NOT NULL,
+                    title TEXT,
+                    doc_type TEXT DEFAULT 'markdown',
+                    tags TEXT DEFAULT '[]',
+                    ingested_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+                    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
+                );
+
+                CREATE TABLE source_chunks (
+                    id TEXT PRIMARY KEY,
+                    document_id TEXT NOT NULL REFERENCES source_documents(id) ON DELETE CASCADE,
+                    chunk_index INTEGER NOT NULL,
+                    content TEXT NOT NULL,
+                    heading_path TEXT DEFAULT '',
+                    char_count INTEGER NOT NULL,
+                    metadata TEXT DEFAULT '{}',
+                    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
+                );
+
+                CREATE TABLE memories (
+                    id TEXT PRIMARY KEY,
+                    memory_type TEXT NOT NULL,
+                    content TEXT NOT NULL,
+                    source_chunk_id TEXT REFERENCES source_chunks(id),
+                    confidence REAL DEFAULT 1.0,
+                    status TEXT DEFAULT 'active',
+                    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+                    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
+                );
+
+                CREATE TABLE projects (
+                    id TEXT PRIMARY KEY,
+                    name TEXT UNIQUE NOT NULL,
+                    description TEXT DEFAULT '',
+                    status TEXT DEFAULT 'active',
+                    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
+                    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
+                );
+
+                CREATE TABLE interactions (
+                    id TEXT PRIMARY KEY,
+                    prompt TEXT NOT NULL,
+                    context_pack TEXT DEFAULT '{}',
+                    response_summary TEXT DEFAULT '',
+                    project_id TEXT REFERENCES projects(id),
+                    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
+                );
+                """
+            )
+            conn.commit()
+
+        # Step 3: call init_db — this used to raise on the upgrade
+        # path. After the fix it should succeed.
+        init_db()
+
+        # Step 4: verify the migrations ran — Phase 9 columns present
+        with sqlite3.connect(str(db_path)) as conn:
+            conn.row_factory = sqlite3.Row
+            memories_cols = {
+                row["name"] for row in conn.execute("PRAGMA table_info(memories)")
+            }
+            interactions_cols = {
+                row["name"]
+                for row in conn.execute("PRAGMA table_info(interactions)")
+            }
+
+            assert "project" in memories_cols
+            assert "last_referenced_at" in memories_cols
+            assert "reference_count" in memories_cols
+
+            assert "project" in interactions_cols
+            assert "client" in interactions_cols
+            assert "session_id" in interactions_cols
+            assert "response" in interactions_cols
+            assert "memories_used" in interactions_cols
+            assert "chunks_used" in interactions_cols
+
+            # Step 5: verify the indexes on migration columns exist
+            index_rows = conn.execute(
+                "SELECT name FROM sqlite_master WHERE type='index' AND tbl_name IN ('memories','interactions')"
+            ).fetchall()
+            index_names = {row["name"] for row in index_rows}
+
+            assert "idx_memories_project" in index_names
+            assert "idx_interactions_project_name" in index_names
+            assert "idx_interactions_session" in index_names
+    finally:
+        config.settings = original_settings
--- a/tests/test_interactions.py
+++ b/tests/test_interactions.py
@@ -209,3 +209,96 @@ def test_list_interactions_endpoint_returns_summaries(tmp_data_dir):
    assert body["interactions"][0]["response_chars"] == 50
    # The list endpoint never includes the full response body
    assert "response" not in body["interactions"][0]
+
+
+# --- alias canonicalization on interaction capture/list -------------------
+
+
+def test_record_interaction_canonicalizes_project(project_registry):
+    """Capturing under an alias should store the canonical project id.
+
+    Regression for codex's P2 finding: reinforcement and extraction
+    query memories by interaction.project; if the captured project is
+    a raw alias they would silently miss memories stored under the
+    canonical id.
+    """
+    init_db()
+    project_registry(("p05-interferometer", ["p05", "interferometer"]))
+
+    interaction = record_interaction(
+        prompt="quick capture", response="response body", project="p05", reinforce=False
+    )
+    assert interaction.project == "p05-interferometer"
+
+    fetched = get_interaction(interaction.id)
+    assert fetched.project == "p05-interferometer"
+
+
+def test_list_interactions_canonicalizes_project_filter(project_registry):
+    init_db()
+    project_registry(("p06-polisher", ["p06", "polisher"]))
+
+    record_interaction(prompt="a", response="ra", project="p06-polisher", reinforce=False)
+    record_interaction(prompt="b", response="rb", project="polisher", reinforce=False)
+    record_interaction(prompt="c", response="rc", project="atocore", reinforce=False)
+
+    # Query by an alias should still find both p06 captures
+    via_alias = list_interactions(project="p06")
+    via_canonical = list_interactions(project="p06-polisher")
+    assert len(via_alias) == 2
+    assert len(via_canonical) == 2
+    assert {i.prompt for i in via_alias} == {"a", "b"}
+
+
+# --- since filter format normalization ------------------------------------
+
+
+def test_list_interactions_since_accepts_iso_with_t_separator(tmp_data_dir):
+    init_db()
+    record_interaction(prompt="early", response="r", reinforce=False)
+    time.sleep(1.05)
+    pivot = record_interaction(prompt="late", response="r", reinforce=False)
+
+    # pivot.created_at is in storage format 'YYYY-MM-DD HH:MM:SS'.
+    # Build the equivalent ISO 8601 with 'T' that an external client
+    # would naturally send.
+    iso_with_t = pivot.created_at.replace(" ", "T")
+    items = list_interactions(since=iso_with_t)
+    assert any(i.id == pivot.id for i in items)
+    # The early row must also be excluded if its timestamp is strictly
+    # before the pivot — since is inclusive on the cutoff
+    early_ids = {i.id for i in items if i.prompt == "early"}
+    assert early_ids == set() or len(items) >= 1
+
+
+def test_list_interactions_since_accepts_z_suffix(tmp_data_dir):
+    init_db()
+    pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
+    time.sleep(1.05)
+    after = record_interaction(prompt="after", response="r", reinforce=False)
+
+    iso_with_z = pivot.created_at.replace(" ", "T") + "Z"
+    items = list_interactions(since=iso_with_z)
+    ids = {i.id for i in items}
+    assert pivot.id in ids
+    assert after.id in ids
+
+
+def test_list_interactions_since_accepts_offset(tmp_data_dir):
+    init_db()
+    pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
+    time.sleep(1.05)
+    after = record_interaction(prompt="after", response="r", reinforce=False)
+
+    iso_with_offset = pivot.created_at.replace(" ", "T") + "+00:00"
+    items = list_interactions(since=iso_with_offset)
+    assert any(i.id == after.id for i in items)
+
+
+def test_list_interactions_since_storage_format_still_works(tmp_data_dir):
+    """The bare storage format must still work for backwards compatibility."""
+    init_db()
+    pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
+
+    items = list_interactions(since=pivot.created_at)
+    assert any(i.id == pivot.id for i in items)
--- a/tests/test_migrate_legacy_aliases.py
+++ b/tests/test_migrate_legacy_aliases.py
@@ -0,0 +1,802 @@
+"""Tests for scripts/migrate_legacy_aliases.py.
+
+The migration script closes the compatibility gap documented in
+docs/architecture/project-identity-canonicalization.md. These tests
+cover:
+
+- empty/clean database behavior
+- shadow projects detection
+- state rekey without collisions
+- state collision detection + apply refusal
+- memory rekey + supersession of duplicates
+- interaction rekey
+- end-to-end apply on a realistic shadow
+- idempotency (running twice produces the same final state)
+- report artifact is written
+- the pre-fix regression gap is actually closed after migration
+"""
+
+from __future__ import annotations
+
+import json
+import sqlite3
+import sys
+import uuid
+from pathlib import Path
+
+import pytest
+
+from atocore.context.project_state import (
+    get_state,
+    init_project_state_schema,
+)
+from atocore.models.database import init_db
+
+# Make scripts/ importable
+_REPO_ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(_REPO_ROOT / "scripts"))
+
+import migrate_legacy_aliases as mig  # noqa: E402
+
+
+# ---------------------------------------------------------------------------
+# Helpers that seed "legacy" rows the way they would have looked before fb6298a
+# ---------------------------------------------------------------------------
+
+
+def _open_db_connection():
+    """Open a direct SQLite connection to the test data dir's DB."""
+    import atocore.config as config
+
+    conn = sqlite3.connect(str(config.settings.db_path))
+    conn.row_factory = sqlite3.Row
+    conn.execute("PRAGMA foreign_keys = ON")
+    return conn
+
+
+def _seed_shadow_project(
+    conn: sqlite3.Connection, shadow_name: str
+) -> str:
+    """Insert a projects row keyed under an alias, like the old set_state would have."""
+    project_id = str(uuid.uuid4())
+    conn.execute(
+        "INSERT INTO projects (id, name, description) VALUES (?, ?, ?)",
+        (project_id, shadow_name, f"shadow row for {shadow_name}"),
+    )
+    conn.commit()
+    return project_id
+
+
+def _seed_state_row(
+    conn: sqlite3.Connection,
+    project_id: str,
+    category: str,
+    key: str,
+    value: str,
+    status: str = "active",
+) -> str:
+    row_id = str(uuid.uuid4())
+    conn.execute(
+        "INSERT INTO project_state "
+        "(id, project_id, category, key, value, source, confidence, status) "
+        "VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
+        (row_id, project_id, category, key, value, "legacy-test", 1.0, status),
+    )
+    conn.commit()
+    return row_id
+
+
+def _seed_memory_row(
+    conn: sqlite3.Connection,
+    memory_type: str,
+    content: str,
+    project: str,
+    status: str = "active",
+) -> str:
+    row_id = str(uuid.uuid4())
+    conn.execute(
+        "INSERT INTO memories "
+        "(id, memory_type, content, project, source_chunk_id, confidence, status) "
+        "VALUES (?, ?, ?, ?, ?, ?, ?)",
+        (row_id, memory_type, content, project, None, 1.0, status),
+    )
+    conn.commit()
+    return row_id
+
+
+def _seed_interaction_row(
+    conn: sqlite3.Connection, prompt: str, project: str
+) -> str:
+    row_id = str(uuid.uuid4())
+    conn.execute(
+        "INSERT INTO interactions "
+        "(id, prompt, context_pack, response_summary, response, "
+        " memories_used, chunks_used, client, session_id, project, created_at) "
+        "VALUES (?, ?, '{}', '', '', '[]', '[]', 'legacy-test', '', ?, '2026-04-01 12:00:00')",
+        (row_id, prompt, project),
+    )
+    conn.commit()
+    return row_id
+
+
+# ---------------------------------------------------------------------------
+# plan-building tests
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture(autouse=True)
+def _setup(tmp_data_dir):
+    init_db()
+    init_project_state_schema()
+
+
+def test_dry_run_on_empty_registry_reports_empty_plan(tmp_data_dir):
+    """Empty registry -> empty alias map -> empty plan."""
+    registry_path = tmp_data_dir / "empty-registry.json"
+    registry_path.write_text('{"projects": []}', encoding="utf-8")
+
+    conn = _open_db_connection()
+    try:
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    assert plan.alias_map == {}
+    assert plan.is_empty
+    assert not plan.has_collisions
+    assert plan.counts() == {
+        "shadow_projects": 0,
+        "state_rekey_rows": 0,
+        "state_collisions": 0,
+        "state_historical_drops": 0,
+        "memory_rekey_rows": 0,
+        "memory_supersede_rows": 0,
+        "interaction_rekey_rows": 0,
+    }
+
+
+def test_dry_run_on_clean_registered_db_reports_empty_plan(project_registry):
+    """A registry with projects but no legacy rows -> empty plan."""
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    assert plan.alias_map != {}
+    assert plan.is_empty
+
+
+def test_dry_run_finds_shadow_project(project_registry):
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        _seed_shadow_project(conn, "p05")
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    assert len(plan.shadow_projects) == 1
+    assert plan.shadow_projects[0].shadow_name == "p05"
+    assert plan.shadow_projects[0].canonical_project_id == "p05-interferometer"
+
+
+def test_dry_run_plans_state_rekey_without_collisions(project_registry):
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        _seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1 ingestion")
+        _seed_state_row(conn, shadow_id, "decision", "lateral_support", "GF-PTFE")
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    assert len(plan.state_plans) == 1
+    sp = plan.state_plans[0]
+    assert len(sp.rows_to_rekey) == 2
+    assert sp.collisions == []
+    assert not plan.has_collisions
+
+
+def test_dry_run_detects_state_collision(project_registry):
+    """Shadow and canonical both have state under the same (category, key) with different values."""
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        canonical_id = _seed_shadow_project(conn, "p05-interferometer")
+        _seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
+        _seed_state_row(
+            conn, canonical_id, "status", "next_focus", "Wave 2"
+        )
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    assert plan.has_collisions
+    collision = plan.state_plans[0].collisions[0]
+    assert collision["shadow"]["value"] == "Wave 1"
+    assert collision["canonical"]["value"] == "Wave 2"
+
+
+def test_dry_run_plans_memory_rekey_and_supersession(project_registry):
+    registry_path = project_registry(
+        ("p04-gigabit", ["p04", "gigabit"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        # A clean memory under the alias that will just be rekeyed
+        _seed_memory_row(conn, "project", "clean rekey memory", "p04")
+        # A memory that collides with an existing canonical memory
+        _seed_memory_row(conn, "project", "duplicate content", "p04")
+        _seed_memory_row(conn, "project", "duplicate content", "p04-gigabit")
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    # There's exactly one memory plan (one alias matched)
+    assert len(plan.memory_plans) == 1
+    mp = plan.memory_plans[0]
+    # Two rows are candidates for rekey or supersession — one clean,
+    # one duplicate. The duplicate is handled via to_supersede; the
+    # other via rows_to_rekey.
+    total_affected = len(mp.rows_to_rekey) + len(mp.to_supersede)
+    assert total_affected == 2
+
+
+def test_dry_run_plans_interaction_rekey(project_registry):
+    registry_path = project_registry(
+        ("p06-polisher", ["p06", "polisher"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        _seed_interaction_row(conn, "quick capture under alias", "polisher")
+        _seed_interaction_row(conn, "another alias-keyed row", "p06")
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    total = sum(len(p.rows_to_rekey) for p in plan.interaction_plans)
+    assert total == 2
+
+
+# ---------------------------------------------------------------------------
+# apply tests
+# ---------------------------------------------------------------------------
+
+
+def test_apply_refuses_on_state_collision(project_registry):
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        canonical_id = _seed_shadow_project(conn, "p05-interferometer")
+        _seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
+        _seed_state_row(conn, canonical_id, "status", "next_focus", "Wave 2")
+
+        plan = mig.build_plan(conn, registry_path)
+        assert plan.has_collisions
+
+        with pytest.raises(mig.MigrationRefused):
+            mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+
+def test_apply_migrates_clean_shadow_end_to_end(project_registry):
+    """The happy path: one shadow project with clean state rows, rekey into a freshly-created canonical row, verify reachability via get_state."""
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        _seed_state_row(
+            conn, shadow_id, "status", "next_focus", "Wave 1 ingestion"
+        )
+        _seed_state_row(
+            conn, shadow_id, "decision", "lateral_support", "GF-PTFE"
+        )
+
+        plan = mig.build_plan(conn, registry_path)
+        assert not plan.has_collisions
+        summary = mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    assert summary["state_rows_rekeyed"] == 2
+    assert summary["shadow_projects_deleted"] == 1
+    assert summary["canonical_rows_created"] == 1
+
+    # The regression gap is now closed: the service layer can see
+    # the state under the canonical id via either the alias OR the
+    # canonical.
+    via_alias = get_state("p05")
+    via_canonical = get_state("p05-interferometer")
+    assert len(via_alias) == 2
+    assert len(via_canonical) == 2
+    values = {entry.value for entry in via_canonical}
+    assert values == {"Wave 1 ingestion", "GF-PTFE"}
+
+
+def test_apply_drops_shadow_state_duplicate_without_collision(project_registry):
+    """Shadow and canonical both have the same (category, key, value) — shadow gets marked superseded rather than hitting the UNIQUE constraint."""
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        canonical_id = _seed_shadow_project(conn, "p05-interferometer")
+        _seed_state_row(
+            conn, shadow_id, "status", "next_focus", "Wave 1 ingestion"
+        )
+        _seed_state_row(
+            conn, canonical_id, "status", "next_focus", "Wave 1 ingestion"
+        )
+
+        plan = mig.build_plan(conn, registry_path)
+        assert not plan.has_collisions
+        summary = mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    assert summary["state_rows_merged_as_duplicate"] == 1
+
+    via_canonical = get_state("p05-interferometer")
+    # Exactly one active row survives
+    assert len(via_canonical) == 1
+    assert via_canonical[0].value == "Wave 1 ingestion"
+
+
+def test_apply_preserves_superseded_shadow_state_when_no_collision(project_registry):
+    """Regression test for the codex-flagged data-loss bug.
+
+    Before the fix, plan_state_migration only selected status='active'
+    rows. Any superseded or invalid row on the shadow project was
+    invisible to the plan and got silently cascade-deleted when the
+    shadow projects row was dropped at the end of apply. That's
+    exactly the kind of audit loss a cleanup migration must not cause.
+
+    This test seeds a shadow project with a superseded state row on
+    a triple the canonical project doesn't have, runs the migration,
+    and verifies the row survived and is now attached to the
+    canonical project (still with status='superseded').
+    """
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        # Superseded row on a triple the canonical won't have
+        _seed_state_row(
+            conn,
+            shadow_id,
+            "status",
+            "historical_phase",
+            "Phase 0 legacy",
+            status="superseded",
+        )
+
+        plan = mig.build_plan(conn, registry_path)
+        assert not plan.has_collisions
+        summary = mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    # The superseded row should have been rekeyed, not dropped
+    assert summary["state_rows_rekeyed"] == 1
+    assert summary["state_rows_historical_dropped"] == 0
+
+    # Verify via raw SQL that the row is now attached to the canonical
+    # projects row and still has status='superseded'
+    conn = _open_db_connection()
+    try:
+        row = conn.execute(
+            "SELECT ps.status, ps.value, p.name "
+            "FROM project_state ps JOIN projects p ON ps.project_id = p.id "
+            "WHERE ps.category = ? AND ps.key = ?",
+            ("status", "historical_phase"),
+        ).fetchone()
+    finally:
+        conn.close()
+
+    assert row is not None, "superseded shadow row was lost during migration"
+    assert row["status"] == "superseded"
+    assert row["value"] == "Phase 0 legacy"
+    assert row["name"] == "p05-interferometer"
+
+
+def test_apply_drops_shadow_inactive_row_when_canonical_holds_same_triple(project_registry):
+    """Shadow is inactive (superseded) and collides with an active canonical row.
+
+    The canonical wins by definition of the UPSERT schema. The shadow
+    row is recorded as a historical_drop in the plan so the operator
+    sees the audit loss, and the apply cascade-deletes it via the
+    shadow projects row. This is the unavoidable data-loss case
+    documented in the migration module docstring.
+    """
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        canonical_id = _seed_shadow_project(conn, "p05-interferometer")
+
+        # Shadow has a superseded value on a triple where the canonical
+        # has a different active value. Can't preserve both: UNIQUE
+        # allows only one row per triple.
+        _seed_state_row(
+            conn,
+            shadow_id,
+            "status",
+            "next_focus",
+            "Old wave 1",
+            status="superseded",
+        )
+        _seed_state_row(
+            conn,
+            canonical_id,
+            "status",
+            "next_focus",
+            "Wave 2 trusted-operational",
+            status="active",
+        )
+
+        plan = mig.build_plan(conn, registry_path)
+        assert not plan.has_collisions  # not an active-vs-active collision
+        assert plan.counts()["state_historical_drops"] == 1
+
+        summary = mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    assert summary["state_rows_historical_dropped"] == 1
+
+    # The canonical's active row survives unchanged
+    via_canonical = get_state("p05-interferometer")
+    active_next_focus = [
+        e
+        for e in via_canonical
+        if e.category == "status" and e.key == "next_focus"
+    ]
+    assert len(active_next_focus) == 1
+    assert active_next_focus[0].value == "Wave 2 trusted-operational"
+
+
+def test_apply_replaces_inactive_canonical_with_active_shadow(project_registry):
+    """Shadow is active, canonical has an inactive row at the same triple.
+
+    The shadow wins: canonical inactive row is deleted, shadow is
+    rekeyed into canonical's project_id. This covers the
+    cross-contamination case where the old alias path was used for
+    the live value while the canonical path had a stale row.
+    """
+    registry_path = project_registry(
+        ("p06-polisher", ["p06", "polisher"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p06")
+        canonical_id = _seed_shadow_project(conn, "p06-polisher")
+
+        # Canonical has a stale invalid row; shadow has the live value.
+        _seed_state_row(
+            conn,
+            canonical_id,
+            "decision",
+            "frame",
+            "Old frame (no longer current)",
+            status="invalid",
+        )
+        _seed_state_row(
+            conn,
+            shadow_id,
+            "decision",
+            "frame",
+            "kinematic mount frame",
+            status="active",
+        )
+
+        plan = mig.build_plan(conn, registry_path)
+        assert not plan.has_collisions
+        assert plan.counts()["state_historical_drops"] == 0
+
+        summary = mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    assert summary["state_rows_replaced_inactive_canonical"] == 1
+
+    # The active shadow value now lives on the canonical row
+    via_canonical = get_state("p06-polisher")
+    frame_entries = [
+        e for e in via_canonical if e.category == "decision" and e.key == "frame"
+    ]
+    assert len(frame_entries) == 1
+    assert frame_entries[0].value == "kinematic mount frame"
+
+    # Confirm via raw SQL that the previously-inactive canonical row
+    # no longer exists
+    conn = _open_db_connection()
+    try:
+        stale = conn.execute(
+            "SELECT COUNT(*) AS c FROM project_state WHERE value = ?",
+            ("Old frame (no longer current)",),
+        ).fetchone()
+    finally:
+        conn.close()
+    assert stale["c"] == 0
+
+
+def test_apply_migrates_memories(project_registry):
+    registry_path = project_registry(
+        ("p04-gigabit", ["p04", "gigabit"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        _seed_memory_row(conn, "project", "lateral support uses GF-PTFE", "p04")
+        _seed_memory_row(conn, "preference", "I prefer descriptive commits", "gigabit")
+        plan = mig.build_plan(conn, registry_path)
+        summary = mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    assert summary["memory_rows_rekeyed"] == 2
+
+    # Both memories should now read as living under the canonical id
+    from atocore.memory.service import get_memories
+
+    rows = get_memories(project="p04-gigabit", limit=50)
+    contents = {m.content for m in rows}
+    assert "lateral support uses GF-PTFE" in contents
+    assert "I prefer descriptive commits" in contents
+
+
+def test_apply_migrates_interactions(project_registry):
+    registry_path = project_registry(
+        ("p06-polisher", ["p06", "polisher"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        _seed_interaction_row(conn, "alias-keyed 1", "polisher")
+        _seed_interaction_row(conn, "alias-keyed 2", "p06")
+        plan = mig.build_plan(conn, registry_path)
+        summary = mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    assert summary["interaction_rows_rekeyed"] == 2
+
+    from atocore.interactions.service import list_interactions
+
+    rows = list_interactions(project="p06-polisher", limit=50)
+    prompts = {i.prompt for i in rows}
+    assert prompts == {"alias-keyed 1", "alias-keyed 2"}
+
+
+def test_apply_is_idempotent(project_registry):
+    """Running apply twice produces the same final state as running it once."""
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        _seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
+        _seed_memory_row(conn, "project", "m1", "p05")
+        _seed_interaction_row(conn, "i1", "p05")
+
+        # first apply
+        plan_a = mig.build_plan(conn, registry_path)
+        summary_a = mig.apply_plan(conn, plan_a)
+
+        # second apply: plan should be empty
+        plan_b = mig.build_plan(conn, registry_path)
+        assert plan_b.is_empty
+
+        # forcing a second apply on the empty plan via the function
+        # directly should also succeed as a no-op (caller normally
+        # has to pass --allow-empty through the CLI, but apply_plan
+        # itself doesn't enforce that — the refusal is in run())
+        summary_b = mig.apply_plan(conn, plan_b)
+    finally:
+        conn.close()
+
+    assert summary_a["state_rows_rekeyed"] == 1
+    assert summary_a["memory_rows_rekeyed"] == 1
+    assert summary_a["interaction_rows_rekeyed"] == 1
+    assert summary_b["state_rows_rekeyed"] == 0
+    assert summary_b["memory_rows_rekeyed"] == 0
+    assert summary_b["interaction_rows_rekeyed"] == 0
+
+
+def test_apply_refuses_with_integrity_errors(project_registry):
+    """If the projects table has two case-variant rows for the canonical id, refuse.
+
+    The projects.name column has a case-sensitive UNIQUE constraint,
+    so exact duplicates can't exist. But case-variant rows
+    ``p05-interferometer`` and ``P05-Interferometer`` can both
+    survive the UNIQUE constraint while both matching the
+    case-insensitive ``lower(name) = lower(?)`` lookup that the
+    migration uses to find the canonical row. That ambiguity
+    (which canonical row should dependents rekey into?) is exactly
+    the integrity failure the migration is guarding against.
+    """
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        _seed_shadow_project(conn, "p05-interferometer")
+        _seed_shadow_project(conn, "P05-Interferometer")
+        plan = mig.build_plan(conn, registry_path)
+        assert plan.integrity_errors
+        with pytest.raises(mig.MigrationRefused):
+            mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+
+# ---------------------------------------------------------------------------
+# reporting tests
+# ---------------------------------------------------------------------------
+
+
+def test_plan_to_json_dict_is_serializable(project_registry):
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        _seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    payload = mig.plan_to_json_dict(plan)
+    # Must be JSON-serializable
+    json_str = json.dumps(payload, default=str)
+    assert "p05-interferometer" in json_str
+    assert payload["counts"]["state_rekey_rows"] == 1
+
+
+def test_write_report_creates_file(tmp_path, project_registry):
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    report_dir = tmp_path / "reports"
+    report_path = mig.write_report(
+        plan,
+        summary=None,
+        db_path=Path("/tmp/fake.db"),
+        registry_path=registry_path,
+        mode="dry-run",
+        report_dir=report_dir,
+    )
+    assert report_path.exists()
+    payload = json.loads(report_path.read_text(encoding="utf-8"))
+    assert payload["mode"] == "dry-run"
+    assert "plan" in payload
+
+
+def test_render_plan_text_on_empty_plan(project_registry):
+    registry_path = project_registry()  # empty
+    conn = _open_db_connection()
+    try:
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    text = mig.render_plan_text(plan)
+    assert "nothing to plan" in text.lower()
+
+
+def test_render_plan_text_on_collision(project_registry):
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        canonical_id = _seed_shadow_project(conn, "p05-interferometer")
+        _seed_state_row(conn, shadow_id, "status", "phase", "A")
+        _seed_state_row(conn, canonical_id, "status", "phase", "B")
+        plan = mig.build_plan(conn, registry_path)
+    finally:
+        conn.close()
+
+    text = mig.render_plan_text(plan)
+    assert "COLLISION" in text.upper()
+    assert "REFUSE" in text.upper() or "refuse" in text.lower()
+
+
+# ---------------------------------------------------------------------------
+# gap-closed companion test — the flip side of
+# test_legacy_alias_keyed_state_is_invisible_until_migrated in
+# test_project_state.py. After running this migration, the legacy row
+# IS reachable via the canonical id.
+# ---------------------------------------------------------------------------
+
+
+def test_legacy_alias_gap_is_closed_after_migration(project_registry):
+    """End-to-end regression test for the canonicalization gap.
+
+    Simulates the exact scenario from
+    test_legacy_alias_keyed_state_is_invisible_until_migrated in
+    test_project_state.py — a shadow projects row with a state row
+    pointing at it. Runs the migration. Verifies the state is now
+    reachable via the canonical id.
+    """
+    registry_path = project_registry(
+        ("p05-interferometer", ["p05", "interferometer"])
+    )
+
+    conn = _open_db_connection()
+    try:
+        shadow_id = _seed_shadow_project(conn, "p05")
+        _seed_state_row(
+            conn, shadow_id, "status", "legacy_focus", "Wave 1 ingestion"
+        )
+
+        # Before migration: the legacy row is invisible to get_state
+        # (this is the documented gap, covered in test_project_state.py)
+        assert all(
+            entry.value != "Wave 1 ingestion" for entry in get_state("p05")
+        )
+        assert all(
+            entry.value != "Wave 1 ingestion"
+            for entry in get_state("p05-interferometer")
+        )
+
+        # Run the migration
+        plan = mig.build_plan(conn, registry_path)
+        mig.apply_plan(conn, plan)
+    finally:
+        conn.close()
+
+    # After migration: the row is reachable via canonical AND alias
+    via_canonical = get_state("p05-interferometer")
+    via_alias = get_state("p05")
+    assert any(e.value == "Wave 1 ingestion" for e in via_canonical)
+    assert any(e.value == "Wave 1 ingestion" for e in via_alias)
--- a/tests/test_project_state.py
+++ b/tests/test_project_state.py
@@ -131,3 +131,139 @@ def test_format_project_state():
 def test_format_empty():
    """Test formatting empty state."""
    assert format_project_state([]) == ""
+
+
+# --- Alias canonicalization regression tests --------------------------------
+
+
+def test_set_state_canonicalizes_alias(project_registry):
+    """Writing state via an alias should land under the canonical project id.
+
+    Regression for codex's P1 finding: previously /project/state with
+    project="p05" created a separate alias row that later context builds
+    (which canonicalize the hint) would never see.
+    """
+    project_registry(("p05-interferometer", ["p05", "interferometer"]))
+
+    set_state("p05", "status", "next_focus", "Wave 2 ingestion")
+
+    # The state must be reachable via every alias AND the canonical id
+    via_alias = get_state("p05")
+    via_canonical = get_state("p05-interferometer")
+    via_other_alias = get_state("interferometer")
+
+    assert len(via_alias) == 1
+    assert len(via_canonical) == 1
+    assert len(via_other_alias) == 1
+    # All three reads return the same row id (no fragmented duplicates)
+    assert via_alias[0].id == via_canonical[0].id == via_other_alias[0].id
+    assert via_canonical[0].value == "Wave 2 ingestion"
+
+
+def test_get_state_canonicalizes_alias_after_canonical_write(project_registry):
+    """Reading via an alias should find state written under the canonical id."""
+    project_registry(("p04-gigabit", ["p04", "gigabit"]))
+
+    set_state("p04-gigabit", "status", "phase", "Phase 1 baseline")
+    via_alias = get_state("gigabit")
+
+    assert len(via_alias) == 1
+    assert via_alias[0].value == "Phase 1 baseline"
+
+
+def test_invalidate_state_canonicalizes_alias(project_registry):
+    """Invalidating via an alias should hit the canonical row."""
+    project_registry(("p06-polisher", ["p06", "polisher"]))
+
+    set_state("p06-polisher", "decision", "frame", "kinematic mounts")
+    success = invalidate_state("polisher", "decision", "frame")
+
+    assert success is True
+    active = get_state("p06-polisher")
+    assert len(active) == 0
+
+
+def test_unregistered_project_state_still_works(project_registry):
+    """Hand-curated state for an unregistered project must still round-trip.
+
+    Backwards compatibility with state created before the project
+    registry existed: resolve_project_name returns the input unchanged
+    when the registry has no record, so the raw name is used as-is.
+    """
+    project_registry()  # empty registry
+
+    set_state("orphan-project", "status", "phase", "Standalone")
+    entries = get_state("orphan-project")
+    assert len(entries) == 1
+    assert entries[0].value == "Standalone"
+
+
+def test_legacy_alias_keyed_state_is_invisible_until_migrated(project_registry):
+    """Documents the compatibility gap from project-identity-canonicalization.md.
+
+    Rows that were written under a registered alias BEFORE the
+    canonicalization landed in fb6298a are stored in the projects
+    table under the alias name (not the canonical id). Every read
+    path now canonicalizes to the canonical id, so those legacy
+    rows become invisible.
+
+    This test simulates the legacy state by inserting a shadow
+    project row and a state row that points at it via raw SQL,
+    bypassing set_state() which now canonicalizes. Then it
+    verifies the canonicalized get_state() does NOT find the
+    legacy row.
+
+    When the legacy alias migration script lands (see the open
+    follow-ups in docs/architecture/project-identity-canonicalization.md),
+    this test must be inverted: after running the migration the
+    legacy state should be reachable via the canonical project,
+    not invisible. The migration is required before engineering
+    V1 ships.
+    """
+    import uuid
+
+    from atocore.models.database import get_connection
+
+    project_registry(("p05-interferometer", ["p05", "interferometer"]))
+
+    # Simulate a pre-fix legacy row by writing directly under the
+    # alias name. This is what the OLD set_state would have done
+    # before fb6298a added canonicalization.
+    legacy_project_id = str(uuid.uuid4())
+    legacy_state_id = str(uuid.uuid4())
+    with get_connection() as conn:
+        conn.execute(
+            "INSERT INTO projects (id, name, description) VALUES (?, ?, ?)",
+            (legacy_project_id, "p05", "shadow row created before canonicalization"),
+        )
+        conn.execute(
+            "INSERT INTO project_state "
+            "(id, project_id, category, key, value, source, confidence) "
+            "VALUES (?, ?, ?, ?, ?, ?, ?)",
+            (
+                legacy_state_id,
+                legacy_project_id,
+                "status",
+                "legacy_focus",
+                "Wave 1 ingestion",
+                "pre-canonicalization",
+                1.0,
+            ),
+        )
+
+    # The canonicalized read path looks under "p05-interferometer"
+    # and cannot see the legacy row. THIS IS THE GAP.
+    via_alias = get_state("p05")
+    via_canonical = get_state("p05-interferometer")
+    assert all(entry.value != "Wave 1 ingestion" for entry in via_alias)
+    assert all(entry.value != "Wave 1 ingestion" for entry in via_canonical)
+
+    # The legacy row is still in the database — it's just unreachable
+    # from the canonicalized read path. The migration script (open
+    # follow-up) is what closes the gap.
+    with get_connection() as conn:
+        row = conn.execute(
+            "SELECT value FROM project_state WHERE id = ?", (legacy_state_id,)
+        ).fetchone()
+    assert row is not None
+    assert row["value"] == "Wave 1 ingestion"
--- a/tests/test_reinforcement.py
+++ b/tests/test_reinforcement.py
@@ -6,6 +6,8 @@ from atocore.interactions.service import record_interaction
 from atocore.main import app
 from atocore.memory.reinforcement import (
    DEFAULT_CONFIDENCE_DELTA,
+    _stem,
+    _tokenize,
    reinforce_from_interaction,
 )
 from atocore.memory.service import (
@@ -314,3 +316,177 @@ def test_api_post_interactions_accepts_reinforce_false(tmp_data_dir):
    reloaded = [m for m in get_memories(memory_type="preference", limit=20) if m.id == mem.id][0]
    assert reloaded.confidence == 0.5
    assert reloaded.reference_count == 0
+
+
+# --- alias canonicalization end-to-end -------------------------------------
+
+
+def test_reinforcement_works_when_capture_uses_alias(project_registry):
+    """End-to-end: capture under an alias, seed memory under canonical id,
+    verify reinforcement still finds and bumps the memory.
+
+    Regression for codex's P2 finding: previously interaction.project
+    was stored verbatim and reinforcement queried memories using that
+    raw value, so capturing under "p05" while memories live under
+    "p05-interferometer" silently missed everything.
+    """
+    init_db()
+    project_registry(("p05-interferometer", ["p05", "interferometer"]))
+
+    # Seed an active memory under the CANONICAL id
+    mem = create_memory(
+        memory_type="project",
+        content="the lateral support pads use GF-PTFE for thermal stability",
+        project="p05-interferometer",
+        confidence=0.5,
+    )
+
+    # Capture an interaction under the ALIAS — this is the bug case
+    record_interaction(
+        prompt="status update",
+        response=(
+            "Quick note: the lateral support pads use GF-PTFE for thermal "
+            "stability and that's still the current selection."
+        ),
+        project="p05",
+    )
+
+    # The seeded memory should have been reinforced
+    reloaded = [
+        m
+        for m in get_memories(memory_type="project", project="p05-interferometer", limit=20)
+        if m.id == mem.id
+    ][0]
+    assert reloaded.confidence > 0.5
+    assert reloaded.reference_count == 1
+
+
+def test_get_memories_filter_by_alias(project_registry):
+    """Filtering memories by an alias should find rows stored under canonical."""
+    init_db()
+    project_registry(("p04-gigabit", ["p04", "gigabit"]))
+
+    create_memory(memory_type="project", content="m1", project="p04-gigabit")
+    create_memory(memory_type="project", content="m2", project="gigabit")
+
+    via_alias = get_memories(memory_type="project", project="p04")
+    via_canonical = get_memories(memory_type="project", project="p04-gigabit")
+
+    assert len(via_alias) == 2
+    assert len(via_canonical) == 2
+    assert {m.content for m in via_alias} == {"m1", "m2"}
+
+
+# --- token-overlap matcher: unit tests -------------------------------------
+
+
+def test_stem_folds_s_ed_ing():
+    assert _stem("prefers") == "prefer"
+    assert _stem("preferred") == "prefer"
+    assert _stem("services") == "service"
+    assert _stem("processing") == "process"
+    # Short words must not be over-stripped
+    assert _stem("red") == "red"  # 3 chars, don't strip "ed"
+    assert _stem("bus") == "bus"  # 3 chars, don't strip "s"
+    assert _stem("sing") == "sing"  # 4 chars, don't strip "ing"
+    assert _stem("being") == "being"  # 5 chars, "ing" strip leaves "be" (2) — too short
+
+
+def test_tokenize_removes_stop_words():
+    tokens = _tokenize("the quick brown fox jumps over the lazy dog")
+    assert "the" not in tokens
+    assert "quick" in tokens
+    assert "brown" in tokens
+    assert "fox" in tokens
+    assert "dog" in tokens
+    # "over" has len 4, not a stop word → kept (stemmed: "over")
+    assert "over" in tokens
+
+
+# --- token-overlap matcher: paraphrase matching ----------------------------
+
+
+def test_reinforce_matches_paraphrase_prefers_vs_prefer(tmp_data_dir):
+    """The canonical rebase case from phase9-first-real-use.md."""
+    init_db()
+    mem = create_memory(
+        memory_type="preference",
+        content="prefers rebase-based workflows because history stays linear",
+        confidence=0.5,
+    )
+    interaction = _make_interaction(
+        response=(
+            "I prefer rebase-based workflows because the history stays "
+            "linear and reviewers have an easier time."
+        ),
+    )
+    results = reinforce_from_interaction(interaction)
+    assert any(r.memory_id == mem.id for r in results)
+
+
+def test_reinforce_matches_paraphrase_with_articles_and_ed(tmp_data_dir):
+    init_db()
+    mem = create_memory(
+        memory_type="preference",
+        content="preferred structured logging across all backend services",
+        confidence=0.5,
+    )
+    interaction = _make_interaction(
+        response=(
+            "I set up structured logging across all the backend services, "
+            "which the team prefers for consistency."
+        ),
+    )
+    results = reinforce_from_interaction(interaction)
+    assert any(r.memory_id == mem.id for r in results)
+
+
+def test_reinforce_rejects_low_overlap(tmp_data_dir):
+    init_db()
+    mem = create_memory(
+        memory_type="preference",
+        content="always uses Python for data processing scripts",
+        confidence=0.5,
+    )
+    interaction = _make_interaction(
+        response=(
+            "The CI pipeline runs on Node.js and deploys to Kubernetes "
+            "using Helm charts."
+        ),
+    )
+    results = reinforce_from_interaction(interaction)
+    assert all(r.memory_id != mem.id for r in results)
+
+
+def test_reinforce_matches_at_70_percent_threshold(tmp_data_dir):
+    """Exactly 7 of 10 content tokens present → should match."""
+    init_db()
+    # After stop-word removal and stemming, this has 10 tokens:
+    # alpha, bravo, charlie, delta, echo, foxtrot, golf, hotel, india, juliet
+    mem = create_memory(
+        memory_type="preference",
+        content="alpha bravo charlie delta echo foxtrot golf hotel india juliet",
+        confidence=0.5,
+    )
+    # Echo 7 of 10 tokens (70%) plus some noise
+    interaction = _make_interaction(
+        response="alpha bravo charlie delta echo foxtrot golf noise words here",
+    )
+    results = reinforce_from_interaction(interaction)
+    assert any(r.memory_id == mem.id for r in results)
+
+
+def test_reinforce_rejects_below_70_percent(tmp_data_dir):
+    """Only 6 of 10 content tokens present (60%) → should NOT match."""
+    init_db()
+    mem = create_memory(
+        memory_type="preference",
+        content="alpha bravo charlie delta echo foxtrot golf hotel india juliet",
+        confidence=0.5,
+    )
+    # Echo 6 of 10 tokens (60%) plus noise
+    interaction = _make_interaction(
+        response="alpha bravo charlie delta echo foxtrot noise words here only",
+    )
+    results = reinforce_from_interaction(interaction)
+    assert all(r.memory_id != mem.id for r in results)