Compare commits
22 Commits
d0ff8b5738
...
codex/dali
| Author | SHA1 | Date | |
|---|---|---|---|
| 58c744fd2f | |||
| a34a7a995f | |||
| 92fc250b54 | |||
| 2d911909f8 | |||
| 1a8fdf4225 | |||
| 336208004c | |||
| 03822389a1 | |||
| be4099486c | |||
| 2c0b214137 | |||
| b492f5f7b0 | |||
| e877e5b8ff | |||
| fad30d5461 | |||
| 261277fd51 | |||
| 7e60f5a0e6 | |||
| 1953e559f9 | |||
| f521aab97b | |||
| fb6298a9a1 | |||
| f2372eff9e | |||
| 78d4e979e5 | |||
| d6ce6128cf | |||
| 368adf2ebc | |||
| a637017900 |
159
.claude/commands/atocore-context.md
Normal file
159
.claude/commands/atocore-context.md
Normal file
@@ -0,0 +1,159 @@
|
||||
---
|
||||
description: Pull a context pack from the live AtoCore service for the current prompt
|
||||
argument-hint: <prompt text> [project-id]
|
||||
---
|
||||
|
||||
You are about to enrich a user prompt with context from the live
|
||||
AtoCore service. This is the daily-use entry point for AtoCore from
|
||||
inside Claude Code.
|
||||
|
||||
The work happens via the **shared AtoCore operator client** at
|
||||
`scripts/atocore_client.py`. That client is the canonical Python
|
||||
backbone for stable AtoCore operations and is meant to be reused by
|
||||
every LLM client (OpenClaw helper, future Codex skill, etc.) — see
|
||||
`docs/architecture/llm-client-integration.md` for the layering. This
|
||||
slash command is a thin Claude Code-specific frontend on top of it.
|
||||
|
||||
## Step 1 — parse the arguments
|
||||
|
||||
The user invoked `/atocore-context` with:
|
||||
|
||||
```
|
||||
$ARGUMENTS
|
||||
```
|
||||
|
||||
You need to figure out two things:
|
||||
|
||||
1. The **prompt text** — what AtoCore will retrieve context for
|
||||
2. An **optional project hint** — used to scope retrieval to a
|
||||
specific project's trusted state and corpus
|
||||
|
||||
The user may have passed a project id or alias as the **last
|
||||
whitespace-separated token**. Don't maintain a hardcoded list of
|
||||
known aliases — let the shared client decide. Use this rule:
|
||||
|
||||
- Take the last token of `$ARGUMENTS`. Call it `MAYBE_HINT`.
|
||||
- Run `python scripts/atocore_client.py detect-project "$MAYBE_HINT"`
|
||||
to ask the registry whether it's a known project id or alias.
|
||||
This call is cheap (it just hits `/projects` and does a regex
|
||||
match) and inherits the client's fail-open behavior.
|
||||
- If the response has a non-null `matched_project`, the last
|
||||
token was an explicit project hint. `PROMPT_TEXT` is everything
|
||||
except the last token; `PROJECT_HINT` is the matched canonical
|
||||
project id.
|
||||
- Otherwise the last token is just part of the prompt.
|
||||
`PROMPT_TEXT` is the full `$ARGUMENTS`; `PROJECT_HINT` is empty.
|
||||
|
||||
This delegates the alias-knowledge to the registry instead of
|
||||
embedding a stale list in this markdown file. When you add a new
|
||||
project to the registry, the slash command picks it up
|
||||
automatically with no edits here.
|
||||
|
||||
## Step 2 — call the shared client for the context pack
|
||||
|
||||
The server resolves project hints through the registry before
|
||||
looking up trusted state, so you can pass either the canonical id
|
||||
or any alias to `context-build` and the trusted state lookup will
|
||||
work either way. (Regression test:
|
||||
`tests/test_context_builder.py::test_alias_hint_resolves_through_registry`.)
|
||||
|
||||
**If `PROJECT_HINT` is non-empty**, call `context-build` directly
|
||||
with that hint:
|
||||
|
||||
```bash
|
||||
python scripts/atocore_client.py context-build \
|
||||
"$PROMPT_TEXT" \
|
||||
"$PROJECT_HINT"
|
||||
```
|
||||
|
||||
**If `PROJECT_HINT` is empty**, do the 2-step fallback dance so the
|
||||
user always gets a context pack regardless of whether the prompt
|
||||
implies a project:
|
||||
|
||||
```bash
|
||||
# Try project auto-detection first.
|
||||
RESULT=$(python scripts/atocore_client.py auto-context "$PROMPT_TEXT")
|
||||
|
||||
# If auto-context could not detect a project it returns a small
|
||||
# {"status": "no_project_match", ...} envelope. In that case fall
|
||||
# back to a corpus-wide context build with no project hint, which
|
||||
# is the right behaviour for cross-project or generic prompts like
|
||||
# "what changed in AtoCore backup policy this week?"
|
||||
if echo "$RESULT" | grep -q '"no_project_match"'; then
|
||||
RESULT=$(python scripts/atocore_client.py context-build "$PROMPT_TEXT")
|
||||
fi
|
||||
|
||||
echo "$RESULT"
|
||||
```
|
||||
|
||||
This is the fix for the P2 finding from codex's review: previously
|
||||
the slash command sent every no-hint prompt through `auto-context`
|
||||
and returned `no_project_match` to the user with no context, even
|
||||
though the underlying client's `context-build` subcommand has
|
||||
always supported corpus-wide context builds.
|
||||
|
||||
In both branches the response is the JSON payload from
|
||||
`/context/build` (or, in the rare case where even the corpus-wide
|
||||
build fails, a `{"status": "unavailable"}` envelope from the
|
||||
client's fail-open layer).
|
||||
|
||||
## Step 3 — present the context pack to the user
|
||||
|
||||
The successful response contains at least:
|
||||
|
||||
- `formatted_context` — the assembled context block AtoCore would
|
||||
feed an LLM
|
||||
- `chunks_used`, `total_chars`, `budget`, `budget_remaining`,
|
||||
`duration_ms`
|
||||
- `chunks` — array of source documents that contributed, each with
|
||||
`source_file`, `heading_path`, `score`
|
||||
|
||||
Render in this order:
|
||||
|
||||
1. A one-line stats banner: `chunks=N, chars=X/budget, duration=Yms`
|
||||
2. The `formatted_context` block verbatim inside a fenced text code
|
||||
block so the user can read what AtoCore would feed an LLM
|
||||
3. The `chunks` array as a small bullet list with `source_file`,
|
||||
`heading_path`, and `score` per chunk
|
||||
|
||||
Two special cases:
|
||||
|
||||
- **`{"status": "unavailable"}`** (fail-open from the client)
|
||||
→ Tell the user: "AtoCore is unreachable at `$ATOCORE_BASE_URL`.
|
||||
Check `python scripts/atocore_client.py health` for diagnostics."
|
||||
- **Empty `chunks_used: 0` with no project state and no memories**
|
||||
→ Tell the user: "AtoCore returned no context for this prompt —
|
||||
either the corpus does not have relevant information or the
|
||||
project hint is wrong. Try a different hint or a longer prompt."
|
||||
|
||||
## Step 4 — what about capturing the interaction
|
||||
|
||||
Capture (Phase 9 Commit A) and the rest of the reflection loop
|
||||
(reinforcement, extraction, review queue) are intentionally NOT
|
||||
exposed by the shared client yet. The contracts are stable but the
|
||||
workflow ergonomics are not, so the daily-use slash command stays
|
||||
focused on context retrieval until those review flows have been
|
||||
exercised in real use. See `docs/architecture/llm-client-integration.md`
|
||||
for the deferral rationale.
|
||||
|
||||
When capture is added to the shared client, this slash command will
|
||||
gain a follow-up `/atocore-record-response` companion command that
|
||||
posts the LLM's response back to the same interaction. That work is
|
||||
queued.
|
||||
|
||||
## Notes for the assistant
|
||||
|
||||
- DO NOT bypass the shared client by calling curl yourself. The
|
||||
client is the contract between AtoCore and every LLM frontend; if
|
||||
you find a missing capability, the right fix is to extend the
|
||||
client, not to work around it.
|
||||
- DO NOT maintain a hardcoded list of project aliases in this
|
||||
file. Use `detect-project` to ask the registry — that's the
|
||||
whole point of having a registry.
|
||||
- DO NOT silently change `ATOCORE_BASE_URL`. If the env var points
|
||||
at the wrong instance, surface the error so the user can fix it.
|
||||
- DO NOT hide the formatted context pack from the user. Showing
|
||||
what AtoCore would feed an LLM is the whole point.
|
||||
- The output goes into the user's working context as background;
|
||||
they may follow up with their actual question, and the AtoCore
|
||||
context pack acts as informal injected knowledge.
|
||||
4
.gitignore
vendored
4
.gitignore
vendored
@@ -10,4 +10,6 @@ htmlcov/
|
||||
.coverage
|
||||
venv/
|
||||
.venv/
|
||||
.claude/
|
||||
.claude/*
|
||||
!.claude/commands/
|
||||
!.claude/commands/**
|
||||
|
||||
349
deploy/dalidou/deploy.sh
Normal file
349
deploy/dalidou/deploy.sh
Normal file
@@ -0,0 +1,349 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy/dalidou/deploy.sh
|
||||
# -------------------------
|
||||
# One-shot deploy script for updating the running AtoCore container
|
||||
# on Dalidou from the current Gitea main branch.
|
||||
#
|
||||
# The script is idempotent and safe to re-run. It handles both the
|
||||
# first-time deploy (where /srv/storage/atocore/app may not yet be
|
||||
# a git checkout) and the ongoing update case (where it is).
|
||||
#
|
||||
# Usage
|
||||
# -----
|
||||
#
|
||||
# # Normal update from main (most common)
|
||||
# bash deploy/dalidou/deploy.sh
|
||||
#
|
||||
# # Deploy a specific branch or tag
|
||||
# ATOCORE_BRANCH=codex/some-feature bash deploy/dalidou/deploy.sh
|
||||
#
|
||||
# # Dry-run: show what would happen without touching anything
|
||||
# ATOCORE_DEPLOY_DRY_RUN=1 bash deploy/dalidou/deploy.sh
|
||||
#
|
||||
# Environment variables
|
||||
# ---------------------
|
||||
#
|
||||
# ATOCORE_APP_DIR default /srv/storage/atocore/app
|
||||
# ATOCORE_GIT_REMOTE default http://127.0.0.1:3000/Antoine/ATOCore.git
|
||||
# This is the local Dalidou gitea, reached
|
||||
# via loopback. Override only when running
|
||||
# the deploy from a remote host. The default
|
||||
# is loopback (not the hostname "dalidou")
|
||||
# because the hostname doesn't reliably
|
||||
# resolve on the host itself — Dalidou
|
||||
# Claude's first deploy had to work around
|
||||
# exactly this.
|
||||
# ATOCORE_BRANCH default main
|
||||
# ATOCORE_DEPLOY_DRY_RUN if set to 1, report only, no mutations
|
||||
# ATOCORE_HEALTH_URL default http://127.0.0.1:8100/health
|
||||
#
|
||||
# Safety rails
|
||||
# ------------
|
||||
#
|
||||
# - If the app dir exists but is NOT a git repo, the script renames
|
||||
# it to <dir>.pre-git-<timestamp> before re-cloning, so you never
|
||||
# lose the pre-existing snapshot to a git clobber.
|
||||
# - If the health check fails after restart, the script exits
|
||||
# non-zero and prints the container logs tail for diagnosis.
|
||||
# - Dry-run mode is the default recommendation for the first deploy
|
||||
# on a new environment: it shows the planned git operations and
|
||||
# the compose command without actually running them.
|
||||
#
|
||||
# What this script does NOT do
|
||||
# ----------------------------
|
||||
#
|
||||
# - Does not manage secrets / .env files. The caller is responsible
|
||||
# for placing deploy/dalidou/.env before running.
|
||||
# - Does not run a backup before deploying. Run the backup endpoint
|
||||
# first if you want a pre-deploy snapshot.
|
||||
# - Does not roll back on health-check failure. If deploy fails,
|
||||
# the previous container is already stopped; you need to redeploy
|
||||
# a known-good commit to recover.
|
||||
# - Does not touch the database. The Phase 9 schema migrations in
|
||||
# src/atocore/models/database.py::_apply_migrations are idempotent
|
||||
# ALTER TABLE ADD COLUMN calls that run at service startup via the
|
||||
# lifespan handler. Stale pre-Phase-9 schema is upgraded in place.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
APP_DIR="${ATOCORE_APP_DIR:-/srv/storage/atocore/app}"
|
||||
GIT_REMOTE="${ATOCORE_GIT_REMOTE:-http://127.0.0.1:3000/Antoine/ATOCore.git}"
|
||||
BRANCH="${ATOCORE_BRANCH:-main}"
|
||||
HEALTH_URL="${ATOCORE_HEALTH_URL:-http://127.0.0.1:8100/health}"
|
||||
DRY_RUN="${ATOCORE_DEPLOY_DRY_RUN:-0}"
|
||||
COMPOSE_DIR="$APP_DIR/deploy/dalidou"
|
||||
|
||||
log() { printf '==> %s\n' "$*"; }
|
||||
run() {
|
||||
if [ "$DRY_RUN" = "1" ]; then
|
||||
printf ' [dry-run] %s\n' "$*"
|
||||
else
|
||||
eval "$@"
|
||||
fi
|
||||
}
|
||||
|
||||
log "AtoCore deploy starting"
|
||||
log " app dir: $APP_DIR"
|
||||
log " git remote: $GIT_REMOTE"
|
||||
log " branch: $BRANCH"
|
||||
log " health url: $HEALTH_URL"
|
||||
log " dry run: $DRY_RUN"
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 0: pre-flight permission check
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# If $APP_DIR exists but the current user cannot write to it (because
|
||||
# a previous manual deploy left it root-owned, for example), the git
|
||||
# fetch / reset in step 1 will fail with cryptic errors. Detect this
|
||||
# up front and give the operator a clean remediation command instead
|
||||
# of letting git produce half-state on partial failure. This was the
|
||||
# exact workaround the 2026-04-08 Dalidou redeploy needed — pre-
|
||||
# existing root ownership from the pre-phase9 manual schema fix.
|
||||
|
||||
if [ -d "$APP_DIR" ] && [ "$DRY_RUN" != "1" ]; then
|
||||
if [ ! -w "$APP_DIR" ] || [ ! -r "$APP_DIR/.git" ] 2>/dev/null; then
|
||||
log "WARNING: app dir exists but may not be writable by current user"
|
||||
fi
|
||||
current_owner="$(stat -c '%U:%G' "$APP_DIR" 2>/dev/null || echo unknown)"
|
||||
current_user="$(id -un 2>/dev/null || echo unknown)"
|
||||
current_uid_gid="$(id -u 2>/dev/null):$(id -g 2>/dev/null)"
|
||||
log "Step 0: permission check"
|
||||
log " app dir owner: $current_owner"
|
||||
log " current user: $current_user ($current_uid_gid)"
|
||||
# Try to write a tiny marker file. If it fails, surface a clean
|
||||
# remediation message and exit before git produces confusing
|
||||
# half-state.
|
||||
marker="$APP_DIR/.deploy-permission-check"
|
||||
if ! ( : > "$marker" ) 2>/dev/null; then
|
||||
log "FATAL: cannot write to $APP_DIR as $current_user"
|
||||
log ""
|
||||
log "The app dir is owned by $current_owner and the current user"
|
||||
log "doesn't have write permission. This usually happens after a"
|
||||
log "manual workaround deploy that ran as root."
|
||||
log ""
|
||||
log "Remediation (pick the one that matches your setup):"
|
||||
log ""
|
||||
log " # If you have passwordless sudo and gitea runs as UID 1000:"
|
||||
log " sudo chown -R 1000:1000 $APP_DIR"
|
||||
log ""
|
||||
log " # If you're running deploy.sh itself as root:"
|
||||
log " sudo bash $0"
|
||||
log ""
|
||||
log " # If neither works, do it via a throwaway container:"
|
||||
log " docker run --rm -v $APP_DIR:/app alpine \\"
|
||||
log " chown -R 1000:1000 /app"
|
||||
log ""
|
||||
log "Then re-run deploy.sh."
|
||||
exit 5
|
||||
fi
|
||||
rm -f "$marker" 2>/dev/null || true
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 1: make sure $APP_DIR is a proper git checkout of the branch
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
if [ -d "$APP_DIR/.git" ]; then
|
||||
log "Step 1: app dir is already a git checkout; fetching latest"
|
||||
run "cd '$APP_DIR' && git fetch origin '$BRANCH'"
|
||||
run "cd '$APP_DIR' && git reset --hard 'origin/$BRANCH'"
|
||||
else
|
||||
log "Step 1: app dir is NOT a git checkout; converting"
|
||||
if [ -d "$APP_DIR" ]; then
|
||||
BACKUP="${APP_DIR}.pre-git-$(date -u +%Y%m%dT%H%M%SZ)"
|
||||
log " backing up existing snapshot to $BACKUP"
|
||||
run "mv '$APP_DIR' '$BACKUP'"
|
||||
fi
|
||||
log " cloning $GIT_REMOTE -> $APP_DIR (branch: $BRANCH)"
|
||||
run "git clone --branch '$BRANCH' '$GIT_REMOTE' '$APP_DIR'"
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 1.5: self-update re-exec guard
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# When deploy.sh itself changes in the commit we just pulled, the bash
|
||||
# process running this script is still executing the OLD deploy.sh
|
||||
# from memory — git reset --hard updated the file on disk but our
|
||||
# in-memory instructions are stale. That's exactly how the first
|
||||
# 2026-04-09 Dalidou deploy silently wrote "unknown" build_sha: old
|
||||
# Step 2 logic ran against fresh source. Detect the mismatch and
|
||||
# re-exec into the fresh copy so every post-update run exercises the
|
||||
# new script.
|
||||
#
|
||||
# Guard rails:
|
||||
# - Only runs when $APP_DIR exists, holds a git checkout, and a
|
||||
# deploy.sh exists there (i.e. after Step 1 succeeded).
|
||||
# - Uses a sentinel env var ATOCORE_DEPLOY_REEXECED=1 to make sure
|
||||
# we only re-exec once, never recurse.
|
||||
# - Skipped in dry-run mode (no mutation).
|
||||
# - Skipped if $0 isn't a readable file (bash -c pipe inputs, etc.).
|
||||
|
||||
if [ "$DRY_RUN" != "1" ] \
|
||||
&& [ -z "${ATOCORE_DEPLOY_REEXECED:-}" ] \
|
||||
&& [ -r "$0" ] \
|
||||
&& [ -f "$APP_DIR/deploy/dalidou/deploy.sh" ]; then
|
||||
ON_DISK_HASH="$(sha1sum "$APP_DIR/deploy/dalidou/deploy.sh" 2>/dev/null | awk '{print $1}')"
|
||||
RUNNING_HASH="$(sha1sum "$0" 2>/dev/null | awk '{print $1}')"
|
||||
if [ -n "$ON_DISK_HASH" ] \
|
||||
&& [ -n "$RUNNING_HASH" ] \
|
||||
&& [ "$ON_DISK_HASH" != "$RUNNING_HASH" ]; then
|
||||
log "Step 1.5: deploy.sh changed in the pulled commit; re-exec'ing"
|
||||
log " running script hash: $RUNNING_HASH"
|
||||
log " on-disk script hash: $ON_DISK_HASH"
|
||||
log " re-exec -> $APP_DIR/deploy/dalidou/deploy.sh"
|
||||
export ATOCORE_DEPLOY_REEXECED=1
|
||||
exec bash "$APP_DIR/deploy/dalidou/deploy.sh" "$@"
|
||||
fi
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 2: capture build provenance to pass to the container
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# We compute the full SHA, the short SHA, the UTC build timestamp,
|
||||
# and the source branch. These get exported as env vars before
|
||||
# `docker compose up -d --build` so the running container can read
|
||||
# them at startup and report them via /health. The post-deploy
|
||||
# verification step (Step 6) reads /health and compares the
|
||||
# reported SHA against this value to detect any silent drift.
|
||||
|
||||
log "Step 2: capturing build provenance"
|
||||
if [ "$DRY_RUN" != "1" ] && [ -d "$APP_DIR/.git" ]; then
|
||||
DEPLOYING_SHA_FULL="$(cd "$APP_DIR" && git rev-parse HEAD)"
|
||||
DEPLOYING_SHA="$(echo "$DEPLOYING_SHA_FULL" | cut -c1-7)"
|
||||
DEPLOYING_TIME="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
DEPLOYING_BRANCH="$BRANCH"
|
||||
log " commit: $DEPLOYING_SHA ($DEPLOYING_SHA_FULL)"
|
||||
log " built at: $DEPLOYING_TIME"
|
||||
log " branch: $DEPLOYING_BRANCH"
|
||||
( cd "$APP_DIR" && git log --oneline -1 ) | sed 's/^/ /'
|
||||
export ATOCORE_BUILD_SHA="$DEPLOYING_SHA_FULL"
|
||||
export ATOCORE_BUILD_TIME="$DEPLOYING_TIME"
|
||||
export ATOCORE_BUILD_BRANCH="$DEPLOYING_BRANCH"
|
||||
else
|
||||
log " [dry-run] would read git log from $APP_DIR"
|
||||
DEPLOYING_SHA="dry-run"
|
||||
DEPLOYING_SHA_FULL="dry-run"
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 3: preserve the .env file (it's not in git)
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
ENV_FILE="$COMPOSE_DIR/.env"
|
||||
if [ "$DRY_RUN" != "1" ] && [ ! -f "$ENV_FILE" ]; then
|
||||
log "Step 3: WARNING — $ENV_FILE does not exist"
|
||||
log " the compose workflow needs this file to map mount points"
|
||||
log " copy deploy/dalidou/.env.example to $ENV_FILE and edit it"
|
||||
log " before re-running this script"
|
||||
exit 2
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 4: rebuild and restart the container
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
log "Step 4: rebuilding and restarting the atocore container"
|
||||
run "cd '$COMPOSE_DIR' && docker compose up -d --build"
|
||||
|
||||
if [ "$DRY_RUN" = "1" ]; then
|
||||
log "dry-run complete — no mutations performed"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 5: wait for the service to come up and pass the health check
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
log "Step 5: waiting for /health to respond"
|
||||
for i in 1 2 3 4 5 6 7 8 9 10; do
|
||||
if curl -fsS "$HEALTH_URL" > /tmp/atocore-health.json 2>/dev/null; then
|
||||
log " service is responding"
|
||||
break
|
||||
fi
|
||||
log " not ready yet ($i/10); waiting 3s"
|
||||
sleep 3
|
||||
done
|
||||
|
||||
if ! curl -fsS "$HEALTH_URL" > /tmp/atocore-health.json 2>/dev/null; then
|
||||
log "FATAL: service did not come up within 30 seconds"
|
||||
log " container logs (last 50 lines):"
|
||||
cd "$COMPOSE_DIR" && docker compose logs --tail=50 atocore || true
|
||||
exit 3
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 6: verify the deployed build matches what we just shipped
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# Two layers of comparison:
|
||||
#
|
||||
# - code_version: matches src/atocore/__init__.py::__version__.
|
||||
# Coarse: any commit between version bumps reports the same value.
|
||||
# - build_sha: full git SHA the container was built from. Set as
|
||||
# an env var by Step 2 above and read by /health from
|
||||
# ATOCORE_BUILD_SHA. This is the precise drift signal — if the
|
||||
# live build_sha doesn't match $DEPLOYING_SHA_FULL, the build
|
||||
# didn't pick up the new source.
|
||||
|
||||
log "Step 6: verifying deployed build"
|
||||
log " /health response:"
|
||||
if command -v jq >/dev/null 2>&1; then
|
||||
jq . < /tmp/atocore-health.json | sed 's/^/ /'
|
||||
REPORTED_VERSION="$(jq -r '.code_version // .version' < /tmp/atocore-health.json)"
|
||||
REPORTED_SHA="$(jq -r '.build_sha // "unknown"' < /tmp/atocore-health.json)"
|
||||
REPORTED_BUILD_TIME="$(jq -r '.build_time // "unknown"' < /tmp/atocore-health.json)"
|
||||
else
|
||||
cat /tmp/atocore-health.json | sed 's/^/ /'
|
||||
echo
|
||||
REPORTED_VERSION="$(grep -o '"code_version":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
if [ -z "$REPORTED_VERSION" ]; then
|
||||
REPORTED_VERSION="$(grep -o '"version":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
fi
|
||||
REPORTED_SHA="$(grep -o '"build_sha":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
REPORTED_SHA="${REPORTED_SHA:-unknown}"
|
||||
REPORTED_BUILD_TIME="$(grep -o '"build_time":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
REPORTED_BUILD_TIME="${REPORTED_BUILD_TIME:-unknown}"
|
||||
fi
|
||||
|
||||
EXPECTED_VERSION="$(grep -oE "__version__ = \"[^\"]+\"" "$APP_DIR/src/atocore/__init__.py" | head -1 | cut -d'"' -f2)"
|
||||
|
||||
log " Layer 1 — coarse version:"
|
||||
log " expected code_version: $EXPECTED_VERSION (from src/atocore/__init__.py)"
|
||||
log " reported code_version: $REPORTED_VERSION (from live /health)"
|
||||
|
||||
if [ "$REPORTED_VERSION" != "$EXPECTED_VERSION" ]; then
|
||||
log "FATAL: code_version mismatch"
|
||||
log " the container may not have picked up the new image"
|
||||
log " try: docker compose down && docker compose up -d --build"
|
||||
exit 4
|
||||
fi
|
||||
|
||||
log " Layer 2 — precise build SHA:"
|
||||
log " expected build_sha: $DEPLOYING_SHA_FULL (from this deploy.sh run)"
|
||||
log " reported build_sha: $REPORTED_SHA (from live /health)"
|
||||
log " reported build_time: $REPORTED_BUILD_TIME"
|
||||
|
||||
if [ "$REPORTED_SHA" != "$DEPLOYING_SHA_FULL" ]; then
|
||||
log "FATAL: build_sha mismatch"
|
||||
log " the live container is reporting a different commit than"
|
||||
log " the one this deploy.sh run just shipped. Possible causes:"
|
||||
log " - the container is using a cached image instead of the"
|
||||
log " freshly-built one (try: docker compose build --no-cache)"
|
||||
log " - the env vars didn't propagate (check that"
|
||||
log " deploy/dalidou/docker-compose.yml has the environment"
|
||||
log " section with ATOCORE_BUILD_SHA)"
|
||||
log " - another process restarted the container between the"
|
||||
log " build and the health check"
|
||||
exit 6
|
||||
fi
|
||||
|
||||
log "Deploy complete."
|
||||
log " commit: $DEPLOYING_SHA ($DEPLOYING_SHA_FULL)"
|
||||
log " code_version: $REPORTED_VERSION"
|
||||
log " build_sha: $REPORTED_SHA"
|
||||
log " build_time: $REPORTED_BUILD_TIME"
|
||||
log " health: ok"
|
||||
@@ -9,6 +9,15 @@ services:
|
||||
- "${ATOCORE_PORT:-8100}:8100"
|
||||
env_file:
|
||||
- .env
|
||||
environment:
|
||||
# Build provenance — set by deploy/dalidou/deploy.sh on each
|
||||
# rebuild so /health can report exactly which commit is live.
|
||||
# Defaults to 'unknown' for direct `docker compose up` runs that
|
||||
# bypass deploy.sh; in that case the operator should run
|
||||
# deploy.sh instead so the deployed SHA is recorded.
|
||||
ATOCORE_BUILD_SHA: "${ATOCORE_BUILD_SHA:-unknown}"
|
||||
ATOCORE_BUILD_TIME: "${ATOCORE_BUILD_TIME:-unknown}"
|
||||
ATOCORE_BUILD_BRANCH: "${ATOCORE_BUILD_BRANCH:-unknown}"
|
||||
volumes:
|
||||
- ${ATOCORE_DB_DIR}:${ATOCORE_DB_DIR}
|
||||
- ${ATOCORE_CHROMA_DIR}:${ATOCORE_CHROMA_DIR}
|
||||
|
||||
188
deploy/hooks/capture_stop.py
Normal file
188
deploy/hooks/capture_stop.py
Normal file
@@ -0,0 +1,188 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Claude Code Stop hook: capture interaction to AtoCore.
|
||||
|
||||
Reads the Stop hook JSON from stdin, extracts the last user prompt
|
||||
from the transcript JSONL, and POSTs to the AtoCore /interactions
|
||||
endpoint in conservative mode (reinforce=false, no extraction).
|
||||
|
||||
Fail-open: always exits 0, logs errors to stderr only.
|
||||
|
||||
Environment variables:
|
||||
ATOCORE_URL Base URL of the AtoCore instance (default: http://dalidou:8100)
|
||||
ATOCORE_CAPTURE_DISABLED Set to "1" to disable capture (kill switch)
|
||||
|
||||
Usage in ~/.claude/settings.json:
|
||||
"Stop": [{
|
||||
"matcher": "",
|
||||
"hooks": [{
|
||||
"type": "command",
|
||||
"command": "python /path/to/capture_stop.py",
|
||||
"timeout": 15
|
||||
}]
|
||||
}]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
|
||||
ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100")
|
||||
TIMEOUT_SECONDS = 10
|
||||
|
||||
# Minimum prompt length to bother capturing. Single-word acks,
|
||||
# slash commands, and empty lines aren't useful interactions.
|
||||
MIN_PROMPT_LENGTH = 15
|
||||
|
||||
# Maximum response length to capture. Truncate very long assistant
|
||||
# responses to keep the interactions table manageable.
|
||||
MAX_RESPONSE_LENGTH = 50_000
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Entry point. Always exits 0."""
|
||||
try:
|
||||
_capture()
|
||||
except Exception as exc:
|
||||
print(f"capture_stop: {exc}", file=sys.stderr)
|
||||
|
||||
|
||||
def _capture() -> None:
|
||||
if os.environ.get("ATOCORE_CAPTURE_DISABLED") == "1":
|
||||
return
|
||||
|
||||
raw = sys.stdin.read()
|
||||
if not raw.strip():
|
||||
return
|
||||
|
||||
hook_data = json.loads(raw)
|
||||
|
||||
session_id = hook_data.get("session_id", "")
|
||||
assistant_message = hook_data.get("last_assistant_message", "")
|
||||
transcript_path = hook_data.get("transcript_path", "")
|
||||
cwd = hook_data.get("cwd", "")
|
||||
|
||||
prompt = _extract_last_user_prompt(transcript_path)
|
||||
if not prompt or len(prompt.strip()) < MIN_PROMPT_LENGTH:
|
||||
return
|
||||
|
||||
response = assistant_message or ""
|
||||
if len(response) > MAX_RESPONSE_LENGTH:
|
||||
response = response[:MAX_RESPONSE_LENGTH] + "\n\n[truncated]"
|
||||
|
||||
project = _infer_project(cwd)
|
||||
|
||||
payload = {
|
||||
"prompt": prompt,
|
||||
"response": response,
|
||||
"client": "claude-code",
|
||||
"session_id": session_id,
|
||||
"project": project,
|
||||
"reinforce": False,
|
||||
}
|
||||
|
||||
body = json.dumps(payload, ensure_ascii=True).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{ATOCORE_URL}/interactions",
|
||||
data=body,
|
||||
headers={"Content-Type": "application/json"},
|
||||
method="POST",
|
||||
)
|
||||
resp = urllib.request.urlopen(req, timeout=TIMEOUT_SECONDS)
|
||||
result = json.loads(resp.read().decode("utf-8"))
|
||||
print(
|
||||
f"capture_stop: recorded interaction {result.get('id', '?')} "
|
||||
f"(project={project or 'none'}, prompt_chars={len(prompt)}, "
|
||||
f"response_chars={len(response)})",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
|
||||
def _extract_last_user_prompt(transcript_path: str) -> str:
|
||||
"""Read the JSONL transcript and return the last real user prompt.
|
||||
|
||||
Skips meta messages (isMeta=True) and system/command messages
|
||||
(content starting with '<').
|
||||
"""
|
||||
if not transcript_path:
|
||||
return ""
|
||||
|
||||
# Normalize path for the current OS
|
||||
path = os.path.normpath(transcript_path)
|
||||
if not os.path.isfile(path):
|
||||
return ""
|
||||
|
||||
last_prompt = ""
|
||||
try:
|
||||
with open(path, encoding="utf-8", errors="replace") as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
entry = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
if entry.get("type") != "user":
|
||||
continue
|
||||
if entry.get("isMeta", False):
|
||||
continue
|
||||
|
||||
msg = entry.get("message", {})
|
||||
if not isinstance(msg, dict):
|
||||
continue
|
||||
|
||||
content = msg.get("content", "")
|
||||
|
||||
if isinstance(content, str):
|
||||
text = content.strip()
|
||||
elif isinstance(content, list):
|
||||
# Content blocks: extract text blocks
|
||||
parts = []
|
||||
for block in content:
|
||||
if isinstance(block, str):
|
||||
parts.append(block)
|
||||
elif isinstance(block, dict) and block.get("type") == "text":
|
||||
parts.append(block.get("text", ""))
|
||||
text = "\n".join(parts).strip()
|
||||
else:
|
||||
continue
|
||||
|
||||
# Skip system/command XML and very short messages
|
||||
if text.startswith("<") or len(text) < MIN_PROMPT_LENGTH:
|
||||
continue
|
||||
|
||||
last_prompt = text
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
return last_prompt
|
||||
|
||||
|
||||
# Project inference from working directory.
|
||||
# Maps known repo paths to AtoCore project IDs. The user can extend
|
||||
# this table or replace it with a registry lookup later.
|
||||
_PROJECT_PATH_MAP: dict[str, str] = {
|
||||
# Add mappings as needed, e.g.:
|
||||
# "C:\\Users\\antoi\\gigabit": "p04-gigabit",
|
||||
# "C:\\Users\\antoi\\interferometer": "p05-interferometer",
|
||||
}
|
||||
|
||||
|
||||
def _infer_project(cwd: str) -> str:
|
||||
"""Try to map the working directory to an AtoCore project."""
|
||||
if not cwd:
|
||||
return ""
|
||||
norm = os.path.normpath(cwd).lower()
|
||||
for path_prefix, project_id in _PROJECT_PATH_MAP.items():
|
||||
if norm.startswith(os.path.normpath(path_prefix).lower()):
|
||||
return project_id
|
||||
return ""
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
434
docs/architecture/engineering-v1-acceptance.md
Normal file
434
docs/architecture/engineering-v1-acceptance.md
Normal file
@@ -0,0 +1,434 @@
|
||||
# Engineering Layer V1 Acceptance Criteria
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The engineering layer planning sprint produced 7 architecture
|
||||
docs. None of them on their own says "you're done with V1, ship
|
||||
it". This document does. It translates the planning into
|
||||
measurable, falsifiable acceptance criteria so the implementation
|
||||
sprint can know unambiguously when V1 is complete.
|
||||
|
||||
The acceptance criteria are organized into four categories:
|
||||
|
||||
1. **Functional** — what the system must be able to do
|
||||
2. **Quality** — how well it must do it
|
||||
3. **Operational** — what running it must look like
|
||||
4. **Documentation** — what must be written down
|
||||
|
||||
V1 is "done" only when **every criterion in this document is met
|
||||
against at least one of the three active projects** (`p04-gigabit`,
|
||||
`p05-interferometer`, `p06-polisher`). The choice of which
|
||||
project is the test bed is up to the implementer, but the same
|
||||
project must satisfy all functional criteria.
|
||||
|
||||
## The single-sentence definition
|
||||
|
||||
> AtoCore Engineering Layer V1 is done when, against one chosen
|
||||
> active project, every v1-required query in
|
||||
> `engineering-query-catalog.md` returns a correct result, the
|
||||
> Human Mirror renders a coherent project overview, and a real
|
||||
> KB-CAD or KB-FEM export round-trips through the ingest →
|
||||
> review queue → active entity flow without violating any
|
||||
> conflict or trust invariant.
|
||||
|
||||
Everything below is the operational form of that sentence.
|
||||
|
||||
## Category 1 — Functional acceptance
|
||||
|
||||
### F-1: Entity store implemented per the V1 ontology
|
||||
|
||||
- The 12 V1 entity types from `engineering-ontology-v1.md` exist
|
||||
in the database with the schema described there
|
||||
- The 4 relationship families (Structural, Intent, Validation,
|
||||
Provenance) are implemented as edges with the relationship
|
||||
types listed in the catalog
|
||||
- Every entity has the shared header fields:
|
||||
`id, type, name, project_id, status, confidence, source_refs,
|
||||
created_at, updated_at, extractor_version, canonical_home`
|
||||
- The status lifecycle matches the memory layer:
|
||||
`candidate → active → superseded | invalid`
|
||||
|
||||
### F-2: All v1-required queries return correct results
|
||||
|
||||
For the chosen test project, every query Q-001 through Q-020 in
|
||||
`engineering-query-catalog.md` must:
|
||||
|
||||
- be implemented as an API endpoint with the shape specified in
|
||||
the catalog
|
||||
- return the expected result shape against real data
|
||||
- include the provenance chain when the catalog requires it
|
||||
- handle the empty case (no matches) gracefully — empty array,
|
||||
not 500
|
||||
|
||||
The "killer correctness queries" — Q-006 (orphan requirements),
|
||||
Q-009 (decisions on flagged assumptions), Q-011 (unsupported
|
||||
validation claims) — are non-negotiable. If any of those three
|
||||
returns wrong results, V1 is not done.
|
||||
|
||||
### F-3: Tool ingest endpoints are live
|
||||
|
||||
Both endpoints from `tool-handoff-boundaries.md` are implemented:
|
||||
|
||||
- `POST /ingest/kb-cad/export` accepts the documented JSON
|
||||
shape, validates it, and produces entity candidates
|
||||
- `POST /ingest/kb-fem/export` ditto
|
||||
- Both refuse exports with invalid schemas (4xx with a clear
|
||||
error)
|
||||
- Both return a summary of created/dropped/failed counts
|
||||
- Both never auto-promote anything; everything lands as
|
||||
`status="candidate"`
|
||||
- Both carry source identifiers (exporter name, exporter version,
|
||||
source artifact id) into the candidate's provenance fields
|
||||
|
||||
A real KB-CAD export — even a hand-crafted one if the actual
|
||||
exporter doesn't exist yet — must round-trip through the endpoint
|
||||
and produce reviewable candidates for the test project.
|
||||
|
||||
### F-4: Candidate review queue works end to end
|
||||
|
||||
Per `promotion-rules.md`:
|
||||
|
||||
- `GET /entities?status=candidate` lists the queue
|
||||
- `POST /entities/{id}/promote` moves candidate → active
|
||||
- `POST /entities/{id}/reject` moves candidate → invalid
|
||||
- The same shapes work for memories (already shipped in Phase 9 C)
|
||||
- The reviewer can edit a candidate's content via
|
||||
`PUT /entities/{id}` before promoting
|
||||
- Every promote/reject is logged with timestamp and reason
|
||||
|
||||
### F-5: Conflict detection fires
|
||||
|
||||
Per `conflict-model.md`:
|
||||
|
||||
- The synchronous detector runs at every active write
|
||||
(create, promote, project_state set, KB import)
|
||||
- A test must demonstrate that pushing a contradictory KB-CAD
|
||||
export creates a `conflicts` row with both members linked
|
||||
- The reviewer can resolve the conflict via
|
||||
`POST /conflicts/{id}/resolve` with one of the supported
|
||||
actions (supersede_others, no_action, dismiss)
|
||||
- Resolution updates the underlying entities according to the
|
||||
chosen action
|
||||
|
||||
### F-6: Human Mirror renders for the test project
|
||||
|
||||
Per `human-mirror-rules.md`:
|
||||
|
||||
- `GET /mirror/{project}/overview` returns rendered markdown
|
||||
- `GET /mirror/{project}/decisions` returns rendered markdown
|
||||
- `GET /mirror/{project}/subsystems/{subsystem}` returns
|
||||
rendered markdown for at least one subsystem
|
||||
- `POST /mirror/{project}/regenerate` triggers regeneration on
|
||||
demand
|
||||
- Generated files appear under `/srv/storage/atocore/data/mirror/`
|
||||
with the "do not edit" header banner
|
||||
- Disputed markers appear inline when conflicts exist
|
||||
- Project-state overrides display with the `(curated)` annotation
|
||||
- Output is deterministic (the same inputs produce the same
|
||||
bytes, suitable for diffing)
|
||||
|
||||
### F-7: Memory-to-entity graduation works for at least one type
|
||||
|
||||
Per `memory-vs-entities.md`:
|
||||
|
||||
- `POST /memory/{id}/graduate` exists
|
||||
- Graduating a memory of type `adaptation` produces a Decision
|
||||
entity candidate with the memory's content as a starting point
|
||||
- The original memory row stays at `status="graduated"` (a new
|
||||
status added by the engineering layer migration)
|
||||
- The graduated memory has a forward pointer to the entity
|
||||
candidate's id
|
||||
- Promoting the entity candidate does NOT delete the original
|
||||
memory
|
||||
- The same graduation flow works for `project` → Requirement
|
||||
and `knowledge` → Fact entity types (test the path; doesn't
|
||||
have to be exhaustive)
|
||||
|
||||
### F-8: Provenance chain is complete
|
||||
|
||||
For every active entity in the test project, the following must
|
||||
be true:
|
||||
|
||||
- It links back to at least one source via `source_refs` (which
|
||||
is one or more of: source_chunk_id, source_interaction_id,
|
||||
source_artifact_id from KB import)
|
||||
- The provenance chain can be walked from the entity to the
|
||||
underlying raw text (source_chunks) or external artifact
|
||||
- Q-017 (the evidence query) returns at least one row for every
|
||||
active entity
|
||||
|
||||
If any active entity has no provenance, it's a bug — provenance
|
||||
is mandatory at write time per the promotion rules.
|
||||
|
||||
## Category 2 — Quality acceptance
|
||||
|
||||
### Q-1: All existing tests still pass
|
||||
|
||||
The full pre-V1 test suite (currently 160 tests) must still
|
||||
pass. The V1 implementation may add new tests but cannot regress
|
||||
any existing test.
|
||||
|
||||
### Q-2: V1 has its own test coverage
|
||||
|
||||
For each of F-1 through F-8 above, at least one automated test
|
||||
exists that:
|
||||
|
||||
- exercises the happy path
|
||||
- covers at least one error path
|
||||
- runs in CI in under 10 seconds (no real network, no real LLM)
|
||||
|
||||
The full V1 test suite should be under 30 seconds total runtime
|
||||
to keep the development loop fast.
|
||||
|
||||
### Q-3: Conflict invariants are enforced by tests
|
||||
|
||||
Specific tests must demonstrate:
|
||||
|
||||
- Two contradictory KB exports produce a conflict (not silent
|
||||
overwrite)
|
||||
- A reviewer can't accidentally promote both members of an open
|
||||
conflict to active without resolving the conflict first
|
||||
- The "flag, never block" rule holds — writes still succeed
|
||||
even when they create a conflict
|
||||
|
||||
### Q-4: Trust hierarchy is enforced by tests
|
||||
|
||||
Specific tests must demonstrate:
|
||||
|
||||
- Entity candidates can never appear in context packs
|
||||
- Reinforcement only touches active memories (already covered
|
||||
by Phase 9 Commit B tests, but the same property must hold
|
||||
for entities once they exist)
|
||||
- Nothing automatically writes to project_state ever
|
||||
- Candidates can never satisfy Q-005 (only active entities count)
|
||||
|
||||
### Q-5: The Human Mirror is reproducible
|
||||
|
||||
A golden-file test exists for at least one Mirror page. Updating
|
||||
the golden file is a normal part of template work (single
|
||||
command, well-documented). The test fails if the renderer
|
||||
produces different bytes for the same input, catching
|
||||
non-determinism.
|
||||
|
||||
### Q-6: Killer correctness queries pass against real-ish data
|
||||
|
||||
The test bed for Q-006, Q-009, Q-011 is not synthetic. The
|
||||
implementation must seed the test project with at least:
|
||||
|
||||
- One Requirement that has a satisfying Component (Q-006 should
|
||||
not flag it)
|
||||
- One Requirement with no satisfying Component (Q-006 must flag it)
|
||||
- One Decision based on an Assumption flagged as `needs_review`
|
||||
(Q-009 must flag the Decision)
|
||||
- One ValidationClaim with at least one supporting Result
|
||||
(Q-011 should not flag it)
|
||||
- One ValidationClaim with no supporting Result (Q-011 must flag it)
|
||||
|
||||
These five seed cases run as a single integration test that
|
||||
exercises the killer correctness queries against actual
|
||||
representative data.
|
||||
|
||||
## Category 3 — Operational acceptance
|
||||
|
||||
### O-1: Migration is safe and reversible
|
||||
|
||||
The V1 schema migration (adding the `entities`, `relationships`,
|
||||
`conflicts`, `conflict_members` tables, plus `mirror_regeneration_failures`)
|
||||
must:
|
||||
|
||||
- run cleanly against a production-shape database
|
||||
- be implemented via the same `_apply_migrations` pattern as
|
||||
Phase 9 (additive only, idempotent, safe to run twice)
|
||||
- be tested by spinning up a fresh DB AND running against a
|
||||
copy of the live Dalidou DB taken from a backup
|
||||
|
||||
### O-2: Backup and restore still work
|
||||
|
||||
The backup endpoint must include the new tables. A restore drill
|
||||
on the test project must:
|
||||
|
||||
- successfully back up the V1 entity state via
|
||||
`POST /admin/backup`
|
||||
- successfully validate the snapshot
|
||||
- successfully restore from the snapshot per
|
||||
`docs/backup-restore-procedure.md`
|
||||
- pass post-restore verification including a Q-001 query against
|
||||
the test project
|
||||
|
||||
The drill must be performed once before V1 is declared done.
|
||||
|
||||
### O-3: Performance bounds
|
||||
|
||||
These are starting bounds; tune later if real usage shows
|
||||
problems:
|
||||
|
||||
- Single-entity write (`POST /entities/...`): under 100ms p99
|
||||
on the production Dalidou hardware
|
||||
- Single Q-001 / Q-005 / Q-008 query: under 500ms p99 against
|
||||
a project with up to 1000 entities
|
||||
- Mirror regeneration of one project overview: under 5 seconds
|
||||
for a project with up to 1000 entities
|
||||
- Conflict detector at write time: adds no more than 50ms p99
|
||||
to a write that doesn't actually produce a conflict
|
||||
|
||||
These bounds are not tested by automated benchmarks in V1 (that
|
||||
would be over-engineering). They are sanity-checked by the
|
||||
developer running the operations against the test project.
|
||||
|
||||
### O-4: No new manual ops burden
|
||||
|
||||
V1 should not introduce any new "you have to remember to run X
|
||||
every day" requirement. Specifically:
|
||||
|
||||
- Mirror regeneration is automatic (debounced async + daily
|
||||
refresh), no manual cron entry needed
|
||||
- Conflict detection is automatic at write time, no manual sweep
|
||||
needed in V1 (the nightly sweep is V2)
|
||||
- Backup retention cleanup is **still** an open follow-up from
|
||||
the operational baseline; V1 does not block on it
|
||||
|
||||
### O-5: No regressions in Phase 9 reflection loop
|
||||
|
||||
The capture, reinforcement, and extraction loop from Phase 9
|
||||
A/B/C must continue to work end to end with the engineering
|
||||
layer in place. Specifically:
|
||||
|
||||
- Memories whose types are NOT in the engineering layer
|
||||
(identity, preference, episodic) keep working exactly as
|
||||
before
|
||||
- Memories whose types ARE in the engineering layer (project,
|
||||
knowledge, adaptation) can still be created hand or by
|
||||
extraction; the deprecation rule from `memory-vs-entities.md`
|
||||
("no new writes after V1 ships") is implemented as a
|
||||
configurable warning, not a hard block, so existing
|
||||
workflows aren't disrupted
|
||||
|
||||
## Category 4 — Documentation acceptance
|
||||
|
||||
### D-1: Per-entity-type spec docs
|
||||
|
||||
Each of the 12 V1 entity types has a short spec doc under
|
||||
`docs/architecture/entities/` covering:
|
||||
|
||||
- the entity's purpose
|
||||
- its required and optional fields
|
||||
- its lifecycle quirks (if any beyond the standard
|
||||
candidate/active/superseded/invalid)
|
||||
- which queries it appears in (cross-reference to the catalog)
|
||||
- which relationship types reference it
|
||||
|
||||
These docs can be terse — a page each, mostly bullet lists.
|
||||
Their purpose is to make the entity model legible to a future
|
||||
maintainer, not to be reference manuals.
|
||||
|
||||
### D-2: KB-CAD and KB-FEM export schema docs
|
||||
|
||||
`docs/architecture/kb-cad-export-schema.md` and
|
||||
`docs/architecture/kb-fem-export-schema.md` are written and
|
||||
match the implemented validators.
|
||||
|
||||
### D-3: V1 release notes
|
||||
|
||||
A `docs/v1-release-notes.md` summarizes:
|
||||
|
||||
- What V1 added (entities, relationships, conflicts, mirror,
|
||||
ingest endpoints)
|
||||
- What V1 deferred (auto-promotion, BOM/cost/manufacturing
|
||||
entities, NX direct integration, cross-project rollups)
|
||||
- The migration story for existing memories (graduation flow)
|
||||
- Known limitations and the V2 roadmap pointers
|
||||
|
||||
### D-4: master-plan-status.md and current-state.md updated
|
||||
|
||||
Both top-level status docs reflect V1's completion:
|
||||
|
||||
- Phase 6 (AtoDrive) and the engineering layer are explicitly
|
||||
marked as separate tracks
|
||||
- The engineering planning sprint section is marked complete
|
||||
- Phase 9 stays at "baseline complete" (V1 doesn't change Phase 9)
|
||||
- The engineering layer V1 is added as its own line item
|
||||
|
||||
## What V1 explicitly does NOT need to do
|
||||
|
||||
To prevent scope creep, here is the negative list. None of the
|
||||
following are V1 acceptance criteria:
|
||||
|
||||
- **No LLM extractor.** The Phase 9 C rule-based extractor is
|
||||
the entity extractor for V1 too, just with new rules added for
|
||||
entity types.
|
||||
- **No auto-promotion of candidates.** Per `promotion-rules.md`.
|
||||
- **No write-back to KB-CAD or KB-FEM.** Per
|
||||
`tool-handoff-boundaries.md`.
|
||||
- **No multi-user / per-reviewer auth.** Single-user assumed.
|
||||
- **No real-time UI.** API + Mirror markdown is the V1 surface.
|
||||
A web UI is V2+.
|
||||
- **No cross-project rollups.** Per `human-mirror-rules.md`.
|
||||
- **No time-travel queries** (Q-015 stays v1-stretch).
|
||||
- **No nightly conflict sweep.** Synchronous detection only in V1.
|
||||
- **No incremental Chroma snapshots.** The current full-copy
|
||||
approach in `backup-restore-procedure.md` is fine for V1.
|
||||
- **No retention cleanup script.** Still an open follow-up.
|
||||
- **No backup encryption.** Still an open follow-up.
|
||||
- **No off-Dalidou backup target.** Still an open follow-up.
|
||||
|
||||
## How to use this document during implementation
|
||||
|
||||
When the implementation sprint begins:
|
||||
|
||||
1. Read this doc once, top to bottom
|
||||
2. Pick the test project (probably p05-interferometer because
|
||||
the optical/structural domain has the cleanest entity model)
|
||||
3. For each section, write the test or the implementation, in
|
||||
roughly the order: F-1 → F-2 → F-3 → F-4 → F-5 → F-6 → F-7 → F-8
|
||||
4. Each acceptance criterion's test should be written **before
|
||||
or alongside** the implementation, not after
|
||||
5. Run the full test suite at every commit
|
||||
6. When every box is checked, write D-3 (release notes), update
|
||||
D-4 (status docs), and call V1 done
|
||||
|
||||
The implementation sprint should not touch anything outside the
|
||||
scope listed here. If a desire arises to add something not in
|
||||
this doc, that's a V2 conversation, not a V1 expansion.
|
||||
|
||||
## Anticipated friction points
|
||||
|
||||
These are the things I expect will be hard during implementation:
|
||||
|
||||
1. **The graduation flow (F-7)** is the most cross-cutting
|
||||
change because it touches the existing memory module.
|
||||
Worth doing it last so the memory module is stable for
|
||||
all the V1 entity work first.
|
||||
2. **The Mirror's deterministic-output requirement (Q-5)** will
|
||||
bite if the implementer iterates over Python dicts without
|
||||
sorting. Plan to use `sorted()` literally everywhere.
|
||||
3. **Conflict detection (F-5)** has subtle correctness traps:
|
||||
the slot key extraction must be stable, the dedup-of-existing-conflicts
|
||||
logic must be right, and the synchronous detector must not
|
||||
slow writes meaningfully (Q-3 / O-3 cover this, but watch).
|
||||
4. **Provenance backfill** for entities that come from the
|
||||
existing memory layer via graduation (F-7) is the trickiest
|
||||
part: the original memory may not have had a strict
|
||||
`source_chunk_id`, in which case the graduated entity also
|
||||
doesn't have one. The implementation needs an "orphan
|
||||
provenance" allowance for graduated entities, with a
|
||||
warning surfaced in the Mirror.
|
||||
|
||||
These aren't blockers, just the parts of the V1 spec I'd
|
||||
attack with extra care.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- Engineering V1 is done when every box in this doc is checked
|
||||
against one chosen active project
|
||||
- Functional: 8 criteria covering entities, queries, ingest,
|
||||
review queue, conflicts, mirror, graduation, provenance
|
||||
- Quality: 6 criteria covering tests, golden files, killer
|
||||
correctness, trust enforcement
|
||||
- Operational: 5 criteria covering migration safety, backup
|
||||
drill, performance bounds, no new manual ops, Phase 9 not
|
||||
regressed
|
||||
- Documentation: 4 criteria covering entity specs, KB schema
|
||||
docs, release notes, top-level status updates
|
||||
- Negative list: a clear set of things V1 deliberately does
|
||||
NOT need to do, to prevent scope creep
|
||||
- The implementation sprint follows this doc as a checklist
|
||||
384
docs/architecture/human-mirror-rules.md
Normal file
384
docs/architecture/human-mirror-rules.md
Normal file
@@ -0,0 +1,384 @@
|
||||
# Human Mirror Rules (Layer 3 → derived markdown views)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The engineering layer V1 stores facts as typed entities and
|
||||
relationships in a SQL database. That representation is excellent
|
||||
for queries, conflict detection, and automated reasoning, but
|
||||
it's terrible for the human reading experience. People want to
|
||||
read prose, not crawl JSON.
|
||||
|
||||
The Human Mirror is the layer that turns the typed entity store
|
||||
into human-readable markdown pages. It's strictly a derived view —
|
||||
nothing in the Human Mirror is canonical, every page is regenerated
|
||||
from current entity state on demand.
|
||||
|
||||
This document defines:
|
||||
|
||||
- what the Human Mirror generates
|
||||
- when it regenerates
|
||||
- how the human edits things they see in the Mirror
|
||||
- how the canonical-vs-derived rule is enforced (so editing the
|
||||
derived markdown can't silently corrupt the entity store)
|
||||
|
||||
## The non-negotiable rule
|
||||
|
||||
> **The Human Mirror is read-only from the human's perspective.**
|
||||
>
|
||||
> If the human wants to change a fact they see in the Mirror, they
|
||||
> change it in the canonical home (per `representation-authority.md`),
|
||||
> NOT in the Mirror page. The next regeneration picks up the change.
|
||||
|
||||
This rule is what makes the whole derived-view approach safe. If
|
||||
the human is allowed to edit Mirror pages directly, the
|
||||
canonical-vs-derived split breaks and the Mirror becomes a second
|
||||
source of truth that disagrees with the entity store.
|
||||
|
||||
The technical enforcement is that every Mirror page carries a
|
||||
header banner that says "this file is generated from AtoCore
|
||||
entity state, do not edit", and the file is regenerated from the
|
||||
entity store on every change to its underlying entities. Manual
|
||||
edits will be silently overwritten on the next regeneration.
|
||||
|
||||
## What the Mirror generates in V1
|
||||
|
||||
Three template families, each producing one or more pages per
|
||||
project:
|
||||
|
||||
### 1. Project Overview
|
||||
|
||||
One page per registered project. Renders:
|
||||
|
||||
- Project header (id, aliases, description)
|
||||
- Subsystem tree (from Q-001 / Q-004 in the query catalog)
|
||||
- Active Decisions affecting this project (Q-008, ordered by date)
|
||||
- Open Requirements with coverage status (Q-005, Q-006)
|
||||
- Open ValidationClaims with support status (Q-010, Q-011)
|
||||
- Currently flagged conflicts (from the conflict model)
|
||||
- Recent changes (Q-013) — last 14 days
|
||||
|
||||
This is the most important Mirror page. It's the page someone
|
||||
opens when they want to know "what's the state of this project
|
||||
right now". It deliberately mirrors what `current-state.md` does
|
||||
for AtoCore itself but generated entirely from typed state.
|
||||
|
||||
### 2. Decision Log
|
||||
|
||||
One page per project. Renders:
|
||||
|
||||
- All active Decisions in chronological order (newest first)
|
||||
- Each Decision shows: id, what was decided, when, the affected
|
||||
Subsystem/Component, the supporting evidence (Q-014, Q-017)
|
||||
- Superseded Decisions appear as collapsed "history" entries
|
||||
with a forward link to whatever superseded them
|
||||
- Conflicting Decisions get a "⚠ disputed" marker
|
||||
|
||||
This is the human-readable form of the engineering query catalog's
|
||||
Q-014 query.
|
||||
|
||||
### 3. Subsystem Detail
|
||||
|
||||
One page per Subsystem (so a few per project). Renders:
|
||||
|
||||
- Subsystem header
|
||||
- Components contained in this subsystem (Q-001)
|
||||
- Interfaces this subsystem has (Q-003)
|
||||
- Constraints applying to it (Q-007)
|
||||
- Decisions affecting it (Q-008)
|
||||
- Validation status: which Requirements are satisfied,
|
||||
which are open (Q-005, Q-006)
|
||||
- Change history within this subsystem (Q-013 scoped)
|
||||
|
||||
Subsystem detail pages are what someone reads when they're
|
||||
working on a specific part of the system and want everything
|
||||
relevant in one place.
|
||||
|
||||
## What the Mirror does NOT generate in V1
|
||||
|
||||
Intentionally excluded so the V1 implementation stays scoped:
|
||||
|
||||
- **Per-component detail pages.** Components are listed in
|
||||
Subsystem pages but don't get their own pages. Reduces page
|
||||
count from hundreds to dozens.
|
||||
- **Per-Decision detail pages.** Decisions appear inline in
|
||||
Project Overview and Decision Log; their full text plus
|
||||
evidence chain is shown there, not on a separate page.
|
||||
- **Cross-project rollup pages.** No "all projects at a glance"
|
||||
page in V1. Each project is its own report.
|
||||
- **Time-series / historical pages.** The Mirror is always
|
||||
"current state". History is accessible via Decision Log and
|
||||
superseded chains, but no "what was true on date X" page exists
|
||||
in V1 (Q-015 is v1-stretch in the query catalog for the same
|
||||
reason).
|
||||
- **Diff pages between two timestamps.** Same reasoning.
|
||||
- **Render of the conflict queue itself.** Conflicts appear
|
||||
inline in the relevant Mirror pages with the "⚠ disputed"
|
||||
marker and a link to `/conflicts/{id}`, but there's no
|
||||
Mirror page that lists all conflicts. Use `GET /conflicts`.
|
||||
- **Per-memory pages.** Memories are not engineering entities;
|
||||
they appear in context packs and the review queue, not in the
|
||||
Human Mirror.
|
||||
|
||||
## Where Mirror pages live
|
||||
|
||||
Two options were considered. The chosen V1 path is option B:
|
||||
|
||||
**Option A — write Mirror pages back into the source vault.**
|
||||
Generate `/srv/storage/atocore/sources/vault/mirror/p05/overview.md`
|
||||
so the human reads them in their normal Obsidian / markdown
|
||||
viewer. **Rejected** because writing into the source vault
|
||||
violates the "sources are read-only" rule from
|
||||
`tool-handoff-boundaries.md` and the operating model.
|
||||
|
||||
**Option B (chosen) — write Mirror pages into a dedicated AtoCore
|
||||
output dir, served via the API.** Generate under
|
||||
`/srv/storage/atocore/data/mirror/p05/overview.md`. The human
|
||||
reads them via:
|
||||
|
||||
- the API endpoints `GET /mirror/{project}/overview`,
|
||||
`GET /mirror/{project}/decisions`,
|
||||
`GET /mirror/{project}/subsystems/{subsystem}` (all return
|
||||
rendered markdown as text/markdown)
|
||||
- a future "Mirror viewer" in the Claude Code slash command
|
||||
`/atocore-mirror <project>` that fetches the rendered markdown
|
||||
and displays it inline
|
||||
- direct file access on Dalidou for power users:
|
||||
`cat /srv/storage/atocore/data/mirror/p05/overview.md`
|
||||
|
||||
The dedicated dir keeps the Mirror clearly separated from the
|
||||
canonical sources and makes regeneration safe (it's just a
|
||||
directory wipe + write).
|
||||
|
||||
## When the Mirror regenerates
|
||||
|
||||
Three triggers, in order from cheapest to most expensive:
|
||||
|
||||
### 1. On explicit human request
|
||||
|
||||
```
|
||||
POST /mirror/{project}/regenerate
|
||||
```
|
||||
|
||||
Returns the timestamp of the regeneration and the list of files
|
||||
written. This is the path the human takes when they've just
|
||||
curated something into project_state and want to see the Mirror
|
||||
reflect it immediately.
|
||||
|
||||
### 2. On entity write (debounced, async, per project)
|
||||
|
||||
When any entity in a project changes status (candidate → active,
|
||||
active → superseded), a regeneration of that project's Mirror is
|
||||
queued. The queue is debounced — multiple writes within a 30-second
|
||||
window only trigger one regeneration. This keeps the Mirror
|
||||
"close to current" without generating a Mirror update on every
|
||||
single API call.
|
||||
|
||||
The implementation is a simple dict of "next regeneration time"
|
||||
per project, checked by a background task. No cron, no message
|
||||
queue, no Celery. Just a `dict[str, datetime]` and a thread.
|
||||
|
||||
### 3. On scheduled refresh (daily)
|
||||
|
||||
Once per day at a quiet hour, every project's Mirror regenerates
|
||||
unconditionally. This catches any state drift from manual
|
||||
project_state edits that bypassed the entity write hooks, and
|
||||
provides a baseline guarantee that the Mirror is at most 24
|
||||
hours stale.
|
||||
|
||||
The schedule runs from the same machinery as the future backup
|
||||
retention job, so we get one cron-equivalent system to maintain
|
||||
instead of two.
|
||||
|
||||
## What if regeneration fails
|
||||
|
||||
The Mirror has to be resilient. If regeneration fails for a
|
||||
project (e.g. a query catalog query crashes, a template rendering
|
||||
error), the existing Mirror files are **not** deleted. The
|
||||
existing files stay in place (showing the last successful state)
|
||||
and a regeneration error is recorded in:
|
||||
|
||||
- the API response if the trigger was explicit
|
||||
- a log entry at warning level for the async path
|
||||
- a `mirror_regeneration_failures` table for the daily refresh
|
||||
|
||||
This means the human can always read the Mirror, even if the
|
||||
last 5 minutes of changes haven't made it in yet. Stale is
|
||||
better than blank.
|
||||
|
||||
## How the human curates "around" the Mirror
|
||||
|
||||
The Mirror reflects the current entity state. If the human
|
||||
doesn't like what they see, the right edits go into one of:
|
||||
|
||||
| What you want to change | Where you change it |
|
||||
|---|---|
|
||||
| A Decision's text | `PUT /entities/Decision/{id}` (or `PUT /memory/{id}` if it's still memory-layer) |
|
||||
| A Decision's status (active → superseded) | `POST /entities/Decision/{id}/supersede` (V1 entity API) |
|
||||
| Whether a Component "satisfies" a Requirement | edit the relationship directly via the entity API (V1) |
|
||||
| The current trusted next focus shown on the Project Overview | `POST /project/state` with `category=status, key=next_focus` |
|
||||
| A typo in a generated heading or label | edit the **template**, not the rendered file. Templates live in `templates/mirror/` (V1 implementation) |
|
||||
| Source of a fact ("this came from KB-CAD on day X") | not editable by hand — it's automatically populated from provenance |
|
||||
|
||||
The rule is consistent: edit the canonical home, regenerate (or
|
||||
let the auto-trigger fire), see the change reflected in the
|
||||
Mirror.
|
||||
|
||||
## Templates
|
||||
|
||||
The Mirror uses Jinja2-style templates checked into the repo
|
||||
under `templates/mirror/`. Each template is a markdown file with
|
||||
placeholders that the renderer fills from query catalog results.
|
||||
|
||||
Template list for V1:
|
||||
|
||||
- `templates/mirror/project-overview.md.j2`
|
||||
- `templates/mirror/decision-log.md.j2`
|
||||
- `templates/mirror/subsystem-detail.md.j2`
|
||||
|
||||
Editing a template is a code change, reviewed via normal git PRs.
|
||||
The templates are deliberately small and readable so the human
|
||||
can tweak the output format without touching renderer code.
|
||||
|
||||
The renderer is a thin module:
|
||||
|
||||
```python
|
||||
# src/atocore/mirror/renderer.py (V1, not yet implemented)
|
||||
|
||||
def render_project_overview(project: str) -> str:
|
||||
"""Generate the project overview markdown for one project."""
|
||||
facts = collect_project_overview_facts(project)
|
||||
template = load_template("project-overview.md.j2")
|
||||
return template.render(**facts)
|
||||
```
|
||||
|
||||
## The "do not edit" header
|
||||
|
||||
Every generated Mirror file starts with a fixed banner:
|
||||
|
||||
```markdown
|
||||
<!--
|
||||
This file is generated by AtoCore from current entity state.
|
||||
DO NOT EDIT — manual changes will be silently overwritten on
|
||||
the next regeneration.
|
||||
Edit the canonical home instead. See:
|
||||
https://docs.atocore.../representation-authority.md
|
||||
Regenerated: 2026-04-07T12:34:56Z
|
||||
Source entities: <commit-like checksum of input data>
|
||||
-->
|
||||
```
|
||||
|
||||
The checksum at the end lets the renderer skip work when nothing
|
||||
relevant has changed since the last regeneration. If the inputs
|
||||
match the previous run's checksum, the existing file is left
|
||||
untouched.
|
||||
|
||||
## Conflicts in the Mirror
|
||||
|
||||
Per the conflict model, any open conflict on a fact that appears
|
||||
in the Mirror gets a visible disputed marker:
|
||||
|
||||
```markdown
|
||||
- Lateral support material: **GF-PTFE** ⚠ disputed
|
||||
- The KB-CAD import on 2026-04-07 reported PEEK; conflict #c-039.
|
||||
```
|
||||
|
||||
The disputed marker is a hyperlink (in renderer terms; the markdown
|
||||
output is a relative link) to the conflict detail page in the API
|
||||
or to the conflict id for direct lookup. The reviewer follows the
|
||||
link, resolves the conflict via `POST /conflicts/{id}/resolve`,
|
||||
and on the next regeneration the marker disappears.
|
||||
|
||||
## Project-state overrides in the Mirror
|
||||
|
||||
When a Mirror page would show a value derived from entities, but
|
||||
project_state has an override on the same key, **the Mirror shows
|
||||
the project_state value** with a small annotation noting the
|
||||
override:
|
||||
|
||||
```markdown
|
||||
- Next focus: **Wave 2 trusted-operational ingestion** (curated)
|
||||
```
|
||||
|
||||
The `(curated)` annotation tells the reader "this is from the
|
||||
trusted-state Layer 3, not from extracted entities". This makes
|
||||
the trust hierarchy visible in the human reading experience.
|
||||
|
||||
## The "Mirror diff" workflow (post-V1, but designed for)
|
||||
|
||||
A common workflow after V1 ships will be:
|
||||
|
||||
1. Reviewer has curated some new entities
|
||||
2. They want to see "what changed in the Mirror as a result"
|
||||
3. They want to share that diff with someone else as evidence
|
||||
|
||||
To support this, the Mirror generator writes its output
|
||||
deterministically (sorted iteration, stable timestamp formatting)
|
||||
so a `git diff` between two regenerated states is meaningful.
|
||||
|
||||
V1 doesn't add an explicit "diff between two Mirror snapshots"
|
||||
endpoint — that's deferred. But the deterministic-output
|
||||
property is a V1 requirement so future diffing works without
|
||||
re-renderer-design work.
|
||||
|
||||
## What the Mirror enables
|
||||
|
||||
With the Mirror in place:
|
||||
|
||||
- **OpenClaw can read project state in human form.** The
|
||||
read-only AtoCore helper skill on the T420 already calls
|
||||
`/context/build`; in V1 it gains the option to call
|
||||
`/mirror/{project}/overview` to get a fully-rendered markdown
|
||||
page instead of just retrieved chunks. This is much faster
|
||||
than crawling individual entities for general questions.
|
||||
- **The human gets a daily-readable artifact.** Every morning,
|
||||
Antoine can `cat /srv/storage/atocore/data/mirror/p05/overview.md`
|
||||
and see the current state of p05 in his preferred reading
|
||||
format. No API calls, no JSON parsing.
|
||||
- **Cross-collaborator sharing.** If you ever want to send
|
||||
someone a project overview without giving them AtoCore access,
|
||||
the Mirror file is a self-contained markdown document they can
|
||||
read in any markdown viewer.
|
||||
- **Claude Code integration.** A future
|
||||
`/atocore-mirror <project>` slash command renders the Mirror
|
||||
inline, complementing the existing `/atocore-context` command
|
||||
with a human-readable view of "what does AtoCore think about
|
||||
this project right now".
|
||||
|
||||
## Open questions for V1 implementation
|
||||
|
||||
1. **What's the regeneration debounce window?** 30 seconds is the
|
||||
starting value but should be tuned with real usage.
|
||||
2. **Does the daily refresh need a separate trigger mechanism, or
|
||||
is it just a long-period entry in the same in-process scheduler
|
||||
that handles the debounced async refreshes?** Probably the
|
||||
latter — keep it simple.
|
||||
3. **How are templates tested?** Likely a small set of fixture
|
||||
project states + golden output files, with a single test that
|
||||
asserts `render(fixture) == golden`. Updating golden files is
|
||||
a normal part of template work.
|
||||
4. **Are Mirror pages discoverable via a directory listing
|
||||
endpoint?** `GET /mirror/{project}` returns the list of
|
||||
available pages for that project. Probably yes; cheap to add.
|
||||
5. **How does the Mirror handle a project that has zero entities
|
||||
yet?** Render an empty-state page that says "no curated facts
|
||||
yet — add some via /memory or /entities/Decision". Better than
|
||||
a blank file.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- The Human Mirror generates 3 template families per project
|
||||
(Overview, Decision Log, Subsystem Detail) from current entity
|
||||
state
|
||||
- It's strictly read-only from the human's perspective; edits go
|
||||
to the canonical home and the Mirror picks them up on
|
||||
regeneration
|
||||
- Three regeneration triggers: explicit POST, debounced
|
||||
async-on-write, daily scheduled refresh
|
||||
- Mirror files live in `/srv/storage/atocore/data/mirror/`
|
||||
(NOT in the source vault — sources stay read-only)
|
||||
- Conflicts and project_state overrides are visible inline in
|
||||
the rendered markdown so the trust hierarchy shows through
|
||||
- Templates are checked into the repo and edited via PR; the
|
||||
rendered files are derived and never canonical
|
||||
- Deterministic output is a V1 requirement so future diffing
|
||||
works without rework
|
||||
333
docs/architecture/llm-client-integration.md
Normal file
333
docs/architecture/llm-client-integration.md
Normal file
@@ -0,0 +1,333 @@
|
||||
# LLM Client Integration (the layering)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
AtoCore must be reachable from many different LLM client contexts:
|
||||
|
||||
- **OpenClaw** on the T420 (already integrated via the read-only
|
||||
helper skill at `/home/papa/clawd/skills/atocore-context/`)
|
||||
- **Claude Code** on the laptop (via the slash command shipped in
|
||||
this repo at `.claude/commands/atocore-context.md`)
|
||||
- **Codex** sessions (future)
|
||||
- **Direct API consumers** — scripts, Python code, ad-hoc curl
|
||||
- **The eventual MCP server** when it's worth building
|
||||
|
||||
Without an explicit layering rule, every new client tends to
|
||||
reimplement the same routing logic (project detection, context
|
||||
build, retrieval audit, project-state inspection) in slightly
|
||||
different ways. That is exactly what almost happened in the first
|
||||
draft of the Claude Code slash command, which started as a curl +
|
||||
jq script that duplicated capabilities the existing operator client
|
||||
already had.
|
||||
|
||||
This document defines the layering so future clients don't repeat
|
||||
that mistake.
|
||||
|
||||
## The layering
|
||||
|
||||
Three layers, top to bottom:
|
||||
|
||||
```
|
||||
+----------------------------------------------------+
|
||||
| Per-agent thin frontends |
|
||||
| |
|
||||
| - Claude Code slash command |
|
||||
| (.claude/commands/atocore-context.md) |
|
||||
| - OpenClaw helper skill |
|
||||
| (/home/papa/clawd/skills/atocore-context/) |
|
||||
| - Codex skill (future) |
|
||||
| - MCP server (future) |
|
||||
+----------------------------------------------------+
|
||||
|
|
||||
| shells out to / imports
|
||||
v
|
||||
+----------------------------------------------------+
|
||||
| Shared operator client |
|
||||
| scripts/atocore_client.py |
|
||||
| |
|
||||
| - subcommands for stable AtoCore operations |
|
||||
| - fail-open on network errors |
|
||||
| - consistent JSON output across all subcommands |
|
||||
| - environment-driven configuration |
|
||||
| (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS, |
|
||||
| ATOCORE_REFRESH_TIMEOUT_SECONDS, |
|
||||
| ATOCORE_FAIL_OPEN) |
|
||||
+----------------------------------------------------+
|
||||
|
|
||||
| HTTP
|
||||
v
|
||||
+----------------------------------------------------+
|
||||
| AtoCore HTTP API |
|
||||
| src/atocore/api/routes.py |
|
||||
| |
|
||||
| - the universal interface to AtoCore |
|
||||
| - everything else above is glue |
|
||||
+----------------------------------------------------+
|
||||
```
|
||||
|
||||
## The non-negotiable rules
|
||||
|
||||
These rules are what make the layering work.
|
||||
|
||||
### Rule 1 — every per-agent frontend is a thin wrapper
|
||||
|
||||
A per-agent frontend exists to do exactly two things:
|
||||
|
||||
1. **Translate the agent platform's command/skill format** into an
|
||||
invocation of the shared client (or a small sequence of them)
|
||||
2. **Render the JSON response** into whatever shape the agent
|
||||
platform wants (markdown for Claude Code, plaintext for
|
||||
OpenClaw, MCP tool result for an MCP server, etc.)
|
||||
|
||||
Everything else — talking to AtoCore, project detection, retrieval
|
||||
audit, fail-open behavior, configuration — is the **shared
|
||||
client's** job.
|
||||
|
||||
If a per-agent frontend grows logic beyond the two responsibilities
|
||||
above, that logic is in the wrong place. It belongs in the shared
|
||||
client where every other frontend gets to use it.
|
||||
|
||||
### Rule 2 — the shared client never duplicates the API
|
||||
|
||||
The shared client is allowed to **compose** API calls (e.g.
|
||||
`auto-context` calls `detect-project` then `context-build`), but
|
||||
it never reimplements API logic. If a useful operation can't be
|
||||
expressed via the existing API endpoints, the right fix is to
|
||||
extend the API, not to embed the logic in the client.
|
||||
|
||||
This rule keeps the API as the single source of truth for what
|
||||
AtoCore can do.
|
||||
|
||||
### Rule 3 — the shared client only exposes stable operations
|
||||
|
||||
A subcommand only makes it into the shared client when:
|
||||
|
||||
- the API endpoint behind it has been exercised by at least one
|
||||
real workflow
|
||||
- the request and response shapes are unlikely to change
|
||||
- the operation is one that more than one frontend will plausibly
|
||||
want
|
||||
|
||||
This rule keeps the client surface stable so frontends don't have
|
||||
to chase changes. New endpoints land in the API first, get
|
||||
exercised in real use, and only then get a client subcommand.
|
||||
|
||||
## What's in scope for the shared client today
|
||||
|
||||
The currently shipped scope (per `scripts/atocore_client.py`):
|
||||
|
||||
### Stable operations (shipped since the client was introduced)
|
||||
|
||||
| Subcommand | Purpose | API endpoint(s) |
|
||||
|---|---|---|
|
||||
| `health` | service status, mount + source readiness | `GET /health` |
|
||||
| `sources` | enabled source roots and their existence | `GET /sources` |
|
||||
| `stats` | document/chunk/vector counts | `GET /stats` |
|
||||
| `projects` | registered projects | `GET /projects` |
|
||||
| `project-template` | starter shape for a new project | `GET /projects/template` |
|
||||
| `propose-project` | preview a registration | `POST /projects/proposal` |
|
||||
| `register-project` | persist a registration | `POST /projects/register` |
|
||||
| `update-project` | update an existing registration | `PUT /projects/{name}` |
|
||||
| `refresh-project` | re-ingest a project's roots | `POST /projects/{name}/refresh` |
|
||||
| `project-state` | list trusted state for a project | `GET /project/state/{name}` |
|
||||
| `project-state-set` | curate trusted state | `POST /project/state` |
|
||||
| `project-state-invalidate` | supersede trusted state | `DELETE /project/state` |
|
||||
| `query` | raw retrieval | `POST /query` |
|
||||
| `context-build` | full context pack | `POST /context/build` |
|
||||
| `auto-context` | detect-project then context-build | composes `/projects` + `/context/build` |
|
||||
| `detect-project` | match a prompt to a registered project | composes `/projects` + local regex |
|
||||
| `audit-query` | retrieval-quality audit with classification | composes `/query` + local labelling |
|
||||
| `debug-context` | last context pack inspection | `GET /debug/context` |
|
||||
| `ingest-sources` | ingest configured source dirs | `POST /ingest/sources` |
|
||||
|
||||
### Phase 9 reflection loop (shipped after migration safety work)
|
||||
|
||||
These were explicitly deferred in earlier versions of this doc
|
||||
pending "exercised workflow". The constraint was real — premature
|
||||
API freeze would have made it harder to iterate on the ergonomics —
|
||||
but the deferral ran into a bootstrap problem: you can't exercise
|
||||
the workflow in real Claude Code sessions without a usable client
|
||||
surface to drive it from. The fix is to ship a minimal Phase 9
|
||||
surface now and treat it as stable-but-refinable: adding new
|
||||
optional parameters is fine, renaming subcommands is not.
|
||||
|
||||
| Subcommand | Purpose | API endpoint(s) |
|
||||
|---|---|---|
|
||||
| `capture` | record one interaction round-trip | `POST /interactions` |
|
||||
| `extract` | run the rule-based extractor (preview or persist) | `POST /interactions/{id}/extract` |
|
||||
| `reinforce-interaction` | backfill reinforcement on an existing interaction | `POST /interactions/{id}/reinforce` |
|
||||
| `list-interactions` | paginated list with filters | `GET /interactions` |
|
||||
| `get-interaction` | fetch one interaction by id | `GET /interactions/{id}` |
|
||||
| `queue` | list the candidate review queue | `GET /memory?status=candidate` |
|
||||
| `promote` | move a candidate memory to active | `POST /memory/{id}/promote` |
|
||||
| `reject` | mark a candidate memory invalid | `POST /memory/{id}/reject` |
|
||||
|
||||
All 8 Phase 9 subcommands have test coverage in
|
||||
`tests/test_atocore_client.py` via mocked `request()`, including
|
||||
an end-to-end test that drives the full capture → extract → queue
|
||||
→ promote/reject cycle through the client.
|
||||
|
||||
### Coverage summary
|
||||
|
||||
That covers everything in the "stable operations" set AND the
|
||||
full Phase 9 reflection loop: project lifecycle, ingestion,
|
||||
project-state curation, retrieval, context build,
|
||||
retrieval-quality audit, health and stats inspection, interaction
|
||||
capture, candidate extraction, candidate review queue.
|
||||
|
||||
## What's intentionally NOT in scope today
|
||||
|
||||
Two families of operations remain deferred:
|
||||
|
||||
### 1. Backup and restore admin operations
|
||||
|
||||
Phase 9 Commit B shipped these endpoints:
|
||||
|
||||
- `POST /admin/backup` (with `include_chroma`)
|
||||
- `GET /admin/backup` (list)
|
||||
- `GET /admin/backup/{stamp}/validate`
|
||||
|
||||
The backup endpoints are stable, but the documented operational
|
||||
procedure (`docs/backup-restore-procedure.md`) intentionally uses
|
||||
direct curl rather than the shared client. The reason is that
|
||||
backup operations are *administrative* and benefit from being
|
||||
explicit about which instance they're targeting, with no
|
||||
fail-open behavior. The shared client's fail-open default would
|
||||
hide a real backup failure.
|
||||
|
||||
If we later decide to add backup commands to the shared client,
|
||||
they would set `ATOCORE_FAIL_OPEN=false` for the duration of the
|
||||
call so the operator gets a real error on failure rather than a
|
||||
silent fail-open envelope.
|
||||
|
||||
### 2. Engineering layer entity operations
|
||||
|
||||
The engineering layer is in planning, not implementation. When
|
||||
V1 ships per `engineering-v1-acceptance.md`, the shared client
|
||||
will gain entity, relationship, conflict, and Mirror commands.
|
||||
None of those exist as stable contracts yet, so they are not in
|
||||
the shared client today.
|
||||
|
||||
## How a new agent platform integrates
|
||||
|
||||
When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom
|
||||
GPT, a Cursor extension), the integration recipe is:
|
||||
|
||||
1. **Don't reimplement.** Don't write a new HTTP client. Use the
|
||||
shared client.
|
||||
2. **Write a thin frontend** that translates the platform's
|
||||
command/skill format into a shell call to
|
||||
`python scripts/atocore_client.py <subcommand> <args...>`.
|
||||
3. **Render the JSON response** in the platform's preferred shape.
|
||||
4. **Inherit fail-open and env-var behavior** from the shared
|
||||
client. Don't override unless the platform explicitly needs
|
||||
to (e.g. an admin tool that wants to see real errors).
|
||||
5. **If a needed capability is missing**, propose adding it to
|
||||
the shared client. If the underlying API endpoint also
|
||||
doesn't exist, propose adding it to the API first. Don't
|
||||
add the logic to your frontend.
|
||||
|
||||
The Claude Code slash command in this repo is a worked example:
|
||||
~50 lines of markdown that does argument parsing, calls the
|
||||
shared client, and renders the result. It contains zero AtoCore
|
||||
business logic of its own.
|
||||
|
||||
## How OpenClaw fits
|
||||
|
||||
OpenClaw's helper skill at `/home/papa/clawd/skills/atocore-context/`
|
||||
on the T420 currently has its own implementation of `auto-context`,
|
||||
`detect-project`, and the project lifecycle commands. It predates
|
||||
this layering doc.
|
||||
|
||||
The right long-term shape is to **refactor the OpenClaw helper to
|
||||
shell out to the shared client** instead of duplicating the
|
||||
routing logic. This isn't urgent because:
|
||||
|
||||
- OpenClaw's helper works today and is in active use
|
||||
- The duplication is on the OpenClaw side; AtoCore itself is not
|
||||
affected
|
||||
- The shared client and the OpenClaw helper are in different
|
||||
repos (AtoCore vs OpenClaw clawd), so the refactor is a
|
||||
cross-repo coordination
|
||||
|
||||
The refactor is queued as a follow-up. Until then, **the OpenClaw
|
||||
helper and the Claude Code slash command are parallel
|
||||
implementations** of the same idea. The shared client is the
|
||||
canonical backbone going forward; new clients should follow the
|
||||
new pattern even though the existing OpenClaw helper still has
|
||||
its own.
|
||||
|
||||
## How this connects to the master plan
|
||||
|
||||
| Layer | Phase home | Status |
|
||||
|---|---|---|
|
||||
| AtoCore HTTP API | Phases 0/0.5/1/2/3/5/7/9 | shipped |
|
||||
| Shared operator client (`scripts/atocore_client.py`) | implicitly Phase 8 (OpenClaw integration) infrastructure | shipped via codex/port-atocore-ops-client merge |
|
||||
| OpenClaw helper skill (T420) | Phase 8 — partial | shipped (own implementation, refactor queued) |
|
||||
| Claude Code slash command (this repo) | precursor to Phase 11 (multi-model) | shipped (refactored to use the shared client) |
|
||||
| Codex skill | Phase 11 | future |
|
||||
| MCP server | Phase 11 | future |
|
||||
| Web UI / dashboard | Phase 11+ | future |
|
||||
|
||||
The shared client is the **substrate Phase 11 will build on**.
|
||||
Every new client added in Phase 11 should be a thin frontend on
|
||||
the shared client, not a fresh reimplementation.
|
||||
|
||||
## Versioning and stability
|
||||
|
||||
The shared client's subcommand surface is **stable**. Adding new
|
||||
subcommands is non-breaking. Changing or removing existing
|
||||
subcommands is breaking and would require a coordinated update
|
||||
of every frontend that depends on them.
|
||||
|
||||
The current shared client has no explicit version constant; the
|
||||
implicit contract is "the subcommands and JSON shapes documented
|
||||
in this file". When the client surface meaningfully changes,
|
||||
add a `CLIENT_VERSION = "x.y.z"` constant to
|
||||
`scripts/atocore_client.py` and bump it per semver:
|
||||
|
||||
- patch: bug fixes, no surface change
|
||||
- minor: new subcommands or new optional fields
|
||||
- major: removed subcommands, renamed fields, changed defaults
|
||||
|
||||
## Open follow-ups
|
||||
|
||||
1. **Refactor the OpenClaw helper** to shell out to the shared
|
||||
client. Cross-repo coordination, not blocking anything in
|
||||
AtoCore itself. With the Phase 9 subcommands now in the shared
|
||||
client, the OpenClaw refactor can reuse all the reflection-loop
|
||||
work instead of duplicating it.
|
||||
2. **Real-usage validation of the Phase 9 loop**, now that the
|
||||
client surface exists. First capture → extract → review cycle
|
||||
against the live Dalidou instance, likely via the Claude Code
|
||||
slash command flow. Findings feed back into subcommand
|
||||
refinement (new optional flags are fine, renames require a
|
||||
semver bump).
|
||||
3. **Add backup admin subcommands** if and when we decide the
|
||||
shared client should be the canonical backup operator
|
||||
interface (with fail-open disabled for admin commands).
|
||||
4. **Add engineering-layer entity subcommands** as part of the
|
||||
engineering V1 implementation sprint, per
|
||||
`engineering-v1-acceptance.md`.
|
||||
5. **Tag a `CLIENT_VERSION` constant** the next time the shared
|
||||
client surface meaningfully changes. Today's surface with the
|
||||
Phase 9 loop added is the v0.2.0 baseline (v0.1.0 was the
|
||||
stable-ops-only version).
|
||||
|
||||
## TL;DR
|
||||
|
||||
- AtoCore HTTP API is the universal interface
|
||||
- `scripts/atocore_client.py` is the canonical shared Python
|
||||
backbone for stable AtoCore operations
|
||||
- Per-agent frontends (Claude Code slash command, OpenClaw
|
||||
helper, future Codex skill, future MCP server) are thin
|
||||
wrappers that shell out to the shared client
|
||||
- The shared client today covers project lifecycle, ingestion,
|
||||
retrieval, context build, project-state, retrieval audit, AND
|
||||
the full Phase 9 reflection loop (capture / extract /
|
||||
reinforce / list / queue / promote / reject)
|
||||
- Backup admin and engineering-entity commands remain deferred
|
||||
- The OpenClaw helper is currently a parallel implementation and
|
||||
the refactor to the shared client is a queued follow-up
|
||||
- New LLM clients should never reimplement HTTP calls — they
|
||||
follow the shell-out pattern documented here
|
||||
462
docs/architecture/project-identity-canonicalization.md
Normal file
462
docs/architecture/project-identity-canonicalization.md
Normal file
@@ -0,0 +1,462 @@
|
||||
# Project Identity Canonicalization
|
||||
|
||||
## Why this document exists
|
||||
|
||||
AtoCore identifies projects by name in many places: trusted state
|
||||
rows, memories, captured interactions, query/context API parameters,
|
||||
extractor candidates, future engineering entities. Without an
|
||||
explicit rule, every callsite would have to remember to canonicalize
|
||||
project names through the registry — and the recent codex review
|
||||
caught exactly the bug class that follows when one of them forgets.
|
||||
|
||||
The fix landed in `fb6298a` and works correctly today. This document
|
||||
exists to make the rule **explicit and discoverable** so the
|
||||
engineering layer V1 implementation, future entity write paths, and
|
||||
any new agent integration don't reintroduce the same fragmentation
|
||||
when nobody is looking.
|
||||
|
||||
## The contract
|
||||
|
||||
> **Every read/write that takes a project name MUST canonicalize it
|
||||
> through `resolve_project_name()` before the value crosses a service
|
||||
> boundary.**
|
||||
|
||||
The boundary is wherever a project name becomes a database row, a
|
||||
query filter, an attribute on a stored object, or a key for any
|
||||
lookup. The canonicalization happens **once**, at that boundary,
|
||||
before the underlying storage primitive is called.
|
||||
|
||||
Symbolically:
|
||||
|
||||
```
|
||||
HTTP layer (raw user input)
|
||||
↓
|
||||
service entry point
|
||||
↓
|
||||
project_name = resolve_project_name(project_name) ← ONLY canonical from this point
|
||||
↓
|
||||
storage / queries / further service calls
|
||||
```
|
||||
|
||||
The rule is intentionally simple. There's no per-call exception,
|
||||
no "trust me, the caller already canonicalized it" shortcut, no
|
||||
opt-out flag. Every service-layer entry point applies the helper
|
||||
the moment it receives a project name from outside the service.
|
||||
|
||||
## The helper
|
||||
|
||||
```python
|
||||
# src/atocore/projects/registry.py
|
||||
|
||||
def resolve_project_name(name: str | None) -> str:
|
||||
"""Canonicalize a project name through the registry.
|
||||
|
||||
Returns the canonical project_id if the input matches any
|
||||
registered project's id or alias. Returns the input unchanged
|
||||
when it's empty or not in the registry — the second case keeps
|
||||
backwards compatibility with hand-curated state, memories, and
|
||||
interactions that predate the registry, or for projects that
|
||||
are intentionally not registered.
|
||||
"""
|
||||
if not name:
|
||||
return name or ""
|
||||
project = get_registered_project(name)
|
||||
if project is not None:
|
||||
return project.project_id
|
||||
return name
|
||||
```
|
||||
|
||||
Three behaviors worth keeping in mind:
|
||||
|
||||
1. **Empty / None input → empty string output.** Callers don't have
|
||||
to pre-check; passing `""` or `None` to a query filter still
|
||||
works as "no project scope".
|
||||
2. **Registered alias → canonical project_id.** The helper does the
|
||||
case-insensitive lookup and returns the project's `id` field
|
||||
(e.g. `"p05" → "p05-interferometer"`).
|
||||
3. **Unregistered name → input unchanged.** This is the
|
||||
backwards-compatibility path. Hand-curated state, memories, or
|
||||
interactions created under a name that isn't in the registry
|
||||
keep working. The retrieval is then "best effort" — the raw
|
||||
string is used as the SQL key, which still finds the row that
|
||||
was stored under the same raw string. This path exists so the
|
||||
engineering layer V1 doesn't have to also be a data migration.
|
||||
|
||||
## Where the helper is currently called
|
||||
|
||||
As of `fb6298a`, the helper is invoked at exactly these eight
|
||||
service-layer entry points:
|
||||
|
||||
| Module | Function | What gets canonicalized |
|
||||
|---|---|---|
|
||||
| `src/atocore/context/builder.py` | `build_context` | the `project_hint` parameter, before the trusted state lookup |
|
||||
| `src/atocore/context/project_state.py` | `set_state` | `project_name`, before `ensure_project()` |
|
||||
| `src/atocore/context/project_state.py` | `get_state` | `project_name`, before the SQL lookup |
|
||||
| `src/atocore/context/project_state.py` | `invalidate_state` | `project_name`, before the SQL lookup |
|
||||
| `src/atocore/interactions/service.py` | `record_interaction` | `project`, before insert |
|
||||
| `src/atocore/interactions/service.py` | `list_interactions` | `project` filter parameter, before WHERE clause |
|
||||
| `src/atocore/memory/service.py` | `create_memory` | `project`, before insert |
|
||||
| `src/atocore/memory/service.py` | `get_memories` | `project` filter parameter, before WHERE clause |
|
||||
|
||||
Every one of those is the **first** thing the function does after
|
||||
input validation. There is no path through any of those eight
|
||||
functions where a project name reaches storage without passing
|
||||
through `resolve_project_name`.
|
||||
|
||||
## Where the helper is NOT called (and why that's correct)
|
||||
|
||||
These places intentionally do not canonicalize:
|
||||
|
||||
1. **`update_memory`'s project field.** The API does not allow
|
||||
changing a memory's project after creation, so there's no
|
||||
project to canonicalize. The function only updates `content`,
|
||||
`confidence`, and `status`.
|
||||
2. **The retriever's `_project_match_boost` substring matcher.** It
|
||||
already calls `get_registered_project` internally to expand the
|
||||
hint into the candidate set (canonical id + all aliases + last
|
||||
path segments). It accepts the raw hint by design.
|
||||
3. **`_rank_chunks`'s secondary substring boost in
|
||||
`builder.py`.** Still uses the raw hint. This is a multiplicative
|
||||
factor on top of correct retrieval, not a filter, so it cannot
|
||||
drop relevant chunks. Tracked as a future cleanup but not
|
||||
critical.
|
||||
4. **Direct SQL queries for the projects table itself** (e.g.
|
||||
`ensure_project`'s lookup). These are intentional case-insensitive
|
||||
raw lookups against the column the canonical id is stored in.
|
||||
`set_state` already canonicalized before reaching `ensure_project`,
|
||||
so the value passed is the canonical id by definition.
|
||||
5. **Hand-authored project names that aren't in the registry.**
|
||||
The helper returns those unchanged. This is the backwards-compat
|
||||
path mentioned above; it is *not* a violation of the rule, it's
|
||||
the rule applied to a name with no registry record.
|
||||
|
||||
## Why this is the trust hierarchy in action
|
||||
|
||||
The whole point of AtoCore is the trust hierarchy from the operating
|
||||
model:
|
||||
|
||||
1. Trusted Project State (Layer 3) is the most authoritative layer
|
||||
2. Memories (active) are second
|
||||
3. Source chunks (raw retrieved content) are last
|
||||
|
||||
If a caller passes the alias `p05` and Layer 3 was written under
|
||||
`p05-interferometer`, and the lookup fails to find the canonical
|
||||
row, **the trust hierarchy collapses**. The most-authoritative
|
||||
layer is silently invisible to the caller. The system would still
|
||||
return *something* — namely, lower-trust retrieved chunks — and the
|
||||
human would never know they got a degraded answer.
|
||||
|
||||
The canonicalization helper is what makes the trust hierarchy
|
||||
**dependable**. Layer 3 is supposed to win every time. To win it
|
||||
has to be findable. To be findable, the lookup key has to match
|
||||
how the row was stored. And the only way to guarantee that match
|
||||
across every entry point is to canonicalize at every boundary.
|
||||
|
||||
## Compatibility gap: legacy alias-keyed rows
|
||||
|
||||
The canonicalization rule fixes new writes going forward, but it
|
||||
does NOT fix rows that were already written under a registered
|
||||
alias before `fb6298a` landed. Those rows have a real, concrete
|
||||
gap that must be closed by a one-time migration before the
|
||||
engineering layer V1 ships.
|
||||
|
||||
The exact failure mode:
|
||||
|
||||
```
|
||||
time T0 (before fb6298a):
|
||||
POST /project/state {project: "p05", ...}
|
||||
-> set_state("p05", ...) # no canonicalization
|
||||
-> ensure_project("p05") # creates a "p05" row
|
||||
-> writes state with project_id pointing at the "p05" row
|
||||
|
||||
time T1 (after fb6298a):
|
||||
POST /project/state {project: "p05", ...} (or any read)
|
||||
-> set_state("p05", ...)
|
||||
-> resolve_project_name("p05") -> "p05-interferometer"
|
||||
-> ensure_project("p05-interferometer") # creates a SECOND row
|
||||
-> writes new state under the canonical row
|
||||
-> the T0 state is still in the "p05" row, INVISIBLE to every
|
||||
canonicalized read
|
||||
```
|
||||
|
||||
The unregistered-name fallback path saves you when the project was
|
||||
never in the registry: a row stored under `"orphan-project"` is read
|
||||
back via `"orphan-project"`, both pass through `resolve_project_name`
|
||||
unchanged, and the strings line up. **It does not save you when the
|
||||
name is a registered alias** — the helper rewrites the read key but
|
||||
not the storage key, and the legacy row becomes invisible.
|
||||
|
||||
What is at risk on the live Dalidou DB:
|
||||
|
||||
1. **`projects` table**: any rows whose `name` column matches a
|
||||
registered alias (one row per alias actually written under
|
||||
before the fix landed). These shadow the canonical project row
|
||||
and silently fragment the projects namespace.
|
||||
2. **`project_state` table**: any rows whose `project_id` points
|
||||
at one of those shadow project rows. **This is the highest-risk
|
||||
case** because it directly defeats the trust hierarchy: Layer 3
|
||||
trusted state becomes invisible to every canonicalized lookup.
|
||||
3. **`memories` table**: any rows whose `project` column is a
|
||||
registered alias. Reinforcement and extraction queries will
|
||||
miss them.
|
||||
4. **`interactions` table**: any rows whose `project` column is a
|
||||
registered alias. Listing and downstream reflection will miss
|
||||
them.
|
||||
|
||||
How to find out the actual blast radius on the live Dalidou DB:
|
||||
|
||||
```sql
|
||||
-- inspect the projects table for alias-shadow rows
|
||||
SELECT id, name FROM projects;
|
||||
|
||||
-- count alias-keyed memories per known alias
|
||||
SELECT project, COUNT(*) FROM memories
|
||||
WHERE project IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core')
|
||||
GROUP BY project;
|
||||
|
||||
-- count alias-keyed interactions
|
||||
SELECT project, COUNT(*) FROM interactions
|
||||
WHERE project IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core')
|
||||
GROUP BY project;
|
||||
|
||||
-- count alias-shadowed project_state rows by project name
|
||||
SELECT p.name, COUNT(*) FROM project_state ps
|
||||
JOIN projects p ON ps.project_id = p.id
|
||||
WHERE p.name IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core');
|
||||
```
|
||||
|
||||
The migration that closes the gap has to:
|
||||
|
||||
1. For each registered project, find all `projects` rows whose
|
||||
name matches one of the project's aliases AND is not the
|
||||
canonical id itself. These are the "shadow" rows.
|
||||
2. For each shadow row, MERGE its dependent state into the
|
||||
canonical project's row:
|
||||
- rekey `project_state.project_id` from shadow → canonical
|
||||
- if the merge would create a `(project_id, category, key)`
|
||||
collision (a state row already exists under the canonical
|
||||
id with the same category+key), the migration must surface
|
||||
the conflict via the existing conflict model and pause
|
||||
until the human resolves it
|
||||
- delete the now-empty shadow `projects` row
|
||||
3. For `memories` and `interactions`, the fix is simpler because
|
||||
the alias appears as a string column (not a foreign key):
|
||||
`UPDATE memories SET project = canonical WHERE project = alias`,
|
||||
then same for interactions.
|
||||
4. The migration must run in dry-run mode first, printing the
|
||||
exact rows it would touch and the canonical destinations they
|
||||
would be merged into.
|
||||
5. The migration must be idempotent — running it twice produces
|
||||
the same final state as running it once.
|
||||
|
||||
This work is **required before the engineering layer V1 ships**
|
||||
because V1 will add new `entities`, `relationships`, `conflicts`,
|
||||
and `mirror_regeneration_failures` tables that all key on the
|
||||
canonical project id. Any leaked alias-keyed rows in the existing
|
||||
tables would show up in V1 reads as silently missing data, and
|
||||
the killer-correctness queries from `engineering-query-catalog.md`
|
||||
(orphan requirements, decisions on flagged assumptions,
|
||||
unsupported claims) would report wrong results against any project
|
||||
that has shadow rows.
|
||||
|
||||
The migration script does NOT exist yet. The open follow-ups
|
||||
section below tracks it as the next concrete step.
|
||||
|
||||
## The rule for new entry points
|
||||
|
||||
When you add a new service-layer function that takes a project name,
|
||||
follow this checklist:
|
||||
|
||||
1. **Does the function read or write a row keyed by project?** If
|
||||
yes, you must call `resolve_project_name`. If no (e.g. it only
|
||||
takes `project` as a label for logging), you may skip the
|
||||
canonicalization but you should add a comment explaining why.
|
||||
2. **Where does the canonicalization go?** As the first statement
|
||||
after input validation. Not later, not "before storage", not
|
||||
"in the helper that does the actual write". As the first
|
||||
statement, so any subsequent service call inside the function
|
||||
sees the canonical value.
|
||||
3. **Add a regression test that uses an alias.** Use the
|
||||
`project_registry` fixture from `tests/conftest.py` to set up
|
||||
a temp registry with at least one project + aliases, then
|
||||
verify the new function works when called with the alias and
|
||||
when called with the canonical id.
|
||||
4. **If the function can be called with `None` or empty string,
|
||||
verify that path too.** The helper handles it correctly but
|
||||
the function-under-test might not.
|
||||
|
||||
## How the `project_registry` test fixture works
|
||||
|
||||
`tests/conftest.py::project_registry` returns a callable that
|
||||
takes one or more `(project_id, [aliases])` tuples (or just a bare
|
||||
`project_id` string), writes them into a temp registry file,
|
||||
points `ATOCORE_PROJECT_REGISTRY_PATH` at it, and reloads
|
||||
`config.settings`. Use it like:
|
||||
|
||||
```python
|
||||
def test_my_new_thing_canonicalizes(project_registry):
|
||||
project_registry(("p05-interferometer", ["p05", "interferometer"]))
|
||||
|
||||
# ... call your service function with "p05" ...
|
||||
# ... assert it works the same as if you'd passed "p05-interferometer" ...
|
||||
```
|
||||
|
||||
The fixture is reused by all 12 alias-canonicalization regression
|
||||
tests added in `fb6298a`. Following the same pattern for new
|
||||
features is the cheapest way to keep the contract intact.
|
||||
|
||||
## What this rule does NOT cover
|
||||
|
||||
1. **Alias creation / management.** This document is about reading
|
||||
and writing project-keyed data. Adding new projects or new
|
||||
aliases is the registry's own write path
|
||||
(`POST /projects/register`, `PUT /projects/{name}`), which
|
||||
already enforces collision detection and atomic file writes.
|
||||
2. **Registry hot-reloading.** The helper calls
|
||||
`load_project_registry()` on every invocation, which reads the
|
||||
JSON file each time. There is no in-process cache. If the
|
||||
registry file changes, the next call sees the new contents.
|
||||
Performance is fine for the current registry size but if it
|
||||
becomes a bottleneck, add a versioned cache here, not at every
|
||||
call site.
|
||||
3. **Cross-project deduplication.** If two different projects in
|
||||
the registry happen to share an alias, the registry's collision
|
||||
detection blocks the second one at registration time, so this
|
||||
case can't arise in practice. The helper does not handle it
|
||||
defensively.
|
||||
4. **Time-bounded canonicalization.** A project's canonical id is
|
||||
stable. Aliases can be added or removed via
|
||||
`PUT /projects/{name}`, but the canonical `id` field never
|
||||
changes after registration. So a row written today under the
|
||||
canonical id will always remain findable under that id, even
|
||||
if the alias set evolves.
|
||||
5. **Migration of legacy data.** If the live Dalidou DB has rows
|
||||
that were written under aliases before the canonicalization
|
||||
landed (e.g. a `memories` row with `project = "p05"` from
|
||||
before `fb6298a`), those rows are **NOT** automatically
|
||||
reachable from the canonicalized read path. The unregistered-
|
||||
name fallback only helps for project names that were never
|
||||
registered at all; it does **NOT** help for names that are
|
||||
registered as aliases. See the "Compatibility gap" section
|
||||
below for the exact failure mode and the migration path that
|
||||
has to run before the engineering layer V1 ships.
|
||||
|
||||
## What this enables for the engineering layer V1
|
||||
|
||||
When the engineering layer ships per `engineering-v1-acceptance.md`,
|
||||
it adds at least these new project-keyed surfaces:
|
||||
|
||||
- `entities` table with a `project_id` column
|
||||
- `relationships` table that joins entities, indirectly project-keyed
|
||||
- `conflicts` table with a `project` column
|
||||
- `mirror_regeneration_failures` table with a `project` column
|
||||
- new endpoints: `POST /entities/...`, `POST /ingest/kb-cad/export`,
|
||||
`POST /ingest/kb-fem/export`, `GET /mirror/{project}/...`,
|
||||
`GET /conflicts?project=...`
|
||||
|
||||
**Every one of those write/read paths needs to call
|
||||
`resolve_project_name` at its service-layer entry point**, following
|
||||
the same pattern as the eight existing call sites listed above. The
|
||||
implementation sprint should:
|
||||
|
||||
1. Apply the helper at each new service entry point as the first
|
||||
statement after input validation
|
||||
2. Add a regression test using the `project_registry` fixture that
|
||||
exercises an alias against each new entry point
|
||||
3. Treat any new service function that takes a project name without
|
||||
calling `resolve_project_name` as a code review failure
|
||||
|
||||
The pattern is simple enough to follow without thinking, which is
|
||||
exactly the property we want for a contract that has to hold
|
||||
across many independent additions.
|
||||
|
||||
## Open follow-ups
|
||||
|
||||
These are things the canonicalization story still has open. None
|
||||
are blockers, but they're the rough edges to be aware of.
|
||||
|
||||
1. **Legacy alias data migration — REQUIRED before engineering V1
|
||||
ships, NOT optional.** If the live Dalidou DB has any rows
|
||||
written under aliases before `fb6298a` landed, they are
|
||||
silently invisible to the canonicalized read path (see the
|
||||
"Compatibility gap" section above for the exact failure mode).
|
||||
This is a real correctness issue, not a theoretical one: any
|
||||
trusted state, memory, or interaction stored under `p05`,
|
||||
`gigabit`, `polisher`, etc. before the fix landed is currently
|
||||
unreachable from any service-layer query. The migration script
|
||||
has to walk `projects`, `project_state`, `memories`, and
|
||||
`interactions`, merge shadow rows into their canonical
|
||||
counterparts (with conflict-model handling for any collisions),
|
||||
and run in dry-run mode first. Estimated cost: ~150 LOC for
|
||||
the migration script + ~50 LOC of tests + a one-time supervised
|
||||
run on the live Dalidou DB. **This migration is the next
|
||||
concrete pre-V1 step.**
|
||||
2. **Registry file caching.** `load_project_registry()` reads the
|
||||
JSON file on every `resolve_project_name` call. With ~5
|
||||
projects this is fine; with 50+ it would warrant a versioned
|
||||
cache (cache key = file mtime + size). Defer until measured.
|
||||
3. **Case sensitivity audit.** The helper uses
|
||||
`get_registered_project` which lowercases for comparison. The
|
||||
stored canonical id keeps its original casing. No bug today
|
||||
because every test passes, but worth re-confirming when the
|
||||
engineering layer adds entity-side storage.
|
||||
4. **`_rank_chunks`'s secondary substring boost.** Mentioned
|
||||
earlier; still uses the raw hint. Replace it with the same
|
||||
helper-driven approach the retriever uses, OR delete it as
|
||||
redundant once we confirm the retriever's primary boost is
|
||||
sufficient.
|
||||
5. **Documentation discoverability.** This doc lives under
|
||||
`docs/architecture/`. The contract is also restated in the
|
||||
docstring of `resolve_project_name` and referenced from each
|
||||
call site's comment. That redundancy is intentional — the
|
||||
contract is too easy to forget to live in only one place.
|
||||
|
||||
## Quick reference card
|
||||
|
||||
Copy-pasteable for new service functions:
|
||||
|
||||
```python
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
|
||||
def my_new_service_entry_point(
|
||||
project_name: str,
|
||||
other_args: ...,
|
||||
) -> ...:
|
||||
# Validate inputs first
|
||||
if not project_name:
|
||||
raise ValueError("project_name is required")
|
||||
|
||||
# Canonicalize through the registry as the first thing after
|
||||
# validation. Every subsequent operation in this function uses
|
||||
# the canonical id, so storage and queries are guaranteed
|
||||
# consistent across alias and canonical-id callers.
|
||||
project_name = resolve_project_name(project_name)
|
||||
|
||||
# ... rest of the function ...
|
||||
```
|
||||
|
||||
## TL;DR
|
||||
|
||||
- One helper, one rule: `resolve_project_name` at every service-layer
|
||||
entry point that takes a project name
|
||||
- Currently called in 8 places across builder, project_state,
|
||||
interactions, and memory; all 8 listed in this doc
|
||||
- Backwards-compat path returns **unregistered** names unchanged
|
||||
(e.g. `"orphan-project"`); this does NOT cover **registered
|
||||
alias** names that were used as storage keys before `fb6298a`
|
||||
- **Real compatibility gap**: any row whose `project` column is a
|
||||
registered alias from before the canonicalization landed is
|
||||
silently invisible to the new read path. A one-time migration
|
||||
is required before engineering V1 ships. See the "Compatibility
|
||||
gap" section.
|
||||
- The trust hierarchy depends on this helper being applied
|
||||
everywhere — Layer 3 trusted state has to be findable for it to
|
||||
win the trust battle
|
||||
- Use the `project_registry` test fixture to add regression tests
|
||||
for any new service function that takes a project name
|
||||
- The engineering layer V1 implementation must follow the same
|
||||
pattern at every new service entry point
|
||||
- Open follow-ups (in priority order): **legacy alias data
|
||||
migration (required pre-V1)**, redundant substring boost
|
||||
cleanup, registry caching when projects scale
|
||||
273
docs/architecture/representation-authority.md
Normal file
273
docs/architecture/representation-authority.md
Normal file
@@ -0,0 +1,273 @@
|
||||
# Representation Authority (canonical home matrix)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The same fact about an engineering project can show up in many
|
||||
places: a markdown note in the PKM, a structured field in KB-CAD,
|
||||
a commit message in a Gitea repo, an active memory in AtoCore, an
|
||||
entity in the engineering layer, a row in trusted project state.
|
||||
**Without an explicit rule about which representation is
|
||||
authoritative for which kind of fact, the system will accumulate
|
||||
contradictions and the human will lose trust in all of them.**
|
||||
|
||||
This document is the canonical-home matrix. Every kind of fact
|
||||
that AtoCore handles has exactly one authoritative representation,
|
||||
and every other place that holds a copy of that fact is, by
|
||||
definition, a derived view that may be stale.
|
||||
|
||||
## The representations in scope
|
||||
|
||||
Six places where facts can live in this ecosystem:
|
||||
|
||||
| Layer | What it is | Who edits it | How it's structured |
|
||||
|---|---|---|---|
|
||||
| **PKM** | Antoine's Obsidian-style markdown vault under `/srv/storage/atocore/sources/vault/` | Antoine, by hand | unstructured markdown with optional frontmatter |
|
||||
| **KB project** | the engineering Knowledge Base (KB-CAD / KB-FEM repos and any companion docs) | Antoine, semi-structured | per-tool typed records |
|
||||
| **Gitea repos** | source code repos under `dalidou:3000/Antoine/*` (Fullum-Interferometer, polisher-sim, ATOCore itself, ...) | Antoine via git commits | code, READMEs, repo-specific markdown |
|
||||
| **AtoCore memories** | rows in the `memories` table | hand-authored or extracted from interactions | typed (identity / preference / project / episodic / knowledge / adaptation) |
|
||||
| **AtoCore entities** | rows in the `entities` table (V1, not yet built) | imported from KB exports or extracted from interactions | typed entities + relationships per the V1 ontology |
|
||||
| **AtoCore project state** | rows in the `project_state` table (Layer 3, trusted) | hand-curated only, never automatic | category + key + value |
|
||||
|
||||
## The canonical home rule
|
||||
|
||||
> For each kind of fact, exactly one of the six representations is
|
||||
> the authoritative source. The other five may hold derived
|
||||
> copies, but they are not allowed to disagree with the
|
||||
> authoritative one. When they disagree, the disagreement is a
|
||||
> conflict and surfaces via the conflict model.
|
||||
|
||||
The matrix below assigns the authoritative representation per fact
|
||||
kind. It is the practical answer to the question "where does this
|
||||
fact actually live?" for daily decisions.
|
||||
|
||||
## The canonical-home matrix
|
||||
|
||||
| Fact kind | Canonical home | Why | How it gets into AtoCore |
|
||||
|---|---|---|---|
|
||||
| **CAD geometry** (the actual model) | NX (or successor CAD tool) | the only place that can render and validate it | not in AtoCore at all in V1 |
|
||||
| **CAD-side structure** (subsystem tree, component list, materials, parameters) | KB-CAD | KB-CAD is the structured wrapper around NX | KB-CAD export → `/ingest/kb-cad/export` → entities |
|
||||
| **FEM mesh & solver settings** | KB-FEM (wrapping the FEM tool) | only the solver representation can run | not in AtoCore at all in V1 |
|
||||
| **FEM results & validation outcomes** | KB-FEM | KB-FEM owns the outcome records | KB-FEM export → `/ingest/kb-fem/export` → entities |
|
||||
| **Source code** | Gitea repos | repos are version-controlled and reviewable | indirectly via repo markdown ingestion (Phase 1) |
|
||||
| **Repo-level documentation** (READMEs, design docs in the repo) | Gitea repos | lives next to the code it documents | ingested as source chunks; never hand-edited in AtoCore |
|
||||
| **Project-level prose notes** (decisions in long-form, journal-style entries, working notes) | PKM | the place Antoine actually writes when thinking | ingested as source chunks; the extractor proposes candidates from these for the review queue |
|
||||
| **Identity** ("the user is a mechanical engineer running AtoCore") | AtoCore memories (`identity` type) | nowhere else holds personal identity | hand-authored via `POST /memory` or extracted from interactions |
|
||||
| **Preference** ("prefers small reviewable diffs", "uses SI units") | AtoCore memories (`preference` type) | nowhere else holds personal preferences | hand-authored or extracted |
|
||||
| **Episodic** ("on April 6 we debugged the EXDEV bug") | AtoCore memories (`episodic` type) | nowhere else has time-bound personal recall | extracted from captured interactions |
|
||||
| **Decision** (a structured engineering decision) | AtoCore **entities** (Decision) once the engineering layer ships; AtoCore memories (`adaptation`) until then | needs structured supersession, audit trail, and link to affected components | extracted from PKM or interactions; promoted via review queue |
|
||||
| **Requirement** | AtoCore **entities** (Requirement) | needs structured satisfaction tracking | extracted from PKM, KB-CAD, or interactions |
|
||||
| **Constraint** | AtoCore **entities** (Constraint) | needs structured link to the entity it constrains | extracted from PKM, KB-CAD, or interactions |
|
||||
| **Validation claim** | AtoCore **entities** (ValidationClaim) | needs structured link to supporting Result | extracted from KB-FEM exports or interactions |
|
||||
| **Material** | KB-CAD if the material is on a real component; AtoCore entity (Material) if it's a project-wide material decision not yet attached to geometry | structured properties live in KB-CAD's material database | KB-CAD export, or hand-authored as a Material entity |
|
||||
| **Parameter** | KB-CAD or KB-FEM depending on whether it's a geometry or solver parameter; AtoCore entity (Parameter) if it's a higher-level project parameter not in either tool | structured numeric values with units live in their tool of origin | KB export, or hand-authored |
|
||||
| **Project status / current focus / next milestone** | AtoCore **project_state** (Layer 3) | the trust hierarchy says trusted state is the highest authority for "what is the current state of the project" | hand-curated via `POST /project/state` |
|
||||
| **Architectural decision records (ADRs)** | depends on form: long-form ADR markdown lives in the repo; the structured fact about which ADR was selected lives in the AtoCore Decision entity | both representations are useful for different audiences | repo ingestion provides the prose; the entity is created by extraction or hand-authored |
|
||||
| **Operational runbooks** | repo (next to the code they describe) | lives with the system it operates | not promoted into AtoCore entities — runbooks are reference material, not facts |
|
||||
| **Backup metadata** (snapshot timestamps, integrity status) | the backup-metadata.json files under `/srv/storage/atocore/backups/` | each snapshot is its own self-describing record | not in AtoCore's database; queried via the `/admin/backup` endpoints |
|
||||
| **Conversation history with AtoCore (interactions)** | AtoCore `interactions` table | nowhere else has the prompt + context pack + response triple | written by capture (Phase 9 Commit A) |
|
||||
|
||||
## The supremacy rule for cross-layer facts
|
||||
|
||||
When the same fact has copies in multiple representations and they
|
||||
disagree, the trust hierarchy applies in this order:
|
||||
|
||||
1. **AtoCore project_state** (Layer 3) is highest authority for any
|
||||
"current state of the project" question. This is why it requires
|
||||
manual curation and never gets touched by automatic processes.
|
||||
2. **The tool-of-origin canonical home** is highest authority for
|
||||
facts that are tool-managed: KB-CAD wins over AtoCore entities
|
||||
for CAD-side structure facts; KB-FEM wins for FEM result facts.
|
||||
3. **AtoCore entities** are highest authority for facts that are
|
||||
AtoCore-managed: Decisions, Requirements, Constraints,
|
||||
ValidationClaims (when the supporting Results are still loose).
|
||||
4. **Active AtoCore memories** are highest authority for personal
|
||||
facts (identity, preference, episodic).
|
||||
5. **Source chunks (PKM, repos, ingested docs)** are lowest
|
||||
authority — they are the raw substrate from which higher layers
|
||||
are extracted, but they may be stale, contradictory among
|
||||
themselves, or out of date.
|
||||
|
||||
This is the same hierarchy enforced by `conflict-model.md`. This
|
||||
document just makes it explicit per fact kind.
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1 — "what material does the lateral support pad use?"
|
||||
|
||||
Possible representations:
|
||||
|
||||
- KB-CAD has the field `component.lateral-support-pad.material = "GF-PTFE"`
|
||||
- A PKM note from last month says "considering PEEK for the
|
||||
lateral support, GF-PTFE was the previous choice"
|
||||
- An AtoCore Material entity says `GF-PTFE`
|
||||
- An AtoCore project_state entry says `p05 / decision /
|
||||
lateral_support_material = GF-PTFE`
|
||||
|
||||
Which one wins for the question "what's the current material"?
|
||||
|
||||
- **project_state wins** if the query is "what is the current
|
||||
trusted answer for p05's lateral support material" (Layer 3)
|
||||
- **KB-CAD wins** if project_state has not been curated for this
|
||||
field yet, because KB-CAD is the canonical home for CAD-side
|
||||
structure
|
||||
- **The Material entity** is a derived view from KB-CAD; if it
|
||||
disagrees with KB-CAD, the entity is wrong and a conflict is
|
||||
surfaced
|
||||
- **The PKM note** is historical context, not authoritative for
|
||||
"current"
|
||||
|
||||
### Example 2 — "did we decide to merge the bind mounts?"
|
||||
|
||||
Possible representations:
|
||||
|
||||
- A working session interaction is captured in the `interactions`
|
||||
table with the response containing `## Decision: merge the two
|
||||
bind mounts into one`
|
||||
- The Phase 9 Commit C extractor produced a candidate adaptation
|
||||
memory from that decision
|
||||
- A reviewer promoted the candidate to active
|
||||
- The AtoCore source repo has the actual code change in commit
|
||||
`d0ff8b5` and the docker-compose.yml is in its post-merge form
|
||||
|
||||
Which one wins for "is this decision real and current"?
|
||||
|
||||
- **The Gitea repo** wins for "is this decision implemented" —
|
||||
the docker-compose.yml is the canonical home for the actual
|
||||
bind mount configuration
|
||||
- **The active adaptation memory** wins for "did we decide this"
|
||||
— that's exactly what the Commit C lifecycle is for
|
||||
- **The interaction record** is the audit trail — it's
|
||||
authoritative for "when did this conversation happen and what
|
||||
did the LLM say", but not for "is this decision current"
|
||||
- **The source chunks** from PKM are not relevant here because no
|
||||
PKM note about this decision exists yet (and that's fine —
|
||||
decisions don't have to live in PKM if they live in the repo
|
||||
and the AtoCore memory)
|
||||
|
||||
### Example 3 — "what's p05's current next focus?"
|
||||
|
||||
Possible representations:
|
||||
|
||||
- The PKM has a `current-status.md` note updated last week
|
||||
- AtoCore project_state has `p05 / status / next_focus = "wave 2 ingestion"`
|
||||
- A captured interaction from yesterday discussed the next focus
|
||||
at length
|
||||
|
||||
Which one wins?
|
||||
|
||||
- **project_state wins**, full stop. The trust hierarchy says
|
||||
Layer 3 is canonical for current state. This is exactly the
|
||||
reason project_state exists.
|
||||
- The PKM note is historical context.
|
||||
- The interaction is conversation history.
|
||||
- If project_state and the PKM disagree, the human updates one or
|
||||
the other to bring them in line — usually by re-curating
|
||||
project_state if the conversation revealed a real change.
|
||||
|
||||
## What this means for the engineering layer V1 implementation
|
||||
|
||||
Several concrete consequences fall out of the matrix:
|
||||
|
||||
1. **The Material and Parameter entity types are mostly KB-CAD
|
||||
shadows in V1.** They exist in AtoCore so other entities
|
||||
(Decisions, Requirements) can reference them with structured
|
||||
links, but their authoritative values come from KB-CAD imports.
|
||||
If KB-CAD doesn't know about a material, the AtoCore entity is
|
||||
the canonical home only because nothing else is.
|
||||
2. **Decisions / Requirements / Constraints / ValidationClaims
|
||||
are AtoCore-canonical.** These don't have a natural home in
|
||||
KB-CAD or KB-FEM. They live in AtoCore as first-class entities
|
||||
with full lifecycle and supersession.
|
||||
3. **The PKM is never authoritative.** It is the substrate for
|
||||
extraction. The reviewer promotes things out of it; they don't
|
||||
point at PKM notes as the "current truth".
|
||||
4. **project_state is the override layer.** Whenever the human
|
||||
wants to declare "the current truth is X regardless of what
|
||||
the entities and memories and KB exports say", they curate
|
||||
into project_state. Layer 3 is intentionally small and
|
||||
intentionally manual.
|
||||
5. **The conflict model is the enforcement mechanism.** When two
|
||||
representations disagree on a fact whose canonical home rule
|
||||
should pick a winner, the conflict surfaces via the
|
||||
`/conflicts` endpoint and the reviewer resolves it. The
|
||||
matrix in this document tells the reviewer who is supposed
|
||||
to win in each scenario; they're not making the decision blind.
|
||||
|
||||
## What the matrix does NOT define
|
||||
|
||||
1. **Facts about people other than the user.** No "team member"
|
||||
entity, no per-collaborator preferences. AtoCore is
|
||||
single-user in V1.
|
||||
2. **Facts about AtoCore itself as a project.** Those are project
|
||||
memories and project_state entries under `project=atocore`,
|
||||
same lifecycle as any other project's facts.
|
||||
3. **Vendor / supplier / cost facts.** Out of V1 scope.
|
||||
4. **Time-bounded facts** (a value that was true between two
|
||||
dates and may not be true now). The current matrix treats all
|
||||
active facts as currently-true and uses supersession to
|
||||
represent change. Temporal facts are a V2 concern.
|
||||
5. **Cross-project shared facts** (a Material that is reused across
|
||||
p04, p05, and p06). Currently each project has its own copy.
|
||||
Cross-project deduplication is also a V2 concern.
|
||||
|
||||
## The "single canonical home" invariant in practice
|
||||
|
||||
The hard rule that every fact has exactly one canonical home is
|
||||
the load-bearing invariant of this matrix. To enforce it
|
||||
operationally:
|
||||
|
||||
- **Extraction never duplicates.** When the extractor scans an
|
||||
interaction or a source chunk and proposes a candidate, the
|
||||
candidate is dropped if it duplicates an already-active record
|
||||
in the canonical home (the existing extractor implementation
|
||||
already does this for memories; the entity extractor will
|
||||
follow the same pattern).
|
||||
- **Imports never duplicate.** When KB-CAD pushes the same
|
||||
Component twice with the same value, the second push is
|
||||
recognized as identical and updates the `last_imported_at`
|
||||
timestamp without creating a new entity.
|
||||
- **Imports surface drift as conflict.** When KB-CAD pushes the
|
||||
same Component with a different value, that's a conflict per
|
||||
the conflict model — never a silent overwrite.
|
||||
- **Hand-curation into project_state always wins.** A
|
||||
project_state entry can disagree with an entity or a KB
|
||||
export; the project_state entry is correct by fiat (Layer 3
|
||||
trust), and the reviewer is responsible for bringing the lower
|
||||
layers in line if appropriate.
|
||||
|
||||
## Open questions for V1 implementation
|
||||
|
||||
1. **How does the reviewer see the canonical home for a fact in
|
||||
the UI?** Probably by including the fact's authoritative
|
||||
layer in the entity / memory detail view: "this Material is
|
||||
currently mirrored from KB-CAD; the canonical home is KB-CAD".
|
||||
2. **Who owns running the KB-CAD / KB-FEM exporter?** The
|
||||
`tool-handoff-boundaries.md` doc lists this as an open
|
||||
question; same answer applies here.
|
||||
3. **Do we need an explicit `canonical_home` field on entity
|
||||
rows?** A field that records "this entity is canonical here"
|
||||
vs "this entity is a mirror of <external system>". Probably
|
||||
yes; deferred to the entity schema spec.
|
||||
4. **How are project_state overrides surfaced in the engineering
|
||||
layer query results?** When a query (e.g. Q-001 "what does
|
||||
this subsystem contain?") would return entity rows, the result
|
||||
should also flag any project_state entries that contradict the
|
||||
entities — letting the reviewer see the override at query
|
||||
time, not just in the conflict queue.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- Six representation layers: PKM, KB project, repos, AtoCore
|
||||
memories, AtoCore entities, AtoCore project_state
|
||||
- Every fact kind has exactly one canonical home
|
||||
- The trust hierarchy resolves cross-layer conflicts:
|
||||
project_state > tool-of-origin (KB-CAD/KB-FEM) > entities >
|
||||
active memories > source chunks
|
||||
- Decisions / Requirements / Constraints / ValidationClaims are
|
||||
AtoCore-canonical (no other system has a natural home for them)
|
||||
- Materials / Parameters / CAD-side structure are KB-CAD-canonical
|
||||
- FEM results / validation outcomes are KB-FEM-canonical
|
||||
- project_state is the human override layer, top of the
|
||||
hierarchy, manually curated only
|
||||
- Conflicts surface via `/conflicts` and the reviewer applies the
|
||||
matrix to pick a winner
|
||||
339
docs/architecture/tool-handoff-boundaries.md
Normal file
339
docs/architecture/tool-handoff-boundaries.md
Normal file
@@ -0,0 +1,339 @@
|
||||
# Tool Hand-off Boundaries (KB-CAD / KB-FEM and friends)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The engineering layer V1 will accumulate typed entities about
|
||||
projects, subsystems, components, materials, requirements,
|
||||
constraints, decisions, parameters, analysis models, results, and
|
||||
validation claims. Many of those concepts also live in real
|
||||
external tools — CAD systems, FEM solvers, BOM managers, PLM
|
||||
databases, vendor portals.
|
||||
|
||||
The first big design decision before writing any entity-layer code
|
||||
is: **what is AtoCore's read/write relationship with each of those
|
||||
external tools?**
|
||||
|
||||
The wrong answer in either direction is expensive:
|
||||
|
||||
- Too read-only: AtoCore becomes a stale shadow of the tools and
|
||||
loses the trust battle the moment a value drifts.
|
||||
- Too bidirectional: AtoCore takes on responsibilities it can't
|
||||
reliably honor (live sync, conflict resolution against external
|
||||
schemas, write-back validation), and the project never ships.
|
||||
|
||||
This document picks a position for V1.
|
||||
|
||||
## The position
|
||||
|
||||
> **AtoCore is a one-way mirror in V1.** External tools push
|
||||
> structured exports into AtoCore. AtoCore never pushes back.
|
||||
|
||||
That position has three corollaries:
|
||||
|
||||
1. **External tools remain the source of truth for everything they
|
||||
already manage.** A CAD model is canonical for geometry; a FEM
|
||||
project is canonical for meshes and solver settings; KB-CAD is
|
||||
canonical for whatever KB-CAD already calls canonical.
|
||||
2. **AtoCore is the source of truth for the *AtoCore-shaped*
|
||||
record** of those facts: the Decision that selected the geometry,
|
||||
the Requirement the geometry satisfies, the ValidationClaim the
|
||||
FEM result supports. AtoCore does not duplicate the external
|
||||
tool's primary representation; it stores the structured *facts
|
||||
about* it.
|
||||
3. **The boundary is enforced by absence.** No write endpoint in
|
||||
AtoCore ever generates a `.prt`, a `.fem`, an export to a PLM
|
||||
schema, or a vendor purchase order. If we find ourselves wanting
|
||||
to add such an endpoint in V1, we should stop and reconsider
|
||||
the V1 scope.
|
||||
|
||||
## Why one-way and not bidirectional
|
||||
|
||||
Bidirectional sync between independent systems is one of the
|
||||
hardest problems in engineering software. The honest reasons we
|
||||
are not attempting it in V1:
|
||||
|
||||
1. **Schema drift.** External tools evolve their schemas
|
||||
independently. A bidirectional sync would have to track every
|
||||
schema version of every external tool we touch. That is a
|
||||
permanent maintenance tax.
|
||||
2. **Conflict semantics.** When AtoCore and an external tool
|
||||
disagree on the same field, "who wins" is a per-tool, per-field
|
||||
decision. There is no general rule. Bidirectional sync would
|
||||
require us to specify that decision exhaustively.
|
||||
3. **Trust hierarchy.** AtoCore's whole point is the trust
|
||||
hierarchy: trusted project state > entities > memories. If we
|
||||
let entities push values back into the external tools, we
|
||||
silently elevate AtoCore's confidence to "high enough to write
|
||||
to a CAD model", which it almost never deserves.
|
||||
4. **Velocity.** A bidirectional engineering layer is a
|
||||
multi-year project. A one-way mirror is a months project. The
|
||||
value-to-effort ratio favors one-way for V1 by an enormous
|
||||
margin.
|
||||
5. **Reversibility.** We can always add bidirectional sync later
|
||||
on a per-tool basis once V1 has shown itself to be useful. We
|
||||
cannot easily walk back a half-finished bidirectional sync that
|
||||
has already corrupted data in someone's CAD model.
|
||||
|
||||
## Per-tool stance for V1
|
||||
|
||||
| External tool | V1 stance | What AtoCore reads in | What AtoCore writes back |
|
||||
|---|---|---|---|
|
||||
| **KB-CAD** (Antoine's CAD knowledge base) | one-way mirror | structured exports of subsystems, components, materials, parameters via a documented JSON or CSV shape | nothing |
|
||||
| **KB-FEM** (Antoine's FEM knowledge base) | one-way mirror | structured exports of analysis models, results, validation claims | nothing |
|
||||
| **NX / Siemens NX** (the CAD tool itself) | not connected in V1 | nothing direct — only what KB-CAD exports about NX projects | nothing |
|
||||
| **PKM (Obsidian / markdown vault)** | already connected via the ingestion pipeline (Phase 1) | full markdown/text corpus per the ingestion-waves doc | nothing |
|
||||
| **Gitea repos** | already connected via the ingestion pipeline | repo markdown/text per project | nothing |
|
||||
| **OpenClaw** (the LLM agent) | already connected via the read-only helper skill on the T420 | nothing — OpenClaw reads from AtoCore | nothing — OpenClaw does not write into AtoCore |
|
||||
| **AtoDrive** (operational truth layer, future) | future: bidirectional with AtoDrive itself, but AtoDrive is internal to AtoCore so this isn't an external tool boundary | n/a in V1 | n/a in V1 |
|
||||
| **PLM / vendor portals / cost systems** | not in V1 scope | nothing | nothing |
|
||||
|
||||
## What "one-way mirror" actually looks like in code
|
||||
|
||||
AtoCore exposes an ingestion endpoint per external tool that
|
||||
accepts a structured export and turns it into entity candidates.
|
||||
The endpoint is read-side from AtoCore's perspective (it reads
|
||||
from a file or HTTP body), even though the external tool is the
|
||||
one initiating the call.
|
||||
|
||||
Proposed V1 ingestion endpoints:
|
||||
|
||||
```
|
||||
POST /ingest/kb-cad/export body: KB-CAD export JSON
|
||||
POST /ingest/kb-fem/export body: KB-FEM export JSON
|
||||
```
|
||||
|
||||
Each endpoint:
|
||||
|
||||
1. Validates the export against the documented schema
|
||||
2. Maps each export record to an entity candidate (status="candidate")
|
||||
3. Carries the export's source identifier into the candidate's
|
||||
provenance fields (source_artifact_id, exporter_version, etc.)
|
||||
4. Returns a summary: how many candidates were created, how many
|
||||
were dropped as duplicates, how many failed schema validation
|
||||
5. Does NOT auto-promote anything
|
||||
|
||||
The KB-CAD and KB-FEM teams (which is to say, future-you) own the
|
||||
exporter scripts that produce these JSON bodies. Those scripts
|
||||
live in the KB-CAD / KB-FEM repos respectively, not in AtoCore.
|
||||
|
||||
## The export schemas (sketch, not final)
|
||||
|
||||
These are starting shapes, intentionally minimal. The schemas
|
||||
will be refined in `kb-cad-export-schema.md` and
|
||||
`kb-fem-export-schema.md` once the V1 ontology lands.
|
||||
|
||||
### KB-CAD export shape (starting sketch)
|
||||
|
||||
```json
|
||||
{
|
||||
"exporter": "kb-cad",
|
||||
"exporter_version": "1.0.0",
|
||||
"exported_at": "2026-04-07T12:00:00Z",
|
||||
"project": "p05-interferometer",
|
||||
"subsystems": [
|
||||
{
|
||||
"id": "subsystem.optical-frame",
|
||||
"name": "Optical frame",
|
||||
"parent": null,
|
||||
"components": [
|
||||
{
|
||||
"id": "component.lateral-support-pad",
|
||||
"name": "Lateral support pad",
|
||||
"material": "GF-PTFE",
|
||||
"parameters": {
|
||||
"thickness_mm": 3.0,
|
||||
"preload_n": 12.0
|
||||
},
|
||||
"source_artifact": "kb-cad://p05/subsystems/optical-frame#lateral-support"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### KB-FEM export shape (starting sketch)
|
||||
|
||||
```json
|
||||
{
|
||||
"exporter": "kb-fem",
|
||||
"exporter_version": "1.0.0",
|
||||
"exported_at": "2026-04-07T12:00:00Z",
|
||||
"project": "p05-interferometer",
|
||||
"analysis_models": [
|
||||
{
|
||||
"id": "model.optical-frame-modal",
|
||||
"name": "Optical frame modal analysis v3",
|
||||
"subsystem": "subsystem.optical-frame",
|
||||
"results": [
|
||||
{
|
||||
"id": "result.first-mode-frequency",
|
||||
"name": "First-mode frequency",
|
||||
"value": 187.4,
|
||||
"unit": "Hz",
|
||||
"supports_validation_claim": "claim.frame-rigidity-min-150hz",
|
||||
"source_artifact": "kb-fem://p05/models/optical-frame-modal#first-mode"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
These shapes will evolve. The point of including them now is to
|
||||
make the one-way mirror concrete: it is a small, well-defined
|
||||
JSON shape, not "AtoCore reaches into KB-CAD's database".
|
||||
|
||||
## What AtoCore is allowed to do with the imported records
|
||||
|
||||
After ingestion, the imported records become entity candidates
|
||||
in AtoCore's own table. From that point forward they follow the
|
||||
exact same lifecycle as any other candidate:
|
||||
|
||||
- they sit at status="candidate" until a human reviews them
|
||||
- the reviewer promotes them to status="active" or rejects them
|
||||
- the active entities are queryable via the engineering query
|
||||
catalog (Q-001 through Q-020)
|
||||
- the active entities can be referenced from Decisions, Requirements,
|
||||
ValidationClaims, etc. via the V1 relationship types
|
||||
|
||||
The imported records are never automatically pushed into trusted
|
||||
project state, never modified in place after import (they are
|
||||
superseded by re-imports, not edited), and never written back to
|
||||
the external tool.
|
||||
|
||||
## What happens when KB-CAD changes a value AtoCore already has
|
||||
|
||||
This is the canonical "drift" scenario. The flow:
|
||||
|
||||
1. KB-CAD exports a fresh JSON. Component `component.lateral-support-pad`
|
||||
now has `material: "PEEK"` instead of `material: "GF-PTFE"`.
|
||||
2. AtoCore's ingestion endpoint sees the same `id` and a different
|
||||
value.
|
||||
3. The ingestion endpoint creates a new entity candidate with the
|
||||
new value, **does NOT delete or modify the existing active
|
||||
entity**, and creates a `conflicts` row linking the two members
|
||||
(per the conflict model doc).
|
||||
4. The reviewer sees an open conflict on the next visit to
|
||||
`/conflicts`.
|
||||
5. The reviewer either:
|
||||
- **promotes the new value** (the active is superseded, the
|
||||
candidate becomes the new active, the audit trail keeps both)
|
||||
- **rejects the new value** (the candidate is invalidated, the
|
||||
active stays — useful when the export was wrong)
|
||||
- **dismisses the conflict** (declares them not actually about
|
||||
the same thing, both stay active)
|
||||
|
||||
The reviewer never touches KB-CAD from AtoCore. If the resolution
|
||||
implies a change in KB-CAD itself, the reviewer makes that change
|
||||
in KB-CAD, then re-exports.
|
||||
|
||||
## What about NX directly?
|
||||
|
||||
NX (Siemens NX) is the underlying CAD tool that KB-CAD wraps.
|
||||
**NX is not connected to AtoCore in V1.** Any facts about NX
|
||||
projects flow through KB-CAD as the structured intermediate. This
|
||||
gives us:
|
||||
|
||||
- **One schema to maintain.** AtoCore only has to understand the
|
||||
KB-CAD export shape, not the NX API.
|
||||
- **One ownership boundary.** KB-CAD owns the question of "what's
|
||||
in NX". AtoCore owns the question of "what's in the typed
|
||||
knowledge base".
|
||||
- **Future flexibility.** When NX is replaced or upgraded, only
|
||||
KB-CAD has to adapt; AtoCore doesn't notice.
|
||||
|
||||
The same logic applies to FEM solvers (Nastran, Abaqus, ANSYS):
|
||||
KB-FEM is the structured intermediate, AtoCore never talks to the
|
||||
solver directly.
|
||||
|
||||
## The hard-line invariants
|
||||
|
||||
These are the things V1 will not do, regardless of how convenient
|
||||
they might seem:
|
||||
|
||||
1. **No write to external tools.** No POST/PUT/PATCH to any
|
||||
external API, no file generation that gets written into a
|
||||
CAD/FEM project tree, no email/chat sends.
|
||||
2. **No live polling.** AtoCore does not poll KB-CAD or KB-FEM on
|
||||
a schedule. Imports are explicit pushes from the external tool
|
||||
into AtoCore's ingestion endpoint.
|
||||
3. **No silent merging.** Every value drift surfaces as a
|
||||
conflict for the reviewer (per the conflict model doc).
|
||||
4. **No schema fan-out.** AtoCore does not store every field that
|
||||
KB-CAD knows about. Only fields that map to one of the V1
|
||||
entity types make it into AtoCore. Everything else is dropped
|
||||
at the import boundary.
|
||||
5. **No external-tool-specific logic in entity types.** A
|
||||
`Component` in AtoCore is the same shape regardless of whether
|
||||
it came from KB-CAD, KB-FEM, the PKM, or a hand-curated
|
||||
project state entry. The source is recorded in provenance,
|
||||
not in the entity shape.
|
||||
|
||||
## What this enables
|
||||
|
||||
With the one-way mirror locked in, V1 implementation can focus on:
|
||||
|
||||
- The entity table and its lifecycle
|
||||
- The two `/ingest/kb-cad/export` and `/ingest/kb-fem/export`
|
||||
endpoints with their JSON validators
|
||||
- The candidate review queue extension (already designed in
|
||||
`promotion-rules.md`)
|
||||
- The conflict model (already designed in `conflict-model.md`)
|
||||
- The query catalog implementation (already designed in
|
||||
`engineering-query-catalog.md`)
|
||||
|
||||
None of those are unbounded. Each is a finite, well-defined
|
||||
implementation task. The one-way mirror is the choice that makes
|
||||
V1 finishable.
|
||||
|
||||
## What V2 might consider (deferred)
|
||||
|
||||
After V1 has been live and demonstrably useful for a quarter or
|
||||
two, the questions that become reasonable to revisit:
|
||||
|
||||
1. **Selective write-back to KB-CAD for low-risk fields.** For
|
||||
example, AtoCore could push back a "Decision id linked to this
|
||||
component" annotation that KB-CAD then displays without it
|
||||
being canonical there. Read-only annotations from AtoCore's
|
||||
perspective, advisory metadata from KB-CAD's perspective.
|
||||
2. **Live polling for very small payloads.** A daily poll of
|
||||
"what subsystem ids exist in KB-CAD now" so AtoCore can flag
|
||||
subsystems that disappeared from KB-CAD without an explicit
|
||||
AtoCore invalidation.
|
||||
3. **Direct NX integration** if the KB-CAD layer becomes a
|
||||
bottleneck — but only if the friction is real, not theoretical.
|
||||
4. **Cost / vendor / PLM connections** for projects where the
|
||||
procurement cycle is part of the active engineering work.
|
||||
|
||||
None of these are V1 work and they are listed only so the V1
|
||||
design intentionally leaves room for them later.
|
||||
|
||||
## Open questions for the V1 implementation sprint
|
||||
|
||||
1. **Where do the export schemas live?** Probably in
|
||||
`docs/architecture/kb-cad-export-schema.md` and
|
||||
`docs/architecture/kb-fem-export-schema.md`, drafted during
|
||||
the implementation sprint.
|
||||
2. **Who runs the exporter?** A scheduled job on the KB-CAD /
|
||||
KB-FEM hosts, triggered by the human after a meaningful
|
||||
change, or both?
|
||||
3. **Is the export incremental or full?** Full is simpler but
|
||||
more expensive. Incremental needs delta semantics. V1 starts
|
||||
with full and revisits when full becomes too slow.
|
||||
4. **How is the exporter authenticated to AtoCore?** Probably
|
||||
the existing PAT model (one PAT per exporter, scoped to
|
||||
`write:engineering-import` once that scope exists). Worth a
|
||||
quick auth design pass before the endpoints exist.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- AtoCore is a one-way mirror in V1: external tools push,
|
||||
AtoCore reads, AtoCore never writes back
|
||||
- Two import endpoints for V1: KB-CAD and KB-FEM, each with a
|
||||
documented JSON export shape
|
||||
- Drift surfaces as conflicts in the existing conflict model
|
||||
- No NX, no FEM solvers, no PLM, no vendor portals, no
|
||||
cost/BOM systems in V1
|
||||
- Bidirectional sync is reserved for V2+ on a per-tool basis,
|
||||
only after V1 demonstrates value
|
||||
442
docs/backup-restore-procedure.md
Normal file
442
docs/backup-restore-procedure.md
Normal file
@@ -0,0 +1,442 @@
|
||||
# AtoCore Backup and Restore Procedure
|
||||
|
||||
## Scope
|
||||
|
||||
This document defines the operational procedure for backing up and
|
||||
restoring AtoCore's machine state on the Dalidou deployment. It is
|
||||
the practical companion to `docs/backup-strategy.md` (which defines
|
||||
the strategy) and `src/atocore/ops/backup.py` (which implements the
|
||||
mechanics).
|
||||
|
||||
The intent is that this procedure can be followed by anyone with
|
||||
SSH access to Dalidou and the AtoCore admin endpoints.
|
||||
|
||||
## What gets backed up
|
||||
|
||||
A `create_runtime_backup` snapshot contains, in order of importance:
|
||||
|
||||
| Artifact | Source path on Dalidou | Backup destination | Always included |
|
||||
|---|---|---|---|
|
||||
| SQLite database | `/srv/storage/atocore/data/db/atocore.db` | `<backup_root>/db/atocore.db` | yes |
|
||||
| Project registry JSON | `/srv/storage/atocore/config/project-registry.json` | `<backup_root>/config/project-registry.json` | yes (if file exists) |
|
||||
| Backup metadata | (generated) | `<backup_root>/backup-metadata.json` | yes |
|
||||
| Chroma vector store | `/srv/storage/atocore/data/chroma/` | `<backup_root>/chroma/` | only when `include_chroma=true` |
|
||||
|
||||
The SQLite snapshot uses the online `conn.backup()` API and is safe
|
||||
to take while the database is in use. The Chroma snapshot is a cold
|
||||
directory copy and is **only safe when no ingestion is running**;
|
||||
the API endpoint enforces this by acquiring the ingestion lock for
|
||||
the duration of the copy.
|
||||
|
||||
What is **not** in the backup:
|
||||
|
||||
- Source documents under `/srv/storage/atocore/sources/vault/` and
|
||||
`/srv/storage/atocore/sources/drive/`. These are read-only
|
||||
inputs and live in the user's PKM/Drive, which is backed up
|
||||
separately by their own systems.
|
||||
- Application code. The container image is the source of truth for
|
||||
code; recovery means rebuilding the image, not restoring code from
|
||||
a backup.
|
||||
- Logs under `/srv/storage/atocore/logs/`.
|
||||
- Embeddings cache under `/srv/storage/atocore/data/cache/`.
|
||||
- Temp files under `/srv/storage/atocore/data/tmp/`.
|
||||
|
||||
## Backup root layout
|
||||
|
||||
Each backup snapshot lives in its own timestamped directory:
|
||||
|
||||
```
|
||||
/srv/storage/atocore/backups/snapshots/
|
||||
├── 20260407T060000Z/
|
||||
│ ├── backup-metadata.json
|
||||
│ ├── db/
|
||||
│ │ └── atocore.db
|
||||
│ ├── config/
|
||||
│ │ └── project-registry.json
|
||||
│ └── chroma/ # only if include_chroma=true
|
||||
│ └── ...
|
||||
├── 20260408T060000Z/
|
||||
│ └── ...
|
||||
└── ...
|
||||
```
|
||||
|
||||
The timestamp is UTC, format `YYYYMMDDTHHMMSSZ`.
|
||||
|
||||
## Triggering a backup
|
||||
|
||||
### Option A — via the admin endpoint (preferred)
|
||||
|
||||
```bash
|
||||
# DB + registry only (fast, safe at any time)
|
||||
curl -fsS -X POST http://dalidou:8100/admin/backup \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"include_chroma": false}'
|
||||
|
||||
# DB + registry + Chroma (acquires ingestion lock)
|
||||
curl -fsS -X POST http://dalidou:8100/admin/backup \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"include_chroma": true}'
|
||||
```
|
||||
|
||||
The response is the backup metadata JSON. Save the `backup_root`
|
||||
field — that's the directory the snapshot was written to.
|
||||
|
||||
### Option B — via the standalone script (when the API is down)
|
||||
|
||||
```bash
|
||||
docker exec atocore python -m atocore.ops.backup
|
||||
```
|
||||
|
||||
This runs `create_runtime_backup()` directly, without going through
|
||||
the API or the ingestion lock. Use it only when the AtoCore service
|
||||
itself is unhealthy and you can't hit the admin endpoint.
|
||||
|
||||
### Option C — manual file copy (last resort)
|
||||
|
||||
If both the API and the standalone script are unusable:
|
||||
|
||||
```bash
|
||||
sudo systemctl stop atocore # or: docker compose stop atocore
|
||||
sudo cp /srv/storage/atocore/data/db/atocore.db \
|
||||
/srv/storage/atocore/backups/manual-$(date -u +%Y%m%dT%H%M%SZ).db
|
||||
sudo cp /srv/storage/atocore/config/project-registry.json \
|
||||
/srv/storage/atocore/backups/manual-$(date -u +%Y%m%dT%H%M%SZ).registry.json
|
||||
sudo systemctl start atocore
|
||||
```
|
||||
|
||||
This is a cold backup and requires brief downtime.
|
||||
|
||||
## Listing backups
|
||||
|
||||
```bash
|
||||
curl -fsS http://dalidou:8100/admin/backup
|
||||
```
|
||||
|
||||
Returns the configured `backup_dir` and a list of all snapshots
|
||||
under it, with their full metadata if available.
|
||||
|
||||
Or, on the host directly:
|
||||
|
||||
```bash
|
||||
ls -la /srv/storage/atocore/backups/snapshots/
|
||||
```
|
||||
|
||||
## Validating a backup
|
||||
|
||||
Before relying on a backup for restore, validate it:
|
||||
|
||||
```bash
|
||||
curl -fsS http://dalidou:8100/admin/backup/20260407T060000Z/validate
|
||||
```
|
||||
|
||||
The validator:
|
||||
- confirms the snapshot directory exists
|
||||
- opens the SQLite snapshot and runs `PRAGMA integrity_check`
|
||||
- parses the registry JSON
|
||||
- confirms the Chroma directory exists (if it was included)
|
||||
|
||||
A valid backup returns `"valid": true` and an empty `errors` array.
|
||||
A failing validation returns `"valid": false` with one or more
|
||||
specific error strings (e.g. `db_integrity_check_failed`,
|
||||
`registry_invalid_json`, `chroma_snapshot_missing`).
|
||||
|
||||
**Validate every backup at creation time.** A backup that has never
|
||||
been validated is not actually a backup — it's just a hopeful copy
|
||||
of bytes.
|
||||
|
||||
## Restore procedure
|
||||
|
||||
Since 2026-04-09 the restore is implemented as a proper module
|
||||
function plus CLI entry point: `restore_runtime_backup()` in
|
||||
`src/atocore/ops/backup.py`, invoked as
|
||||
`python -m atocore.ops.backup restore <STAMP> --confirm-service-stopped`.
|
||||
It automatically takes a pre-restore safety snapshot (your rollback
|
||||
anchor), handles SQLite WAL/SHM cleanly, restores the registry, and
|
||||
runs `PRAGMA integrity_check` on the restored db. This replaces the
|
||||
earlier manual `sudo cp` sequence.
|
||||
|
||||
The function refuses to run without `--confirm-service-stopped`.
|
||||
This is deliberate: hot-restoring into a running service corrupts
|
||||
SQLite state.
|
||||
|
||||
### Pre-flight (always)
|
||||
|
||||
1. Identify which snapshot you want to restore. List available
|
||||
snapshots and pick by timestamp:
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup | jq '.backups[].stamp'
|
||||
```
|
||||
2. Validate it. Refuse to restore an invalid backup:
|
||||
```bash
|
||||
STAMP=20260409T060000Z
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup/$STAMP/validate | jq .
|
||||
```
|
||||
3. **Stop AtoCore.** SQLite cannot be hot-restored under a running
|
||||
process and Chroma will not pick up new files until the process
|
||||
restarts.
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose down
|
||||
docker compose ps # atocore should be Exited/gone
|
||||
```
|
||||
|
||||
### Run the restore
|
||||
|
||||
Use a one-shot container that reuses the live service's volume
|
||||
mounts so every path (`db_path`, `chroma_path`, backup dir) resolves
|
||||
to the same place the main service would see:
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore \
|
||||
$STAMP \
|
||||
--confirm-service-stopped
|
||||
```
|
||||
|
||||
Output is a JSON document. The critical fields:
|
||||
|
||||
- `pre_restore_snapshot`: stamp of the safety snapshot of live
|
||||
state taken right before the restore. **Write this down.** If
|
||||
the restore was the wrong call, this is how you roll it back.
|
||||
- `db_restored`: should be `true`
|
||||
- `registry_restored`: `true` if the backup captured a registry
|
||||
- `chroma_restored`: `true` if the backup captured a chroma tree
|
||||
and include_chroma resolved to true (default)
|
||||
- `restored_integrity_ok`: **must be `true`** — if this is false,
|
||||
STOP and do not start the service; investigate the integrity
|
||||
error first. The restored file is still on disk but untrusted.
|
||||
|
||||
### Controlling the restore
|
||||
|
||||
The CLI supports a few flags for finer control:
|
||||
|
||||
- `--no-pre-snapshot` skips the pre-restore safety snapshot. Use
|
||||
this only when you know you have another rollback path.
|
||||
- `--no-chroma` restores only SQLite + registry, leaving the
|
||||
current Chroma dir alone. Useful if Chroma is consistent but
|
||||
SQLite needs a rollback.
|
||||
- `--chroma` forces Chroma restoration even if the metadata
|
||||
doesn't clearly indicate the snapshot has it (rare).
|
||||
|
||||
### Chroma restore and bind-mounted volumes
|
||||
|
||||
The Chroma dir on Dalidou is a bind-mounted Docker volume. The
|
||||
restore cannot `rmtree` the destination (you can't unlink a mount
|
||||
point — it raises `OSError [Errno 16] Device or resource busy`),
|
||||
so the function clears the dir's CONTENTS and uses
|
||||
`copytree(dirs_exist_ok=True)` to copy the snapshot back in. The
|
||||
regression test `test_restore_chroma_does_not_unlink_destination_directory`
|
||||
in `tests/test_backup.py` captures the destination inode before
|
||||
and after restore and asserts it's stable — the same invariant
|
||||
that protects the bind mount.
|
||||
|
||||
This was discovered during the first real Dalidou restore drill
|
||||
on 2026-04-09. If you see a new restore failure with
|
||||
`Device or resource busy`, something has regressed this fix.
|
||||
|
||||
### Restart AtoCore
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose up -d
|
||||
# Wait for /health to come up
|
||||
for i in 1 2 3 4 5 6 7 8 9 10; do
|
||||
curl -fsS http://127.0.0.1:8100/health \
|
||||
&& break || { echo "not ready ($i/10)"; sleep 3; }
|
||||
done
|
||||
```
|
||||
|
||||
**Note on build_sha after restore:** The one-shot `docker compose run`
|
||||
container does not carry the build provenance env vars that `deploy.sh`
|
||||
exports at deploy time. After a restore, `/health` will report
|
||||
`build_sha: "unknown"` until you re-run `deploy.sh` or manually
|
||||
re-deploy. This is cosmetic — the data is correctly restored — but if
|
||||
you need `build_sha` to be accurate, run a redeploy after the restore:
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app
|
||||
bash deploy/dalidou/deploy.sh
|
||||
```
|
||||
|
||||
### Post-restore verification
|
||||
|
||||
```bash
|
||||
# 1. Service is healthy
|
||||
curl -fsS http://127.0.0.1:8100/health | jq .
|
||||
|
||||
# 2. Stats look right
|
||||
curl -fsS http://127.0.0.1:8100/stats | jq .
|
||||
|
||||
# 3. Project registry loads
|
||||
curl -fsS http://127.0.0.1:8100/projects | jq '.projects | length'
|
||||
|
||||
# 4. A known-good context query returns non-empty results
|
||||
curl -fsS -X POST http://127.0.0.1:8100/context/build \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"prompt": "what is p05 about", "project": "p05-interferometer"}' | jq '.chunks_used'
|
||||
```
|
||||
|
||||
If any of these are wrong, the restore is bad. Roll back using the
|
||||
pre-restore safety snapshot whose stamp you recorded from the
|
||||
restore output. The rollback is the same procedure — stop the
|
||||
service and restore that stamp:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore \
|
||||
$PRE_RESTORE_SNAPSHOT_STAMP \
|
||||
--confirm-service-stopped \
|
||||
--no-pre-snapshot
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
(`--no-pre-snapshot` because the rollback itself doesn't need one;
|
||||
you already have the original snapshot as a fallback if everything
|
||||
goes sideways.)
|
||||
|
||||
### Restore drill
|
||||
|
||||
The restore is exercised at three levels:
|
||||
|
||||
1. **Unit tests.** `tests/test_backup.py` has six restore tests
|
||||
(refuse-without-confirm, invalid backup, full round-trip,
|
||||
Chroma round-trip, inode-stability regression, WAL sidecar
|
||||
cleanup, skip-pre-snapshot). These run in CI on every commit.
|
||||
2. **Module-level round-trip.**
|
||||
`test_restore_round_trip_reverses_post_backup_mutations` is
|
||||
the canonical drill in code form: seed baseline, snapshot,
|
||||
mutate, restore, assert mutation reversed + baseline survived
|
||||
+ pre-restore snapshot captured the mutation.
|
||||
3. **Live drill on Dalidou.** Periodically run the full procedure
|
||||
against the real service with a disposable drill-marker
|
||||
memory (created via `POST /memory` with `memory_type=episodic`
|
||||
and `project=drill`), following the sequence above and then
|
||||
verifying the marker is gone afterward via
|
||||
`GET /memory?project=drill`. The first such drill on
|
||||
2026-04-09 surfaced the bind-mount bug; future runs
|
||||
primarily exist to verify the fix stays fixed.
|
||||
|
||||
Run the live drill:
|
||||
|
||||
- **Before** enabling any new write-path automation (auto-capture,
|
||||
automated ingestion, reinforcement sweeps).
|
||||
- **After** any change to `src/atocore/ops/backup.py` or to
|
||||
schema migrations in `src/atocore/models/database.py`.
|
||||
- **After** a Dalidou OS upgrade or docker version bump.
|
||||
- **At least once per quarter** as a standing operational check.
|
||||
- **After any incident** that touched the storage layer.
|
||||
|
||||
Record each drill run (stamp, pre-restore snapshot stamp, pass/fail,
|
||||
any surprises) somewhere durable — a line in the project journal
|
||||
or a git commit message is enough. A drill you ran once and never
|
||||
again is barely more than a drill you never ran.
|
||||
|
||||
## Retention policy
|
||||
|
||||
- **Last 7 daily backups**: kept verbatim
|
||||
- **Last 4 weekly backups** (Sunday): kept verbatim
|
||||
- **Last 6 monthly backups** (1st of month): kept verbatim
|
||||
- **Anything older**: deleted
|
||||
|
||||
The retention job is **not yet implemented** and is tracked as a
|
||||
follow-up. Until then, the snapshots directory grows monotonically.
|
||||
A simple cron-based cleanup script is the next step:
|
||||
|
||||
```cron
|
||||
0 4 * * * /srv/storage/atocore/scripts/cleanup-old-backups.sh
|
||||
```
|
||||
|
||||
## Common failure modes and what to do about them
|
||||
|
||||
| Symptom | Likely cause | Action |
|
||||
|---|---|---|
|
||||
| `db_integrity_check_failed` on validation | SQLite snapshot copied while a write was in progress, or disk corruption | Take a fresh backup and validate again. If it fails twice, suspect the underlying disk. |
|
||||
| `registry_invalid_json` | Registry was being edited at backup time | Take a fresh backup. The registry is small so this is cheap. |
|
||||
| Restore: `restored_integrity_ok: false` | Source snapshot was itself corrupt (validation should have caught it — file a bug) or copy was interrupted mid-write | Do NOT start the service. Validate the snapshot directly with `python -m atocore.ops.backup validate <STAMP>`, try a different older snapshot, or roll back to the pre-restore safety snapshot. |
|
||||
| Restore: `OSError [Errno 16] Device or resource busy` on Chroma | Old code tried to `rmtree` the Chroma mount point. Fixed on 2026-04-09 by `test_restore_chroma_does_not_unlink_destination_directory` | Ensure you're running commit 2026-04-09 or later; if you need to work around an older build, use `--no-chroma` and restore Chroma contents manually. |
|
||||
| `chroma_snapshot_missing` after a restore | Snapshot was DB-only | Either rebuild via fresh ingestion or restore an older snapshot that includes Chroma. |
|
||||
| Service won't start after restore | Permissions wrong on the restored files | Re-run `chown 1000:1000` (or whatever the gitea/atocore container user is) on the data dir. |
|
||||
| `/stats` returns 0 documents after restore | The SQL store was restored but the source paths in `source_documents` don't match the current Dalidou paths | This means the backup came from a different deployment. Don't trust this restore — it's pulling from the wrong layout. |
|
||||
| Drill marker still present after restore | Wrong stamp, service still writing during `docker compose down`, or the restore JSON didn't report `db_restored: true` | Roll back via the pre-restore safety snapshot and retry with the correct source snapshot. |
|
||||
|
||||
## Open follow-ups (not yet implemented)
|
||||
|
||||
Tracked separately in `docs/next-steps.md` — the list below is the
|
||||
backup-specific subset.
|
||||
|
||||
1. **Retention cleanup script**: see the cron entry above. The
|
||||
snapshots directory grows monotonically until this exists.
|
||||
2. **Off-Dalidou backup target**: currently snapshots live on the
|
||||
same disk as the live data. A real disaster-recovery story
|
||||
needs at least one snapshot on a different physical machine.
|
||||
The simplest first step is a periodic `rsync` to the user's
|
||||
laptop or to another server.
|
||||
3. **Backup encryption**: snapshots contain raw SQLite and JSON.
|
||||
Consider age/gpg encryption if backups will be shipped off-site.
|
||||
4. **Automatic post-backup validation**: today the validator must
|
||||
be invoked manually. The `create_runtime_backup` function
|
||||
should call `validate_backup` on its own output and refuse to
|
||||
declare success if validation fails.
|
||||
5. **Chroma backup is currently full directory copy** every time.
|
||||
For large vector stores this gets expensive. A future
|
||||
improvement would be incremental snapshots via filesystem-level
|
||||
snapshotting (LVM, btrfs, ZFS).
|
||||
|
||||
**Done** (kept for historical reference):
|
||||
|
||||
- ~~Implement `restore_runtime_backup()` as a proper module
|
||||
function so the restore isn't a manual `sudo cp` dance~~ —
|
||||
landed 2026-04-09 in commit 3362080, followed by the
|
||||
Chroma bind-mount fix from the first real drill.
|
||||
|
||||
## Quickstart cheat sheet
|
||||
|
||||
```bash
|
||||
# Daily backup (DB + registry only — fast)
|
||||
curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
|
||||
-H "Content-Type: application/json" -d '{}'
|
||||
|
||||
# Weekly backup (DB + registry + Chroma — slower, holds ingestion lock)
|
||||
curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
|
||||
-H "Content-Type: application/json" -d '{"include_chroma": true}'
|
||||
|
||||
# List backups
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup | jq '.backups[].stamp'
|
||||
|
||||
# Validate the most recent backup
|
||||
LATEST=$(curl -fsS http://127.0.0.1:8100/admin/backup | jq -r '.backups[-1].stamp')
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup/$LATEST/validate | jq .
|
||||
|
||||
# Full restore (service must be stopped first)
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose down
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore $STAMP --confirm-service-stopped
|
||||
docker compose up -d
|
||||
|
||||
# Live drill: exercise the full create -> mutate -> restore flow
|
||||
# against the running service. The marker memory uses
|
||||
# memory_type=episodic (valid types: identity, preference, project,
|
||||
# episodic, knowledge, adaptation) and project=drill so it's easy
|
||||
# to find via GET /memory?project=drill before and after.
|
||||
#
|
||||
# See the "Restore drill" section above for the full sequence.
|
||||
STAMP=$(curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"include_chroma": true}' | jq -r '.backup_root' | awk -F/ '{print $NF}')
|
||||
|
||||
curl -fsS -X POST http://127.0.0.1:8100/memory \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"memory_type":"episodic","content":"DRILL-MARKER","project":"drill","confidence":1.0}'
|
||||
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose down
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore $STAMP --confirm-service-stopped
|
||||
docker compose up -d
|
||||
|
||||
# Marker should be gone:
|
||||
curl -fsS 'http://127.0.0.1:8100/memory?project=drill' | jq .
|
||||
```
|
||||
@@ -200,10 +200,30 @@ The runtime has now been hardened in a few practical ways:
|
||||
- SQLite connections use a configurable busy timeout
|
||||
- SQLite uses WAL mode to reduce transient lock pain under normal concurrent use
|
||||
- project registry writes are atomic file replacements rather than in-place rewrites
|
||||
- a first runtime backup path now exists for:
|
||||
- SQLite
|
||||
- project registry
|
||||
- a full runtime backup and restore path now exists and has been exercised on
|
||||
live Dalidou:
|
||||
- SQLite (hot online backup via `conn.backup()`)
|
||||
- project registry (file copy)
|
||||
- Chroma vector store (cold directory copy under `exclusive_ingestion()`)
|
||||
- backup metadata
|
||||
- `restore_runtime_backup()` with CLI entry point
|
||||
(`python -m atocore.ops.backup restore <STAMP>
|
||||
--confirm-service-stopped`), pre-restore safety snapshot for
|
||||
rollback, WAL/SHM sidecar cleanup, `PRAGMA integrity_check`
|
||||
on the restored file
|
||||
- the first live drill on 2026-04-09 surfaced and fixed a Chroma
|
||||
restore bug on Docker bind-mounted volumes (`shutil.rmtree`
|
||||
on a mount point); a regression test now asserts the
|
||||
destination inode is stable across restore
|
||||
- deploy provenance is visible end-to-end:
|
||||
- `/health` reports `build_sha`, `build_time`, `build_branch`
|
||||
from env vars wired by `deploy.sh`
|
||||
- `deploy.sh` Step 6 verifies the live `build_sha` matches the
|
||||
just-built commit (exit code 6 on drift) so "live is current?"
|
||||
can be answered precisely, not just by `__version__`
|
||||
- `deploy.sh` Step 1.5 detects that the script itself changed
|
||||
in the pulled commit and re-execs into the fresh copy, so
|
||||
the deploy never silently runs the old script against new source
|
||||
|
||||
This does not eliminate every concurrency edge, but it materially improves the
|
||||
current operational baseline.
|
||||
@@ -224,15 +244,23 @@ This separation is healthy:
|
||||
|
||||
## Immediate Next Focus
|
||||
|
||||
1. Use the new T420-side organic routing layer in real OpenClaw workflows
|
||||
2. Tighten retrieval quality for the now fully ingested active project corpora
|
||||
3. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
|
||||
4. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
|
||||
5. Expand the boring operations baseline:
|
||||
- restore validation
|
||||
- Chroma rebuild / backup policy
|
||||
- retention
|
||||
6. Only later consider write-back, reflection, or deeper autonomous behaviors
|
||||
1. ~~Re-run the full backup/restore drill~~ — DONE 2026-04-11,
|
||||
full pass (db, registry, chroma, integrity all true)
|
||||
2. ~~Turn on auto-capture of Claude Code sessions in conservative
|
||||
mode~~ — DONE 2026-04-11, Stop hook wired via
|
||||
`deploy/hooks/capture_stop.py` → `POST /interactions`
|
||||
with `reinforce=false`; kill switch via
|
||||
`ATOCORE_CAPTURE_DISABLED=1`
|
||||
3. Run a short real-use pilot with auto-capture on, verify
|
||||
interactions are landing in Dalidou, review quality
|
||||
4. Use the new T420-side organic routing layer in real OpenClaw workflows
|
||||
4. Tighten retrieval quality for the now fully ingested active project corpora
|
||||
5. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
|
||||
6. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
|
||||
7. Expand the remaining boring operations baseline:
|
||||
- retention policy cleanup script
|
||||
- off-Dalidou backup target (rsync or similar)
|
||||
8. Only later consider write-back, reflection, or deeper autonomous behaviors
|
||||
|
||||
See also:
|
||||
|
||||
|
||||
@@ -50,26 +50,205 @@ starting from:
|
||||
deploy/dalidou/.env.example
|
||||
```
|
||||
|
||||
## Deployment steps
|
||||
## First-time deployment steps
|
||||
|
||||
1. Place the repository under `/srv/storage/atocore/app` — ideally as a
|
||||
proper git clone so future updates can be pulled, not as a static
|
||||
snapshot:
|
||||
|
||||
```bash
|
||||
sudo git clone http://dalidou:3000/Antoine/ATOCore.git \
|
||||
/srv/storage/atocore/app
|
||||
```
|
||||
|
||||
1. Place the repository under `/srv/storage/atocore/app`.
|
||||
2. Create the canonical directories listed above.
|
||||
3. Copy `deploy/dalidou/.env.example` to `deploy/dalidou/.env`.
|
||||
4. Adjust the source paths if your AtoVault/AtoDrive mirrors live elsewhere.
|
||||
5. Run:
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose up -d --build
|
||||
```
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
6. Validate:
|
||||
|
||||
```bash
|
||||
curl http://127.0.0.1:8100/health
|
||||
curl http://127.0.0.1:8100/sources
|
||||
```
|
||||
|
||||
## Updating a running deployment
|
||||
|
||||
**Use `deploy/dalidou/deploy.sh` for every code update.** It is the
|
||||
one-shot sync script that:
|
||||
|
||||
- fetches latest main from Gitea into `/srv/storage/atocore/app`
|
||||
- (if the app dir is not a git checkout) backs it up as
|
||||
`<dir>.pre-git-<timestamp>` and re-clones
|
||||
- rebuilds the container image
|
||||
- restarts the container
|
||||
- waits for `/health` to respond
|
||||
- compares the reported `code_version` against the
|
||||
`__version__` in the freshly-pulled source, and exits non-zero
|
||||
if they don't match (deployment drift detection)
|
||||
|
||||
```bash
|
||||
curl http://127.0.0.1:8100/health
|
||||
curl http://127.0.0.1:8100/sources
|
||||
# Normal update from main
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
|
||||
# Deploy a specific branch or tag
|
||||
ATOCORE_BRANCH=codex/some-feature \
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
|
||||
# Dry-run: show what would happen without touching anything
|
||||
ATOCORE_DEPLOY_DRY_RUN=1 \
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
|
||||
# Deploy from a remote host (e.g. the laptop) using the Tailscale
|
||||
# or LAN address instead of loopback
|
||||
ATOCORE_GIT_REMOTE=http://192.168.86.50:3000/Antoine/ATOCore.git \
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
```
|
||||
|
||||
The script is idempotent and safe to re-run. It never touches the
|
||||
database directly — schema migrations are applied automatically at
|
||||
service startup by the lifespan handler in `src/atocore/main.py`
|
||||
which calls `init_db()` (which in turn runs the ALTER TABLE
|
||||
statements in `_apply_migrations`).
|
||||
|
||||
### Troubleshooting hostname resolution
|
||||
|
||||
`deploy.sh` defaults `ATOCORE_GIT_REMOTE` to
|
||||
`http://127.0.0.1:3000/Antoine/ATOCore.git` (loopback) because the
|
||||
hostname "dalidou" doesn't reliably resolve on the host itself —
|
||||
the first real Dalidou deploy hit exactly this on 2026-04-08. If
|
||||
you need to override (e.g. running deploy.sh from a laptop against
|
||||
the Dalidou LAN), set `ATOCORE_GIT_REMOTE` explicitly.
|
||||
|
||||
The same applies to `scripts/atocore_client.py`: its default
|
||||
`ATOCORE_BASE_URL` is `http://dalidou:8100` for remote callers, but
|
||||
when running the client on Dalidou itself (or inside the container
|
||||
via `docker exec`), override to loopback:
|
||||
|
||||
```bash
|
||||
ATOCORE_BASE_URL=http://127.0.0.1:8100 \
|
||||
python scripts/atocore_client.py health
|
||||
```
|
||||
|
||||
If you see `{"status": "unavailable", "fail_open": true}` from the
|
||||
client, the first thing to check is whether the base URL resolves
|
||||
from where you're running the client.
|
||||
|
||||
### The deploy.sh self-update race
|
||||
|
||||
When `deploy.sh` itself changes in the commit being pulled, the
|
||||
first run after the update is still executing the *old* script from
|
||||
the bash process's in-memory copy. `git reset --hard` updates the
|
||||
file on disk, but the running bash has already loaded the
|
||||
instructions. On 2026-04-09 this silently shipped an "unknown"
|
||||
`build_sha` because the old Step 2 (which predated env-var export)
|
||||
ran against fresh source.
|
||||
|
||||
`deploy.sh` now detects this: Step 1.5 compares the sha1 of `$0`
|
||||
(the running script) against the sha1 of
|
||||
`$APP_DIR/deploy/dalidou/deploy.sh` (the on-disk copy) after the
|
||||
git reset. If they differ, it sets `ATOCORE_DEPLOY_REEXECED=1` and
|
||||
`exec`s the fresh copy so the rest of the deploy runs under the new
|
||||
script. The sentinel env var prevents infinite recursion.
|
||||
|
||||
You'll see this in the logs as:
|
||||
|
||||
```text
|
||||
==> Step 1.5: deploy.sh changed in the pulled commit; re-exec'ing
|
||||
==> running script hash: <old>
|
||||
==> on-disk script hash: <new>
|
||||
==> re-exec -> /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
```
|
||||
|
||||
To opt out (debugging, for example), pre-set
|
||||
`ATOCORE_DEPLOY_REEXECED=1` before invoking `deploy.sh` and the
|
||||
self-update guard will be skipped.
|
||||
|
||||
### Deployment drift detection
|
||||
|
||||
`/health` reports drift signals at three increasing levels of
|
||||
precision:
|
||||
|
||||
| Field | Source | Precision | When to use |
|
||||
|---|---|---|---|
|
||||
| `version` / `code_version` | `atocore.__version__` (manual bump) | coarse — same value across many commits | quick smoke check that the right *release* is running |
|
||||
| `build_sha` | `ATOCORE_BUILD_SHA` env var, set by `deploy.sh` per build | precise — changes per commit | the canonical drift signal |
|
||||
| `build_time` / `build_branch` | same env var path | per-build | forensics when multiple branches in flight |
|
||||
|
||||
The **precise** check (run on the laptop or any host that can curl
|
||||
the live service AND has the source repo at hand):
|
||||
|
||||
```bash
|
||||
# What's actually running on Dalidou
|
||||
LIVE_SHA=$(curl -fsS http://dalidou:8100/health | grep -o '"build_sha":"[^"]*"' | cut -d'"' -f4)
|
||||
|
||||
# What the deployed branch tip should be
|
||||
EXPECTED_SHA=$(cd /srv/storage/atocore/app && git rev-parse HEAD)
|
||||
|
||||
# Compare
|
||||
if [ "$LIVE_SHA" = "$EXPECTED_SHA" ]; then
|
||||
echo "live is current at $LIVE_SHA"
|
||||
else
|
||||
echo "DRIFT: live $LIVE_SHA vs expected $EXPECTED_SHA"
|
||||
echo "run deploy.sh to sync"
|
||||
fi
|
||||
```
|
||||
|
||||
The `deploy.sh` script does exactly this comparison automatically
|
||||
in its post-deploy verification step (Step 6) and exits non-zero
|
||||
on mismatch. So the **simplest drift check** is just to run
|
||||
`deploy.sh` — if there's nothing to deploy, it succeeds quickly;
|
||||
if the live service is stale, it deploys and verifies.
|
||||
|
||||
If `/health` reports `build_sha: "unknown"`, the running container
|
||||
was started without `deploy.sh` (probably via `docker compose up`
|
||||
directly), and the build provenance was never recorded. Re-run
|
||||
via `deploy.sh` to fix.
|
||||
|
||||
The coarse `code_version` check is still useful as a quick visual
|
||||
sanity check — bumping `__version__` from `0.2.0` to `0.3.0`
|
||||
signals a meaningful release boundary even if the precise
|
||||
`build_sha` is what tools should compare against:
|
||||
|
||||
```bash
|
||||
# Quick sanity check (coarse)
|
||||
curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"'
|
||||
grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py
|
||||
```
|
||||
|
||||
### Schema migrations on redeploy
|
||||
|
||||
When updating from an older `__version__`, the first startup after
|
||||
the redeploy runs the idempotent ALTER TABLE migrations in
|
||||
`_apply_migrations`. For a pre-0.2.0 → 0.2.0 upgrade the migrations
|
||||
add these columns to existing tables (all with safe defaults so no
|
||||
data is touched):
|
||||
|
||||
- `memories.project TEXT DEFAULT ''`
|
||||
- `memories.last_referenced_at DATETIME`
|
||||
- `memories.reference_count INTEGER DEFAULT 0`
|
||||
- `interactions.response TEXT DEFAULT ''`
|
||||
- `interactions.memories_used TEXT DEFAULT '[]'`
|
||||
- `interactions.chunks_used TEXT DEFAULT '[]'`
|
||||
- `interactions.client TEXT DEFAULT ''`
|
||||
- `interactions.session_id TEXT DEFAULT ''`
|
||||
- `interactions.project TEXT DEFAULT ''`
|
||||
|
||||
Plus new indexes on the new columns. No row data is modified. The
|
||||
migration is safe to run against a database that already has the
|
||||
columns — the `_column_exists` check makes each ALTER a no-op in
|
||||
that case.
|
||||
|
||||
Backup the database before any redeploy (via `POST /admin/backup`)
|
||||
if you want a pre-upgrade snapshot. The migration is additive and
|
||||
reversible by restoring the snapshot.
|
||||
|
||||
## Deferred
|
||||
|
||||
- backup automation
|
||||
|
||||
@@ -44,8 +44,9 @@ read-only additive mode.
|
||||
|
||||
### Engineering Layer Planning Sprint
|
||||
|
||||
The engineering layer is intentionally in planning, not implementation.
|
||||
The architecture docs below are the current state of that planning:
|
||||
**Status: complete.** All 8 architecture docs are drafted. The
|
||||
engineering layer is now ready for V1 implementation against the
|
||||
active project set.
|
||||
|
||||
- [engineering-query-catalog.md](architecture/engineering-query-catalog.md) —
|
||||
the 20 v1-required queries the engineering layer must answer
|
||||
@@ -55,17 +56,44 @@ The architecture docs below are the current state of that planning:
|
||||
Layer 0 → Layer 2 pipeline, triggers, review queue mechanics
|
||||
- [conflict-model.md](architecture/conflict-model.md) —
|
||||
detection, representation, and resolution of contradictory facts
|
||||
- [tool-handoff-boundaries.md](architecture/tool-handoff-boundaries.md) —
|
||||
KB-CAD / KB-FEM one-way mirror stance, ingest endpoints, drift handling
|
||||
- [representation-authority.md](architecture/representation-authority.md) —
|
||||
canonical home matrix across PKM / KB / repos / AtoCore for 22 fact kinds
|
||||
- [human-mirror-rules.md](architecture/human-mirror-rules.md) —
|
||||
templates, regeneration triggers, edit flow, "do not edit" enforcement
|
||||
- [engineering-v1-acceptance.md](architecture/engineering-v1-acceptance.md) —
|
||||
measurable done definition with 23 acceptance criteria
|
||||
- [engineering-knowledge-hybrid-architecture.md](architecture/engineering-knowledge-hybrid-architecture.md) —
|
||||
the 5-layer model (from the previous planning wave)
|
||||
- [engineering-ontology-v1.md](architecture/engineering-ontology-v1.md) —
|
||||
the initial V1 object and relationship inventory (previous wave)
|
||||
- [project-identity-canonicalization.md](architecture/project-identity-canonicalization.md) —
|
||||
the helper-at-every-service-boundary contract that keeps the
|
||||
trust hierarchy dependable across alias and canonical-id callers;
|
||||
required reading before adding new project-keyed entity surfaces
|
||||
in the V1 implementation sprint
|
||||
|
||||
Still to draft before engineering-layer implementation begins:
|
||||
The next concrete next step is the V1 implementation sprint, which
|
||||
should follow engineering-v1-acceptance.md as its checklist, and
|
||||
must apply the project-identity-canonicalization contract at every
|
||||
new service-layer entry point.
|
||||
|
||||
- tool-handoff-boundaries.md (KB-CAD / KB-FEM read vs write)
|
||||
- human-mirror-rules.md (templates, triggers, edit flow)
|
||||
- representation-authority.md (PKM / KB / repo / AtoCore canonical home matrix)
|
||||
- engineering-v1-acceptance.md (done definition)
|
||||
### LLM Client Integration
|
||||
|
||||
A separate but related architectural concern: how AtoCore is reachable
|
||||
from many different LLM client contexts (OpenClaw, Claude Code, future
|
||||
Codex skills, future MCP server). The layering rule is documented in:
|
||||
|
||||
- [llm-client-integration.md](architecture/llm-client-integration.md) —
|
||||
three-layer shape: HTTP API → shared operator client
|
||||
(`scripts/atocore_client.py`) → per-agent thin frontends; the
|
||||
shared client is the canonical backbone every new client should
|
||||
shell out to instead of reimplementing HTTP calls
|
||||
|
||||
This sits implicitly between Phase 8 (OpenClaw) and Phase 11
|
||||
(multi-model). Memory-review and engineering-entity commands are
|
||||
deferred from the shared client until their workflows are exercised.
|
||||
|
||||
## What Is Real Today
|
||||
|
||||
|
||||
@@ -20,45 +20,65 @@ This working list should be read alongside:
|
||||
|
||||
## Immediate Next Steps
|
||||
|
||||
1. Use the T420 `atocore-context` skill and the new organic routing layer in
|
||||
1. ~~Re-run the backup/restore drill~~ — DONE 2026-04-11, full pass
|
||||
2. ~~Turn on auto-capture of Claude Code sessions~~ — DONE 2026-04-11,
|
||||
Stop hook via `deploy/hooks/capture_stop.py` → `POST /interactions`
|
||||
with `reinforce=false`; kill switch: `ATOCORE_CAPTURE_DISABLED=1`
|
||||
2a. Run a short real-use pilot with auto-capture on
|
||||
- verify interactions are landing in Dalidou
|
||||
- check prompt/response quality and truncation
|
||||
- confirm fail-open: no user-visible impact when Dalidou is down
|
||||
3. Use the T420 `atocore-context` skill and the new organic routing layer in
|
||||
real OpenClaw workflows
|
||||
- confirm `auto-context` feels natural
|
||||
- confirm project inference is good enough in practice
|
||||
- confirm the fail-open behavior remains acceptable in practice
|
||||
2. Review retrieval quality after the first real project ingestion batch
|
||||
4. Review retrieval quality after the first real project ingestion batch
|
||||
- check whether the top hits are useful
|
||||
- check whether trusted project state remains dominant
|
||||
- reduce cross-project competition and prompt ambiguity where needed
|
||||
- use `debug-context` to inspect the exact last AtoCore supplement
|
||||
3. Treat the active-project full markdown/text wave as complete
|
||||
5. Treat the active-project full markdown/text wave as complete
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
4. Define a cleaner source refresh model
|
||||
6. Define a cleaner source refresh model
|
||||
- make the difference between source truth, staged inputs, and machine store
|
||||
explicit
|
||||
- move toward a project source registry and refresh workflow
|
||||
- foundation now exists via project registry + per-project refresh API
|
||||
- registration policy + template + proposal + approved registration are now
|
||||
the normal path for new projects
|
||||
5. Move to Wave 2 trusted-operational ingestion
|
||||
7. Move to Wave 2 trusted-operational ingestion
|
||||
- curated dashboards
|
||||
- decision logs
|
||||
- milestone/current-status views
|
||||
- operational truth, not just raw project notes
|
||||
6. Integrate the new engineering architecture docs into active planning, not immediate schema code
|
||||
8. Integrate the new engineering architecture docs into active planning, not immediate schema code
|
||||
- keep `docs/architecture/engineering-knowledge-hybrid-architecture.md` as the target layer model
|
||||
- keep `docs/architecture/engineering-ontology-v1.md` as the V1 structured-domain target
|
||||
- do not start entity/relationship persistence until the ingestion, retrieval, registry, and backup baseline feels boring and stable
|
||||
7. Define backup and export procedures for Dalidou
|
||||
- exercise the new SQLite + registry snapshot path on Dalidou
|
||||
- Chroma backup or rebuild policy
|
||||
- retention and restore validation
|
||||
- admin backup endpoint now supports `include_chroma` cold snapshot
|
||||
under the ingestion lock and `validate` confirms each snapshot is
|
||||
openable; remaining work is the operational retention policy
|
||||
8. Keep deeper automatic runtime integration modest until the organic read-only
|
||||
model has proven value
|
||||
9. Finish the boring operations baseline around backup
|
||||
- retention policy cleanup script (snapshots dir grows
|
||||
monotonically today)
|
||||
- off-Dalidou backup target (at minimum an rsync to laptop or
|
||||
another host so a single-disk failure isn't terminal)
|
||||
- automatic post-backup validation (have `create_runtime_backup`
|
||||
call `validate_backup` on its own output and refuse to
|
||||
declare success if validation fails)
|
||||
- DONE in commits be40994 / 0382238 / 3362080 / this one:
|
||||
- `create_runtime_backup` + `list_runtime_backups` +
|
||||
`validate_backup` + `restore_runtime_backup` with CLI
|
||||
- `POST /admin/backup` with `include_chroma=true` under
|
||||
the ingestion lock
|
||||
- `/health` build_sha / build_time / build_branch provenance
|
||||
- `deploy.sh` self-update re-exec guard + build_sha drift
|
||||
verification
|
||||
- live drill procedure in `docs/backup-restore-procedure.md`
|
||||
with failure-mode table and the memory_type=episodic
|
||||
marker pattern from the 2026-04-09 drill
|
||||
10. Keep deeper automatic runtime integration modest until the organic read-only
|
||||
model has proven value
|
||||
|
||||
## Trusted State Status
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "atocore"
|
||||
version = "0.1.0"
|
||||
version = "0.2.0"
|
||||
description = "Personal context engine for LLM interactions"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
|
||||
@@ -1,8 +1,43 @@
|
||||
"""Operator-facing API client for live AtoCore instances.
|
||||
|
||||
This script is intentionally external to the app runtime. It is for admins and
|
||||
operators who want a convenient way to inspect live project state, refresh
|
||||
projects, audit retrieval quality, and manage trusted project-state entries.
|
||||
This script is intentionally external to the app runtime. It is for admins
|
||||
and operators who want a convenient way to inspect live project state,
|
||||
refresh projects, audit retrieval quality, manage trusted project-state
|
||||
entries, and drive the Phase 9 reflection loop (capture, extract, queue,
|
||||
promote, reject).
|
||||
|
||||
Environment variables
|
||||
---------------------
|
||||
|
||||
ATOCORE_BASE_URL
|
||||
Base URL of the AtoCore service (default: ``http://dalidou:8100``).
|
||||
|
||||
When running ON the Dalidou host itself or INSIDE the Dalidou
|
||||
container, override this with loopback or the real IP::
|
||||
|
||||
ATOCORE_BASE_URL=http://127.0.0.1:8100 \\
|
||||
python scripts/atocore_client.py health
|
||||
|
||||
The default hostname "dalidou" is meant for cases where the
|
||||
caller is a remote machine (laptop, T420/OpenClaw, etc.) with
|
||||
"dalidou" in its /etc/hosts or resolvable via Tailscale. It does
|
||||
NOT reliably resolve on the host itself or inside the container,
|
||||
and when it fails the client returns
|
||||
``{"status": "unavailable", "fail_open": true}`` — the right
|
||||
diagnosis when that happens is to set ATOCORE_BASE_URL explicitly
|
||||
to 127.0.0.1:8100 and retry.
|
||||
|
||||
ATOCORE_TIMEOUT_SECONDS
|
||||
Request timeout for most operations (default: 30).
|
||||
|
||||
ATOCORE_REFRESH_TIMEOUT_SECONDS
|
||||
Longer timeout for project refresh operations which can be slow
|
||||
(default: 1800).
|
||||
|
||||
ATOCORE_FAIL_OPEN
|
||||
When "true" (default), network errors return a small fail-open
|
||||
envelope instead of raising. Set to "false" for admin operations
|
||||
where you need the real error.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -23,6 +58,15 @@ TIMEOUT = int(os.environ.get("ATOCORE_TIMEOUT_SECONDS", "30"))
|
||||
REFRESH_TIMEOUT = int(os.environ.get("ATOCORE_REFRESH_TIMEOUT_SECONDS", "1800"))
|
||||
FAIL_OPEN = os.environ.get("ATOCORE_FAIL_OPEN", "true").lower() == "true"
|
||||
|
||||
# Bumped when the subcommand surface or JSON output shapes meaningfully
|
||||
# change. See docs/architecture/llm-client-integration.md for the
|
||||
# semver rules. History:
|
||||
# 0.1.0 initial stable-ops-only client
|
||||
# 0.2.0 Phase 9 reflection loop added: capture, extract,
|
||||
# reinforce-interaction, list-interactions, get-interaction,
|
||||
# queue, promote, reject
|
||||
CLIENT_VERSION = "0.2.0"
|
||||
|
||||
|
||||
def print_json(payload: Any) -> None:
|
||||
print(json.dumps(payload, ensure_ascii=True, indent=2))
|
||||
@@ -243,6 +287,59 @@ def build_parser() -> argparse.ArgumentParser:
|
||||
p.add_argument("top_k", nargs="?", type=int, default=5)
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
|
||||
# --- Phase 9 reflection loop surface --------------------------------
|
||||
#
|
||||
# capture: record one interaction (prompt + response + context used).
|
||||
# Mirrors POST /interactions. response is positional so shell
|
||||
# callers can pass it via $(cat file.txt) or heredoc. project,
|
||||
# client, and session_id are optional positionals with empty
|
||||
# defaults, matching the existing script's style.
|
||||
p = sub.add_parser("capture")
|
||||
p.add_argument("prompt")
|
||||
p.add_argument("response", nargs="?", default="")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("client", nargs="?", default="")
|
||||
p.add_argument("session_id", nargs="?", default="")
|
||||
p.add_argument("reinforce", nargs="?", default="true")
|
||||
|
||||
# extract: run the Phase 9 C rule-based extractor against an
|
||||
# already-captured interaction. persist='true' writes the
|
||||
# candidates as status='candidate' memories; default is
|
||||
# preview-only.
|
||||
p = sub.add_parser("extract")
|
||||
p.add_argument("interaction_id")
|
||||
p.add_argument("persist", nargs="?", default="false")
|
||||
|
||||
# reinforce: backfill reinforcement on an already-captured interaction.
|
||||
p = sub.add_parser("reinforce-interaction")
|
||||
p.add_argument("interaction_id")
|
||||
|
||||
# list-interactions: paginated listing with filters.
|
||||
p = sub.add_parser("list-interactions")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("session_id", nargs="?", default="")
|
||||
p.add_argument("client", nargs="?", default="")
|
||||
p.add_argument("since", nargs="?", default="")
|
||||
p.add_argument("limit", nargs="?", type=int, default=50)
|
||||
|
||||
# get-interaction: fetch one by id
|
||||
p = sub.add_parser("get-interaction")
|
||||
p.add_argument("interaction_id")
|
||||
|
||||
# queue: list the candidate review queue
|
||||
p = sub.add_parser("queue")
|
||||
p.add_argument("memory_type", nargs="?", default="")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("limit", nargs="?", type=int, default=50)
|
||||
|
||||
# promote: candidate -> active
|
||||
p = sub.add_parser("promote")
|
||||
p.add_argument("memory_id")
|
||||
|
||||
# reject: candidate -> invalid
|
||||
p = sub.add_parser("reject")
|
||||
p.add_argument("memory_id")
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
@@ -304,6 +401,79 @@ def main() -> int:
|
||||
print_json(request("POST", "/context/build", {"prompt": args.prompt, "project": args.project or None, "budget": args.budget}))
|
||||
elif cmd == "audit-query":
|
||||
print_json(audit_query(args.prompt, args.top_k, args.project or None))
|
||||
# --- Phase 9 reflection loop surface ------------------------------
|
||||
elif cmd == "capture":
|
||||
body: dict[str, Any] = {
|
||||
"prompt": args.prompt,
|
||||
"response": args.response,
|
||||
"project": args.project,
|
||||
"client": args.client or "atocore-client",
|
||||
"session_id": args.session_id,
|
||||
"reinforce": args.reinforce.lower() in {"1", "true", "yes", "y"},
|
||||
}
|
||||
print_json(request("POST", "/interactions", body))
|
||||
elif cmd == "extract":
|
||||
persist = args.persist.lower() in {"1", "true", "yes", "y"}
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/extract",
|
||||
{"persist": persist},
|
||||
)
|
||||
)
|
||||
elif cmd == "reinforce-interaction":
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/reinforce",
|
||||
{},
|
||||
)
|
||||
)
|
||||
elif cmd == "list-interactions":
|
||||
query_parts: list[str] = []
|
||||
if args.project:
|
||||
query_parts.append(f"project={urllib.parse.quote(args.project)}")
|
||||
if args.session_id:
|
||||
query_parts.append(f"session_id={urllib.parse.quote(args.session_id)}")
|
||||
if args.client:
|
||||
query_parts.append(f"client={urllib.parse.quote(args.client)}")
|
||||
if args.since:
|
||||
query_parts.append(f"since={urllib.parse.quote(args.since)}")
|
||||
query_parts.append(f"limit={int(args.limit)}")
|
||||
query = "?" + "&".join(query_parts)
|
||||
print_json(request("GET", f"/interactions{query}"))
|
||||
elif cmd == "get-interaction":
|
||||
print_json(
|
||||
request(
|
||||
"GET",
|
||||
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}",
|
||||
)
|
||||
)
|
||||
elif cmd == "queue":
|
||||
query_parts = ["status=candidate"]
|
||||
if args.memory_type:
|
||||
query_parts.append(f"memory_type={urllib.parse.quote(args.memory_type)}")
|
||||
if args.project:
|
||||
query_parts.append(f"project={urllib.parse.quote(args.project)}")
|
||||
query_parts.append(f"limit={int(args.limit)}")
|
||||
query = "?" + "&".join(query_parts)
|
||||
print_json(request("GET", f"/memory{query}"))
|
||||
elif cmd == "promote":
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/promote",
|
||||
{},
|
||||
)
|
||||
)
|
||||
elif cmd == "reject":
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/reject",
|
||||
{},
|
||||
)
|
||||
)
|
||||
else:
|
||||
return 1
|
||||
return 0
|
||||
|
||||
1018
scripts/migrate_legacy_aliases.py
Normal file
1018
scripts/migrate_legacy_aliases.py
Normal file
File diff suppressed because it is too large
Load Diff
@@ -1,3 +1,15 @@
|
||||
"""AtoCore — Personal Context Engine."""
|
||||
|
||||
__version__ = "0.1.0"
|
||||
# Bumped when a commit meaningfully changes the API surface, schema, or
|
||||
# user-visible behavior. The /health endpoint reports this value so
|
||||
# deployment drift is immediately visible: if the running service's
|
||||
# /health reports an older version than the main branch's __version__,
|
||||
# the deployment is stale and needs a redeploy (see
|
||||
# docs/dalidou-deployment.md and deploy/dalidou/deploy.sh).
|
||||
#
|
||||
# History:
|
||||
# 0.1.0 Phase 0/0.5/1/2/3/5/7 baseline
|
||||
# 0.2.0 Phase 9 reflection loop (capture/reinforce/extract + review
|
||||
# queue), shared client v0.2.0, project identity
|
||||
# canonicalization at every service-layer entry point
|
||||
__version__ = "0.2.0"
|
||||
|
||||
@@ -742,12 +742,45 @@ def api_validate_backup(stamp: str) -> dict:
|
||||
|
||||
@router.get("/health")
|
||||
def api_health() -> dict:
|
||||
"""Health check."""
|
||||
"""Health check.
|
||||
|
||||
Three layers of version reporting, in increasing precision:
|
||||
|
||||
- ``version`` / ``code_version``: ``atocore.__version__`` (e.g.
|
||||
"0.2.0"). Bumped manually on commits that change the API
|
||||
surface, schema, or user-visible behavior. Coarse — any
|
||||
number of commits can land between bumps without changing
|
||||
this value.
|
||||
- ``build_sha``: full git SHA of the commit the running
|
||||
container was built from. Set by ``deploy/dalidou/deploy.sh``
|
||||
via the ``ATOCORE_BUILD_SHA`` env var on every rebuild.
|
||||
Reports ``"unknown"`` for builds that bypass deploy.sh
|
||||
(direct ``docker compose up`` etc.). This is the precise
|
||||
drift signal: if the live ``build_sha`` doesn't match the
|
||||
tip of the deployed branch on Gitea, the service is stale
|
||||
regardless of what ``code_version`` says.
|
||||
- ``build_time`` / ``build_branch``: when and from which branch
|
||||
the live container was built. Useful for forensics when
|
||||
multiple branches are in flight or when build_sha is
|
||||
ambiguous (e.g. a force-push to the same SHA).
|
||||
|
||||
The deploy.sh post-deploy verification step compares the live
|
||||
``build_sha`` to the SHA it just set, and exits non-zero on
|
||||
mismatch.
|
||||
"""
|
||||
import os
|
||||
|
||||
from atocore import __version__
|
||||
|
||||
store = get_vector_store()
|
||||
source_status = get_source_status()
|
||||
return {
|
||||
"status": "ok",
|
||||
"version": "0.1.0",
|
||||
"version": __version__,
|
||||
"code_version": __version__,
|
||||
"build_sha": os.environ.get("ATOCORE_BUILD_SHA", "unknown"),
|
||||
"build_time": os.environ.get("ATOCORE_BUILD_TIME", "unknown"),
|
||||
"build_branch": os.environ.get("ATOCORE_BUILD_BRANCH", "unknown"),
|
||||
"vectors_count": store.count,
|
||||
"env": _config.settings.env,
|
||||
"machine_paths": {
|
||||
|
||||
@@ -14,6 +14,7 @@ import atocore.config as _config
|
||||
from atocore.context.project_state import format_project_state, get_state
|
||||
from atocore.memory.service import get_memories_for_context
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
from atocore.retrieval.retriever import ChunkResult, retrieve
|
||||
|
||||
log = get_logger("context_builder")
|
||||
@@ -84,8 +85,16 @@ def build_context(
|
||||
max(0, int(budget * PROJECT_STATE_BUDGET_RATIO)),
|
||||
)
|
||||
|
||||
if project_hint:
|
||||
state_entries = get_state(project_hint)
|
||||
# Canonicalize the project hint through the registry so callers
|
||||
# can pass an alias (`p05`, `gigabit`) and still find trusted
|
||||
# state stored under the canonical project id. The same helper
|
||||
# is used everywhere a project name crosses a trust boundary
|
||||
# (project_state, memories, interactions). When the registry has
|
||||
# no entry the helper returns the input unchanged so hand-curated
|
||||
# state that predates the registry still works.
|
||||
canonical_project = resolve_project_name(project_hint) if project_hint else ""
|
||||
if canonical_project:
|
||||
state_entries = get_state(canonical_project)
|
||||
if state_entries:
|
||||
project_state_text = format_project_state(state_entries)
|
||||
project_state_text, project_state_chars = _truncate_text_block(
|
||||
|
||||
@@ -18,6 +18,7 @@ from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
log = get_logger("project_state")
|
||||
|
||||
@@ -101,11 +102,19 @@ def set_state(
|
||||
source: str = "",
|
||||
confidence: float = 1.0,
|
||||
) -> ProjectStateEntry:
|
||||
"""Set or update a project state entry. Upsert semantics."""
|
||||
"""Set or update a project state entry. Upsert semantics.
|
||||
|
||||
The ``project_name`` is canonicalized through the registry so a
|
||||
caller passing an alias (``p05``) ends up writing into the same
|
||||
row as the canonical id (``p05-interferometer``). Without this
|
||||
step, alias and canonical names would create two parallel
|
||||
project rows and fragmented state.
|
||||
"""
|
||||
if category not in CATEGORIES:
|
||||
raise ValueError(f"Invalid category '{category}'. Must be one of: {CATEGORIES}")
|
||||
_validate_confidence(confidence)
|
||||
|
||||
project_name = resolve_project_name(project_name)
|
||||
project_id = ensure_project(project_name)
|
||||
entry_id = str(uuid.uuid4())
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
@@ -153,7 +162,12 @@ def get_state(
|
||||
category: str | None = None,
|
||||
active_only: bool = True,
|
||||
) -> list[ProjectStateEntry]:
|
||||
"""Get project state entries, optionally filtered by category."""
|
||||
"""Get project state entries, optionally filtered by category.
|
||||
|
||||
The lookup is canonicalized through the registry so an alias hint
|
||||
finds the same rows as the canonical id.
|
||||
"""
|
||||
project_name = resolve_project_name(project_name)
|
||||
with get_connection() as conn:
|
||||
project = conn.execute(
|
||||
"SELECT id FROM projects WHERE lower(name) = lower(?)", (project_name,)
|
||||
@@ -191,7 +205,12 @@ def get_state(
|
||||
|
||||
|
||||
def invalidate_state(project_name: str, category: str, key: str) -> bool:
|
||||
"""Mark a project state entry as superseded."""
|
||||
"""Mark a project state entry as superseded.
|
||||
|
||||
The lookup is canonicalized through the registry so an alias is
|
||||
treated as the canonical project for the invalidation lookup.
|
||||
"""
|
||||
project_name = resolve_project_name(project_name)
|
||||
with get_connection() as conn:
|
||||
project = conn.execute(
|
||||
"SELECT id FROM projects WHERE lower(name) = lower(?)", (project_name,)
|
||||
|
||||
@@ -18,15 +18,24 @@ violating the AtoCore trust hierarchy.
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
import uuid
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
log = get_logger("interactions")
|
||||
|
||||
# Stored timestamps use 'YYYY-MM-DD HH:MM:SS' (no timezone offset, UTC by
|
||||
# convention) so they sort lexically and compare cleanly with the SQLite
|
||||
# CURRENT_TIMESTAMP default. The since filter accepts ISO 8601 strings
|
||||
# (with 'T', optional 'Z' or +offset, optional fractional seconds) and
|
||||
# normalizes them to the storage format before the SQL comparison.
|
||||
_STORAGE_TIMESTAMP_FORMAT = "%Y-%m-%d %H:%M:%S"
|
||||
|
||||
|
||||
@dataclass
|
||||
class Interaction:
|
||||
@@ -72,6 +81,13 @@ def record_interaction(
|
||||
if not prompt or not prompt.strip():
|
||||
raise ValueError("Interaction prompt must be non-empty")
|
||||
|
||||
# Canonicalize the project through the registry so an alias and
|
||||
# the canonical id store under the same bucket. Without this,
|
||||
# reinforcement and extraction (which both query by raw
|
||||
# interaction.project) would silently miss memories and create
|
||||
# candidates in the wrong project.
|
||||
project = resolve_project_name(project)
|
||||
|
||||
interaction_id = str(uuid.uuid4())
|
||||
# Store created_at explicitly so the same string lives in both the DB
|
||||
# column and the returned dataclass. SQLite's CURRENT_TIMESTAMP uses
|
||||
@@ -159,9 +175,14 @@ def list_interactions(
|
||||
) -> list[Interaction]:
|
||||
"""List captured interactions, optionally filtered.
|
||||
|
||||
``since`` is an ISO timestamp string; only interactions created at or
|
||||
after that time are returned. ``limit`` is hard-capped at 500 to keep
|
||||
casual API listings cheap.
|
||||
``since`` accepts an ISO 8601 timestamp string (with ``T``, an
|
||||
optional ``Z`` or numeric offset, optional fractional seconds).
|
||||
The value is normalized to the storage format (UTC,
|
||||
``YYYY-MM-DD HH:MM:SS``) before the SQL comparison so external
|
||||
callers can pass any of the common ISO shapes without filter
|
||||
drift. ``project`` is canonicalized through the registry so an
|
||||
alias finds rows stored under the canonical project id.
|
||||
``limit`` is hard-capped at 500 to keep casual API listings cheap.
|
||||
"""
|
||||
if limit <= 0:
|
||||
return []
|
||||
@@ -172,7 +193,7 @@ def list_interactions(
|
||||
|
||||
if project:
|
||||
query += " AND project = ?"
|
||||
params.append(project)
|
||||
params.append(resolve_project_name(project))
|
||||
if session_id:
|
||||
query += " AND session_id = ?"
|
||||
params.append(session_id)
|
||||
@@ -181,7 +202,7 @@ def list_interactions(
|
||||
params.append(client)
|
||||
if since:
|
||||
query += " AND created_at >= ?"
|
||||
params.append(since)
|
||||
params.append(_normalize_since(since))
|
||||
|
||||
query += " ORDER BY created_at DESC LIMIT ?"
|
||||
params.append(limit)
|
||||
@@ -243,3 +264,41 @@ def _safe_json_dict(raw: str | None) -> dict:
|
||||
if not isinstance(value, dict):
|
||||
return {}
|
||||
return value
|
||||
|
||||
|
||||
def _normalize_since(since: str) -> str:
|
||||
"""Normalize an ISO 8601 ``since`` filter to the storage format.
|
||||
|
||||
Stored ``created_at`` values are ``YYYY-MM-DD HH:MM:SS`` (no
|
||||
timezone, UTC by convention). External callers naturally pass
|
||||
ISO 8601 with ``T`` separator, optional ``Z`` suffix, optional
|
||||
fractional seconds, and optional ``+HH:MM`` offsets. A naive
|
||||
string comparison between the two formats fails on the same
|
||||
day because the lexically-greater ``T`` makes any ISO value
|
||||
sort after any space-separated value.
|
||||
|
||||
This helper accepts the common ISO shapes plus the bare
|
||||
storage format and returns the storage format. On a parse
|
||||
failure it returns the input unchanged so the SQL comparison
|
||||
fails open (no rows match) instead of raising and breaking
|
||||
the listing endpoint.
|
||||
"""
|
||||
if not since:
|
||||
return since
|
||||
candidate = since.strip()
|
||||
# Python's fromisoformat understands trailing 'Z' from 3.11+ but
|
||||
# we replace it explicitly for safety against earlier shapes.
|
||||
if candidate.endswith("Z"):
|
||||
candidate = candidate[:-1] + "+00:00"
|
||||
try:
|
||||
dt = datetime.fromisoformat(candidate)
|
||||
except ValueError:
|
||||
# Already in storage format, or unparseable: best-effort
|
||||
# match the storage format with a regex; if that fails too,
|
||||
# return the raw input.
|
||||
if re.fullmatch(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", since):
|
||||
return since
|
||||
return since
|
||||
if dt.tzinfo is not None:
|
||||
dt = dt.astimezone(timezone.utc).replace(tzinfo=None)
|
||||
return dt.strftime(_STORAGE_TIMESTAMP_FORMAT)
|
||||
|
||||
@@ -4,6 +4,7 @@ from contextlib import asynccontextmanager
|
||||
|
||||
from fastapi import FastAPI
|
||||
|
||||
from atocore import __version__
|
||||
from atocore.api.routes import router
|
||||
import atocore.config as _config
|
||||
from atocore.context.project_state import init_project_state_schema
|
||||
@@ -43,7 +44,7 @@ async def lifespan(app: FastAPI):
|
||||
app = FastAPI(
|
||||
title="AtoCore",
|
||||
description="Personal Context Engine for LLM interactions",
|
||||
version="0.1.0",
|
||||
version=__version__,
|
||||
lifespan=lifespan,
|
||||
)
|
||||
|
||||
|
||||
@@ -8,10 +8,11 @@ given memory, without ever promoting anything new into trusted state.
|
||||
|
||||
Design notes
|
||||
------------
|
||||
- Matching is intentionally simple and explainable:
|
||||
* normalize both sides (lowercase, collapse whitespace)
|
||||
* require the normalized memory content (or its first 80 chars) to
|
||||
appear as a substring in the normalized response
|
||||
- Matching uses token-overlap: tokenize both sides (lowercase, stem,
|
||||
drop stop words), then check whether >= 70 % of the memory's content
|
||||
tokens appear in the response token set. This handles natural
|
||||
paraphrases (e.g. "prefers" vs "prefer", "because history" vs
|
||||
"because the history") that substring matching missed.
|
||||
- Candidates and invalidated memories are NEVER considered — reinforcement
|
||||
must not revive history.
|
||||
- Reinforcement is capped at 1.0 and monotonically non-decreasing.
|
||||
@@ -43,9 +44,12 @@ log = get_logger("reinforcement")
|
||||
# memories like "prefers Python".
|
||||
_MIN_MEMORY_CONTENT_LENGTH = 12
|
||||
|
||||
# When a memory's content is very long, match on its leading window only
|
||||
# to avoid punishing small paraphrases further into the body.
|
||||
_MATCH_WINDOW_CHARS = 80
|
||||
# Token-overlap matching constants.
|
||||
_STOP_WORDS: frozenset[str] = frozenset({
|
||||
"the", "a", "an", "and", "or", "of", "to", "is", "was",
|
||||
"that", "this", "with", "for", "from", "into",
|
||||
})
|
||||
_MATCH_THRESHOLD = 0.70
|
||||
|
||||
DEFAULT_CONFIDENCE_DELTA = 0.02
|
||||
|
||||
@@ -144,12 +148,58 @@ def _normalize(text: str) -> str:
|
||||
return collapsed.strip()
|
||||
|
||||
|
||||
def _stem(word: str) -> str:
|
||||
"""Aggressive suffix-folding so inflected forms collapse.
|
||||
|
||||
Handles trailing ``ing``, ``ed``, and ``s`` — good enough for
|
||||
reinforcement matching without pulling in nltk/snowball.
|
||||
"""
|
||||
# Order matters: try longest suffix first.
|
||||
if word.endswith("ing") and len(word) >= 6:
|
||||
return word[:-3]
|
||||
if word.endswith("ed") and len(word) > 4:
|
||||
stem = word[:-2]
|
||||
# "preferred" → "preferr" → "prefer" (doubled consonant before -ed)
|
||||
if len(stem) >= 3 and stem[-1] == stem[-2]:
|
||||
stem = stem[:-1]
|
||||
return stem
|
||||
if word.endswith("s") and len(word) > 3:
|
||||
return word[:-1]
|
||||
return word
|
||||
|
||||
|
||||
def _tokenize(text: str) -> set[str]:
|
||||
"""Split normalized text into a stemmed token set.
|
||||
|
||||
Strips punctuation, drops words shorter than 3 chars and stop words.
|
||||
"""
|
||||
tokens: set[str] = set()
|
||||
for raw in text.split():
|
||||
# Strip leading/trailing punctuation (commas, periods, quotes, etc.)
|
||||
word = raw.strip(".,;:!?\"'()[]{}-/")
|
||||
if len(word) < 3:
|
||||
continue
|
||||
if word in _STOP_WORDS:
|
||||
continue
|
||||
tokens.add(_stem(word))
|
||||
return tokens
|
||||
|
||||
|
||||
def _memory_matches(memory_content: str, normalized_response: str) -> bool:
|
||||
"""Return True if the memory content appears in the response."""
|
||||
"""Return True if enough of the memory's tokens appear in the response.
|
||||
|
||||
Uses token-overlap: tokenize both sides (lowercase, stem, drop stop
|
||||
words), then check whether >= 70 % of the memory's content tokens
|
||||
appear in the response token set.
|
||||
"""
|
||||
if not memory_content:
|
||||
return False
|
||||
normalized_memory = _normalize(memory_content)
|
||||
if len(normalized_memory) < _MIN_MEMORY_CONTENT_LENGTH:
|
||||
return False
|
||||
window = normalized_memory[:_MATCH_WINDOW_CHARS]
|
||||
return window in normalized_response
|
||||
memory_tokens = _tokenize(normalized_memory)
|
||||
if not memory_tokens:
|
||||
return False
|
||||
response_tokens = _tokenize(normalized_response)
|
||||
overlap = memory_tokens & response_tokens
|
||||
return len(overlap) / len(memory_tokens) >= _MATCH_THRESHOLD
|
||||
|
||||
@@ -29,6 +29,7 @@ from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
log = get_logger("memory")
|
||||
|
||||
@@ -84,6 +85,13 @@ def create_memory(
|
||||
raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
|
||||
_validate_confidence(confidence)
|
||||
|
||||
# Canonicalize the project through the registry so an alias and
|
||||
# the canonical id store under the same bucket. This keeps
|
||||
# reinforcement queries (which use the interaction's project) and
|
||||
# context retrieval (which uses the registry-canonicalized hint)
|
||||
# consistent with how memories are created.
|
||||
project = resolve_project_name(project)
|
||||
|
||||
memory_id = str(uuid.uuid4())
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
@@ -162,8 +170,13 @@ def get_memories(
|
||||
query += " AND memory_type = ?"
|
||||
params.append(memory_type)
|
||||
if project is not None:
|
||||
# Canonicalize on the read side so a caller passing an alias
|
||||
# finds rows that were stored under the canonical id (and
|
||||
# vice versa). resolve_project_name returns the input
|
||||
# unchanged for unregistered names so empty-string queries
|
||||
# for "no project scope" still work.
|
||||
query += " AND project = ?"
|
||||
params.append(project)
|
||||
params.append(resolve_project_name(project))
|
||||
if status is not None:
|
||||
query += " AND status = ?"
|
||||
params.append(status)
|
||||
|
||||
@@ -71,14 +71,18 @@ CREATE TABLE IF NOT EXISTS interactions (
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
-- Indexes that reference columns guaranteed to exist since the first
|
||||
-- release ship here. Indexes that reference columns added by later
|
||||
-- migrations (memories.project, interactions.project,
|
||||
-- interactions.session_id) are created inside _apply_migrations AFTER
|
||||
-- the corresponding ALTER TABLE, NOT here. Creating them here would
|
||||
-- fail on upgrade from a pre-migration schema because CREATE TABLE
|
||||
-- IF NOT EXISTS is a no-op on an existing table, so the new columns
|
||||
-- wouldn't be added before the CREATE INDEX runs.
|
||||
CREATE INDEX IF NOT EXISTS idx_chunks_document ON source_chunks(document_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_memories_type ON memories(memory_type);
|
||||
CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project);
|
||||
CREATE INDEX IF NOT EXISTS idx_memories_status ON memories(status);
|
||||
CREATE INDEX IF NOT EXISTS idx_interactions_project ON interactions(project_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_interactions_project_name ON interactions(project);
|
||||
CREATE INDEX IF NOT EXISTS idx_interactions_session ON interactions(session_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_interactions_created_at ON interactions(created_at);
|
||||
"""
|
||||
|
||||
|
||||
|
||||
@@ -103,12 +103,27 @@ def create_runtime_backup(
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
# Automatic post-backup validation. Failures log a warning but do
|
||||
# not raise — the backup files are still on disk and may be useful.
|
||||
validation = validate_backup(stamp)
|
||||
validated = validation.get("valid", False)
|
||||
validation_errors = validation.get("errors", [])
|
||||
if not validated:
|
||||
log.warning(
|
||||
"post_backup_validation_failed",
|
||||
backup_root=str(backup_root),
|
||||
errors=validation_errors,
|
||||
)
|
||||
metadata["validated"] = validated
|
||||
metadata["validation_errors"] = validation_errors
|
||||
|
||||
log.info(
|
||||
"runtime_backup_created",
|
||||
backup_root=str(backup_root),
|
||||
db_snapshot=str(db_snapshot_path),
|
||||
chroma_included=include_chroma,
|
||||
chroma_bytes=chroma_bytes_copied,
|
||||
validated=validated,
|
||||
)
|
||||
return metadata
|
||||
|
||||
@@ -216,6 +231,286 @@ def validate_backup(stamp: str) -> dict:
|
||||
return result
|
||||
|
||||
|
||||
def restore_runtime_backup(
|
||||
stamp: str,
|
||||
*,
|
||||
include_chroma: bool | None = None,
|
||||
pre_restore_snapshot: bool = True,
|
||||
confirm_service_stopped: bool = False,
|
||||
) -> dict:
|
||||
"""Restore a previously captured runtime backup.
|
||||
|
||||
CRITICAL: the AtoCore service MUST be stopped before calling this.
|
||||
Overwriting a live SQLite database corrupts state and can break
|
||||
the running container's open connections. The caller must pass
|
||||
``confirm_service_stopped=True`` as an explicit acknowledgment —
|
||||
otherwise this function refuses to run.
|
||||
|
||||
The restore procedure:
|
||||
|
||||
1. Validate the backup via ``validate_backup``; refuse on any error.
|
||||
2. (default) Create a pre-restore safety snapshot of the CURRENT
|
||||
state so the restore itself is reversible. The snapshot stamp
|
||||
is returned in the result for the operator to record.
|
||||
3. Remove stale SQLite WAL/SHM sidecar files next to the target db
|
||||
before copying — the snapshot is a self-contained main-file
|
||||
image from ``conn.backup()``, and leftover WAL/SHM from the old
|
||||
live db would desync against the restored main file.
|
||||
4. Copy the snapshot db over the target db path.
|
||||
5. Restore the project registry file if the snapshot captured one.
|
||||
6. Restore the Chroma directory if ``include_chroma`` resolves to
|
||||
true. When ``include_chroma is None`` the function defers to
|
||||
whether the snapshot captured Chroma (the common case).
|
||||
7. Run ``PRAGMA integrity_check`` on the restored db and report
|
||||
the result.
|
||||
|
||||
Returns a dict describing what was restored. On refused restore
|
||||
(service still running, validation failed) raises ``RuntimeError``.
|
||||
"""
|
||||
if not confirm_service_stopped:
|
||||
raise RuntimeError(
|
||||
"restore_runtime_backup refuses to run without "
|
||||
"confirm_service_stopped=True — stop the AtoCore container "
|
||||
"first (e.g. `docker compose down` from deploy/dalidou) "
|
||||
"before calling this function"
|
||||
)
|
||||
|
||||
validation = validate_backup(stamp)
|
||||
if not validation.get("valid"):
|
||||
raise RuntimeError(
|
||||
f"backup {stamp} failed validation: {validation.get('errors')}"
|
||||
)
|
||||
metadata = validation.get("metadata") or {}
|
||||
|
||||
pre_snapshot_stamp: str | None = None
|
||||
if pre_restore_snapshot:
|
||||
pre = create_runtime_backup(include_chroma=False)
|
||||
pre_snapshot_stamp = Path(pre["backup_root"]).name
|
||||
|
||||
target_db = _config.settings.db_path
|
||||
source_db = Path(metadata.get("db_snapshot_path", ""))
|
||||
if not source_db.exists():
|
||||
raise RuntimeError(
|
||||
f"db snapshot not found at {source_db} — backup "
|
||||
f"metadata may be stale"
|
||||
)
|
||||
|
||||
# Force sqlite to flush any lingering WAL into the main file and
|
||||
# release OS-level file handles on -wal/-shm before we swap the
|
||||
# main file. Passing through conn.backup() in the pre-restore
|
||||
# snapshot can leave sidecars momentarily locked on Windows;
|
||||
# an explicit checkpoint(TRUNCATE) is the reliable way to flush
|
||||
# and release. Best-effort: if the target db can't be opened
|
||||
# (missing, corrupt), fall through and trust the copy step.
|
||||
if target_db.exists():
|
||||
try:
|
||||
with sqlite3.connect(str(target_db)) as checkpoint_conn:
|
||||
checkpoint_conn.execute("PRAGMA wal_checkpoint(TRUNCATE)")
|
||||
except sqlite3.DatabaseError as exc:
|
||||
log.warning(
|
||||
"restore_pre_checkpoint_failed",
|
||||
target_db=str(target_db),
|
||||
error=str(exc),
|
||||
)
|
||||
|
||||
# Remove stale WAL/SHM sidecars from the old live db so SQLite
|
||||
# can't read inconsistent state on next open. Tolerant to
|
||||
# Windows file-lock races — the subsequent copy replaces the
|
||||
# main file anyway, and the integrity check afterward is the
|
||||
# actual correctness signal.
|
||||
wal_path = target_db.with_name(target_db.name + "-wal")
|
||||
shm_path = target_db.with_name(target_db.name + "-shm")
|
||||
for stale in (wal_path, shm_path):
|
||||
if stale.exists():
|
||||
try:
|
||||
stale.unlink()
|
||||
except OSError as exc:
|
||||
log.warning(
|
||||
"restore_sidecar_unlink_failed",
|
||||
path=str(stale),
|
||||
error=str(exc),
|
||||
)
|
||||
|
||||
target_db.parent.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copy2(source_db, target_db)
|
||||
|
||||
registry_restored = False
|
||||
registry_snapshot_path = metadata.get("registry_snapshot_path", "")
|
||||
if registry_snapshot_path:
|
||||
src_reg = Path(registry_snapshot_path)
|
||||
if src_reg.exists():
|
||||
dst_reg = _config.settings.resolved_project_registry_path
|
||||
dst_reg.parent.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copy2(src_reg, dst_reg)
|
||||
registry_restored = True
|
||||
|
||||
chroma_snapshot_path = metadata.get("chroma_snapshot_path", "")
|
||||
if include_chroma is None:
|
||||
include_chroma = bool(chroma_snapshot_path)
|
||||
chroma_restored = False
|
||||
if include_chroma and chroma_snapshot_path:
|
||||
src_chroma = Path(chroma_snapshot_path)
|
||||
if src_chroma.exists() and src_chroma.is_dir():
|
||||
dst_chroma = _config.settings.chroma_path
|
||||
# Do NOT rmtree the destination itself: in a Dockerized
|
||||
# deployment the chroma dir is a bind-mounted volume, and
|
||||
# unlinking a mount point raises
|
||||
# OSError [Errno 16] Device or resource busy.
|
||||
# Instead, clear the directory's CONTENTS and copytree into
|
||||
# it with dirs_exist_ok=True. This is equivalent to an
|
||||
# rmtree+copytree for restore purposes but stays inside the
|
||||
# mount boundary. Discovered during the first real restore
|
||||
# drill on Dalidou (2026-04-09).
|
||||
dst_chroma.mkdir(parents=True, exist_ok=True)
|
||||
for item in dst_chroma.iterdir():
|
||||
if item.is_dir() and not item.is_symlink():
|
||||
shutil.rmtree(item)
|
||||
else:
|
||||
item.unlink()
|
||||
shutil.copytree(src_chroma, dst_chroma, dirs_exist_ok=True)
|
||||
chroma_restored = True
|
||||
|
||||
restored_integrity_ok = False
|
||||
integrity_error: str | None = None
|
||||
try:
|
||||
with sqlite3.connect(str(target_db)) as conn:
|
||||
row = conn.execute("PRAGMA integrity_check").fetchone()
|
||||
restored_integrity_ok = bool(row and row[0] == "ok")
|
||||
if not restored_integrity_ok:
|
||||
integrity_error = row[0] if row else "no_row"
|
||||
except sqlite3.DatabaseError as exc:
|
||||
integrity_error = f"db_open_failed: {exc}"
|
||||
|
||||
result: dict = {
|
||||
"stamp": stamp,
|
||||
"pre_restore_snapshot": pre_snapshot_stamp,
|
||||
"target_db": str(target_db),
|
||||
"db_restored": True,
|
||||
"registry_restored": registry_restored,
|
||||
"chroma_restored": chroma_restored,
|
||||
"restored_integrity_ok": restored_integrity_ok,
|
||||
}
|
||||
if integrity_error:
|
||||
result["integrity_error"] = integrity_error
|
||||
|
||||
log.info(
|
||||
"runtime_backup_restored",
|
||||
stamp=stamp,
|
||||
pre_restore_snapshot=pre_snapshot_stamp,
|
||||
registry_restored=registry_restored,
|
||||
chroma_restored=chroma_restored,
|
||||
integrity_ok=restored_integrity_ok,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
def cleanup_old_backups(*, confirm: bool = False) -> dict:
|
||||
"""Apply retention policy and remove old snapshots.
|
||||
|
||||
Retention keeps:
|
||||
- Last 7 daily snapshots (most recent per calendar day)
|
||||
- Last 4 weekly snapshots (most recent on each Sunday)
|
||||
- Last 6 monthly snapshots (most recent on the 1st of each month)
|
||||
|
||||
All other snapshots are candidates for deletion. Runs as dry-run by
|
||||
default; pass ``confirm=True`` to actually delete.
|
||||
|
||||
Returns a dict with kept/deleted counts and any errors.
|
||||
"""
|
||||
snapshots_root = _config.settings.resolved_backup_dir / "snapshots"
|
||||
if not snapshots_root.exists() or not snapshots_root.is_dir():
|
||||
return {"kept": 0, "deleted": 0, "would_delete": 0, "dry_run": not confirm, "errors": []}
|
||||
|
||||
# Parse all stamp directories into (datetime, dir_path) pairs.
|
||||
stamps: list[tuple[datetime, Path]] = []
|
||||
unparseable: list[str] = []
|
||||
for entry in sorted(snapshots_root.iterdir()):
|
||||
if not entry.is_dir():
|
||||
continue
|
||||
try:
|
||||
dt = datetime.strptime(entry.name, "%Y%m%dT%H%M%SZ").replace(tzinfo=UTC)
|
||||
stamps.append((dt, entry))
|
||||
except ValueError:
|
||||
unparseable.append(entry.name)
|
||||
|
||||
if not stamps:
|
||||
return {
|
||||
"kept": 0, "deleted": 0, "would_delete": 0,
|
||||
"dry_run": not confirm, "errors": [],
|
||||
"unparseable": unparseable,
|
||||
}
|
||||
|
||||
# Sort newest first so "most recent per bucket" is a simple first-seen.
|
||||
stamps.sort(key=lambda t: t[0], reverse=True)
|
||||
|
||||
keep_set: set[Path] = set()
|
||||
|
||||
# Last 7 daily: most recent snapshot per calendar day.
|
||||
seen_days: set[str] = set()
|
||||
for dt, path in stamps:
|
||||
day_key = dt.strftime("%Y-%m-%d")
|
||||
if day_key not in seen_days:
|
||||
seen_days.add(day_key)
|
||||
keep_set.add(path)
|
||||
if len(seen_days) >= 7:
|
||||
break
|
||||
|
||||
# Last 4 weekly: most recent snapshot that falls on a Sunday.
|
||||
seen_weeks: set[str] = set()
|
||||
for dt, path in stamps:
|
||||
if dt.weekday() == 6: # Sunday
|
||||
week_key = dt.strftime("%Y-W%W")
|
||||
if week_key not in seen_weeks:
|
||||
seen_weeks.add(week_key)
|
||||
keep_set.add(path)
|
||||
if len(seen_weeks) >= 4:
|
||||
break
|
||||
|
||||
# Last 6 monthly: most recent snapshot on the 1st of a month.
|
||||
seen_months: set[str] = set()
|
||||
for dt, path in stamps:
|
||||
if dt.day == 1:
|
||||
month_key = dt.strftime("%Y-%m")
|
||||
if month_key not in seen_months:
|
||||
seen_months.add(month_key)
|
||||
keep_set.add(path)
|
||||
if len(seen_months) >= 6:
|
||||
break
|
||||
|
||||
to_delete = [path for _, path in stamps if path not in keep_set]
|
||||
|
||||
errors: list[str] = []
|
||||
deleted_count = 0
|
||||
if confirm:
|
||||
for path in to_delete:
|
||||
try:
|
||||
shutil.rmtree(path)
|
||||
deleted_count += 1
|
||||
except OSError as exc:
|
||||
errors.append(f"{path.name}: {exc}")
|
||||
|
||||
result: dict = {
|
||||
"kept": len(keep_set),
|
||||
"dry_run": not confirm,
|
||||
"errors": errors,
|
||||
}
|
||||
if confirm:
|
||||
result["deleted"] = deleted_count
|
||||
else:
|
||||
result["would_delete"] = len(to_delete)
|
||||
if unparseable:
|
||||
result["unparseable"] = unparseable
|
||||
|
||||
log.info(
|
||||
"cleanup_old_backups",
|
||||
kept=len(keep_set),
|
||||
deleted=deleted_count if confirm else 0,
|
||||
would_delete=len(to_delete) if not confirm else 0,
|
||||
dry_run=not confirm,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
def _backup_sqlite_db(source_path: Path, dest_path: Path) -> None:
|
||||
source_conn = sqlite3.connect(str(source_path))
|
||||
dest_conn = sqlite3.connect(str(dest_path))
|
||||
@@ -242,7 +537,98 @@ def _copy_directory_tree(source: Path, dest: Path) -> tuple[int, int]:
|
||||
|
||||
|
||||
def main() -> None:
|
||||
result = create_runtime_backup()
|
||||
"""CLI entry point for the backup module.
|
||||
|
||||
Supports four subcommands:
|
||||
|
||||
- ``create`` run ``create_runtime_backup`` (default if none given)
|
||||
- ``list`` list all runtime backup snapshots
|
||||
- ``validate`` validate a specific snapshot by stamp
|
||||
- ``restore`` restore a specific snapshot by stamp
|
||||
|
||||
The restore subcommand is the one used by the backup/restore drill
|
||||
and MUST be run only when the AtoCore service is stopped. It takes
|
||||
``--confirm-service-stopped`` as an explicit acknowledgment.
|
||||
"""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(
|
||||
prog="python -m atocore.ops.backup",
|
||||
description="AtoCore runtime backup create/list/validate/restore",
|
||||
)
|
||||
sub = parser.add_subparsers(dest="command")
|
||||
|
||||
p_create = sub.add_parser("create", help="create a new runtime backup")
|
||||
p_create.add_argument(
|
||||
"--chroma",
|
||||
action="store_true",
|
||||
help="also snapshot the Chroma vector store (cold copy)",
|
||||
)
|
||||
|
||||
sub.add_parser("list", help="list runtime backup snapshots")
|
||||
|
||||
p_validate = sub.add_parser("validate", help="validate a snapshot by stamp")
|
||||
p_validate.add_argument("stamp", help="snapshot stamp (e.g. 20260409T010203Z)")
|
||||
|
||||
p_cleanup = sub.add_parser("cleanup", help="remove old snapshots per retention policy")
|
||||
p_cleanup.add_argument(
|
||||
"--confirm",
|
||||
action="store_true",
|
||||
help="actually delete (default is dry-run)",
|
||||
)
|
||||
|
||||
p_restore = sub.add_parser(
|
||||
"restore",
|
||||
help="restore a snapshot by stamp (service must be stopped)",
|
||||
)
|
||||
p_restore.add_argument("stamp", help="snapshot stamp to restore")
|
||||
p_restore.add_argument(
|
||||
"--confirm-service-stopped",
|
||||
action="store_true",
|
||||
help="explicit acknowledgment that the AtoCore container is stopped",
|
||||
)
|
||||
p_restore.add_argument(
|
||||
"--no-pre-snapshot",
|
||||
action="store_true",
|
||||
help="skip the pre-restore safety snapshot of current state",
|
||||
)
|
||||
chroma_group = p_restore.add_mutually_exclusive_group()
|
||||
chroma_group.add_argument(
|
||||
"--chroma",
|
||||
dest="include_chroma",
|
||||
action="store_true",
|
||||
default=None,
|
||||
help="force-restore the Chroma snapshot",
|
||||
)
|
||||
chroma_group.add_argument(
|
||||
"--no-chroma",
|
||||
dest="include_chroma",
|
||||
action="store_false",
|
||||
help="skip the Chroma snapshot even if it was captured",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
command = args.command or "create"
|
||||
|
||||
if command == "create":
|
||||
include_chroma = getattr(args, "chroma", False)
|
||||
result = create_runtime_backup(include_chroma=include_chroma)
|
||||
elif command == "list":
|
||||
result = {"backups": list_runtime_backups()}
|
||||
elif command == "validate":
|
||||
result = validate_backup(args.stamp)
|
||||
elif command == "cleanup":
|
||||
result = cleanup_old_backups(confirm=getattr(args, "confirm", False))
|
||||
elif command == "restore":
|
||||
result = restore_runtime_backup(
|
||||
args.stamp,
|
||||
include_chroma=args.include_chroma,
|
||||
pre_restore_snapshot=not args.no_pre_snapshot,
|
||||
confirm_service_stopped=args.confirm_service_stopped,
|
||||
)
|
||||
else: # pragma: no cover — argparse guards this
|
||||
parser.error(f"unknown command: {command}")
|
||||
|
||||
print(json.dumps(result, indent=2, ensure_ascii=True))
|
||||
|
||||
|
||||
|
||||
@@ -254,6 +254,30 @@ def get_registered_project(project_name: str) -> RegisteredProject | None:
|
||||
return None
|
||||
|
||||
|
||||
def resolve_project_name(name: str | None) -> str:
|
||||
"""Canonicalize a project name through the registry.
|
||||
|
||||
Returns the canonical ``project_id`` if the input matches any
|
||||
registered project's id or alias. Returns the input unchanged
|
||||
when it's empty or not in the registry — the second case keeps
|
||||
backwards compatibility with hand-curated state, memories, and
|
||||
interactions that predate the registry, or for projects that
|
||||
are intentionally not registered.
|
||||
|
||||
This helper is the single canonicalization boundary for project
|
||||
names across the trust hierarchy. Every read/write that takes a
|
||||
project name should pass it through ``resolve_project_name``
|
||||
before storing or querying. The contract is documented in
|
||||
``docs/architecture/representation-authority.md``.
|
||||
"""
|
||||
if not name:
|
||||
return name or ""
|
||||
project = get_registered_project(name)
|
||||
if project is not None:
|
||||
return project.project_id
|
||||
return name
|
||||
|
||||
|
||||
def refresh_registered_project(project_name: str, purge_deleted: bool = False) -> dict:
|
||||
"""Ingest all configured source roots for a registered project.
|
||||
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
"""pytest configuration and shared fixtures."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
@@ -29,6 +30,45 @@ def tmp_data_dir(tmp_path):
|
||||
return tmp_path
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def project_registry(tmp_path, monkeypatch):
|
||||
"""Stand up an isolated project registry pointing at a temp file.
|
||||
|
||||
Returns a callable that takes one or more (project_id, [aliases])
|
||||
tuples and writes them into the registry, then forces the in-process
|
||||
settings singleton to re-resolve. Use this when a test needs the
|
||||
canonicalization helpers (resolve_project_name, get_registered_project)
|
||||
to recognize aliases.
|
||||
"""
|
||||
registry_path = tmp_path / "test-project-registry.json"
|
||||
|
||||
def _set(*projects):
|
||||
payload = {"projects": []}
|
||||
for entry in projects:
|
||||
if isinstance(entry, str):
|
||||
project_id, aliases = entry, []
|
||||
else:
|
||||
project_id, aliases = entry
|
||||
payload["projects"].append(
|
||||
{
|
||||
"id": project_id,
|
||||
"aliases": list(aliases),
|
||||
"description": f"test project {project_id}",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": f"incoming/projects/{project_id}"}
|
||||
],
|
||||
}
|
||||
)
|
||||
registry_path.write_text(json.dumps(payload), encoding="utf-8")
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
from atocore import config
|
||||
|
||||
config.settings = config.Settings()
|
||||
return registry_path
|
||||
|
||||
return _set
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def sample_markdown(tmp_path) -> Path:
|
||||
"""Create a sample markdown file for testing."""
|
||||
|
||||
@@ -50,6 +50,65 @@ def test_health_endpoint_exposes_machine_paths_and_source_readiness(tmp_data_dir
|
||||
assert "run_dir" in body["machine_paths"]
|
||||
|
||||
|
||||
def test_health_endpoint_reports_code_version_from_module(tmp_data_dir):
|
||||
"""The /health response must include code_version reflecting
|
||||
atocore.__version__, so deployment drift detection works."""
|
||||
from atocore import __version__
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/health")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["version"] == __version__
|
||||
assert body["code_version"] == __version__
|
||||
|
||||
|
||||
def test_health_endpoint_reports_build_metadata_from_env(tmp_data_dir, monkeypatch):
|
||||
"""The /health response must include build_sha, build_time, and
|
||||
build_branch from the ATOCORE_BUILD_* env vars, so deploy.sh can
|
||||
detect precise drift via SHA comparison instead of relying on
|
||||
the coarse code_version field.
|
||||
|
||||
Regression test for the codex finding from 2026-04-08:
|
||||
code_version 0.2.0 is too coarse to trust as a 'live is current'
|
||||
signal because it only changes on manual bumps. The build_sha
|
||||
field changes per commit and is set by deploy.sh.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_BUILD_SHA", "abc1234567890fedcba0987654321")
|
||||
monkeypatch.setenv("ATOCORE_BUILD_TIME", "2026-04-09T01:23:45Z")
|
||||
monkeypatch.setenv("ATOCORE_BUILD_BRANCH", "main")
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/health")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["build_sha"] == "abc1234567890fedcba0987654321"
|
||||
assert body["build_time"] == "2026-04-09T01:23:45Z"
|
||||
assert body["build_branch"] == "main"
|
||||
|
||||
|
||||
def test_health_endpoint_reports_unknown_when_build_env_unset(tmp_data_dir, monkeypatch):
|
||||
"""When deploy.sh hasn't set the build env vars (e.g. someone
|
||||
ran `docker compose up` directly), /health reports 'unknown'
|
||||
for all three build fields. This is a clear signal to the
|
||||
operator that the deploy provenance is missing and they should
|
||||
re-run via deploy.sh."""
|
||||
monkeypatch.delenv("ATOCORE_BUILD_SHA", raising=False)
|
||||
monkeypatch.delenv("ATOCORE_BUILD_TIME", raising=False)
|
||||
monkeypatch.delenv("ATOCORE_BUILD_BRANCH", raising=False)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/health")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["build_sha"] == "unknown"
|
||||
assert body["build_time"] == "unknown"
|
||||
assert body["build_branch"] == "unknown"
|
||||
|
||||
|
||||
def test_projects_endpoint_reports_registered_projects(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
|
||||
313
tests/test_atocore_client.py
Normal file
313
tests/test_atocore_client.py
Normal file
@@ -0,0 +1,313 @@
|
||||
"""Tests for scripts/atocore_client.py — the shared operator CLI.
|
||||
|
||||
Specifically covers the Phase 9 reflection-loop subcommands added
|
||||
after codex's sequence-step-3 review: ``capture``, ``extract``,
|
||||
``reinforce-interaction``, ``list-interactions``, ``get-interaction``,
|
||||
``queue``, ``promote``, ``reject``.
|
||||
|
||||
The tests mock the client's ``request()`` helper and verify each
|
||||
subcommand:
|
||||
|
||||
- calls the correct HTTP method and path
|
||||
- builds the correct JSON body (or the correct query string)
|
||||
- passes the right subset of CLI arguments through
|
||||
|
||||
This is the same "wiring test" shape used by tests/test_api_storage.py:
|
||||
we don't exercise the live HTTP stack; we verify the client builds
|
||||
the request correctly. The server side is already covered by its
|
||||
own route tests.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
# Make scripts/ importable
|
||||
_REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
sys.path.insert(0, str(_REPO_ROOT / "scripts"))
|
||||
|
||||
import atocore_client as client # noqa: E402
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Request capture helper
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class _RequestCapture:
|
||||
"""Drop-in replacement for client.request() that records calls."""
|
||||
|
||||
def __init__(self, response: dict | None = None):
|
||||
self.calls: list[dict] = []
|
||||
self._response = response if response is not None else {"ok": True}
|
||||
|
||||
def __call__(self, method, path, data=None, timeout=None):
|
||||
self.calls.append(
|
||||
{"method": method, "path": path, "data": data, "timeout": timeout}
|
||||
)
|
||||
return self._response
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def capture_requests(monkeypatch):
|
||||
"""Replace client.request with a recording stub and return it."""
|
||||
stub = _RequestCapture()
|
||||
monkeypatch.setattr(client, "request", stub)
|
||||
return stub
|
||||
|
||||
|
||||
def _run_client(monkeypatch, argv: list[str]) -> int:
|
||||
"""Simulate a CLI invocation with the given argv."""
|
||||
monkeypatch.setattr(sys, "argv", ["atocore_client.py", *argv])
|
||||
return client.main()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# capture
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_capture_posts_to_interactions_endpoint(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
[
|
||||
"capture",
|
||||
"what is p05's current focus",
|
||||
"The current focus is wave 2 operational ingestion.",
|
||||
"p05-interferometer",
|
||||
"claude-code-test",
|
||||
"session-abc",
|
||||
],
|
||||
)
|
||||
assert len(capture_requests.calls) == 1
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/interactions"
|
||||
body = call["data"]
|
||||
assert body["prompt"] == "what is p05's current focus"
|
||||
assert body["response"].startswith("The current focus")
|
||||
assert body["project"] == "p05-interferometer"
|
||||
assert body["client"] == "claude-code-test"
|
||||
assert body["session_id"] == "session-abc"
|
||||
assert body["reinforce"] is True # default
|
||||
|
||||
|
||||
def test_capture_sets_default_client_when_omitted(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
["capture", "hi", "hello"],
|
||||
)
|
||||
call = capture_requests.calls[0]
|
||||
assert call["data"]["client"] == "atocore-client"
|
||||
assert call["data"]["project"] == ""
|
||||
assert call["data"]["reinforce"] is True
|
||||
|
||||
|
||||
def test_capture_accepts_reinforce_false(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
["capture", "prompt", "response", "p05", "claude", "sess", "false"],
|
||||
)
|
||||
call = capture_requests.calls[0]
|
||||
assert call["data"]["reinforce"] is False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# extract
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_extract_default_is_preview(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["extract", "abc-123"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/interactions/abc-123/extract"
|
||||
assert call["data"] == {"persist": False}
|
||||
|
||||
|
||||
def test_extract_persist_true(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["extract", "abc-123", "true"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["data"] == {"persist": True}
|
||||
|
||||
|
||||
def test_extract_url_encodes_interaction_id(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["extract", "abc/def"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["path"] == "/interactions/abc%2Fdef/extract"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# reinforce-interaction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_reinforce_interaction_posts_to_correct_path(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["reinforce-interaction", "int-xyz"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/interactions/int-xyz/reinforce"
|
||||
assert call["data"] == {}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# list-interactions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_list_interactions_no_filters(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["list-interactions"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "GET"
|
||||
assert call["path"] == "/interactions?limit=50"
|
||||
|
||||
|
||||
def test_list_interactions_with_project_filter(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["list-interactions", "p05-interferometer"])
|
||||
call = capture_requests.calls[0]
|
||||
assert "project=p05-interferometer" in call["path"]
|
||||
assert "limit=50" in call["path"]
|
||||
|
||||
|
||||
def test_list_interactions_full_filter_set(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
[
|
||||
"list-interactions",
|
||||
"p05",
|
||||
"sess-1",
|
||||
"claude-code",
|
||||
"2026-04-07T00:00:00Z",
|
||||
"20",
|
||||
],
|
||||
)
|
||||
call = capture_requests.calls[0]
|
||||
path = call["path"]
|
||||
assert "project=p05" in path
|
||||
assert "session_id=sess-1" in path
|
||||
assert "client=claude-code" in path
|
||||
# Since is URL-encoded — the : and + chars get escaped
|
||||
assert "since=2026-04-07" in path
|
||||
assert "limit=20" in path
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# get-interaction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_get_interaction_fetches_by_id(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["get-interaction", "int-42"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "GET"
|
||||
assert call["path"] == "/interactions/int-42"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# queue
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_queue_always_filters_by_candidate_status(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["queue"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "GET"
|
||||
assert call["path"].startswith("/memory?")
|
||||
assert "status=candidate" in call["path"]
|
||||
assert "limit=50" in call["path"]
|
||||
|
||||
|
||||
def test_queue_with_memory_type_and_project(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["queue", "adaptation", "p05-interferometer", "10"])
|
||||
call = capture_requests.calls[0]
|
||||
path = call["path"]
|
||||
assert "status=candidate" in path
|
||||
assert "memory_type=adaptation" in path
|
||||
assert "project=p05-interferometer" in path
|
||||
assert "limit=10" in path
|
||||
|
||||
|
||||
def test_queue_limit_coercion(capture_requests, monkeypatch):
|
||||
"""limit is typed as int by argparse so string '25' becomes 25."""
|
||||
_run_client(monkeypatch, ["queue", "", "", "25"])
|
||||
call = capture_requests.calls[0]
|
||||
assert "limit=25" in call["path"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# promote / reject
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_promote_posts_to_memory_promote_path(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["promote", "mem-abc"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/memory/mem-abc/promote"
|
||||
assert call["data"] == {}
|
||||
|
||||
|
||||
def test_reject_posts_to_memory_reject_path(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["reject", "mem-xyz"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/memory/mem-xyz/reject"
|
||||
assert call["data"] == {}
|
||||
|
||||
|
||||
def test_promote_url_encodes_memory_id(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["promote", "mem/with/slashes"])
|
||||
call = capture_requests.calls[0]
|
||||
assert "mem%2Fwith%2Fslashes" in call["path"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# end-to-end: ensure the Phase 9 loop can be driven entirely through
|
||||
# the client
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_phase9_full_loop_via_client_shape(capture_requests, monkeypatch):
|
||||
"""Simulate the full capture -> extract -> queue -> promote cycle.
|
||||
|
||||
This doesn't exercise real HTTP — each call is intercepted by
|
||||
the mock request. But it proves every step of the Phase 9 loop
|
||||
is reachable through the shared client, which is the whole point
|
||||
of the codex-step-3 work.
|
||||
"""
|
||||
# Step 1: capture
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
[
|
||||
"capture",
|
||||
"what about GF-PTFE for lateral support",
|
||||
"## Decision: use GF-PTFE pads for thermal stability",
|
||||
"p05-interferometer",
|
||||
],
|
||||
)
|
||||
# Step 2: extract candidates (preview)
|
||||
_run_client(monkeypatch, ["extract", "fake-interaction-id"])
|
||||
# Step 3: extract and persist
|
||||
_run_client(monkeypatch, ["extract", "fake-interaction-id", "true"])
|
||||
# Step 4: list the review queue
|
||||
_run_client(monkeypatch, ["queue"])
|
||||
# Step 5: promote a candidate
|
||||
_run_client(monkeypatch, ["promote", "fake-memory-id"])
|
||||
# Step 6: reject another
|
||||
_run_client(monkeypatch, ["reject", "fake-memory-id-2"])
|
||||
|
||||
methods_and_paths = [
|
||||
(c["method"], c["path"]) for c in capture_requests.calls
|
||||
]
|
||||
assert methods_and_paths == [
|
||||
("POST", "/interactions"),
|
||||
("POST", "/interactions/fake-interaction-id/extract"),
|
||||
("POST", "/interactions/fake-interaction-id/extract"),
|
||||
("GET", "/memory?status=candidate&limit=50"),
|
||||
("POST", "/memory/fake-memory-id/promote"),
|
||||
("POST", "/memory/fake-memory-id-2/reject"),
|
||||
]
|
||||
@@ -1,14 +1,18 @@
|
||||
"""Tests for runtime backup creation."""
|
||||
"""Tests for runtime backup creation, restore, and retention cleanup."""
|
||||
|
||||
import json
|
||||
import sqlite3
|
||||
from datetime import UTC, datetime
|
||||
from datetime import UTC, datetime, timedelta
|
||||
|
||||
import pytest
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.models.database import init_db
|
||||
from atocore.ops.backup import (
|
||||
cleanup_old_backups,
|
||||
create_runtime_backup,
|
||||
list_runtime_backups,
|
||||
restore_runtime_backup,
|
||||
validate_backup,
|
||||
)
|
||||
|
||||
@@ -156,3 +160,531 @@ def test_create_runtime_backup_handles_missing_registry(tmp_path, monkeypatch):
|
||||
config.settings = original_settings
|
||||
|
||||
assert result["registry_snapshot_path"] == ""
|
||||
|
||||
|
||||
def test_restore_refuses_without_confirm_service_stopped(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
create_runtime_backup(datetime(2026, 4, 9, 10, 0, 0, tzinfo=UTC))
|
||||
|
||||
with pytest.raises(RuntimeError, match="confirm_service_stopped"):
|
||||
restore_runtime_backup("20260409T100000Z")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_raises_on_invalid_backup(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
with pytest.raises(RuntimeError, match="failed validation"):
|
||||
restore_runtime_backup(
|
||||
"20250101T000000Z", confirm_service_stopped=True
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_round_trip_reverses_post_backup_mutations(tmp_path, monkeypatch):
|
||||
"""Canonical drill: snapshot -> mutate -> restore -> mutation gone."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
registry_path = tmp_path / "config" / "project-registry.json"
|
||||
registry_path.parent.mkdir(parents=True)
|
||||
registry_path.write_text(
|
||||
'{"projects":[{"id":"p01-example","aliases":[],'
|
||||
'"ingest_roots":[{"source":"vault","subpath":"incoming/projects/p01-example"}]}]}\n',
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
|
||||
# 1. Seed baseline state that should SURVIVE the restore.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name) VALUES (?, ?)",
|
||||
("p01", "Baseline Project"),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
# 2. Create the backup we're going to restore to.
|
||||
create_runtime_backup(datetime(2026, 4, 9, 11, 0, 0, tzinfo=UTC))
|
||||
stamp = "20260409T110000Z"
|
||||
|
||||
# 3. Mutate live state AFTER the backup — this is what the
|
||||
# restore should reverse.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name) VALUES (?, ?)",
|
||||
("p99", "Post Backup Mutation"),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
# Confirm the mutation is present before restore.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
row = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p99",)
|
||||
).fetchone()
|
||||
assert row is not None and row[0] == "Post Backup Mutation"
|
||||
|
||||
# 4. Restore — the drill procedure. Explicit confirm_service_stopped.
|
||||
result = restore_runtime_backup(
|
||||
stamp, confirm_service_stopped=True
|
||||
)
|
||||
|
||||
# 5. Verify restore report
|
||||
assert result["stamp"] == stamp
|
||||
assert result["db_restored"] is True
|
||||
assert result["registry_restored"] is True
|
||||
assert result["restored_integrity_ok"] is True
|
||||
assert result["pre_restore_snapshot"] is not None
|
||||
|
||||
# 6. Verify live state reflects the restore: baseline survived,
|
||||
# post-backup mutation is gone.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
baseline = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p01",)
|
||||
).fetchone()
|
||||
mutation = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p99",)
|
||||
).fetchone()
|
||||
assert baseline is not None and baseline[0] == "Baseline Project"
|
||||
assert mutation is None
|
||||
|
||||
# 7. Pre-restore safety snapshot DOES contain the mutation —
|
||||
# it captured current state before overwriting. This is the
|
||||
# reversibility guarantee: the operator can restore back to
|
||||
# it if the restore itself was a mistake.
|
||||
pre_stamp = result["pre_restore_snapshot"]
|
||||
pre_validation = validate_backup(pre_stamp)
|
||||
assert pre_validation["valid"] is True
|
||||
pre_db_path = pre_validation["metadata"]["db_snapshot_path"]
|
||||
with sqlite3.connect(pre_db_path) as conn:
|
||||
pre_mutation = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p99",)
|
||||
).fetchone()
|
||||
assert pre_mutation is not None and pre_mutation[0] == "Post Backup Mutation"
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_round_trip_with_chroma(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
|
||||
# Seed baseline chroma state that should survive restore.
|
||||
chroma_dir = config.settings.chroma_path
|
||||
(chroma_dir / "coll-a").mkdir(parents=True, exist_ok=True)
|
||||
(chroma_dir / "coll-a" / "baseline.bin").write_bytes(b"baseline")
|
||||
|
||||
create_runtime_backup(
|
||||
datetime(2026, 4, 9, 12, 0, 0, tzinfo=UTC), include_chroma=True
|
||||
)
|
||||
stamp = "20260409T120000Z"
|
||||
|
||||
# Mutate chroma after backup: add a file + remove baseline.
|
||||
(chroma_dir / "coll-a" / "post_backup.bin").write_bytes(b"post")
|
||||
(chroma_dir / "coll-a" / "baseline.bin").unlink()
|
||||
|
||||
result = restore_runtime_backup(
|
||||
stamp, confirm_service_stopped=True
|
||||
)
|
||||
|
||||
assert result["chroma_restored"] is True
|
||||
assert (chroma_dir / "coll-a" / "baseline.bin").exists()
|
||||
assert not (chroma_dir / "coll-a" / "post_backup.bin").exists()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_chroma_does_not_unlink_destination_directory(tmp_path, monkeypatch):
|
||||
"""Regression: restore must not rmtree the chroma dir itself.
|
||||
|
||||
In a Dockerized deployment the chroma dir is a bind-mounted
|
||||
volume. Calling shutil.rmtree on a mount point raises
|
||||
``OSError [Errno 16] Device or resource busy``, which broke the
|
||||
first real Dalidou drill on 2026-04-09. The fix clears the
|
||||
directory's CONTENTS and copytree(dirs_exist_ok=True) into it,
|
||||
keeping the directory inode (and any bind mount) intact.
|
||||
|
||||
This test captures the inode of the destination directory before
|
||||
and after restore and asserts they match — that's what a
|
||||
bind-mounted chroma dir would also see.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
|
||||
chroma_dir = config.settings.chroma_path
|
||||
(chroma_dir / "coll-a").mkdir(parents=True, exist_ok=True)
|
||||
(chroma_dir / "coll-a" / "baseline.bin").write_bytes(b"baseline")
|
||||
|
||||
create_runtime_backup(
|
||||
datetime(2026, 4, 9, 15, 0, 0, tzinfo=UTC), include_chroma=True
|
||||
)
|
||||
|
||||
# Capture the destination directory's stat signature before restore.
|
||||
chroma_stat_before = chroma_dir.stat()
|
||||
|
||||
# Add a file post-backup so restore has work to do.
|
||||
(chroma_dir / "coll-a" / "post_backup.bin").write_bytes(b"post")
|
||||
|
||||
restore_runtime_backup(
|
||||
"20260409T150000Z", confirm_service_stopped=True
|
||||
)
|
||||
|
||||
# Directory still exists (would have failed on mount point) and
|
||||
# its st_ino matches — the mount itself wasn't unlinked.
|
||||
assert chroma_dir.exists()
|
||||
chroma_stat_after = chroma_dir.stat()
|
||||
assert chroma_stat_before.st_ino == chroma_stat_after.st_ino, (
|
||||
"chroma directory inode changed — restore recreated the "
|
||||
"directory instead of clearing its contents; this would "
|
||||
"fail on a Docker bind-mounted volume"
|
||||
)
|
||||
# And the contents did actually get restored.
|
||||
assert (chroma_dir / "coll-a" / "baseline.bin").exists()
|
||||
assert not (chroma_dir / "coll-a" / "post_backup.bin").exists()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_skips_pre_snapshot_when_requested(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
create_runtime_backup(datetime(2026, 4, 9, 13, 0, 0, tzinfo=UTC))
|
||||
|
||||
before_count = len(list_runtime_backups())
|
||||
|
||||
result = restore_runtime_backup(
|
||||
"20260409T130000Z",
|
||||
confirm_service_stopped=True,
|
||||
pre_restore_snapshot=False,
|
||||
)
|
||||
|
||||
after_count = len(list_runtime_backups())
|
||||
assert result["pre_restore_snapshot"] is None
|
||||
assert after_count == before_count
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_create_backup_includes_validation_fields(tmp_path, monkeypatch):
|
||||
"""Task B: create_runtime_backup auto-validates and reports result."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
result = create_runtime_backup(datetime(2026, 4, 11, 10, 0, 0, tzinfo=UTC))
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert "validated" in result
|
||||
assert "validation_errors" in result
|
||||
assert result["validated"] is True
|
||||
assert result["validation_errors"] == []
|
||||
|
||||
|
||||
def test_create_backup_validation_failure_does_not_raise(tmp_path, monkeypatch):
|
||||
"""Task B: if post-backup validation fails, backup still returns metadata."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
def _broken_validate(stamp):
|
||||
return {"valid": False, "errors": ["db_missing", "metadata_missing"]}
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
monkeypatch.setattr("atocore.ops.backup.validate_backup", _broken_validate)
|
||||
result = create_runtime_backup(datetime(2026, 4, 11, 11, 0, 0, tzinfo=UTC))
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
# Should NOT have raised — backup still returned metadata
|
||||
assert result["validated"] is False
|
||||
assert result["validation_errors"] == ["db_missing", "metadata_missing"]
|
||||
# Core backup fields still present
|
||||
assert "db_snapshot_path" in result
|
||||
assert "created_at" in result
|
||||
|
||||
|
||||
def test_restore_cleans_stale_wal_sidecars(tmp_path, monkeypatch):
|
||||
"""Stale WAL/SHM sidecars must not carry bytes past the restore.
|
||||
|
||||
Note: after restore runs, PRAGMA integrity_check reopens the
|
||||
restored db which may legitimately recreate a fresh -wal. So we
|
||||
assert that the STALE byte marker no longer appears in either
|
||||
sidecar, not that the files are absent.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
create_runtime_backup(datetime(2026, 4, 9, 14, 0, 0, tzinfo=UTC))
|
||||
|
||||
# Write fake stale WAL/SHM next to the live db with an
|
||||
# unmistakable marker.
|
||||
target_db = config.settings.db_path
|
||||
wal = target_db.with_name(target_db.name + "-wal")
|
||||
shm = target_db.with_name(target_db.name + "-shm")
|
||||
stale_marker = b"STALE-SIDECAR-MARKER-DO-NOT-SURVIVE"
|
||||
wal.write_bytes(stale_marker)
|
||||
shm.write_bytes(stale_marker)
|
||||
assert wal.exists() and shm.exists()
|
||||
|
||||
restore_runtime_backup(
|
||||
"20260409T140000Z", confirm_service_stopped=True
|
||||
)
|
||||
|
||||
# The restored db must pass integrity check (tested elsewhere);
|
||||
# here we just confirm that no file next to it still contains
|
||||
# the stale marker from the old live process.
|
||||
for sidecar in (wal, shm):
|
||||
if sidecar.exists():
|
||||
assert stale_marker not in sidecar.read_bytes(), (
|
||||
f"{sidecar.name} still carries stale marker"
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Task C: Backup retention cleanup
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _setup_cleanup_env(tmp_path, monkeypatch):
|
||||
"""Helper: configure env, init db, return snapshots_root."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
original = config.settings
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
snapshots_root = config.settings.resolved_backup_dir / "snapshots"
|
||||
snapshots_root.mkdir(parents=True, exist_ok=True)
|
||||
return original, snapshots_root
|
||||
|
||||
|
||||
def _seed_snapshots(snapshots_root, dates):
|
||||
"""Create minimal valid snapshot dirs for the given datetimes."""
|
||||
for dt in dates:
|
||||
stamp = dt.strftime("%Y%m%dT%H%M%SZ")
|
||||
snap_dir = snapshots_root / stamp
|
||||
db_dir = snap_dir / "db"
|
||||
db_dir.mkdir(parents=True, exist_ok=True)
|
||||
db_path = db_dir / "atocore.db"
|
||||
conn = sqlite3.connect(str(db_path))
|
||||
conn.execute("CREATE TABLE IF NOT EXISTS _marker (id INTEGER)")
|
||||
conn.close()
|
||||
metadata = {
|
||||
"created_at": dt.isoformat(),
|
||||
"backup_root": str(snap_dir),
|
||||
"db_snapshot_path": str(db_path),
|
||||
"db_size_bytes": db_path.stat().st_size,
|
||||
"registry_snapshot_path": "",
|
||||
"chroma_snapshot_path": "",
|
||||
"chroma_snapshot_bytes": 0,
|
||||
"chroma_snapshot_files": 0,
|
||||
"chroma_snapshot_included": False,
|
||||
"vector_store_note": "",
|
||||
}
|
||||
(snap_dir / "backup-metadata.json").write_text(
|
||||
json.dumps(metadata, indent=2) + "\n", encoding="utf-8"
|
||||
)
|
||||
|
||||
|
||||
def test_cleanup_empty_dir(tmp_path, monkeypatch):
|
||||
original, _ = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
result = cleanup_old_backups()
|
||||
assert result["kept"] == 0
|
||||
assert result["would_delete"] == 0
|
||||
assert result["dry_run"] is True
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_dry_run_identifies_old_snapshots(tmp_path, monkeypatch):
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
# 10 daily snapshots Apr 2-11 (avoiding Apr 1 which is monthly).
|
||||
base = datetime(2026, 4, 2, 12, 0, 0, tzinfo=UTC)
|
||||
dates = [base + timedelta(days=i) for i in range(10)]
|
||||
_seed_snapshots(snapshots_root, dates)
|
||||
|
||||
result = cleanup_old_backups()
|
||||
assert result["dry_run"] is True
|
||||
# 7 daily kept + Apr 5 is a Sunday (weekly) but already in daily.
|
||||
# Apr 2, 3, 4 are oldest. Apr 5 is Sunday → kept as weekly.
|
||||
# So: 7 daily (Apr 5-11) + 1 weekly (Apr 5 already counted) = 7 daily.
|
||||
# But Apr 5 is the 8th newest day from Apr 11... wait.
|
||||
# Newest 7 days: Apr 11,10,9,8,7,6,5 → all kept as daily.
|
||||
# Remaining: Apr 4,3,2. Apr 5 is already in daily.
|
||||
# None of Apr 4,3,2 are Sunday or 1st → all 3 deleted.
|
||||
assert result["kept"] == 7
|
||||
assert result["would_delete"] == 3
|
||||
assert len(list(snapshots_root.iterdir())) == 10
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_confirm_deletes(tmp_path, monkeypatch):
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
base = datetime(2026, 4, 2, 12, 0, 0, tzinfo=UTC)
|
||||
dates = [base + timedelta(days=i) for i in range(10)]
|
||||
_seed_snapshots(snapshots_root, dates)
|
||||
|
||||
result = cleanup_old_backups(confirm=True)
|
||||
assert result["dry_run"] is False
|
||||
assert result["deleted"] == 3
|
||||
assert result["kept"] == 7
|
||||
assert len(list(snapshots_root.iterdir())) == 7
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_keeps_last_7_daily(tmp_path, monkeypatch):
|
||||
"""Exactly 7 snapshots on different days → all kept."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
dates = [base + timedelta(days=i) for i in range(7)]
|
||||
_seed_snapshots(snapshots_root, dates)
|
||||
|
||||
result = cleanup_old_backups()
|
||||
assert result["kept"] == 7
|
||||
assert result["would_delete"] == 0
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_keeps_sunday_weekly(tmp_path, monkeypatch):
|
||||
"""Snapshots on Sundays outside the 7-day window are kept as weekly."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
# 7 daily snapshots covering Apr 5-11
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
daily = [base + timedelta(days=i) for i in range(7)]
|
||||
|
||||
# 2 older Sunday snapshots
|
||||
sun1 = datetime(2026, 3, 29, 12, 0, 0, tzinfo=UTC) # Sunday
|
||||
sun2 = datetime(2026, 3, 22, 12, 0, 0, tzinfo=UTC) # Sunday
|
||||
# A non-Sunday old snapshot that should be deleted
|
||||
wed = datetime(2026, 3, 25, 12, 0, 0, tzinfo=UTC) # Wednesday
|
||||
|
||||
_seed_snapshots(snapshots_root, daily + [sun1, sun2, wed])
|
||||
|
||||
result = cleanup_old_backups()
|
||||
# 7 daily + 2 Sunday weekly = 9 kept, 1 Wednesday deleted
|
||||
assert result["kept"] == 9
|
||||
assert result["would_delete"] == 1
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_keeps_monthly_first(tmp_path, monkeypatch):
|
||||
"""Snapshots on the 1st of a month outside daily+weekly are kept as monthly."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
# 7 daily in April 2026
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
daily = [base + timedelta(days=i) for i in range(7)]
|
||||
|
||||
# Old monthly 1st snapshots
|
||||
m1 = datetime(2026, 1, 1, 12, 0, 0, tzinfo=UTC)
|
||||
m2 = datetime(2025, 12, 1, 12, 0, 0, tzinfo=UTC)
|
||||
# Old non-1st, non-Sunday snapshot — should be deleted
|
||||
old = datetime(2026, 1, 15, 12, 0, 0, tzinfo=UTC)
|
||||
|
||||
_seed_snapshots(snapshots_root, daily + [m1, m2, old])
|
||||
|
||||
result = cleanup_old_backups()
|
||||
# 7 daily + 2 monthly = 9 kept, 1 deleted
|
||||
assert result["kept"] == 9
|
||||
assert result["would_delete"] == 1
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_unparseable_stamp_skipped(tmp_path, monkeypatch):
|
||||
"""Directories with unparseable names are ignored, not deleted."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
_seed_snapshots(snapshots_root, [base])
|
||||
|
||||
bad_dir = snapshots_root / "not-a-timestamp"
|
||||
bad_dir.mkdir()
|
||||
|
||||
result = cleanup_old_backups(confirm=True)
|
||||
assert result.get("unparseable") == ["not-a-timestamp"]
|
||||
assert bad_dir.exists()
|
||||
assert result["kept"] == 1
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
249
tests/test_capture_stop.py
Normal file
249
tests/test_capture_stop.py
Normal file
@@ -0,0 +1,249 @@
|
||||
"""Tests for deploy/hooks/capture_stop.py — Claude Code Stop hook."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
import textwrap
|
||||
from io import StringIO
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
|
||||
# The hook script lives outside of the normal package tree, so import
|
||||
# it by manipulating sys.path.
|
||||
_HOOK_DIR = str(Path(__file__).resolve().parent.parent / "deploy" / "hooks")
|
||||
if _HOOK_DIR not in sys.path:
|
||||
sys.path.insert(0, _HOOK_DIR)
|
||||
|
||||
import capture_stop # noqa: E402
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _write_transcript(tmp: Path, entries: list[dict]) -> str:
|
||||
"""Write a JSONL transcript and return the path."""
|
||||
path = tmp / "transcript.jsonl"
|
||||
with open(path, "w", encoding="utf-8") as f:
|
||||
for entry in entries:
|
||||
f.write(json.dumps(entry, ensure_ascii=False) + "\n")
|
||||
return str(path)
|
||||
|
||||
|
||||
def _user_entry(content: str, *, is_meta: bool = False) -> dict:
|
||||
return {
|
||||
"type": "user",
|
||||
"isMeta": is_meta,
|
||||
"message": {"role": "user", "content": content},
|
||||
}
|
||||
|
||||
|
||||
def _assistant_entry() -> dict:
|
||||
return {
|
||||
"type": "assistant",
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": [{"type": "text", "text": "Sure, here's the answer."}],
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def _system_entry() -> dict:
|
||||
return {"type": "system", "message": {"role": "system", "content": "system init"}}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _extract_last_user_prompt
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestExtractLastUserPrompt:
|
||||
def test_returns_last_real_prompt(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("First prompt that is long enough to capture"),
|
||||
_assistant_entry(),
|
||||
_user_entry("Second prompt that should be the one we capture"),
|
||||
_assistant_entry(),
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "Second prompt that should be the one we capture"
|
||||
|
||||
def test_skips_meta_messages(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("Real prompt that is definitely long enough"),
|
||||
_user_entry("<local-command>some system stuff</local-command>"),
|
||||
_user_entry("Meta message that looks real enough", is_meta=True),
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "Real prompt that is definitely long enough"
|
||||
|
||||
def test_skips_xml_content(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("Actual prompt from a real human user"),
|
||||
_user_entry("<command-name>/help</command-name>"),
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "Actual prompt from a real human user"
|
||||
|
||||
def test_skips_short_messages(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("This prompt is long enough to be captured"),
|
||||
_user_entry("yes"), # too short
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "This prompt is long enough to be captured"
|
||||
|
||||
def test_handles_content_blocks(self, tmp_path):
|
||||
entry = {
|
||||
"type": "user",
|
||||
"message": {
|
||||
"role": "user",
|
||||
"content": [
|
||||
{"type": "text", "text": "First paragraph of the prompt."},
|
||||
{"type": "text", "text": "Second paragraph continues here."},
|
||||
],
|
||||
},
|
||||
}
|
||||
path = _write_transcript(tmp_path, [entry])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert "First paragraph" in result
|
||||
assert "Second paragraph" in result
|
||||
|
||||
def test_empty_transcript(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == ""
|
||||
|
||||
def test_missing_file(self):
|
||||
result = capture_stop._extract_last_user_prompt("/nonexistent/path.jsonl")
|
||||
assert result == ""
|
||||
|
||||
def test_empty_path(self):
|
||||
result = capture_stop._extract_last_user_prompt("")
|
||||
assert result == ""
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _infer_project
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestInferProject:
|
||||
def test_empty_cwd(self):
|
||||
assert capture_stop._infer_project("") == ""
|
||||
|
||||
def test_unknown_path(self):
|
||||
assert capture_stop._infer_project("C:\\Users\\antoi\\random") == ""
|
||||
|
||||
def test_mapped_path(self):
|
||||
with mock.patch.dict(capture_stop._PROJECT_PATH_MAP, {
|
||||
"C:\\Users\\antoi\\gigabit": "p04-gigabit",
|
||||
}):
|
||||
result = capture_stop._infer_project("C:\\Users\\antoi\\gigabit\\src")
|
||||
assert result == "p04-gigabit"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _capture (integration-style, mocking HTTP)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestCapture:
|
||||
def _hook_input(self, *, transcript_path: str = "", **overrides) -> str:
|
||||
data = {
|
||||
"session_id": "test-session-123",
|
||||
"transcript_path": transcript_path,
|
||||
"cwd": "C:\\Users\\antoi\\ATOCore",
|
||||
"permission_mode": "default",
|
||||
"hook_event_name": "Stop",
|
||||
"last_assistant_message": "Here is the answer to your question about the code.",
|
||||
"turn_number": 3,
|
||||
}
|
||||
data.update(overrides)
|
||||
return json.dumps(data)
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_posts_to_atocore(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("Please explain how the backup system works in detail"),
|
||||
_assistant_entry(),
|
||||
])
|
||||
mock_resp = mock.MagicMock()
|
||||
mock_resp.read.return_value = json.dumps({"id": "int-001", "status": "recorded"}).encode()
|
||||
mock_urlopen.return_value = mock_resp
|
||||
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
|
||||
mock_urlopen.assert_called_once()
|
||||
req = mock_urlopen.call_args[0][0]
|
||||
body = json.loads(req.data.decode())
|
||||
assert body["prompt"] == "Please explain how the backup system works in detail"
|
||||
assert body["client"] == "claude-code"
|
||||
assert body["session_id"] == "test-session-123"
|
||||
assert body["reinforce"] is False
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_skips_when_disabled(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("A prompt that would normally be captured"),
|
||||
])
|
||||
with mock.patch.dict(os.environ, {"ATOCORE_CAPTURE_DISABLED": "1"}):
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
mock_urlopen.assert_not_called()
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_skips_short_prompt(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("yes"),
|
||||
])
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
mock_urlopen.assert_not_called()
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_truncates_long_response(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("Tell me everything about the entire codebase architecture"),
|
||||
])
|
||||
long_response = "x" * 60_000
|
||||
mock_resp = mock.MagicMock()
|
||||
mock_resp.read.return_value = json.dumps({"id": "int-002"}).encode()
|
||||
mock_urlopen.return_value = mock_resp
|
||||
|
||||
with mock.patch("sys.stdin", StringIO(
|
||||
self._hook_input(transcript_path=transcript, last_assistant_message=long_response)
|
||||
)):
|
||||
capture_stop._capture()
|
||||
|
||||
req = mock_urlopen.call_args[0][0]
|
||||
body = json.loads(req.data.decode())
|
||||
assert len(body["response"]) <= capture_stop.MAX_RESPONSE_LENGTH + 20
|
||||
assert body["response"].endswith("[truncated]")
|
||||
|
||||
def test_main_never_raises(self):
|
||||
"""main() must always exit 0, even on garbage input."""
|
||||
with mock.patch("sys.stdin", StringIO("not json at all")):
|
||||
# Should not raise
|
||||
capture_stop.main()
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_uses_atocore_url_env(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("Please help me with this particular problem in the code"),
|
||||
])
|
||||
mock_resp = mock.MagicMock()
|
||||
mock_resp.read.return_value = json.dumps({"id": "int-003"}).encode()
|
||||
mock_urlopen.return_value = mock_resp
|
||||
|
||||
with mock.patch.dict(os.environ, {"ATOCORE_URL": "http://localhost:9999"}):
|
||||
# Re-read the env var
|
||||
with mock.patch.object(capture_stop, "ATOCORE_URL", "http://localhost:9999"):
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
|
||||
req = mock_urlopen.call_args[0][0]
|
||||
assert req.full_url == "http://localhost:9999/interactions"
|
||||
@@ -1,5 +1,8 @@
|
||||
"""Tests for the context builder."""
|
||||
|
||||
import json
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.context.builder import build_context, get_last_context_pack
|
||||
from atocore.context.project_state import init_project_state_schema, set_state
|
||||
from atocore.ingestion.pipeline import ingest_file
|
||||
@@ -162,3 +165,89 @@ def test_no_project_state_without_hint(tmp_data_dir, sample_markdown):
|
||||
pack = build_context("What is AtoCore?")
|
||||
assert pack.project_state_chars == 0
|
||||
assert "--- Trusted Project State ---" not in pack.formatted_context
|
||||
|
||||
|
||||
def test_alias_hint_resolves_through_registry(tmp_data_dir, sample_markdown, monkeypatch):
|
||||
"""An alias hint like 'p05' should find project state stored under 'p05-interferometer'.
|
||||
|
||||
This is the regression test for the P1 finding from codex's review:
|
||||
/context/build was previously doing an exact-name lookup that
|
||||
silently dropped trusted project state when the caller passed an
|
||||
alias instead of the canonical project id.
|
||||
"""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
# Stand up a minimal project registry that knows the aliases.
|
||||
# The registry lives in a JSON file pointed to by
|
||||
# ATOCORE_PROJECT_REGISTRY_PATH; the dataclass-driven loader picks
|
||||
# it up on every call (no in-process cache to invalidate).
|
||||
registry_path = tmp_data_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"description": "P05 alias-resolution regression test",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05"}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
# Trusted state is stored under the canonical id (the way the
|
||||
# /project/state endpoint always writes it).
|
||||
set_state(
|
||||
"p05-interferometer",
|
||||
"status",
|
||||
"next_focus",
|
||||
"Wave 2 trusted-operational ingestion",
|
||||
)
|
||||
|
||||
# The bug: pack with alias hint used to silently miss the state.
|
||||
pack_with_alias = build_context("status?", project_hint="p05", budget=2000)
|
||||
assert "Wave 2 trusted-operational ingestion" in pack_with_alias.formatted_context
|
||||
assert pack_with_alias.project_state_chars > 0
|
||||
|
||||
# The canonical id should still work the same way.
|
||||
pack_with_canonical = build_context(
|
||||
"status?", project_hint="p05-interferometer", budget=2000
|
||||
)
|
||||
assert "Wave 2 trusted-operational ingestion" in pack_with_canonical.formatted_context
|
||||
|
||||
# A second alias should also resolve.
|
||||
pack_with_other_alias = build_context(
|
||||
"status?", project_hint="interferometer", budget=2000
|
||||
)
|
||||
assert "Wave 2 trusted-operational ingestion" in pack_with_other_alias.formatted_context
|
||||
|
||||
|
||||
def test_unknown_hint_falls_back_to_raw_lookup(tmp_data_dir, sample_markdown, monkeypatch):
|
||||
"""A hint that isn't in the registry should still try the raw name.
|
||||
|
||||
This preserves backwards compatibility with hand-curated
|
||||
project_state entries that predate the project registry.
|
||||
"""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
# Empty registry — the hint won't resolve through it.
|
||||
registry_path = tmp_data_dir / "project-registry.json"
|
||||
registry_path.write_text('{"projects": []}', encoding="utf-8")
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
set_state("orphan-project", "status", "phase", "Solo run")
|
||||
|
||||
pack = build_context("status?", project_hint="orphan-project", budget=2000)
|
||||
assert "Solo run" in pack.formatted_context
|
||||
|
||||
@@ -47,3 +47,138 @@ def test_get_connection_uses_configured_timeout_value(tmp_path, monkeypatch):
|
||||
|
||||
assert calls
|
||||
assert calls[0] == 2.5
|
||||
|
||||
|
||||
def test_init_db_upgrades_pre_phase9_schema_without_failing(tmp_path, monkeypatch):
|
||||
"""Regression test for the schema init ordering bug caught during
|
||||
the first real Dalidou deploy (report from 2026-04-08).
|
||||
|
||||
Before the fix, SCHEMA_SQL contained CREATE INDEX statements that
|
||||
referenced columns (memories.project, interactions.project,
|
||||
interactions.session_id) added by _apply_migrations later in
|
||||
init_db. On a fresh install this worked because CREATE TABLE
|
||||
created the tables with the new columns before the CREATE INDEX
|
||||
ran, but on UPGRADE from a pre-Phase-9 schema the CREATE TABLE
|
||||
IF NOT EXISTS was a no-op and the CREATE INDEX hit
|
||||
OperationalError: no such column.
|
||||
|
||||
This test seeds the tables with the OLD pre-Phase-9 shape then
|
||||
calls init_db() and verifies that:
|
||||
|
||||
- init_db does not raise
|
||||
- The new columns were added via _apply_migrations
|
||||
- The new indexes exist
|
||||
|
||||
If the bug is reintroduced by moving a CREATE INDEX for a
|
||||
migration column back into SCHEMA_SQL, this test will fail
|
||||
with OperationalError before reaching the assertions.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
|
||||
# Step 1: create the data dir and open a direct connection
|
||||
config.ensure_runtime_dirs()
|
||||
db_path = config.settings.db_path
|
||||
|
||||
# Step 2: seed the DB with the old pre-Phase-9 shape. No
|
||||
# project/last_referenced_at/reference_count on memories; no
|
||||
# project/client/session_id/response/memories_used/chunks_used
|
||||
# on interactions. We also need the prerequisite tables
|
||||
# (projects, source_documents, source_chunks) because the
|
||||
# memories table has an FK to source_chunks.
|
||||
with sqlite3.connect(str(db_path)) as conn:
|
||||
conn.executescript(
|
||||
"""
|
||||
CREATE TABLE source_documents (
|
||||
id TEXT PRIMARY KEY,
|
||||
file_path TEXT UNIQUE NOT NULL,
|
||||
file_hash TEXT NOT NULL,
|
||||
title TEXT,
|
||||
doc_type TEXT DEFAULT 'markdown',
|
||||
tags TEXT DEFAULT '[]',
|
||||
ingested_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE source_chunks (
|
||||
id TEXT PRIMARY KEY,
|
||||
document_id TEXT NOT NULL REFERENCES source_documents(id) ON DELETE CASCADE,
|
||||
chunk_index INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
heading_path TEXT DEFAULT '',
|
||||
char_count INTEGER NOT NULL,
|
||||
metadata TEXT DEFAULT '{}',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE memories (
|
||||
id TEXT PRIMARY KEY,
|
||||
memory_type TEXT NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
source_chunk_id TEXT REFERENCES source_chunks(id),
|
||||
confidence REAL DEFAULT 1.0,
|
||||
status TEXT DEFAULT 'active',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE projects (
|
||||
id TEXT PRIMARY KEY,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
description TEXT DEFAULT '',
|
||||
status TEXT DEFAULT 'active',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE interactions (
|
||||
id TEXT PRIMARY KEY,
|
||||
prompt TEXT NOT NULL,
|
||||
context_pack TEXT DEFAULT '{}',
|
||||
response_summary TEXT DEFAULT '',
|
||||
project_id TEXT REFERENCES projects(id),
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
"""
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
# Step 3: call init_db — this used to raise on the upgrade
|
||||
# path. After the fix it should succeed.
|
||||
init_db()
|
||||
|
||||
# Step 4: verify the migrations ran — Phase 9 columns present
|
||||
with sqlite3.connect(str(db_path)) as conn:
|
||||
conn.row_factory = sqlite3.Row
|
||||
memories_cols = {
|
||||
row["name"] for row in conn.execute("PRAGMA table_info(memories)")
|
||||
}
|
||||
interactions_cols = {
|
||||
row["name"]
|
||||
for row in conn.execute("PRAGMA table_info(interactions)")
|
||||
}
|
||||
|
||||
assert "project" in memories_cols
|
||||
assert "last_referenced_at" in memories_cols
|
||||
assert "reference_count" in memories_cols
|
||||
|
||||
assert "project" in interactions_cols
|
||||
assert "client" in interactions_cols
|
||||
assert "session_id" in interactions_cols
|
||||
assert "response" in interactions_cols
|
||||
assert "memories_used" in interactions_cols
|
||||
assert "chunks_used" in interactions_cols
|
||||
|
||||
# Step 5: verify the indexes on migration columns exist
|
||||
index_rows = conn.execute(
|
||||
"SELECT name FROM sqlite_master WHERE type='index' AND tbl_name IN ('memories','interactions')"
|
||||
).fetchall()
|
||||
index_names = {row["name"] for row in index_rows}
|
||||
|
||||
assert "idx_memories_project" in index_names
|
||||
assert "idx_interactions_project_name" in index_names
|
||||
assert "idx_interactions_session" in index_names
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
@@ -209,3 +209,96 @@ def test_list_interactions_endpoint_returns_summaries(tmp_data_dir):
|
||||
assert body["interactions"][0]["response_chars"] == 50
|
||||
# The list endpoint never includes the full response body
|
||||
assert "response" not in body["interactions"][0]
|
||||
|
||||
|
||||
# --- alias canonicalization on interaction capture/list -------------------
|
||||
|
||||
|
||||
def test_record_interaction_canonicalizes_project(project_registry):
|
||||
"""Capturing under an alias should store the canonical project id.
|
||||
|
||||
Regression for codex's P2 finding: reinforcement and extraction
|
||||
query memories by interaction.project; if the captured project is
|
||||
a raw alias they would silently miss memories stored under the
|
||||
canonical id.
|
||||
"""
|
||||
init_db()
|
||||
project_registry(("p05-interferometer", ["p05", "interferometer"]))
|
||||
|
||||
interaction = record_interaction(
|
||||
prompt="quick capture", response="response body", project="p05", reinforce=False
|
||||
)
|
||||
assert interaction.project == "p05-interferometer"
|
||||
|
||||
fetched = get_interaction(interaction.id)
|
||||
assert fetched.project == "p05-interferometer"
|
||||
|
||||
|
||||
def test_list_interactions_canonicalizes_project_filter(project_registry):
|
||||
init_db()
|
||||
project_registry(("p06-polisher", ["p06", "polisher"]))
|
||||
|
||||
record_interaction(prompt="a", response="ra", project="p06-polisher", reinforce=False)
|
||||
record_interaction(prompt="b", response="rb", project="polisher", reinforce=False)
|
||||
record_interaction(prompt="c", response="rc", project="atocore", reinforce=False)
|
||||
|
||||
# Query by an alias should still find both p06 captures
|
||||
via_alias = list_interactions(project="p06")
|
||||
via_canonical = list_interactions(project="p06-polisher")
|
||||
assert len(via_alias) == 2
|
||||
assert len(via_canonical) == 2
|
||||
assert {i.prompt for i in via_alias} == {"a", "b"}
|
||||
|
||||
|
||||
# --- since filter format normalization ------------------------------------
|
||||
|
||||
|
||||
def test_list_interactions_since_accepts_iso_with_t_separator(tmp_data_dir):
|
||||
init_db()
|
||||
record_interaction(prompt="early", response="r", reinforce=False)
|
||||
time.sleep(1.05)
|
||||
pivot = record_interaction(prompt="late", response="r", reinforce=False)
|
||||
|
||||
# pivot.created_at is in storage format 'YYYY-MM-DD HH:MM:SS'.
|
||||
# Build the equivalent ISO 8601 with 'T' that an external client
|
||||
# would naturally send.
|
||||
iso_with_t = pivot.created_at.replace(" ", "T")
|
||||
items = list_interactions(since=iso_with_t)
|
||||
assert any(i.id == pivot.id for i in items)
|
||||
# The early row must also be excluded if its timestamp is strictly
|
||||
# before the pivot — since is inclusive on the cutoff
|
||||
early_ids = {i.id for i in items if i.prompt == "early"}
|
||||
assert early_ids == set() or len(items) >= 1
|
||||
|
||||
|
||||
def test_list_interactions_since_accepts_z_suffix(tmp_data_dir):
|
||||
init_db()
|
||||
pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
|
||||
time.sleep(1.05)
|
||||
after = record_interaction(prompt="after", response="r", reinforce=False)
|
||||
|
||||
iso_with_z = pivot.created_at.replace(" ", "T") + "Z"
|
||||
items = list_interactions(since=iso_with_z)
|
||||
ids = {i.id for i in items}
|
||||
assert pivot.id in ids
|
||||
assert after.id in ids
|
||||
|
||||
|
||||
def test_list_interactions_since_accepts_offset(tmp_data_dir):
|
||||
init_db()
|
||||
pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
|
||||
time.sleep(1.05)
|
||||
after = record_interaction(prompt="after", response="r", reinforce=False)
|
||||
|
||||
iso_with_offset = pivot.created_at.replace(" ", "T") + "+00:00"
|
||||
items = list_interactions(since=iso_with_offset)
|
||||
assert any(i.id == after.id for i in items)
|
||||
|
||||
|
||||
def test_list_interactions_since_storage_format_still_works(tmp_data_dir):
|
||||
"""The bare storage format must still work for backwards compatibility."""
|
||||
init_db()
|
||||
pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
|
||||
|
||||
items = list_interactions(since=pivot.created_at)
|
||||
assert any(i.id == pivot.id for i in items)
|
||||
|
||||
802
tests/test_migrate_legacy_aliases.py
Normal file
802
tests/test_migrate_legacy_aliases.py
Normal file
@@ -0,0 +1,802 @@
|
||||
"""Tests for scripts/migrate_legacy_aliases.py.
|
||||
|
||||
The migration script closes the compatibility gap documented in
|
||||
docs/architecture/project-identity-canonicalization.md. These tests
|
||||
cover:
|
||||
|
||||
- empty/clean database behavior
|
||||
- shadow projects detection
|
||||
- state rekey without collisions
|
||||
- state collision detection + apply refusal
|
||||
- memory rekey + supersession of duplicates
|
||||
- interaction rekey
|
||||
- end-to-end apply on a realistic shadow
|
||||
- idempotency (running twice produces the same final state)
|
||||
- report artifact is written
|
||||
- the pre-fix regression gap is actually closed after migration
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import sqlite3
|
||||
import sys
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.context.project_state import (
|
||||
get_state,
|
||||
init_project_state_schema,
|
||||
)
|
||||
from atocore.models.database import init_db
|
||||
|
||||
# Make scripts/ importable
|
||||
_REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
sys.path.insert(0, str(_REPO_ROOT / "scripts"))
|
||||
|
||||
import migrate_legacy_aliases as mig # noqa: E402
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers that seed "legacy" rows the way they would have looked before fb6298a
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _open_db_connection():
|
||||
"""Open a direct SQLite connection to the test data dir's DB."""
|
||||
import atocore.config as config
|
||||
|
||||
conn = sqlite3.connect(str(config.settings.db_path))
|
||||
conn.row_factory = sqlite3.Row
|
||||
conn.execute("PRAGMA foreign_keys = ON")
|
||||
return conn
|
||||
|
||||
|
||||
def _seed_shadow_project(
|
||||
conn: sqlite3.Connection, shadow_name: str
|
||||
) -> str:
|
||||
"""Insert a projects row keyed under an alias, like the old set_state would have."""
|
||||
project_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name, description) VALUES (?, ?, ?)",
|
||||
(project_id, shadow_name, f"shadow row for {shadow_name}"),
|
||||
)
|
||||
conn.commit()
|
||||
return project_id
|
||||
|
||||
|
||||
def _seed_state_row(
|
||||
conn: sqlite3.Connection,
|
||||
project_id: str,
|
||||
category: str,
|
||||
key: str,
|
||||
value: str,
|
||||
status: str = "active",
|
||||
) -> str:
|
||||
row_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO project_state "
|
||||
"(id, project_id, category, key, value, source, confidence, status) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
|
||||
(row_id, project_id, category, key, value, "legacy-test", 1.0, status),
|
||||
)
|
||||
conn.commit()
|
||||
return row_id
|
||||
|
||||
|
||||
def _seed_memory_row(
|
||||
conn: sqlite3.Connection,
|
||||
memory_type: str,
|
||||
content: str,
|
||||
project: str,
|
||||
status: str = "active",
|
||||
) -> str:
|
||||
row_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO memories "
|
||||
"(id, memory_type, content, project, source_chunk_id, confidence, status) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(row_id, memory_type, content, project, None, 1.0, status),
|
||||
)
|
||||
conn.commit()
|
||||
return row_id
|
||||
|
||||
|
||||
def _seed_interaction_row(
|
||||
conn: sqlite3.Connection, prompt: str, project: str
|
||||
) -> str:
|
||||
row_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO interactions "
|
||||
"(id, prompt, context_pack, response_summary, response, "
|
||||
" memories_used, chunks_used, client, session_id, project, created_at) "
|
||||
"VALUES (?, ?, '{}', '', '', '[]', '[]', 'legacy-test', '', ?, '2026-04-01 12:00:00')",
|
||||
(row_id, prompt, project),
|
||||
)
|
||||
conn.commit()
|
||||
return row_id
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# plan-building tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _setup(tmp_data_dir):
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
|
||||
|
||||
def test_dry_run_on_empty_registry_reports_empty_plan(tmp_data_dir):
|
||||
"""Empty registry -> empty alias map -> empty plan."""
|
||||
registry_path = tmp_data_dir / "empty-registry.json"
|
||||
registry_path.write_text('{"projects": []}', encoding="utf-8")
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert plan.alias_map == {}
|
||||
assert plan.is_empty
|
||||
assert not plan.has_collisions
|
||||
assert plan.counts() == {
|
||||
"shadow_projects": 0,
|
||||
"state_rekey_rows": 0,
|
||||
"state_collisions": 0,
|
||||
"state_historical_drops": 0,
|
||||
"memory_rekey_rows": 0,
|
||||
"memory_supersede_rows": 0,
|
||||
"interaction_rekey_rows": 0,
|
||||
}
|
||||
|
||||
|
||||
def test_dry_run_on_clean_registered_db_reports_empty_plan(project_registry):
|
||||
"""A registry with projects but no legacy rows -> empty plan."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert plan.alias_map != {}
|
||||
assert plan.is_empty
|
||||
|
||||
|
||||
def test_dry_run_finds_shadow_project(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_shadow_project(conn, "p05")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert len(plan.shadow_projects) == 1
|
||||
assert plan.shadow_projects[0].shadow_name == "p05"
|
||||
assert plan.shadow_projects[0].canonical_project_id == "p05-interferometer"
|
||||
|
||||
|
||||
def test_dry_run_plans_state_rekey_without_collisions(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1 ingestion")
|
||||
_seed_state_row(conn, shadow_id, "decision", "lateral_support", "GF-PTFE")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert len(plan.state_plans) == 1
|
||||
sp = plan.state_plans[0]
|
||||
assert len(sp.rows_to_rekey) == 2
|
||||
assert sp.collisions == []
|
||||
assert not plan.has_collisions
|
||||
|
||||
|
||||
def test_dry_run_detects_state_collision(project_registry):
|
||||
"""Shadow and canonical both have state under the same (category, key) with different values."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
_seed_state_row(
|
||||
conn, canonical_id, "status", "next_focus", "Wave 2"
|
||||
)
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert plan.has_collisions
|
||||
collision = plan.state_plans[0].collisions[0]
|
||||
assert collision["shadow"]["value"] == "Wave 1"
|
||||
assert collision["canonical"]["value"] == "Wave 2"
|
||||
|
||||
|
||||
def test_dry_run_plans_memory_rekey_and_supersession(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p04-gigabit", ["p04", "gigabit"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
# A clean memory under the alias that will just be rekeyed
|
||||
_seed_memory_row(conn, "project", "clean rekey memory", "p04")
|
||||
# A memory that collides with an existing canonical memory
|
||||
_seed_memory_row(conn, "project", "duplicate content", "p04")
|
||||
_seed_memory_row(conn, "project", "duplicate content", "p04-gigabit")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# There's exactly one memory plan (one alias matched)
|
||||
assert len(plan.memory_plans) == 1
|
||||
mp = plan.memory_plans[0]
|
||||
# Two rows are candidates for rekey or supersession — one clean,
|
||||
# one duplicate. The duplicate is handled via to_supersede; the
|
||||
# other via rows_to_rekey.
|
||||
total_affected = len(mp.rows_to_rekey) + len(mp.to_supersede)
|
||||
assert total_affected == 2
|
||||
|
||||
|
||||
def test_dry_run_plans_interaction_rekey(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p06-polisher", ["p06", "polisher"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_interaction_row(conn, "quick capture under alias", "polisher")
|
||||
_seed_interaction_row(conn, "another alias-keyed row", "p06")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
total = sum(len(p.rows_to_rekey) for p in plan.interaction_plans)
|
||||
assert total == 2
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# apply tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_apply_refuses_on_state_collision(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
_seed_state_row(conn, canonical_id, "status", "next_focus", "Wave 2")
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert plan.has_collisions
|
||||
|
||||
with pytest.raises(mig.MigrationRefused):
|
||||
mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_apply_migrates_clean_shadow_end_to_end(project_registry):
|
||||
"""The happy path: one shadow project with clean state rows, rekey into a freshly-created canonical row, verify reachability via get_state."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "status", "next_focus", "Wave 1 ingestion"
|
||||
)
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "decision", "lateral_support", "GF-PTFE"
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_rekeyed"] == 2
|
||||
assert summary["shadow_projects_deleted"] == 1
|
||||
assert summary["canonical_rows_created"] == 1
|
||||
|
||||
# The regression gap is now closed: the service layer can see
|
||||
# the state under the canonical id via either the alias OR the
|
||||
# canonical.
|
||||
via_alias = get_state("p05")
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
assert len(via_alias) == 2
|
||||
assert len(via_canonical) == 2
|
||||
values = {entry.value for entry in via_canonical}
|
||||
assert values == {"Wave 1 ingestion", "GF-PTFE"}
|
||||
|
||||
|
||||
def test_apply_drops_shadow_state_duplicate_without_collision(project_registry):
|
||||
"""Shadow and canonical both have the same (category, key, value) — shadow gets marked superseded rather than hitting the UNIQUE constraint."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "status", "next_focus", "Wave 1 ingestion"
|
||||
)
|
||||
_seed_state_row(
|
||||
conn, canonical_id, "status", "next_focus", "Wave 1 ingestion"
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_merged_as_duplicate"] == 1
|
||||
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
# Exactly one active row survives
|
||||
assert len(via_canonical) == 1
|
||||
assert via_canonical[0].value == "Wave 1 ingestion"
|
||||
|
||||
|
||||
def test_apply_preserves_superseded_shadow_state_when_no_collision(project_registry):
|
||||
"""Regression test for the codex-flagged data-loss bug.
|
||||
|
||||
Before the fix, plan_state_migration only selected status='active'
|
||||
rows. Any superseded or invalid row on the shadow project was
|
||||
invisible to the plan and got silently cascade-deleted when the
|
||||
shadow projects row was dropped at the end of apply. That's
|
||||
exactly the kind of audit loss a cleanup migration must not cause.
|
||||
|
||||
This test seeds a shadow project with a superseded state row on
|
||||
a triple the canonical project doesn't have, runs the migration,
|
||||
and verifies the row survived and is now attached to the
|
||||
canonical project (still with status='superseded').
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
# Superseded row on a triple the canonical won't have
|
||||
_seed_state_row(
|
||||
conn,
|
||||
shadow_id,
|
||||
"status",
|
||||
"historical_phase",
|
||||
"Phase 0 legacy",
|
||||
status="superseded",
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# The superseded row should have been rekeyed, not dropped
|
||||
assert summary["state_rows_rekeyed"] == 1
|
||||
assert summary["state_rows_historical_dropped"] == 0
|
||||
|
||||
# Verify via raw SQL that the row is now attached to the canonical
|
||||
# projects row and still has status='superseded'
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
row = conn.execute(
|
||||
"SELECT ps.status, ps.value, p.name "
|
||||
"FROM project_state ps JOIN projects p ON ps.project_id = p.id "
|
||||
"WHERE ps.category = ? AND ps.key = ?",
|
||||
("status", "historical_phase"),
|
||||
).fetchone()
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert row is not None, "superseded shadow row was lost during migration"
|
||||
assert row["status"] == "superseded"
|
||||
assert row["value"] == "Phase 0 legacy"
|
||||
assert row["name"] == "p05-interferometer"
|
||||
|
||||
|
||||
def test_apply_drops_shadow_inactive_row_when_canonical_holds_same_triple(project_registry):
|
||||
"""Shadow is inactive (superseded) and collides with an active canonical row.
|
||||
|
||||
The canonical wins by definition of the UPSERT schema. The shadow
|
||||
row is recorded as a historical_drop in the plan so the operator
|
||||
sees the audit loss, and the apply cascade-deletes it via the
|
||||
shadow projects row. This is the unavoidable data-loss case
|
||||
documented in the migration module docstring.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
|
||||
# Shadow has a superseded value on a triple where the canonical
|
||||
# has a different active value. Can't preserve both: UNIQUE
|
||||
# allows only one row per triple.
|
||||
_seed_state_row(
|
||||
conn,
|
||||
shadow_id,
|
||||
"status",
|
||||
"next_focus",
|
||||
"Old wave 1",
|
||||
status="superseded",
|
||||
)
|
||||
_seed_state_row(
|
||||
conn,
|
||||
canonical_id,
|
||||
"status",
|
||||
"next_focus",
|
||||
"Wave 2 trusted-operational",
|
||||
status="active",
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions # not an active-vs-active collision
|
||||
assert plan.counts()["state_historical_drops"] == 1
|
||||
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_historical_dropped"] == 1
|
||||
|
||||
# The canonical's active row survives unchanged
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
active_next_focus = [
|
||||
e
|
||||
for e in via_canonical
|
||||
if e.category == "status" and e.key == "next_focus"
|
||||
]
|
||||
assert len(active_next_focus) == 1
|
||||
assert active_next_focus[0].value == "Wave 2 trusted-operational"
|
||||
|
||||
|
||||
def test_apply_replaces_inactive_canonical_with_active_shadow(project_registry):
|
||||
"""Shadow is active, canonical has an inactive row at the same triple.
|
||||
|
||||
The shadow wins: canonical inactive row is deleted, shadow is
|
||||
rekeyed into canonical's project_id. This covers the
|
||||
cross-contamination case where the old alias path was used for
|
||||
the live value while the canonical path had a stale row.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p06-polisher", ["p06", "polisher"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p06")
|
||||
canonical_id = _seed_shadow_project(conn, "p06-polisher")
|
||||
|
||||
# Canonical has a stale invalid row; shadow has the live value.
|
||||
_seed_state_row(
|
||||
conn,
|
||||
canonical_id,
|
||||
"decision",
|
||||
"frame",
|
||||
"Old frame (no longer current)",
|
||||
status="invalid",
|
||||
)
|
||||
_seed_state_row(
|
||||
conn,
|
||||
shadow_id,
|
||||
"decision",
|
||||
"frame",
|
||||
"kinematic mount frame",
|
||||
status="active",
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
assert plan.counts()["state_historical_drops"] == 0
|
||||
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_replaced_inactive_canonical"] == 1
|
||||
|
||||
# The active shadow value now lives on the canonical row
|
||||
via_canonical = get_state("p06-polisher")
|
||||
frame_entries = [
|
||||
e for e in via_canonical if e.category == "decision" and e.key == "frame"
|
||||
]
|
||||
assert len(frame_entries) == 1
|
||||
assert frame_entries[0].value == "kinematic mount frame"
|
||||
|
||||
# Confirm via raw SQL that the previously-inactive canonical row
|
||||
# no longer exists
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
stale = conn.execute(
|
||||
"SELECT COUNT(*) AS c FROM project_state WHERE value = ?",
|
||||
("Old frame (no longer current)",),
|
||||
).fetchone()
|
||||
finally:
|
||||
conn.close()
|
||||
assert stale["c"] == 0
|
||||
|
||||
|
||||
def test_apply_migrates_memories(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p04-gigabit", ["p04", "gigabit"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_memory_row(conn, "project", "lateral support uses GF-PTFE", "p04")
|
||||
_seed_memory_row(conn, "preference", "I prefer descriptive commits", "gigabit")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["memory_rows_rekeyed"] == 2
|
||||
|
||||
# Both memories should now read as living under the canonical id
|
||||
from atocore.memory.service import get_memories
|
||||
|
||||
rows = get_memories(project="p04-gigabit", limit=50)
|
||||
contents = {m.content for m in rows}
|
||||
assert "lateral support uses GF-PTFE" in contents
|
||||
assert "I prefer descriptive commits" in contents
|
||||
|
||||
|
||||
def test_apply_migrates_interactions(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p06-polisher", ["p06", "polisher"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_interaction_row(conn, "alias-keyed 1", "polisher")
|
||||
_seed_interaction_row(conn, "alias-keyed 2", "p06")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["interaction_rows_rekeyed"] == 2
|
||||
|
||||
from atocore.interactions.service import list_interactions
|
||||
|
||||
rows = list_interactions(project="p06-polisher", limit=50)
|
||||
prompts = {i.prompt for i in rows}
|
||||
assert prompts == {"alias-keyed 1", "alias-keyed 2"}
|
||||
|
||||
|
||||
def test_apply_is_idempotent(project_registry):
|
||||
"""Running apply twice produces the same final state as running it once."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
_seed_memory_row(conn, "project", "m1", "p05")
|
||||
_seed_interaction_row(conn, "i1", "p05")
|
||||
|
||||
# first apply
|
||||
plan_a = mig.build_plan(conn, registry_path)
|
||||
summary_a = mig.apply_plan(conn, plan_a)
|
||||
|
||||
# second apply: plan should be empty
|
||||
plan_b = mig.build_plan(conn, registry_path)
|
||||
assert plan_b.is_empty
|
||||
|
||||
# forcing a second apply on the empty plan via the function
|
||||
# directly should also succeed as a no-op (caller normally
|
||||
# has to pass --allow-empty through the CLI, but apply_plan
|
||||
# itself doesn't enforce that — the refusal is in run())
|
||||
summary_b = mig.apply_plan(conn, plan_b)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary_a["state_rows_rekeyed"] == 1
|
||||
assert summary_a["memory_rows_rekeyed"] == 1
|
||||
assert summary_a["interaction_rows_rekeyed"] == 1
|
||||
assert summary_b["state_rows_rekeyed"] == 0
|
||||
assert summary_b["memory_rows_rekeyed"] == 0
|
||||
assert summary_b["interaction_rows_rekeyed"] == 0
|
||||
|
||||
|
||||
def test_apply_refuses_with_integrity_errors(project_registry):
|
||||
"""If the projects table has two case-variant rows for the canonical id, refuse.
|
||||
|
||||
The projects.name column has a case-sensitive UNIQUE constraint,
|
||||
so exact duplicates can't exist. But case-variant rows
|
||||
``p05-interferometer`` and ``P05-Interferometer`` can both
|
||||
survive the UNIQUE constraint while both matching the
|
||||
case-insensitive ``lower(name) = lower(?)`` lookup that the
|
||||
migration uses to find the canonical row. That ambiguity
|
||||
(which canonical row should dependents rekey into?) is exactly
|
||||
the integrity failure the migration is guarding against.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_shadow_project(conn, "P05-Interferometer")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert plan.integrity_errors
|
||||
with pytest.raises(mig.MigrationRefused):
|
||||
mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# reporting tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_plan_to_json_dict_is_serializable(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
payload = mig.plan_to_json_dict(plan)
|
||||
# Must be JSON-serializable
|
||||
json_str = json.dumps(payload, default=str)
|
||||
assert "p05-interferometer" in json_str
|
||||
assert payload["counts"]["state_rekey_rows"] == 1
|
||||
|
||||
|
||||
def test_write_report_creates_file(tmp_path, project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
report_dir = tmp_path / "reports"
|
||||
report_path = mig.write_report(
|
||||
plan,
|
||||
summary=None,
|
||||
db_path=Path("/tmp/fake.db"),
|
||||
registry_path=registry_path,
|
||||
mode="dry-run",
|
||||
report_dir=report_dir,
|
||||
)
|
||||
assert report_path.exists()
|
||||
payload = json.loads(report_path.read_text(encoding="utf-8"))
|
||||
assert payload["mode"] == "dry-run"
|
||||
assert "plan" in payload
|
||||
|
||||
|
||||
def test_render_plan_text_on_empty_plan(project_registry):
|
||||
registry_path = project_registry() # empty
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
text = mig.render_plan_text(plan)
|
||||
assert "nothing to plan" in text.lower()
|
||||
|
||||
|
||||
def test_render_plan_text_on_collision(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(conn, shadow_id, "status", "phase", "A")
|
||||
_seed_state_row(conn, canonical_id, "status", "phase", "B")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
text = mig.render_plan_text(plan)
|
||||
assert "COLLISION" in text.upper()
|
||||
assert "REFUSE" in text.upper() or "refuse" in text.lower()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# gap-closed companion test — the flip side of
|
||||
# test_legacy_alias_keyed_state_is_invisible_until_migrated in
|
||||
# test_project_state.py. After running this migration, the legacy row
|
||||
# IS reachable via the canonical id.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_legacy_alias_gap_is_closed_after_migration(project_registry):
|
||||
"""End-to-end regression test for the canonicalization gap.
|
||||
|
||||
Simulates the exact scenario from
|
||||
test_legacy_alias_keyed_state_is_invisible_until_migrated in
|
||||
test_project_state.py — a shadow projects row with a state row
|
||||
pointing at it. Runs the migration. Verifies the state is now
|
||||
reachable via the canonical id.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "status", "legacy_focus", "Wave 1 ingestion"
|
||||
)
|
||||
|
||||
# Before migration: the legacy row is invisible to get_state
|
||||
# (this is the documented gap, covered in test_project_state.py)
|
||||
assert all(
|
||||
entry.value != "Wave 1 ingestion" for entry in get_state("p05")
|
||||
)
|
||||
assert all(
|
||||
entry.value != "Wave 1 ingestion"
|
||||
for entry in get_state("p05-interferometer")
|
||||
)
|
||||
|
||||
# Run the migration
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# After migration: the row is reachable via canonical AND alias
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
via_alias = get_state("p05")
|
||||
assert any(e.value == "Wave 1 ingestion" for e in via_canonical)
|
||||
assert any(e.value == "Wave 1 ingestion" for e in via_alias)
|
||||
@@ -131,3 +131,139 @@ def test_format_project_state():
|
||||
def test_format_empty():
|
||||
"""Test formatting empty state."""
|
||||
assert format_project_state([]) == ""
|
||||
|
||||
|
||||
# --- Alias canonicalization regression tests --------------------------------
|
||||
|
||||
|
||||
def test_set_state_canonicalizes_alias(project_registry):
|
||||
"""Writing state via an alias should land under the canonical project id.
|
||||
|
||||
Regression for codex's P1 finding: previously /project/state with
|
||||
project="p05" created a separate alias row that later context builds
|
||||
(which canonicalize the hint) would never see.
|
||||
"""
|
||||
project_registry(("p05-interferometer", ["p05", "interferometer"]))
|
||||
|
||||
set_state("p05", "status", "next_focus", "Wave 2 ingestion")
|
||||
|
||||
# The state must be reachable via every alias AND the canonical id
|
||||
via_alias = get_state("p05")
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
via_other_alias = get_state("interferometer")
|
||||
|
||||
assert len(via_alias) == 1
|
||||
assert len(via_canonical) == 1
|
||||
assert len(via_other_alias) == 1
|
||||
# All three reads return the same row id (no fragmented duplicates)
|
||||
assert via_alias[0].id == via_canonical[0].id == via_other_alias[0].id
|
||||
assert via_canonical[0].value == "Wave 2 ingestion"
|
||||
|
||||
|
||||
def test_get_state_canonicalizes_alias_after_canonical_write(project_registry):
|
||||
"""Reading via an alias should find state written under the canonical id."""
|
||||
project_registry(("p04-gigabit", ["p04", "gigabit"]))
|
||||
|
||||
set_state("p04-gigabit", "status", "phase", "Phase 1 baseline")
|
||||
via_alias = get_state("gigabit")
|
||||
|
||||
assert len(via_alias) == 1
|
||||
assert via_alias[0].value == "Phase 1 baseline"
|
||||
|
||||
|
||||
def test_invalidate_state_canonicalizes_alias(project_registry):
|
||||
"""Invalidating via an alias should hit the canonical row."""
|
||||
project_registry(("p06-polisher", ["p06", "polisher"]))
|
||||
|
||||
set_state("p06-polisher", "decision", "frame", "kinematic mounts")
|
||||
success = invalidate_state("polisher", "decision", "frame")
|
||||
|
||||
assert success is True
|
||||
active = get_state("p06-polisher")
|
||||
assert len(active) == 0
|
||||
|
||||
|
||||
def test_unregistered_project_state_still_works(project_registry):
|
||||
"""Hand-curated state for an unregistered project must still round-trip.
|
||||
|
||||
Backwards compatibility with state created before the project
|
||||
registry existed: resolve_project_name returns the input unchanged
|
||||
when the registry has no record, so the raw name is used as-is.
|
||||
"""
|
||||
project_registry() # empty registry
|
||||
|
||||
set_state("orphan-project", "status", "phase", "Standalone")
|
||||
entries = get_state("orphan-project")
|
||||
assert len(entries) == 1
|
||||
assert entries[0].value == "Standalone"
|
||||
|
||||
|
||||
def test_legacy_alias_keyed_state_is_invisible_until_migrated(project_registry):
|
||||
"""Documents the compatibility gap from project-identity-canonicalization.md.
|
||||
|
||||
Rows that were written under a registered alias BEFORE the
|
||||
canonicalization landed in fb6298a are stored in the projects
|
||||
table under the alias name (not the canonical id). Every read
|
||||
path now canonicalizes to the canonical id, so those legacy
|
||||
rows become invisible.
|
||||
|
||||
This test simulates the legacy state by inserting a shadow
|
||||
project row and a state row that points at it via raw SQL,
|
||||
bypassing set_state() which now canonicalizes. Then it
|
||||
verifies the canonicalized get_state() does NOT find the
|
||||
legacy row.
|
||||
|
||||
When the legacy alias migration script lands (see the open
|
||||
follow-ups in docs/architecture/project-identity-canonicalization.md),
|
||||
this test must be inverted: after running the migration the
|
||||
legacy state should be reachable via the canonical project,
|
||||
not invisible. The migration is required before engineering
|
||||
V1 ships.
|
||||
"""
|
||||
import uuid
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
|
||||
project_registry(("p05-interferometer", ["p05", "interferometer"]))
|
||||
|
||||
# Simulate a pre-fix legacy row by writing directly under the
|
||||
# alias name. This is what the OLD set_state would have done
|
||||
# before fb6298a added canonicalization.
|
||||
legacy_project_id = str(uuid.uuid4())
|
||||
legacy_state_id = str(uuid.uuid4())
|
||||
with get_connection() as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name, description) VALUES (?, ?, ?)",
|
||||
(legacy_project_id, "p05", "shadow row created before canonicalization"),
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO project_state "
|
||||
"(id, project_id, category, key, value, source, confidence) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(
|
||||
legacy_state_id,
|
||||
legacy_project_id,
|
||||
"status",
|
||||
"legacy_focus",
|
||||
"Wave 1 ingestion",
|
||||
"pre-canonicalization",
|
||||
1.0,
|
||||
),
|
||||
)
|
||||
|
||||
# The canonicalized read path looks under "p05-interferometer"
|
||||
# and cannot see the legacy row. THIS IS THE GAP.
|
||||
via_alias = get_state("p05")
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
assert all(entry.value != "Wave 1 ingestion" for entry in via_alias)
|
||||
assert all(entry.value != "Wave 1 ingestion" for entry in via_canonical)
|
||||
|
||||
# The legacy row is still in the database — it's just unreachable
|
||||
# from the canonicalized read path. The migration script (open
|
||||
# follow-up) is what closes the gap.
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT value FROM project_state WHERE id = ?", (legacy_state_id,)
|
||||
).fetchone()
|
||||
assert row is not None
|
||||
assert row["value"] == "Wave 1 ingestion"
|
||||
|
||||
@@ -6,6 +6,8 @@ from atocore.interactions.service import record_interaction
|
||||
from atocore.main import app
|
||||
from atocore.memory.reinforcement import (
|
||||
DEFAULT_CONFIDENCE_DELTA,
|
||||
_stem,
|
||||
_tokenize,
|
||||
reinforce_from_interaction,
|
||||
)
|
||||
from atocore.memory.service import (
|
||||
@@ -314,3 +316,177 @@ def test_api_post_interactions_accepts_reinforce_false(tmp_data_dir):
|
||||
reloaded = [m for m in get_memories(memory_type="preference", limit=20) if m.id == mem.id][0]
|
||||
assert reloaded.confidence == 0.5
|
||||
assert reloaded.reference_count == 0
|
||||
|
||||
|
||||
# --- alias canonicalization end-to-end -------------------------------------
|
||||
|
||||
|
||||
def test_reinforcement_works_when_capture_uses_alias(project_registry):
|
||||
"""End-to-end: capture under an alias, seed memory under canonical id,
|
||||
verify reinforcement still finds and bumps the memory.
|
||||
|
||||
Regression for codex's P2 finding: previously interaction.project
|
||||
was stored verbatim and reinforcement queried memories using that
|
||||
raw value, so capturing under "p05" while memories live under
|
||||
"p05-interferometer" silently missed everything.
|
||||
"""
|
||||
init_db()
|
||||
project_registry(("p05-interferometer", ["p05", "interferometer"]))
|
||||
|
||||
# Seed an active memory under the CANONICAL id
|
||||
mem = create_memory(
|
||||
memory_type="project",
|
||||
content="the lateral support pads use GF-PTFE for thermal stability",
|
||||
project="p05-interferometer",
|
||||
confidence=0.5,
|
||||
)
|
||||
|
||||
# Capture an interaction under the ALIAS — this is the bug case
|
||||
record_interaction(
|
||||
prompt="status update",
|
||||
response=(
|
||||
"Quick note: the lateral support pads use GF-PTFE for thermal "
|
||||
"stability and that's still the current selection."
|
||||
),
|
||||
project="p05",
|
||||
)
|
||||
|
||||
# The seeded memory should have been reinforced
|
||||
reloaded = [
|
||||
m
|
||||
for m in get_memories(memory_type="project", project="p05-interferometer", limit=20)
|
||||
if m.id == mem.id
|
||||
][0]
|
||||
assert reloaded.confidence > 0.5
|
||||
assert reloaded.reference_count == 1
|
||||
|
||||
|
||||
def test_get_memories_filter_by_alias(project_registry):
|
||||
"""Filtering memories by an alias should find rows stored under canonical."""
|
||||
init_db()
|
||||
project_registry(("p04-gigabit", ["p04", "gigabit"]))
|
||||
|
||||
create_memory(memory_type="project", content="m1", project="p04-gigabit")
|
||||
create_memory(memory_type="project", content="m2", project="gigabit")
|
||||
|
||||
via_alias = get_memories(memory_type="project", project="p04")
|
||||
via_canonical = get_memories(memory_type="project", project="p04-gigabit")
|
||||
|
||||
assert len(via_alias) == 2
|
||||
assert len(via_canonical) == 2
|
||||
assert {m.content for m in via_alias} == {"m1", "m2"}
|
||||
|
||||
|
||||
# --- token-overlap matcher: unit tests -------------------------------------
|
||||
|
||||
|
||||
def test_stem_folds_s_ed_ing():
|
||||
assert _stem("prefers") == "prefer"
|
||||
assert _stem("preferred") == "prefer"
|
||||
assert _stem("services") == "service"
|
||||
assert _stem("processing") == "process"
|
||||
# Short words must not be over-stripped
|
||||
assert _stem("red") == "red" # 3 chars, don't strip "ed"
|
||||
assert _stem("bus") == "bus" # 3 chars, don't strip "s"
|
||||
assert _stem("sing") == "sing" # 4 chars, don't strip "ing"
|
||||
assert _stem("being") == "being" # 5 chars, "ing" strip leaves "be" (2) — too short
|
||||
|
||||
|
||||
def test_tokenize_removes_stop_words():
|
||||
tokens = _tokenize("the quick brown fox jumps over the lazy dog")
|
||||
assert "the" not in tokens
|
||||
assert "quick" in tokens
|
||||
assert "brown" in tokens
|
||||
assert "fox" in tokens
|
||||
assert "dog" in tokens
|
||||
# "over" has len 4, not a stop word → kept (stemmed: "over")
|
||||
assert "over" in tokens
|
||||
|
||||
|
||||
# --- token-overlap matcher: paraphrase matching ----------------------------
|
||||
|
||||
|
||||
def test_reinforce_matches_paraphrase_prefers_vs_prefer(tmp_data_dir):
|
||||
"""The canonical rebase case from phase9-first-real-use.md."""
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="prefers rebase-based workflows because history stays linear",
|
||||
confidence=0.5,
|
||||
)
|
||||
interaction = _make_interaction(
|
||||
response=(
|
||||
"I prefer rebase-based workflows because the history stays "
|
||||
"linear and reviewers have an easier time."
|
||||
),
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert any(r.memory_id == mem.id for r in results)
|
||||
|
||||
|
||||
def test_reinforce_matches_paraphrase_with_articles_and_ed(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="preferred structured logging across all backend services",
|
||||
confidence=0.5,
|
||||
)
|
||||
interaction = _make_interaction(
|
||||
response=(
|
||||
"I set up structured logging across all the backend services, "
|
||||
"which the team prefers for consistency."
|
||||
),
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert any(r.memory_id == mem.id for r in results)
|
||||
|
||||
|
||||
def test_reinforce_rejects_low_overlap(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="always uses Python for data processing scripts",
|
||||
confidence=0.5,
|
||||
)
|
||||
interaction = _make_interaction(
|
||||
response=(
|
||||
"The CI pipeline runs on Node.js and deploys to Kubernetes "
|
||||
"using Helm charts."
|
||||
),
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert all(r.memory_id != mem.id for r in results)
|
||||
|
||||
|
||||
def test_reinforce_matches_at_70_percent_threshold(tmp_data_dir):
|
||||
"""Exactly 7 of 10 content tokens present → should match."""
|
||||
init_db()
|
||||
# After stop-word removal and stemming, this has 10 tokens:
|
||||
# alpha, bravo, charlie, delta, echo, foxtrot, golf, hotel, india, juliet
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="alpha bravo charlie delta echo foxtrot golf hotel india juliet",
|
||||
confidence=0.5,
|
||||
)
|
||||
# Echo 7 of 10 tokens (70%) plus some noise
|
||||
interaction = _make_interaction(
|
||||
response="alpha bravo charlie delta echo foxtrot golf noise words here",
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert any(r.memory_id == mem.id for r in results)
|
||||
|
||||
|
||||
def test_reinforce_rejects_below_70_percent(tmp_data_dir):
|
||||
"""Only 6 of 10 content tokens present (60%) → should NOT match."""
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="alpha bravo charlie delta echo foxtrot golf hotel india juliet",
|
||||
confidence=0.5,
|
||||
)
|
||||
# Echo 6 of 10 tokens (60%) plus noise
|
||||
interaction = _make_interaction(
|
||||
response="alpha bravo charlie delta echo foxtrot noise words here only",
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert all(r.memory_id != mem.id for r in results)
|
||||
|
||||
Reference in New Issue
Block a user