Dalidou Claude's validation run against the live service exposed a
structural gap: the deployment at /srv/storage/atocore/app has no
git connection, the running container was built from pre-Phase-9
source, and /health hardcoded 'version: 0.1.0' so drift is
invisible. Weeks of work have been shipping to Gitea but never
reaching the live service.
This commit fixes both the drift-invisibility problem and the
absence of an update workflow, so the next deploy to Dalidou can
go live cleanly and future drifts surface immediately.
Layer 1: deployment drift is now visible via /health
----------------------------------------------------
- src/atocore/__init__.py: __version__ bumped from 0.1.0 to 0.2.0
and documented as the source of truth for the deployed code
version, with a history block explaining when each bump happens
(API surface change, schema change, user-visible behavior change)
- src/atocore/main.py: FastAPI constructor now uses __version__
instead of the hardcoded '0.1.0' string, so the OpenAPI docs
reflect the actual code version
- src/atocore/api/routes.py: /health now reads from __version__
dynamically. Both the existing 'version' field and a new
'code_version' field report the same value for backwards compat.
A new docstring explains that comparing this to the main
branch's __version__ is the fastest way to detect drift.
- pyproject.toml: version bumped to 0.2.0 to stay in sync
The comparison is now:
curl /health -> "code_version": "0.2.0"
grep __version__ src/atocore/__init__.py -> "0.2.0"
If those differ, the deployment is stale. Concrete, unambiguous.
Layer 2: deploy.sh as the canonical update path
-----------------------------------------------
New file: deploy/dalidou/deploy.sh
One-shot bash script that handles both the first-time deploy
(where /srv/storage/atocore/app may not be a git repo yet) and
the ongoing update case. Steps:
1. If app dir is not a git checkout, back it up as
<dir>.pre-git-<utc-stamp> and re-clone from Gitea.
If it IS a checkout, fetch + reset --hard origin/<branch>.
2. Report the deployable commit SHA
3. Check that deploy/dalidou/.env exists (hard fail if missing
with a clear message pointing at .env.example)
4. docker compose up -d --build — rebuilds the image from
current source, restarts the container
5. Poll /health for up to 30 seconds; on failure, print the
last 50 lines of container logs and exit non-zero
6. Parse /health.code_version and compare to the __version__
in the freshly-pulled source. If they differ, exit non-zero
with a message suggesting docker compose down && up
7. On success, report commit + code_version + "health: ok"
Configurable via env vars:
- ATOCORE_APP_DIR (default /srv/storage/atocore/app)
- ATOCORE_GIT_REMOTE (default http://dalidou:3000/Antoine/ATOCore.git)
- ATOCORE_BRANCH (default main)
- ATOCORE_HEALTH_URL (default http://127.0.0.1:8100/health)
- ATOCORE_DEPLOY_DRY_RUN=1 for preview-only mode
Explicit non-goals documented in the script header:
- does not manage secrets (.env is the caller's responsibility)
- does not take a pre-deploy backup (call /admin/backup first
if you want one)
- does not roll back on failure (redeploy a known-good commit
to recover)
- does not touch the DB directly — schema migrations run at
service startup via the lifespan handler, and all existing
_apply_migrations ALTERs are idempotent ADD COLUMN operations
Layer 3: updated docs/dalidou-deployment.md
-------------------------------------------
- First-time deployment steps now explicitly say "git clone", not
"place the repository", so future first-time deploys don't end
up as static snapshots again
- New "Updating a running deployment" section covering deploy.sh
usage with all three modes (normal / branch override / dry-run)
- New "Deployment drift detection" section with the one-liner
comparison between /health code_version and the repo's
__version__
- New "Schema migrations on redeploy" section enumerating the
exact ALTER TABLE statements that run on a pre-0.2.0 -> 0.2.0
upgrade, confirming they are additive-only and safe, and
recommending a backup via /admin/backup before any redeploy
Full suite: 215 passing, 1 warning. No test was hardcoded to the
old version string, so the version bump was safe without test
changes.
What this commit does NOT do
----------------------------
- Does NOT execute the deploy on the live Dalidou instance. That
requires Dalidou access and is the next step. A ready-to-paste
prompt for Dalidou Claude will be provided separately.
- Does NOT add CI/CD, webhook-based auto-deploy, or reverse
proxy. Those remain in the 'deferred' section of the
deployment doc.
- Does NOT change the Dockerfile. The existing 'COPY source at
build time' pattern is what deploy.sh relies on — rebuilding
the image picks up new code.
- Does NOT modify the database schema. The Phase 9 migrations
that Dalidou's DB needs will be applied automatically on next
service startup via the existing _apply_migrations path.
Codex's sequence step 3: finish the Phase 9 operator surface in the
shared client. The previous client version (0.1.0) covered stable
operations (project lifecycle, retrieval, context build, trusted
state, audit-query) but explicitly deferred capture/extract/queue/
promote/reject pending "exercised workflow". That deferral ran
into a bootstrap problem: real Claude Code sessions can't exercise
the Phase 9 loop without a usable client surface to drive it. This
commit ships the 8 missing subcommands so the next step (real
validation on Dalidou) is unblocked.
Bumps CLIENT_VERSION from 0.1.0 to 0.2.0 per the semver rules in
llm-client-integration.md (new subcommands = minor bump).
New subcommands in scripts/atocore_client.py
--------------------------------------------
| Subcommand | Endpoint |
|-----------------------|-------------------------------------------|
| capture | POST /interactions |
| extract | POST /interactions/{id}/extract |
| reinforce-interaction | POST /interactions/{id}/reinforce |
| list-interactions | GET /interactions |
| get-interaction | GET /interactions/{id} |
| queue | GET /memory?status=candidate |
| promote | POST /memory/{id}/promote |
| reject | POST /memory/{id}/reject |
Each follows the existing client style: positional arguments with
empty-string defaults for optional filters, truthy-string arguments
for booleans (matching the existing refresh-project pattern), JSON
output via print_json(), fail-open behavior inherited from
request().
capture accepts prompt + response + project + client + session_id +
reinforce as positionals, defaulting the client field to
"atocore-client" when omitted so every capture from the shared
client is identifiable in the interactions audit trail.
extract defaults to preview mode (persist=false). Pass "true" as
the second positional to create candidate memories.
list-interactions and queue build URL query strings with
url-encoded values and always include the limit, matching how the
existing context-build subcommand handles its parameters.
Security fix: ID-field URL encoding
-----------------------------------
The initial draft used urllib.parse.quote() with the default safe
set, which does NOT encode "/" because it's a reserved path
character. That's a security footgun on ID fields: passing
"promote mem/evil/action" would build /memory/mem/evil/action/promote
and hit a completely different endpoint than intended.
Fixed by passing safe="" to urllib.parse.quote() on every ID field
(interaction_id and memory_id). The tests cover this explicitly via
test_extract_url_encodes_interaction_id and test_promote_url_encodes_memory_id,
both of which would have failed with the default behavior.
Project names keep the default quote behavior because a project
name with a slash would already be broken elsewhere in the system
(ingest root resolution, file paths, etc).
tests/test_atocore_client.py (new, 18 tests, all green)
-------------------------------------------------------
A dedicated test file for the shared client that mocks the
request() helper and verifies each subcommand:
- calls the correct HTTP method and path
- builds the correct JSON body (or query string)
- passes the right subset of CLI arguments through
- URL-encodes ID fields so path traversal isn't possible
Tests are structured as unit tests (not integration tests) because
the API surface on the server side already has its own route tests
in test_api_storage.py and the Phase 9 specific files. These tests
are the wiring contract between CLI args and HTTP calls.
Test file highlights:
- capture: default values, custom client, reinforce=false
- extract: preview by default, persist=true opt-in, URL encoding
- reinforce-interaction: correct path construction
- list-interactions: no filters, single filter, full filter set
(including ISO 8601 since parameter with T separator and Z)
- get-interaction: fetch by id
- queue: always filters status=candidate, accepts memory_type
and project, coerces limit to int
- promote / reject: correct path + URL encoding
- test_phase9_full_loop_via_client_shape: end-to-end sequence
that drives capture -> extract preview -> extract persist ->
queue list -> promote -> reject through the shared client and
verifies the exact sequence of HTTP calls that would be made
These tests run in ~0.2s because they mock request() — no DB, no
Chroma, no HTTP. The fast feedback loop matters because the
client surface is what every agent integration eventually depends
on.
docs/architecture/llm-client-integration.md updates
---------------------------------------------------
- New "Phase 9 reflection loop (shipped after migration safety
work)" section under "What's in scope for the shared client
today" with the full 8-subcommand table and a note explaining
the bootstrap-problem rationale
- Removed the "Memory review queue and reflection loop" section
from "What's intentionally NOT in scope today"; backup admin
and engineering-entity commands remain the only deferred
families
- Renumbered the deferred-commands list (was 3 items, now 2)
- Open follow-ups updated: memory-review-subcommand item replaced
with "real-usage validation of the Phase 9 loop" as the next
concrete dependency
- TL;DR updated to list the reflection-loop subcommands
- Versioning note records the v0.1.0 -> v0.2.0 bump with the
subcommands included
Full suite: 215 passing (was 197), 1 warning. The +18 is
tests/test_atocore_client.py. Runtime unchanged because the new
tests don't touch the DB.
What this commit does NOT do
----------------------------
- Does NOT change the server-side endpoints. All 8 subcommands
call existing API routes that were shipped in Phase 9 Commits
A/B/C. This is purely a client-side wiring commit.
- Does NOT run the reflection loop against the live Dalidou
instance. That's the next concrete step and is explicitly
called out in the open-follow-ups section of the updated doc.
- Does NOT modify the Claude Code slash command. It still pulls
context only; the capture/extract/queue/promote companion
commands (e.g. /atocore-record-response) are deferred until the
capture workflow has been exercised in real use at least once.
- Does NOT refactor the OpenClaw helper. That's a cross-repo
change and remains a queued follow-up, now unblocked by the
shared client having the reflection-loop subcommands.
Codex caught a real documentation accuracy bug in the previous
canonicalization doc commit (f521aab). The doc claimed that rows
written under aliases before fb6298a "still work via the
unregistered-name fallback path" — that is wrong for REGISTERED
aliases, which is exactly the case that matters.
The unregistered-name fallback only saves you when the project was
never in the registry: a row stored under "orphan-project" is read
back via "orphan-project", both pass through resolve_project_name
unchanged, and the strings line up. For a registered alias like
"p05", the helper rewrites the read key to "p05-interferometer"
but does NOT rewrite the storage key, so the legacy row becomes
silently invisible.
This commit corrects the doc and locks the gap behavior in with
a regression test, so the issue cannot be lost again.
docs/architecture/project-identity-canonicalization.md
------------------------------------------------------
- Removed the misleading claim from the "What this rule does NOT
cover" section. Replaced with a pointer to the new gap section
and an explicit statement that the migration is required before
engineering V1 ships.
- New "Compatibility gap: legacy alias-keyed rows" section between
"Why this is the trust hierarchy in action" and "The rule for
new entry points". This is the natural insertion point because
the gap is exactly the trust hierarchy failing for legacy data.
The section covers:
* a worked T0/T1 timeline showing the exact failure mode
* what is at risk on the live Dalidou DB, ranked by trust tier:
projects table (shadow rows), project_state (highest risk
because Layer 3 is most-authoritative), memories, interactions
* inspection SQL queries for measuring the actual blast radius
on the live DB before running any migration
* the spec for the migration script: walk projects, find shadow
rows, merge dependent state via the conflict model when there
are collisions, dry-run mode, idempotent
* explicit statement that this is required pre-V1 because V1
will add new project-keyed tables and the killer correctness
queries from engineering-query-catalog.md would report wrong
results against any project that has shadow rows
- "Open follow-ups" item 1 promoted from "tracked optional" to
"REQUIRED before engineering V1 ships, NOT optional" with a
more honest cost estimate (~150 LOC migration + ~50 LOC tests
+ supervised live run, not the previous optimistic ~30 LOC)
- TL;DR rewritten to mention the gap explicitly and re-order
the open follow-ups so the migration is the top priority
tests/test_project_state.py
---------------------------
- New test_legacy_alias_keyed_state_is_invisible_until_migrated
- Inserts a "p05" project row + a project_state row pointing at
it via raw SQL (bypassing set_state which now canonicalizes),
simulating a pre-fix legacy row
- Verifies the canonicalized get_state path can NOT see the row
via either the alias or the canonical id — this is the bug
- Verifies the row is still in the database (just unreachable),
so the migration script has something to find
- The docstring explicitly says: "When the legacy alias migration
script lands, this test must be inverted." Future readers will
know exactly when and how to update it.
Full suite: 175 passing (was 174), 1 warning. The +1 is the new
gap regression test.
What this commit does NOT do
----------------------------
- The migration script itself is NOT in this commit. Codex's
finding was a doc accuracy issue, and the right scope is fix
the doc + lock the gap behavior in. Writing the migration is
the next concrete step but is bigger (~200 LOC + dry-run mode
+ collision handling via the conflict model + supervised run
on the live Dalidou DB), warrants its own commit, and probably
warrants a "draft + review the dry-run output before applying"
workflow rather than a single shot.
- Existing tests are unchanged. The new test stands alone as a
documented gap; the 12 canonicalization tests from fb6298a
still pass without modification.
Codifies the helper-at-every-service-boundary rule that fb6298a
implemented across the eight current callsites. The contract is
intentionally simple but easy to forget, so it lives in its own
doc that the engineering layer V1 implementation sprint can read
before adding new project-keyed entity surfaces.
docs/architecture/project-identity-canonicalization.md
------------------------------------------------------
- The contract: every read/write that takes a project name MUST
call resolve_project_name() before the value crosses a service
boundary; canonicalization happens once, at the first statement
after input validation, never later
- The helper API: resolve_project_name(name) returns the canonical
project_id for registered names, the input unchanged for empty
or unregistered names (the second case is the backwards-compat
path for hand-curated state predating the registry)
- Full table of the 8 current callsites: builder.build_context,
project_state.set_state/get_state/invalidate_state,
interactions.record_interaction/list_interactions,
memory.create_memory/get_memories
- Where the helper is intentionally NOT called and why: legacy
ensure_project lookup, retriever's own _project_match_boost
(which already calls get_registered_project), _rank_chunks
secondary substring boost (multiplicative not filter, can't
drop relevant chunks), update_memory (no project field update),
unregistered names (the rule applied to a name with no record)
- Why this is the trust hierarchy in action: Layer 3 trusted
state has to be findable to win the trust battle; an
un-canonicalized lookup silently makes Layer 3 invisible and
the system falls through to lower-trust retrieved chunks with
no signal to the human
- The 4-step rule for new entry points: identify project-keyed
reads/writes, place the call as the first statement after
validation, add a regression test using the project_registry
fixture, verify None/empty paths
- How the project_registry fixture works with a copy-pasteable
example
- What the rule does NOT cover: alias creation (registry's own
write path), registry hot-reloading (no in-process cache by
design), cross-project dedup (collision detection at
registration), time-bounded canonicalization (canonical id is
stable forever), legacy data migration (open follow-up)
- Engineering layer V1 implications: every new service entry
point in the entities/relationships/conflicts/mirror modules
must apply the helper at the first statement after validation;
treated as code review failure if missing
- Open follow-ups: legacy data migration script (~30 LOC),
registry file caching when projects scale beyond ~50, case
sensitivity audit when entity-side storage lands, _rank_chunks
cleanup, documentation discoverability (intentional redundancy
between this doc, the helper docstring, and per-callsite comments)
- Quick reference card: copy-pasteable template for new service
functions
master-plan-status.md updated
-----------------------------
- New doc added to the engineering-layer planning sprint listing
- Marked as required reading before V1 implementation begins
- Note that V1 must apply the contract at every new service-layer
entry point
Pure doc work, no code changes. Full suite stays at 174 passing
because no source changed.
Codex's review caught that the Claude Code slash command shipped in
Session 2 was a parallel reimplementation of routing logic the
existing scripts/atocore_client.py already had. That client was
introduced via the codex/port-atocore-ops-client merge and is
already a comprehensive operator client (auto-context,
detect-project, refresh-project, project-state, audit-query, etc.).
The slash command should have been a thin wrapper from the start.
This commit fixes the shape without expanding scope.
.claude/commands/atocore-context.md
-----------------------------------
Rewritten as a thin Claude Code-specific frontend that shells out
to the shared client:
- explicit project hint -> calls `python scripts/atocore_client.py
context-build "<prompt>" "<project>"`
- no explicit hint -> calls `python scripts/atocore_client.py
auto-context "<prompt>"` which runs the client's detect-project
routing first and falls through to context-build with the match
Inherits the client's stable behaviour for free:
- ATOCORE_BASE_URL env var (default http://dalidou:8100)
- fail-open on network errors via ATOCORE_FAIL_OPEN
- consistent JSON output shape
- the same project alias matching the OpenClaw helper uses
Removes the speculative `--capture` capture path that was in the
original draft. Capture/extract/queue/promote/reject are
intentionally NOT in the shared client yet (memory-review
workflow not exercised in real use), so the slash command can't
expose them either.
docs/architecture/llm-client-integration.md
-------------------------------------------
New planning doc that defines the layering rule for AtoCore's
relationship with LLM client contexts:
Three layers:
1. AtoCore HTTP API (universal, src/atocore/api/routes.py)
2. Shared operator client (scripts/atocore_client.py) — the
canonical Python backbone for stable AtoCore operations
3. Per-agent thin frontends (Claude Code slash command,
OpenClaw helper, future Codex skill, future MCP server)
that shell out to the shared client
Three non-negotiable rules:
- every per-agent frontend is a thin wrapper (translate the
agent's command format and render the JSON; nothing else)
- the shared client never duplicates the API (it composes
endpoints; new logic goes in the API first)
- the shared client only exposes stable operations (subcommands
land only after the API has been exercised in a real workflow)
Doc covers:
- the full table of subcommands currently in scope (project
lifecycle, ingestion, project-state, retrieval, context build,
audit-query, debug-context, health/stats)
- the three deferred families with rationale: memory review
queue (workflow not exercised), backup admin (fail-open
default would hide errors), engineering layer entities (V1
not yet implemented)
- the integration recipe for new agent platforms
- explicit acknowledgement that the OpenClaw helper currently
duplicates routing logic and that the refactor to the shared
client is a queued cross-repo follow-up
- how the layering connects to phase 8 (OpenClaw) and phase 11
(multi-model)
- versioning and stability rules for the shared client surface
- open follow-ups: OpenClaw refactor, memory-review subcommands
when ready, optional backup admin subcommands, engineering
entity subcommands during V1 implementation
master-plan-status.md updated
-----------------------------
- New "LLM Client Integration" subsection that points to the
layering doc and explicitly notes the deferral of memory-review
and engineering-entity subcommands
- Frames the layering as sitting between phase 8 and phase 11
Scope is intentionally narrow per codex's framing: promote the
existing client to canonical status, refactor the slash command
to use it, document the layering. No new client subcommands
added in this commit. The OpenClaw helper refactor is a
separate cross-repo follow-up. Memory-review and engineering-
entity work stay deferred.
Full suite: 160 passing, no behavior changes.
Session 4 of the four-session plan. Final two engineering planning
docs, plus master-plan-status.md updated to reflect that the
engineering layer planning sprint is now complete.
docs/architecture/human-mirror-rules.md
---------------------------------------
The Layer 3 derived markdown view spec:
- The non-negotiable rule: the Mirror is read-only from the
human's perspective; edits go to the canonical home and the
Mirror picks them up on regeneration
- 3 V1 template families: Project Overview, Decision Log,
Subsystem Detail
- Explicit V1 exclusions: per-component pages, per-decision
pages, cross-project rollups, time-series pages, diff pages,
conflict queue render, per-memory pages
- Mirror files live in /srv/storage/atocore/data/mirror/ NOT in
the source vault (sources stay read-only per the operating
model)
- 3 regeneration triggers: explicit POST, debounced async on
entity write, daily scheduled refresh
- "Do not edit" header banner with checksum so unchanged inputs
skip work
- Conflicts and project_state overrides surface inline so the
trust hierarchy is visible in the human reading experience
- Templates checked in under templates/mirror/, edited via PR
- Deterministic output is a V1 requirement so future Mirror
diffing works without rework
- Open questions for V1: debounce window, scheduler integration,
template testing approach, directory listing endpoint, empty
state rendering
docs/architecture/engineering-v1-acceptance.md
----------------------------------------------
The measurable done definition:
- Single-sentence definition: V1 is done when every v1-required
query in engineering-query-catalog.md returns a correct result
for one chosen test project, the Human Mirror renders a
coherent overview, and a real KB-CAD or KB-FEM export round-
trips through ingest -> review queue -> active entity without
violating any conflict or trust invariant
- 23 acceptance criteria across 4 categories:
* Functional (8): entity store, all 20 v1-required queries,
tool ingest endpoints, candidate review queue, conflict
detection, Human Mirror, memory-to-entity graduation,
complete provenance chain
* Quality (6): existing tests pass, V1 has its own coverage,
conflict invariants enforced, trust hierarchy enforced,
Mirror reproducible via golden file, killer correctness
queries pass against representative data
* Operational (5): safe migration, backup/restore drill,
performance bounds, no new manual ops burden, Phase 9 not
regressed
* Documentation (4): per-entity-type spec docs, KB schema docs,
V1 release notes, master-plan-status updated
- Explicit negative list of things V1 does NOT need to do:
no LLM extractor, no auto-promotion, no write-back, no
multi-user, no real-time UI, no cross-project rollups,
no time-travel, no nightly conflict sweep, no incremental
Chroma, no retention cleanup, no encryption, no off-Dalidou
backup target
- Recommended implementation order: F-1 -> F-8 in sequence,
with the graduation flow (F-7) saved for last as the most
cross-cutting change
- Anticipated friction points called out in advance:
graduation cross-cuts memory module, Mirror determinism trap,
conflict detector subtle correctness, provenance backfill
for graduated entities
master-plan-status.md updated
-----------------------------
- Engineering Layer Planning Sprint section now marked complete
with all 8 architecture docs listed
- Note that the next concrete step is the V1 implementation
sprint following engineering-v1-acceptance.md as its checklist
Pure doc work. No code, no schema, no behavior changes.
After this commit, the engineering planning sprint is fully done
(8/8 docs) and Phase 9 is fully complete (Commits A/B/C all
shipped, validated, and pushed). AtoCore is ready for either
the engineering V1 implementation sprint OR a pause for real-
world Phase 9 usage, depending on which the user prefers next.
Session 3 of the four-session plan. Two more engineering planning
docs that lock in the most contentious architectural decisions
before V1 implementation begins.
docs/architecture/tool-handoff-boundaries.md
--------------------------------------------
Locks in the V1 read/write relationship with external tools:
- AtoCore is a one-way mirror in V1. External tools push,
AtoCore reads, AtoCore never writes back.
- Per-tool stance table covering KB-CAD, KB-FEM, NX, PKM, Gitea
repos, OpenClaw, AtoDrive, PLM/vendor systems
- Two new ingest endpoints proposed for V1:
POST /ingest/kb-cad/export and POST /ingest/kb-fem/export
- Sketch JSON shapes for both exports (intentionally minimal,
to be refined in dedicated schema docs during implementation)
- Drift handling: KB-CAD changes a value -> creates an entity
candidate -> existing active becomes a conflict member ->
human resolves via the conflict model
- Hard-line invariants V1 will not cross: no write to external
tools, no live polling, no silent merging, no schema fan-out,
no external-tool-specific logic in entity types
- Why not bidirectional: schema drift, conflict semantics, trust
hierarchy, velocity, reversibility
- V2+ deferred items: selective write-back annotations, light
polling, direct NX integration, cost/vendor/PLM connections
- Open questions for the implementation sprint: schema location,
who runs the exporter, full-vs-incremental, exporter auth
docs/architecture/representation-authority.md
---------------------------------------------
The canonical-home matrix that says where each kind of fact
actually lives:
- Six representation layers identified: PKM, KB project,
Gitea repos, AtoCore memories, AtoCore entities, AtoCore
project_state
- The hard rule: every fact kind has exactly one canonical
home; other layers may hold derived copies but never disagree
- Comprehensive matrix covering 22 fact kinds (CAD geometry,
CAD-side structure, FEM mesh, FEM results, code, repo docs,
PKM prose, identity, preference, episodic, decision,
requirement, constraint, validation claim, material,
parameter, project status, ADRs, runbooks, backup metadata,
interactions)
- Cross-layer supremacy rule: project_state > tool-of-origin >
entities > active memories > source chunks
- Three worked examples showing how the rules apply:
* "what material does the lateral support pad use?" (KB-CAD
canonical, project_state override possible)
* "did we decide to merge the bind mounts?" (Gitea + memory
both canonical for different aspects)
* "what's p05's current next focus?" (project_state always
wins for current state queries)
- Concrete consequences for V1 implementation: Material and
Parameter are mostly KB-CAD shadows; Decisions / Requirements /
Constraints / ValidationClaims are AtoCore-canonical; PKM is
never authoritative; project_state is the override layer;
the conflict model is the enforcement mechanism
- Out of scope for V1: facts about other people, vendor/cost
facts, time-bounded facts, cross-project shared facts
- Open questions for V1: how the reviewer sees canonical home
in the UI, whether entities need an explicit canonical_home
field, how project_state overrides surface in query results
This is pure doc work. No code, no schema, no behavior changes.
After this commit the engineering planning sprint is 6 of 8 docs
done — only human-mirror-rules and engineering-v1-acceptance
remain.
Session 2 of the four-session plan. Lands two operational pieces:
the Claude Code slash command that makes AtoCore reachable from
inside any Claude Code session, and the full backup/restore
procedure doc that turns the backup endpoint code into a real
operational drill.
Slash command (.claude/commands/atocore-context.md)
---------------------------------------------------
- Project-level slash command following the standard frontmatter
format (description + argument-hint)
- Parses the user prompt and an optional trailing project id, with
case-insensitive matching against the registered project ids
(atocore, p04-gigabit, p05-interferometer, p06-polisher and
their aliases)
- Calls POST /context/build on the live AtoCore service, defaulting
to http://dalidou:8100 (overridable via ATOCORE_API_BASE env var)
- Renders the formatted context pack inline so the user can see
exactly what AtoCore would feed an LLM, plus a stats banner and a
per-chunk source list
- Includes graceful failure handling for network errors, 4xx, 5xx,
and the empty-result case
- Defines a future capture path that POSTs to /interactions for the
Phase 9 reflection loop. The current command leaves capture as
manual / opt-in pending a clean post-turn hook design
.gitignore changes
------------------
- Replaced wholesale .claude/ ignore with .claude/* + exceptions
for .claude/commands/ so project slash commands can be tracked
- Other .claude/* paths (worktrees, settings, local state) remain
ignored
Backup-restore procedure (docs/backup-restore-procedure.md)
-----------------------------------------------------------
- Defines what gets backed up (SQLite + registry always, Chroma
optional under ingestion lock) and what doesn't (sources, code,
logs, cache, tmp)
- Documents the snapshot directory layout and the timestamp format
- Three trigger paths in priority order:
- via POST /admin/backup with {include_chroma: true|false}
- via the standalone src/atocore/ops/backup.py module
- via cold filesystem copy with brief downtime as last resort
- Listing and validation procedure with the /admin/backup and
/admin/backup/{stamp}/validate endpoints
- Full step-by-step restore procedure with mandatory pre-flight
safety snapshot, ownership/permission requirements, and the
post-restore verification checks
- Rollback path using the pre-restore safety copy
- Retention policy (last 7 daily / 4 weekly / 6 monthly) and
explicit acknowledgment that the cleanup job is not yet
implemented
- Drill schedule: quarterly full restore drill, post-migration
drill, post-incident validation
- Common failure mode table with diagnoses
- Quickstart cheat sheet at the end for daily reference
- Open follow-ups: cleanup script, off-Dalidou target,
encryption, automatic post-backup validation, incremental
Chroma snapshots
The procedure has not yet been exercised against the live Dalidou
instance — that is the next step the user runs themselves once
the slash command is in place.
Integrate codex/port-atocore-ops-client (operator client + operations
playbook) so the dalidou-storage-foundation branch can fast-forward
into main.
# Conflicts:
# README.md
Session 1 of the four-session plan. Empirically exercises the Phase 9
loop (capture -> reinforce -> extract) for the first time and lands
three small hygiene fixes.
Validation script + report
--------------------------
scripts/phase9_first_real_use.py — reproducible script that:
- sets up an isolated SQLite + Chroma store under
data/validation/phase9-first-use (gitignored)
- seeds 3 active memories
- runs 8 sample interactions through capture + reinforce + extract
- prints what each step produced and reinforcement state at the end
- supports --json output for downstream tooling
docs/phase9-first-real-use.md — narrative report of the run with:
- extraction results table (8/8 expectations met exactly)
- the empirical finding that REINFORCEMENT MATCHED ZERO seeds
despite sample 5 clearly echoing the rebase preference memory
- root cause analysis: the substring matcher is too brittle for
natural paraphrases (e.g. "prefers" vs "I prefer", "history"
vs "the history")
- recommended fix: replace substring matcher with a token-overlap
matcher (>=70% of memory tokens present in response, with
light stemming and a small stop list)
- explicit note that the fix is queued as a follow-up commit, not
bundled into the report — keeps the audit trail clean
Key extraction results from the run:
- all 7 heading/sentence rules fired correctly
- 0 false positives on the prose-only sample (the most important
sanity check)
- long content preserved without truncation
- dedup correctly kept three distinct cues from one interaction
- project scoping flowed cleanly through the pipeline
Hygiene 1: FastAPI lifespan migration (src/atocore/main.py)
- Replaced @app.on_event("startup") with the modern @asynccontextmanager
lifespan handler
- Same setup work (setup_logging, ensure_runtime_dirs, init_db,
init_project_state_schema, startup_ready log)
- Removes the two on_event deprecation warnings from every test run
- Test suite now shows 1 warning instead of 3
Hygiene 2: EXTRACTOR_VERSION constant (src/atocore/memory/extractor.py)
- Added EXTRACTOR_VERSION = "0.1.0" with a versioned change log comment
- MemoryCandidate dataclass carries extractor_version on every candidate
- POST /interactions/{id}/extract response now includes extractor_version
on both the top level (current run) and on each candidate
- Implements the versioning requirement called out in
docs/architecture/promotion-rules.md so old candidates can be
identified and re-evaluated when the rule set evolves
Hygiene 3: ~/.git-credentials cleanup (out-of-tree, not committed)
- Removed the dead OAUTH_USER:<jwt> line for dalidou:3000 that was
being silently rewritten by the system credential manager on every
push attempt
- Configured credential.http://dalidou:3000.helper with the empty-string
sentinel pattern so the URL-specific helper chain is exactly
["", store] instead of inheriting the system-level "manager" helper
that ships with Git for Windows
- Same fix for the 100.80.199.40 (Tailscale) entry
- Verified end to end: a fresh push using only the cleaned credentials
file (no embedded URL) authenticates as Antoine and lands cleanly
Full suite: 160 passing (no change from previous), 1 warning
(was 3) thanks to the lifespan migration.
Three planning docs that answer the architectural questions the
engineering query catalog raised. Together with the catalog they
form roughly half of the pre-implementation planning sprint.
docs/architecture/memory-vs-entities.md
---------------------------------------
Resolves the central question blocking every other engineering
layer doc: is a Decision a memory or an entity?
Key decisions:
- memories stay the canonical home for identity, preference, and
episodic facts
- entities become the canonical home for project, knowledge, and
adaptation facts once the engineering layer V1 ships
- no concept lives in both layers at full fidelity; one canonical
home per concept
- a "graduation" flow lets active memories upgrade into entities
(memory stays as a frozen historical pointer, never deleted)
- one shared candidate review queue across both layers
- context builder budget gains a 15% slot for engineering entities,
slotted between identity/preference memories and retrieved chunks
- the Phase 9 memory extractor's structural cues (decision heading,
constraint heading, requirement heading) are explicitly an
intentional temporary overlap, cleanly migrated via graduation
when the entity extractor ships
docs/architecture/promotion-rules.md
------------------------------------
Defines the full Layer 0 → Layer 2 pipeline:
- four layers: L0 raw source, L1 memory candidate/active, L2 entity
candidate/active, L3 trusted project state
- three extraction triggers: on interaction capture (existing),
on ingestion wave (new, batched per wave), on explicit request
- per-rule prior confidence tuned at write time by structural
signal (echoes the retriever's high/low signal hints) and
freshness bonus
- batch cap of 50 candidates per pass to protect the reviewer
- full provenance requirements: every candidate carries rule id,
source_chunk_id, source_interaction_id, and extractor_version
- reversibility matrix for every promotion step
- explicit no-auto-promotion-in-V1 stance with the schema designed
so auto-promotion policies can be added later without migration
- the hard invariant: nothing ever moves into L3 automatically
- ingestion-wave extraction produces a report artifact under
data/extraction-reports/<wave-id>/
docs/architecture/conflict-model.md
-----------------------------------
Defines how AtoCore handles contradictory facts without violating
the "bad memory is worse than no memory" rule.
- conflict = two or more active rows claiming the same slot with
incompatible values
- per-type "slot key" tuples for both memory and entity types
- cross-layer conflict detection respects the trust hierarchy:
trusted project state > active entities > active memories
- new conflicts and conflict_members tables (schema proposal)
- detection at two latencies: synchronous at write time,
asynchronous nightly sweep
- "flag, never block" rule: writes always succeed, conflicts are
surfaced via /conflicts, /health open_conflicts_count, per-row
response bodies, and the Human Mirror's disputed marker
- resolution is always human: promote-winner + supersede-others,
or dismiss-as-not-a-real-conflict, both with audit trail
- explicitly out of scope for V1: cross-project conflicts,
temporal-overlap conflicts, tolerance-aware numeric comparisons
Also updates:
- master-plan-status.md: Phase 9 moved from "started" to "baseline
complete" now that Commits A, B, C are all landed
- master-plan-status.md: adds a "Engineering Layer Planning Sprint"
section listing the doc wave so far and the remaining docs
(tool-handoff-boundaries, human-mirror-rules,
representation-authority, engineering-v1-acceptance)
- current-state.md: Phase 9 moved from "not started" to "baseline
complete" with the A/B/C annotation
This is pure doc work. No code changes, no schema changes, no
behavior changes. Per the working rule in master-plan-status.md:
the architecture docs shape decisions, they do not force premature
schema work.
First doc in the engineering-layer planning sprint. The premise of
this document is the inverse of the existing ontology doc: instead of
listing objects and seeing what they could do, we list the questions
we need to answer and let those drive what objects and relationships
must exist.
The rule established here:
> If a typed object or relationship does not serve at least one query
> in this catalog, it is not in V1.
Contents:
- 20 v1-required queries grouped into 5 tiers:
- structure (Q-001..Q-004)
- intent (Q-005..Q-009)
- validation (Q-010..Q-012)
- change/time (Q-013..Q-014)
- cross-cutting (Q-016..Q-020)
- 3 v1-stretch queries (Q-021..Q-023)
- 4 v2 deferred queries (Q-024..Q-027) so V1 does not paint us into
a corner
Each entry has: id, question, invocation, expected result shape,
required objects, required relationships, provenance requirement,
and tier.
Three queries are flagged as the "killer correctness" queries:
- Q-006 orphan requirements (engineering equivalent of untested code)
- Q-009 decisions based on flagged assumptions (catches fragile design)
- Q-011 validation claims with no supporting result (catches
unevidenced claims)
The catalog ends with the implied implementation order for V1, the
list of object families intentionally deferred (BOM, manufacturing,
software, electrical, test correlation), and the open questions this
catalog raises for the next planning docs:
- when do orphan/unsupported queries flag (insert time vs query time)?
- when an Assumption flips, are dependent Decisions auto-flagged?
- does AtoCore block conflicts or always save-and-flag?
- is EVIDENCED_BY mandatory at insert?
- when does the Human Mirror regenerate?
These are the questions the next planning docs (memory-vs-entities,
conflict-model, promotion-rules) should answer before any engineering
layer code is written.
This is doc work only. No code, no schema, no behavior change.
Per the working rule in master-plan-status.md: the architecture docs
shape decisions, they do not force premature schema work.
Phase 9 Commit A from the agreed plan: turn AtoCore from a stateless
context enhancer into a system that records what it actually fed to an
LLM and what came back. This is the audit trail Reflection (Commit B)
and Extraction (Commit C) will be layered on top of.
The interactions table existed in the schema since the original PoC
but nothing wrote to it. This change makes it real:
Schema migration (additive only):
- response full LLM response (caller decides how much)
- memories_used JSON list of memory ids in the context pack
- chunks_used JSON list of chunk ids in the context pack
- client identifier of the calling system
(openclaw, claude-code, manual, ...)
- session_id groups multi-turn conversations
- project project name (mirrors the memory module pattern,
no FK so capture stays cheap)
- indexes on session_id, project, created_at
The created_at column is now written explicitly with a SQLite-compatible
'YYYY-MM-DD HH:MM:SS' format so the same string lives in the DB and the
returned dataclass. Without this the `since` filter on list_interactions
would silently fail because CURRENT_TIMESTAMP and isoformat use different
shapes that do not compare cleanly as strings.
New module src/atocore/interactions/:
- Interaction dataclass
- record_interaction() persists one round-trip (prompt required;
everything else optional). Refuses empty prompts.
- list_interactions() filters by project / session_id / client / since,
newest-first, hard-capped at 500
- get_interaction() fetch by id, full response + context pack
API endpoints:
- POST /interactions capture one interaction
- GET /interactions list with summaries (no full response)
- GET /interactions/{id} full record incl. response + pack
Trust model:
- Capture is read-only with respect to memories, project state, and
source chunks. Nothing here promotes anything into trusted state.
- The audit trail becomes the dataset Commit B (reinforcement) and
Commit C (extraction + review queue) will operate on.
Tests (13 new, all green):
- service: persist + roundtrip every field
- service: minimum-fields path (prompt only)
- service: empty / whitespace prompt rejected
- service: get by id returns None for missing
- service: filter by project, session, client
- service: ordering newest-first with limit
- service: since filter inclusive on cutoff (the bug the timestamp
fix above caught)
- service: limit=0 returns empty
- API: POST records and round-trips through GET /interactions/{id}
- API: empty prompt returns 400
- API: missing id returns 404
- API: list filter returns summaries (not full response bodies)
Full suite: 118 passing (was 105).
master-plan-status.md updated to move Phase 9 from "not started" to
"started" with the explicit note that Commit A is in and Commits B/C
remain.
Three small improvements that move the operational baseline forward
without changing the existing trust model.
1. Tunable retrieval ranking weights
- rank_project_match_boost, rank_query_token_step,
rank_query_token_cap, rank_path_high_signal_boost,
rank_path_low_signal_penalty are now Settings fields
- all overridable via ATOCORE_* env vars
- retriever no longer hard-codes 2.0 / 1.18 / 0.72 / 0.08 / 1.32
- lets ranking be tuned per environment as Wave 1 is exercised
without code changes
2. /projects/{name}/refresh status
- refresh_registered_project now returns an overall status field
("ingested", "partial", "nothing_to_ingest") plus roots_ingested
and roots_skipped counters
- ProjectRefreshResponse advertises the new fields so callers can
rely on them
- covers the case where every configured root is missing on disk
3. Chroma cold snapshot + admin backup endpoints
- create_runtime_backup now accepts include_chroma and writes a
cold directory copy of the chroma persistence path
- new list_runtime_backups() and validate_backup() helpers
- new endpoints:
- POST /admin/backup create snapshot (optional chroma)
- GET /admin/backup list snapshots
- GET /admin/backup/{stamp}/validate structural validation
- chroma snapshots are taken under exclusive_ingestion() so a refresh
or ingest cannot race with the cold copy
- backup metadata records what was actually included and how big
Tests:
- 8 new tests covering tunable weights, refresh status branches
(ingested / partial / nothing_to_ingest), chroma snapshot, list,
validate, and the API endpoints (including the lock-acquisition path)
- existing fake refresh stubs in test_api_storage.py updated for the
expanded ProjectRefreshResponse model
- full suite: 105 passing (was 97)
next-steps doc updated to reflect that the chroma snapshot + restore
validation gap from current-state.md is now closed in code; only the
operational retention policy remains.