Verified t420-openclaw/atocore.py against live Dalidou from both
the development machine and the T420 (clawdbot @ 192.168.86.39):
- health: returns 0.2.0 + build_sha + vector count
- auto-context: project detection + context/build produces full
packs with Trusted Project State, Project Memories band, and
retrieved chunks (tested p05 vendor query and p06 firmware query)
- fail-open: unreachable host returns {status: unavailable,
fail_open: true} without crashing or blocking the session
API surface coverage: atocore.py hits 15/33 endpoints (core
retrieval + project state + context build). Memory management,
interactions, and backup endpoints are correctly excluded — those
belong to the operator client (scripts/atocore_client.py) per the
read-only additive integration model.
No code changes needed — the April 6 atocore.py already matches
the current API surface. Wave 2 state entries and project-memory
band changes are transparent to the client (they enrich
formatted_context without requiring client-side updates).
Cloned repo to T420 at /home/papa/ATOCore for future OpenClaw use.
Updated master-plan-status.md: Phase 8 moved from Partial to
Baseline Complete.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
scripts/retrieval_eval.py walks a fixture file of project-hinted
questions, runs each against POST /context/build, and scores the
returned formatted_context against per-fixture expect_present and
expect_absent substring checklists. Exit 0 on all-pass, 1 on any
miss. Human-readable by default, --json for automation.
First live run against Dalidou at SHA 1161645: 4/6 pass. The two
failures are real findings, not harness bugs:
- p05-configuration FAIL: "GigaBIT M1" appears in the p05 pack.
Cross-project bleed from a shared p05 doc that legitimately
mentions the p04 mirror under test. Fixture kept strict so
future ranker tuning can close the gap.
- p05-vendor-signal FAIL: "Zygo" missing. The vendor memory exists
with confidence 0.9 but get_memories_for_context walks memories
in fixed order (effectively by updated_at / confidence), so lower-
ranked memories get pushed out of the per-project budget slice by
higher-confidence ones even when the query is specifically about
the lower-ranked content. Query-relevance ordering of memories is
the natural next fix.
Docs sync:
- master-plan-status.md: Phase 9 reflection entry now notes that
capture→reinforce runs automatically and project memories reach
the context pack, while extract remains batch/manual. First batch-
extract pass surfaced 1 candidate from 42 interactions — extractor
rule tuning is a known follow-up.
- next-steps.md: the 2026-04-11 retrieval quality review entry now
shows the project-memory-band work as DONE, and a new
"Reflection Loop Live Check" subsection records the extractor-
coverage finding from the first batch run.
- Both files now agree with the code; follow-up reviewers
(Codex, future Claude) should no longer see narrative drift.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codifies the helper-at-every-service-boundary rule that fb6298a
implemented across the eight current callsites. The contract is
intentionally simple but easy to forget, so it lives in its own
doc that the engineering layer V1 implementation sprint can read
before adding new project-keyed entity surfaces.
docs/architecture/project-identity-canonicalization.md
------------------------------------------------------
- The contract: every read/write that takes a project name MUST
call resolve_project_name() before the value crosses a service
boundary; canonicalization happens once, at the first statement
after input validation, never later
- The helper API: resolve_project_name(name) returns the canonical
project_id for registered names, the input unchanged for empty
or unregistered names (the second case is the backwards-compat
path for hand-curated state predating the registry)
- Full table of the 8 current callsites: builder.build_context,
project_state.set_state/get_state/invalidate_state,
interactions.record_interaction/list_interactions,
memory.create_memory/get_memories
- Where the helper is intentionally NOT called and why: legacy
ensure_project lookup, retriever's own _project_match_boost
(which already calls get_registered_project), _rank_chunks
secondary substring boost (multiplicative not filter, can't
drop relevant chunks), update_memory (no project field update),
unregistered names (the rule applied to a name with no record)
- Why this is the trust hierarchy in action: Layer 3 trusted
state has to be findable to win the trust battle; an
un-canonicalized lookup silently makes Layer 3 invisible and
the system falls through to lower-trust retrieved chunks with
no signal to the human
- The 4-step rule for new entry points: identify project-keyed
reads/writes, place the call as the first statement after
validation, add a regression test using the project_registry
fixture, verify None/empty paths
- How the project_registry fixture works with a copy-pasteable
example
- What the rule does NOT cover: alias creation (registry's own
write path), registry hot-reloading (no in-process cache by
design), cross-project dedup (collision detection at
registration), time-bounded canonicalization (canonical id is
stable forever), legacy data migration (open follow-up)
- Engineering layer V1 implications: every new service entry
point in the entities/relationships/conflicts/mirror modules
must apply the helper at the first statement after validation;
treated as code review failure if missing
- Open follow-ups: legacy data migration script (~30 LOC),
registry file caching when projects scale beyond ~50, case
sensitivity audit when entity-side storage lands, _rank_chunks
cleanup, documentation discoverability (intentional redundancy
between this doc, the helper docstring, and per-callsite comments)
- Quick reference card: copy-pasteable template for new service
functions
master-plan-status.md updated
-----------------------------
- New doc added to the engineering-layer planning sprint listing
- Marked as required reading before V1 implementation begins
- Note that V1 must apply the contract at every new service-layer
entry point
Pure doc work, no code changes. Full suite stays at 174 passing
because no source changed.
Codex's review caught that the Claude Code slash command shipped in
Session 2 was a parallel reimplementation of routing logic the
existing scripts/atocore_client.py already had. That client was
introduced via the codex/port-atocore-ops-client merge and is
already a comprehensive operator client (auto-context,
detect-project, refresh-project, project-state, audit-query, etc.).
The slash command should have been a thin wrapper from the start.
This commit fixes the shape without expanding scope.
.claude/commands/atocore-context.md
-----------------------------------
Rewritten as a thin Claude Code-specific frontend that shells out
to the shared client:
- explicit project hint -> calls `python scripts/atocore_client.py
context-build "<prompt>" "<project>"`
- no explicit hint -> calls `python scripts/atocore_client.py
auto-context "<prompt>"` which runs the client's detect-project
routing first and falls through to context-build with the match
Inherits the client's stable behaviour for free:
- ATOCORE_BASE_URL env var (default http://dalidou:8100)
- fail-open on network errors via ATOCORE_FAIL_OPEN
- consistent JSON output shape
- the same project alias matching the OpenClaw helper uses
Removes the speculative `--capture` capture path that was in the
original draft. Capture/extract/queue/promote/reject are
intentionally NOT in the shared client yet (memory-review
workflow not exercised in real use), so the slash command can't
expose them either.
docs/architecture/llm-client-integration.md
-------------------------------------------
New planning doc that defines the layering rule for AtoCore's
relationship with LLM client contexts:
Three layers:
1. AtoCore HTTP API (universal, src/atocore/api/routes.py)
2. Shared operator client (scripts/atocore_client.py) — the
canonical Python backbone for stable AtoCore operations
3. Per-agent thin frontends (Claude Code slash command,
OpenClaw helper, future Codex skill, future MCP server)
that shell out to the shared client
Three non-negotiable rules:
- every per-agent frontend is a thin wrapper (translate the
agent's command format and render the JSON; nothing else)
- the shared client never duplicates the API (it composes
endpoints; new logic goes in the API first)
- the shared client only exposes stable operations (subcommands
land only after the API has been exercised in a real workflow)
Doc covers:
- the full table of subcommands currently in scope (project
lifecycle, ingestion, project-state, retrieval, context build,
audit-query, debug-context, health/stats)
- the three deferred families with rationale: memory review
queue (workflow not exercised), backup admin (fail-open
default would hide errors), engineering layer entities (V1
not yet implemented)
- the integration recipe for new agent platforms
- explicit acknowledgement that the OpenClaw helper currently
duplicates routing logic and that the refactor to the shared
client is a queued cross-repo follow-up
- how the layering connects to phase 8 (OpenClaw) and phase 11
(multi-model)
- versioning and stability rules for the shared client surface
- open follow-ups: OpenClaw refactor, memory-review subcommands
when ready, optional backup admin subcommands, engineering
entity subcommands during V1 implementation
master-plan-status.md updated
-----------------------------
- New "LLM Client Integration" subsection that points to the
layering doc and explicitly notes the deferral of memory-review
and engineering-entity subcommands
- Frames the layering as sitting between phase 8 and phase 11
Scope is intentionally narrow per codex's framing: promote the
existing client to canonical status, refactor the slash command
to use it, document the layering. No new client subcommands
added in this commit. The OpenClaw helper refactor is a
separate cross-repo follow-up. Memory-review and engineering-
entity work stay deferred.
Full suite: 160 passing, no behavior changes.
Session 4 of the four-session plan. Final two engineering planning
docs, plus master-plan-status.md updated to reflect that the
engineering layer planning sprint is now complete.
docs/architecture/human-mirror-rules.md
---------------------------------------
The Layer 3 derived markdown view spec:
- The non-negotiable rule: the Mirror is read-only from the
human's perspective; edits go to the canonical home and the
Mirror picks them up on regeneration
- 3 V1 template families: Project Overview, Decision Log,
Subsystem Detail
- Explicit V1 exclusions: per-component pages, per-decision
pages, cross-project rollups, time-series pages, diff pages,
conflict queue render, per-memory pages
- Mirror files live in /srv/storage/atocore/data/mirror/ NOT in
the source vault (sources stay read-only per the operating
model)
- 3 regeneration triggers: explicit POST, debounced async on
entity write, daily scheduled refresh
- "Do not edit" header banner with checksum so unchanged inputs
skip work
- Conflicts and project_state overrides surface inline so the
trust hierarchy is visible in the human reading experience
- Templates checked in under templates/mirror/, edited via PR
- Deterministic output is a V1 requirement so future Mirror
diffing works without rework
- Open questions for V1: debounce window, scheduler integration,
template testing approach, directory listing endpoint, empty
state rendering
docs/architecture/engineering-v1-acceptance.md
----------------------------------------------
The measurable done definition:
- Single-sentence definition: V1 is done when every v1-required
query in engineering-query-catalog.md returns a correct result
for one chosen test project, the Human Mirror renders a
coherent overview, and a real KB-CAD or KB-FEM export round-
trips through ingest -> review queue -> active entity without
violating any conflict or trust invariant
- 23 acceptance criteria across 4 categories:
* Functional (8): entity store, all 20 v1-required queries,
tool ingest endpoints, candidate review queue, conflict
detection, Human Mirror, memory-to-entity graduation,
complete provenance chain
* Quality (6): existing tests pass, V1 has its own coverage,
conflict invariants enforced, trust hierarchy enforced,
Mirror reproducible via golden file, killer correctness
queries pass against representative data
* Operational (5): safe migration, backup/restore drill,
performance bounds, no new manual ops burden, Phase 9 not
regressed
* Documentation (4): per-entity-type spec docs, KB schema docs,
V1 release notes, master-plan-status updated
- Explicit negative list of things V1 does NOT need to do:
no LLM extractor, no auto-promotion, no write-back, no
multi-user, no real-time UI, no cross-project rollups,
no time-travel, no nightly conflict sweep, no incremental
Chroma, no retention cleanup, no encryption, no off-Dalidou
backup target
- Recommended implementation order: F-1 -> F-8 in sequence,
with the graduation flow (F-7) saved for last as the most
cross-cutting change
- Anticipated friction points called out in advance:
graduation cross-cuts memory module, Mirror determinism trap,
conflict detector subtle correctness, provenance backfill
for graduated entities
master-plan-status.md updated
-----------------------------
- Engineering Layer Planning Sprint section now marked complete
with all 8 architecture docs listed
- Note that the next concrete step is the V1 implementation
sprint following engineering-v1-acceptance.md as its checklist
Pure doc work. No code, no schema, no behavior changes.
After this commit, the engineering planning sprint is fully done
(8/8 docs) and Phase 9 is fully complete (Commits A/B/C all
shipped, validated, and pushed). AtoCore is ready for either
the engineering V1 implementation sprint OR a pause for real-
world Phase 9 usage, depending on which the user prefers next.
Three planning docs that answer the architectural questions the
engineering query catalog raised. Together with the catalog they
form roughly half of the pre-implementation planning sprint.
docs/architecture/memory-vs-entities.md
---------------------------------------
Resolves the central question blocking every other engineering
layer doc: is a Decision a memory or an entity?
Key decisions:
- memories stay the canonical home for identity, preference, and
episodic facts
- entities become the canonical home for project, knowledge, and
adaptation facts once the engineering layer V1 ships
- no concept lives in both layers at full fidelity; one canonical
home per concept
- a "graduation" flow lets active memories upgrade into entities
(memory stays as a frozen historical pointer, never deleted)
- one shared candidate review queue across both layers
- context builder budget gains a 15% slot for engineering entities,
slotted between identity/preference memories and retrieved chunks
- the Phase 9 memory extractor's structural cues (decision heading,
constraint heading, requirement heading) are explicitly an
intentional temporary overlap, cleanly migrated via graduation
when the entity extractor ships
docs/architecture/promotion-rules.md
------------------------------------
Defines the full Layer 0 → Layer 2 pipeline:
- four layers: L0 raw source, L1 memory candidate/active, L2 entity
candidate/active, L3 trusted project state
- three extraction triggers: on interaction capture (existing),
on ingestion wave (new, batched per wave), on explicit request
- per-rule prior confidence tuned at write time by structural
signal (echoes the retriever's high/low signal hints) and
freshness bonus
- batch cap of 50 candidates per pass to protect the reviewer
- full provenance requirements: every candidate carries rule id,
source_chunk_id, source_interaction_id, and extractor_version
- reversibility matrix for every promotion step
- explicit no-auto-promotion-in-V1 stance with the schema designed
so auto-promotion policies can be added later without migration
- the hard invariant: nothing ever moves into L3 automatically
- ingestion-wave extraction produces a report artifact under
data/extraction-reports/<wave-id>/
docs/architecture/conflict-model.md
-----------------------------------
Defines how AtoCore handles contradictory facts without violating
the "bad memory is worse than no memory" rule.
- conflict = two or more active rows claiming the same slot with
incompatible values
- per-type "slot key" tuples for both memory and entity types
- cross-layer conflict detection respects the trust hierarchy:
trusted project state > active entities > active memories
- new conflicts and conflict_members tables (schema proposal)
- detection at two latencies: synchronous at write time,
asynchronous nightly sweep
- "flag, never block" rule: writes always succeed, conflicts are
surfaced via /conflicts, /health open_conflicts_count, per-row
response bodies, and the Human Mirror's disputed marker
- resolution is always human: promote-winner + supersede-others,
or dismiss-as-not-a-real-conflict, both with audit trail
- explicitly out of scope for V1: cross-project conflicts,
temporal-overlap conflicts, tolerance-aware numeric comparisons
Also updates:
- master-plan-status.md: Phase 9 moved from "started" to "baseline
complete" now that Commits A, B, C are all landed
- master-plan-status.md: adds a "Engineering Layer Planning Sprint"
section listing the doc wave so far and the remaining docs
(tool-handoff-boundaries, human-mirror-rules,
representation-authority, engineering-v1-acceptance)
- current-state.md: Phase 9 moved from "not started" to "baseline
complete" with the A/B/C annotation
This is pure doc work. No code changes, no schema changes, no
behavior changes. Per the working rule in master-plan-status.md:
the architecture docs shape decisions, they do not force premature
schema work.
Phase 9 Commit A from the agreed plan: turn AtoCore from a stateless
context enhancer into a system that records what it actually fed to an
LLM and what came back. This is the audit trail Reflection (Commit B)
and Extraction (Commit C) will be layered on top of.
The interactions table existed in the schema since the original PoC
but nothing wrote to it. This change makes it real:
Schema migration (additive only):
- response full LLM response (caller decides how much)
- memories_used JSON list of memory ids in the context pack
- chunks_used JSON list of chunk ids in the context pack
- client identifier of the calling system
(openclaw, claude-code, manual, ...)
- session_id groups multi-turn conversations
- project project name (mirrors the memory module pattern,
no FK so capture stays cheap)
- indexes on session_id, project, created_at
The created_at column is now written explicitly with a SQLite-compatible
'YYYY-MM-DD HH:MM:SS' format so the same string lives in the DB and the
returned dataclass. Without this the `since` filter on list_interactions
would silently fail because CURRENT_TIMESTAMP and isoformat use different
shapes that do not compare cleanly as strings.
New module src/atocore/interactions/:
- Interaction dataclass
- record_interaction() persists one round-trip (prompt required;
everything else optional). Refuses empty prompts.
- list_interactions() filters by project / session_id / client / since,
newest-first, hard-capped at 500
- get_interaction() fetch by id, full response + context pack
API endpoints:
- POST /interactions capture one interaction
- GET /interactions list with summaries (no full response)
- GET /interactions/{id} full record incl. response + pack
Trust model:
- Capture is read-only with respect to memories, project state, and
source chunks. Nothing here promotes anything into trusted state.
- The audit trail becomes the dataset Commit B (reinforcement) and
Commit C (extraction + review queue) will operate on.
Tests (13 new, all green):
- service: persist + roundtrip every field
- service: minimum-fields path (prompt only)
- service: empty / whitespace prompt rejected
- service: get by id returns None for missing
- service: filter by project, session, client
- service: ordering newest-first with limit
- service: since filter inclusive on cutoff (the bug the timestamp
fix above caught)
- service: limit=0 returns empty
- API: POST records and round-trips through GET /interactions/{id}
- API: empty prompt returns 400
- API: missing id returns 404
- API: list filter returns summaries (not full response bodies)
Full suite: 118 passing (was 105).
master-plan-status.md updated to move Phase 9 from "not started" to
"started" with the explicit note that Commit A is in and Commits B/C
remain.