Files

Anto01 368adf2ebc docs(arch): tool-handoff-boundaries + representation-authority

Session 3 of the four-session plan. Two more engineering planning
docs that lock in the most contentious architectural decisions
before V1 implementation begins.

docs/architecture/tool-handoff-boundaries.md
--------------------------------------------
Locks in the V1 read/write relationship with external tools:

- AtoCore is a one-way mirror in V1. External tools push,
  AtoCore reads, AtoCore never writes back.
- Per-tool stance table covering KB-CAD, KB-FEM, NX, PKM, Gitea
  repos, OpenClaw, AtoDrive, PLM/vendor systems
- Two new ingest endpoints proposed for V1:
  POST /ingest/kb-cad/export and POST /ingest/kb-fem/export
- Sketch JSON shapes for both exports (intentionally minimal,
  to be refined in dedicated schema docs during implementation)
- Drift handling: KB-CAD changes a value -> creates an entity
  candidate -> existing active becomes a conflict member ->
  human resolves via the conflict model
- Hard-line invariants V1 will not cross: no write to external
  tools, no live polling, no silent merging, no schema fan-out,
  no external-tool-specific logic in entity types
- Why not bidirectional: schema drift, conflict semantics, trust
  hierarchy, velocity, reversibility
- V2+ deferred items: selective write-back annotations, light
  polling, direct NX integration, cost/vendor/PLM connections
- Open questions for the implementation sprint: schema location,
  who runs the exporter, full-vs-incremental, exporter auth

docs/architecture/representation-authority.md
---------------------------------------------
The canonical-home matrix that says where each kind of fact
actually lives:

- Six representation layers identified: PKM, KB project,
  Gitea repos, AtoCore memories, AtoCore entities, AtoCore
  project_state
- The hard rule: every fact kind has exactly one canonical
  home; other layers may hold derived copies but never disagree
- Comprehensive matrix covering 22 fact kinds (CAD geometry,
  CAD-side structure, FEM mesh, FEM results, code, repo docs,
  PKM prose, identity, preference, episodic, decision,
  requirement, constraint, validation claim, material,
  parameter, project status, ADRs, runbooks, backup metadata,
  interactions)
- Cross-layer supremacy rule: project_state > tool-of-origin >
  entities > active memories > source chunks
- Three worked examples showing how the rules apply:
  * "what material does the lateral support pad use?" (KB-CAD
    canonical, project_state override possible)
  * "did we decide to merge the bind mounts?" (Gitea + memory
    both canonical for different aspects)
  * "what's p05's current next focus?" (project_state always
    wins for current state queries)
- Concrete consequences for V1 implementation: Material and
  Parameter are mostly KB-CAD shadows; Decisions / Requirements /
  Constraints / ValidationClaims are AtoCore-canonical; PKM is
  never authoritative; project_state is the override layer;
  the conflict model is the enforcement mechanism
- Out of scope for V1: facts about other people, vendor/cost
  facts, time-bounded facts, cross-project shared facts
- Open questions for V1: how the reviewer sees canonical home
  in the UI, whether entities need an explicit canonical_home
  field, how project_state overrides surface in query results

This is pure doc work. No code, no schema, no behavior changes.
After this commit the engineering planning sprint is 6 of 8 docs
done — only human-mirror-rules and engineering-v1-acceptance
remain.

2026-04-07 06:50:56 -04:00

16 KiB

Raw Blame History

Representation Authority (canonical home matrix)

Why this document exists

The same fact about an engineering project can show up in many places: a markdown note in the PKM, a structured field in KB-CAD, a commit message in a Gitea repo, an active memory in AtoCore, an entity in the engineering layer, a row in trusted project state. Without an explicit rule about which representation is authoritative for which kind of fact, the system will accumulate contradictions and the human will lose trust in all of them.

This document is the canonical-home matrix. Every kind of fact that AtoCore handles has exactly one authoritative representation, and every other place that holds a copy of that fact is, by definition, a derived view that may be stale.

The representations in scope

Six places where facts can live in this ecosystem:

Layer	What it is	Who edits it	How it's structured
PKM	Antoine's Obsidian-style markdown vault under `/srv/storage/atocore/sources/vault/`	Antoine, by hand	unstructured markdown with optional frontmatter
KB project	the engineering Knowledge Base (KB-CAD / KB-FEM repos and any companion docs)	Antoine, semi-structured	per-tool typed records
Gitea repos	source code repos under `dalidou:3000/Antoine/*` (Fullum-Interferometer, polisher-sim, ATOCore itself, ...)	Antoine via git commits	code, READMEs, repo-specific markdown
AtoCore memories	rows in the `memories` table	hand-authored or extracted from interactions	typed (identity / preference / project / episodic / knowledge / adaptation)
AtoCore entities	rows in the `entities` table (V1, not yet built)	imported from KB exports or extracted from interactions	typed entities + relationships per the V1 ontology
AtoCore project state	rows in the `project_state` table (Layer 3, trusted)	hand-curated only, never automatic	category + key + value

The canonical home rule

For each kind of fact, exactly one of the six representations is the authoritative source. The other five may hold derived copies, but they are not allowed to disagree with the authoritative one. When they disagree, the disagreement is a conflict and surfaces via the conflict model.

The matrix below assigns the authoritative representation per fact kind. It is the practical answer to the question "where does this fact actually live?" for daily decisions.

The canonical-home matrix

Fact kind	Canonical home	Why	How it gets into AtoCore
CAD geometry (the actual model)	NX (or successor CAD tool)	the only place that can render and validate it	not in AtoCore at all in V1
CAD-side structure (subsystem tree, component list, materials, parameters)	KB-CAD	KB-CAD is the structured wrapper around NX	KB-CAD export → `/ingest/kb-cad/export` → entities
FEM mesh & solver settings	KB-FEM (wrapping the FEM tool)	only the solver representation can run	not in AtoCore at all in V1
FEM results & validation outcomes	KB-FEM	KB-FEM owns the outcome records	KB-FEM export → `/ingest/kb-fem/export` → entities
Source code	Gitea repos	repos are version-controlled and reviewable	indirectly via repo markdown ingestion (Phase 1)
Repo-level documentation (READMEs, design docs in the repo)	Gitea repos	lives next to the code it documents	ingested as source chunks; never hand-edited in AtoCore
Project-level prose notes (decisions in long-form, journal-style entries, working notes)	PKM	the place Antoine actually writes when thinking	ingested as source chunks; the extractor proposes candidates from these for the review queue
Identity ("the user is a mechanical engineer running AtoCore")	AtoCore memories (`identity` type)	nowhere else holds personal identity	hand-authored via `POST /memory` or extracted from interactions
Preference ("prefers small reviewable diffs", "uses SI units")	AtoCore memories (`preference` type)	nowhere else holds personal preferences	hand-authored or extracted
Episodic ("on April 6 we debugged the EXDEV bug")	AtoCore memories (`episodic` type)	nowhere else has time-bound personal recall	extracted from captured interactions
Decision (a structured engineering decision)	AtoCore entities (Decision) once the engineering layer ships; AtoCore memories (`adaptation`) until then	needs structured supersession, audit trail, and link to affected components	extracted from PKM or interactions; promoted via review queue
Requirement	AtoCore entities (Requirement)	needs structured satisfaction tracking	extracted from PKM, KB-CAD, or interactions
Constraint	AtoCore entities (Constraint)	needs structured link to the entity it constrains	extracted from PKM, KB-CAD, or interactions
Validation claim	AtoCore entities (ValidationClaim)	needs structured link to supporting Result	extracted from KB-FEM exports or interactions
Material	KB-CAD if the material is on a real component; AtoCore entity (Material) if it's a project-wide material decision not yet attached to geometry	structured properties live in KB-CAD's material database	KB-CAD export, or hand-authored as a Material entity
Parameter	KB-CAD or KB-FEM depending on whether it's a geometry or solver parameter; AtoCore entity (Parameter) if it's a higher-level project parameter not in either tool	structured numeric values with units live in their tool of origin	KB export, or hand-authored
Project status / current focus / next milestone	AtoCore project_state (Layer 3)	the trust hierarchy says trusted state is the highest authority for "what is the current state of the project"	hand-curated via `POST /project/state`
Architectural decision records (ADRs)	depends on form: long-form ADR markdown lives in the repo; the structured fact about which ADR was selected lives in the AtoCore Decision entity	both representations are useful for different audiences	repo ingestion provides the prose; the entity is created by extraction or hand-authored
Operational runbooks	repo (next to the code they describe)	lives with the system it operates	not promoted into AtoCore entities — runbooks are reference material, not facts
Backup metadata (snapshot timestamps, integrity status)	the backup-metadata.json files under `/srv/storage/atocore/backups/`	each snapshot is its own self-describing record	not in AtoCore's database; queried via the `/admin/backup` endpoints
Conversation history with AtoCore (interactions)	AtoCore `interactions` table	nowhere else has the prompt + context pack + response triple	written by capture (Phase 9 Commit A)

The supremacy rule for cross-layer facts

When the same fact has copies in multiple representations and they disagree, the trust hierarchy applies in this order:

AtoCore project_state (Layer 3) is highest authority for any "current state of the project" question. This is why it requires manual curation and never gets touched by automatic processes.
The tool-of-origin canonical home is highest authority for facts that are tool-managed: KB-CAD wins over AtoCore entities for CAD-side structure facts; KB-FEM wins for FEM result facts.
AtoCore entities are highest authority for facts that are AtoCore-managed: Decisions, Requirements, Constraints, ValidationClaims (when the supporting Results are still loose).
Active AtoCore memories are highest authority for personal facts (identity, preference, episodic).
Source chunks (PKM, repos, ingested docs) are lowest authority — they are the raw substrate from which higher layers are extracted, but they may be stale, contradictory among themselves, or out of date.

This is the same hierarchy enforced by conflict-model.md. This document just makes it explicit per fact kind.

Examples

Example 1 — "what material does the lateral support pad use?"

Possible representations:

KB-CAD has the field component.lateral-support-pad.material = "GF-PTFE"
A PKM note from last month says "considering PEEK for the lateral support, GF-PTFE was the previous choice"
An AtoCore Material entity says GF-PTFE
An AtoCore project_state entry says p05 / decision / lateral_support_material = GF-PTFE

Which one wins for the question "what's the current material"?

project_state wins if the query is "what is the current trusted answer for p05's lateral support material" (Layer 3)
KB-CAD wins if project_state has not been curated for this field yet, because KB-CAD is the canonical home for CAD-side structure
The Material entity is a derived view from KB-CAD; if it disagrees with KB-CAD, the entity is wrong and a conflict is surfaced
The PKM note is historical context, not authoritative for "current"

Example 2 — "did we decide to merge the bind mounts?"

Possible representations:

A working session interaction is captured in the interactions table with the response containing ## Decision: merge the two bind mounts into one
The Phase 9 Commit C extractor produced a candidate adaptation memory from that decision
A reviewer promoted the candidate to active
The AtoCore source repo has the actual code change in commit d0ff8b5 and the docker-compose.yml is in its post-merge form

Which one wins for "is this decision real and current"?

The Gitea repo wins for "is this decision implemented" — the docker-compose.yml is the canonical home for the actual bind mount configuration
The active adaptation memory wins for "did we decide this" — that's exactly what the Commit C lifecycle is for
The interaction record is the audit trail — it's authoritative for "when did this conversation happen and what did the LLM say", but not for "is this decision current"
The source chunks from PKM are not relevant here because no PKM note about this decision exists yet (and that's fine — decisions don't have to live in PKM if they live in the repo and the AtoCore memory)

Example 3 — "what's p05's current next focus?"

Possible representations:

The PKM has a current-status.md note updated last week
AtoCore project_state has p05 / status / next_focus = "wave 2 ingestion"
A captured interaction from yesterday discussed the next focus at length

Which one wins?

project_state wins, full stop. The trust hierarchy says Layer 3 is canonical for current state. This is exactly the reason project_state exists.
The PKM note is historical context.
The interaction is conversation history.
If project_state and the PKM disagree, the human updates one or the other to bring them in line — usually by re-curating project_state if the conversation revealed a real change.

What this means for the engineering layer V1 implementation

Several concrete consequences fall out of the matrix:

The Material and Parameter entity types are mostly KB-CAD shadows in V1. They exist in AtoCore so other entities (Decisions, Requirements) can reference them with structured links, but their authoritative values come from KB-CAD imports. If KB-CAD doesn't know about a material, the AtoCore entity is the canonical home only because nothing else is.
Decisions / Requirements / Constraints / ValidationClaims are AtoCore-canonical. These don't have a natural home in KB-CAD or KB-FEM. They live in AtoCore as first-class entities with full lifecycle and supersession.
The PKM is never authoritative. It is the substrate for extraction. The reviewer promotes things out of it; they don't point at PKM notes as the "current truth".
project_state is the override layer. Whenever the human wants to declare "the current truth is X regardless of what the entities and memories and KB exports say", they curate into project_state. Layer 3 is intentionally small and intentionally manual.
The conflict model is the enforcement mechanism. When two representations disagree on a fact whose canonical home rule should pick a winner, the conflict surfaces via the /conflicts endpoint and the reviewer resolves it. The matrix in this document tells the reviewer who is supposed to win in each scenario; they're not making the decision blind.

What the matrix does NOT define

Facts about people other than the user. No "team member" entity, no per-collaborator preferences. AtoCore is single-user in V1.
Facts about AtoCore itself as a project. Those are project memories and project_state entries under project=atocore, same lifecycle as any other project's facts.
Vendor / supplier / cost facts. Out of V1 scope.
Time-bounded facts (a value that was true between two dates and may not be true now). The current matrix treats all active facts as currently-true and uses supersession to represent change. Temporal facts are a V2 concern.
Cross-project shared facts (a Material that is reused across p04, p05, and p06). Currently each project has its own copy. Cross-project deduplication is also a V2 concern.

The "single canonical home" invariant in practice

The hard rule that every fact has exactly one canonical home is the load-bearing invariant of this matrix. To enforce it operationally:

Extraction never duplicates. When the extractor scans an interaction or a source chunk and proposes a candidate, the candidate is dropped if it duplicates an already-active record in the canonical home (the existing extractor implementation already does this for memories; the entity extractor will follow the same pattern).
Imports never duplicate. When KB-CAD pushes the same Component twice with the same value, the second push is recognized as identical and updates the last_imported_at timestamp without creating a new entity.
Imports surface drift as conflict. When KB-CAD pushes the same Component with a different value, that's a conflict per the conflict model — never a silent overwrite.
Hand-curation into project_state always wins. A project_state entry can disagree with an entity or a KB export; the project_state entry is correct by fiat (Layer 3 trust), and the reviewer is responsible for bringing the lower layers in line if appropriate.

Open questions for V1 implementation

How does the reviewer see the canonical home for a fact in the UI? Probably by including the fact's authoritative layer in the entity / memory detail view: "this Material is currently mirrored from KB-CAD; the canonical home is KB-CAD".
Who owns running the KB-CAD / KB-FEM exporter? The tool-handoff-boundaries.md doc lists this as an open question; same answer applies here.
Do we need an explicit canonical_home field on entity rows? A field that records "this entity is canonical here" vs "this entity is a mirror of ". Probably yes; deferred to the entity schema spec.
How are project_state overrides surfaced in the engineering layer query results? When a query (e.g. Q-001 "what does this subsystem contain?") would return entity rows, the result should also flag any project_state entries that contradict the entities — letting the reviewer see the override at query time, not just in the conflict queue.

TL;DR

Six representation layers: PKM, KB project, repos, AtoCore memories, AtoCore entities, AtoCore project_state
Every fact kind has exactly one canonical home
The trust hierarchy resolves cross-layer conflicts: project_state > tool-of-origin (KB-CAD/KB-FEM) > entities > active memories > source chunks
Decisions / Requirements / Constraints / ValidationClaims are AtoCore-canonical (no other system has a natural home for them)
Materials / Parameters / CAD-side structure are KB-CAD-canonical
FEM results / validation outcomes are KB-FEM-canonical
project_state is the human override layer, top of the hierarchy, manually curated only
Conflicts surface via /conflicts and the reviewer applies the matrix to pick a winner

16 KiB Raw Blame History

Representation Authority (canonical home matrix)

Why this document exists

The representations in scope

The canonical home rule

The canonical-home matrix

The supremacy rule for cross-layer facts

Examples

Example 1 — "what material does the lateral support pad use?"

Example 2 — "did we decide to merge the bind mounts?"

Example 3 — "what's p05's current next focus?"

What this means for the engineering layer V1 implementation

What the matrix does NOT define

The "single canonical home" invariant in practice

Open questions for V1 implementation

TL;DR

16 KiB

Raw Blame History