Files
ATOCore/docs/architecture/representation-authority.md
Anto01 368adf2ebc docs(arch): tool-handoff-boundaries + representation-authority
Session 3 of the four-session plan. Two more engineering planning
docs that lock in the most contentious architectural decisions
before V1 implementation begins.

docs/architecture/tool-handoff-boundaries.md
--------------------------------------------
Locks in the V1 read/write relationship with external tools:

- AtoCore is a one-way mirror in V1. External tools push,
  AtoCore reads, AtoCore never writes back.
- Per-tool stance table covering KB-CAD, KB-FEM, NX, PKM, Gitea
  repos, OpenClaw, AtoDrive, PLM/vendor systems
- Two new ingest endpoints proposed for V1:
  POST /ingest/kb-cad/export and POST /ingest/kb-fem/export
- Sketch JSON shapes for both exports (intentionally minimal,
  to be refined in dedicated schema docs during implementation)
- Drift handling: KB-CAD changes a value -> creates an entity
  candidate -> existing active becomes a conflict member ->
  human resolves via the conflict model
- Hard-line invariants V1 will not cross: no write to external
  tools, no live polling, no silent merging, no schema fan-out,
  no external-tool-specific logic in entity types
- Why not bidirectional: schema drift, conflict semantics, trust
  hierarchy, velocity, reversibility
- V2+ deferred items: selective write-back annotations, light
  polling, direct NX integration, cost/vendor/PLM connections
- Open questions for the implementation sprint: schema location,
  who runs the exporter, full-vs-incremental, exporter auth

docs/architecture/representation-authority.md
---------------------------------------------
The canonical-home matrix that says where each kind of fact
actually lives:

- Six representation layers identified: PKM, KB project,
  Gitea repos, AtoCore memories, AtoCore entities, AtoCore
  project_state
- The hard rule: every fact kind has exactly one canonical
  home; other layers may hold derived copies but never disagree
- Comprehensive matrix covering 22 fact kinds (CAD geometry,
  CAD-side structure, FEM mesh, FEM results, code, repo docs,
  PKM prose, identity, preference, episodic, decision,
  requirement, constraint, validation claim, material,
  parameter, project status, ADRs, runbooks, backup metadata,
  interactions)
- Cross-layer supremacy rule: project_state > tool-of-origin >
  entities > active memories > source chunks
- Three worked examples showing how the rules apply:
  * "what material does the lateral support pad use?" (KB-CAD
    canonical, project_state override possible)
  * "did we decide to merge the bind mounts?" (Gitea + memory
    both canonical for different aspects)
  * "what's p05's current next focus?" (project_state always
    wins for current state queries)
- Concrete consequences for V1 implementation: Material and
  Parameter are mostly KB-CAD shadows; Decisions / Requirements /
  Constraints / ValidationClaims are AtoCore-canonical; PKM is
  never authoritative; project_state is the override layer;
  the conflict model is the enforcement mechanism
- Out of scope for V1: facts about other people, vendor/cost
  facts, time-bounded facts, cross-project shared facts
- Open questions for V1: how the reviewer sees canonical home
  in the UI, whether entities need an explicit canonical_home
  field, how project_state overrides surface in query results

This is pure doc work. No code, no schema, no behavior changes.
After this commit the engineering planning sprint is 6 of 8 docs
done — only human-mirror-rules and engineering-v1-acceptance
remain.
2026-04-07 06:50:56 -04:00

16 KiB

Representation Authority (canonical home matrix)

Why this document exists

The same fact about an engineering project can show up in many places: a markdown note in the PKM, a structured field in KB-CAD, a commit message in a Gitea repo, an active memory in AtoCore, an entity in the engineering layer, a row in trusted project state. Without an explicit rule about which representation is authoritative for which kind of fact, the system will accumulate contradictions and the human will lose trust in all of them.

This document is the canonical-home matrix. Every kind of fact that AtoCore handles has exactly one authoritative representation, and every other place that holds a copy of that fact is, by definition, a derived view that may be stale.

The representations in scope

Six places where facts can live in this ecosystem:

Layer What it is Who edits it How it's structured
PKM Antoine's Obsidian-style markdown vault under /srv/storage/atocore/sources/vault/ Antoine, by hand unstructured markdown with optional frontmatter
KB project the engineering Knowledge Base (KB-CAD / KB-FEM repos and any companion docs) Antoine, semi-structured per-tool typed records
Gitea repos source code repos under dalidou:3000/Antoine/* (Fullum-Interferometer, polisher-sim, ATOCore itself, ...) Antoine via git commits code, READMEs, repo-specific markdown
AtoCore memories rows in the memories table hand-authored or extracted from interactions typed (identity / preference / project / episodic / knowledge / adaptation)
AtoCore entities rows in the entities table (V1, not yet built) imported from KB exports or extracted from interactions typed entities + relationships per the V1 ontology
AtoCore project state rows in the project_state table (Layer 3, trusted) hand-curated only, never automatic category + key + value

The canonical home rule

For each kind of fact, exactly one of the six representations is the authoritative source. The other five may hold derived copies, but they are not allowed to disagree with the authoritative one. When they disagree, the disagreement is a conflict and surfaces via the conflict model.

The matrix below assigns the authoritative representation per fact kind. It is the practical answer to the question "where does this fact actually live?" for daily decisions.

The canonical-home matrix

Fact kind Canonical home Why How it gets into AtoCore
CAD geometry (the actual model) NX (or successor CAD tool) the only place that can render and validate it not in AtoCore at all in V1
CAD-side structure (subsystem tree, component list, materials, parameters) KB-CAD KB-CAD is the structured wrapper around NX KB-CAD export → /ingest/kb-cad/export → entities
FEM mesh & solver settings KB-FEM (wrapping the FEM tool) only the solver representation can run not in AtoCore at all in V1
FEM results & validation outcomes KB-FEM KB-FEM owns the outcome records KB-FEM export → /ingest/kb-fem/export → entities
Source code Gitea repos repos are version-controlled and reviewable indirectly via repo markdown ingestion (Phase 1)
Repo-level documentation (READMEs, design docs in the repo) Gitea repos lives next to the code it documents ingested as source chunks; never hand-edited in AtoCore
Project-level prose notes (decisions in long-form, journal-style entries, working notes) PKM the place Antoine actually writes when thinking ingested as source chunks; the extractor proposes candidates from these for the review queue
Identity ("the user is a mechanical engineer running AtoCore") AtoCore memories (identity type) nowhere else holds personal identity hand-authored via POST /memory or extracted from interactions
Preference ("prefers small reviewable diffs", "uses SI units") AtoCore memories (preference type) nowhere else holds personal preferences hand-authored or extracted
Episodic ("on April 6 we debugged the EXDEV bug") AtoCore memories (episodic type) nowhere else has time-bound personal recall extracted from captured interactions
Decision (a structured engineering decision) AtoCore entities (Decision) once the engineering layer ships; AtoCore memories (adaptation) until then needs structured supersession, audit trail, and link to affected components extracted from PKM or interactions; promoted via review queue
Requirement AtoCore entities (Requirement) needs structured satisfaction tracking extracted from PKM, KB-CAD, or interactions
Constraint AtoCore entities (Constraint) needs structured link to the entity it constrains extracted from PKM, KB-CAD, or interactions
Validation claim AtoCore entities (ValidationClaim) needs structured link to supporting Result extracted from KB-FEM exports or interactions
Material KB-CAD if the material is on a real component; AtoCore entity (Material) if it's a project-wide material decision not yet attached to geometry structured properties live in KB-CAD's material database KB-CAD export, or hand-authored as a Material entity
Parameter KB-CAD or KB-FEM depending on whether it's a geometry or solver parameter; AtoCore entity (Parameter) if it's a higher-level project parameter not in either tool structured numeric values with units live in their tool of origin KB export, or hand-authored
Project status / current focus / next milestone AtoCore project_state (Layer 3) the trust hierarchy says trusted state is the highest authority for "what is the current state of the project" hand-curated via POST /project/state
Architectural decision records (ADRs) depends on form: long-form ADR markdown lives in the repo; the structured fact about which ADR was selected lives in the AtoCore Decision entity both representations are useful for different audiences repo ingestion provides the prose; the entity is created by extraction or hand-authored
Operational runbooks repo (next to the code they describe) lives with the system it operates not promoted into AtoCore entities — runbooks are reference material, not facts
Backup metadata (snapshot timestamps, integrity status) the backup-metadata.json files under /srv/storage/atocore/backups/ each snapshot is its own self-describing record not in AtoCore's database; queried via the /admin/backup endpoints
Conversation history with AtoCore (interactions) AtoCore interactions table nowhere else has the prompt + context pack + response triple written by capture (Phase 9 Commit A)

The supremacy rule for cross-layer facts

When the same fact has copies in multiple representations and they disagree, the trust hierarchy applies in this order:

  1. AtoCore project_state (Layer 3) is highest authority for any "current state of the project" question. This is why it requires manual curation and never gets touched by automatic processes.
  2. The tool-of-origin canonical home is highest authority for facts that are tool-managed: KB-CAD wins over AtoCore entities for CAD-side structure facts; KB-FEM wins for FEM result facts.
  3. AtoCore entities are highest authority for facts that are AtoCore-managed: Decisions, Requirements, Constraints, ValidationClaims (when the supporting Results are still loose).
  4. Active AtoCore memories are highest authority for personal facts (identity, preference, episodic).
  5. Source chunks (PKM, repos, ingested docs) are lowest authority — they are the raw substrate from which higher layers are extracted, but they may be stale, contradictory among themselves, or out of date.

This is the same hierarchy enforced by conflict-model.md. This document just makes it explicit per fact kind.

Examples

Example 1 — "what material does the lateral support pad use?"

Possible representations:

  • KB-CAD has the field component.lateral-support-pad.material = "GF-PTFE"
  • A PKM note from last month says "considering PEEK for the lateral support, GF-PTFE was the previous choice"
  • An AtoCore Material entity says GF-PTFE
  • An AtoCore project_state entry says p05 / decision / lateral_support_material = GF-PTFE

Which one wins for the question "what's the current material"?

  • project_state wins if the query is "what is the current trusted answer for p05's lateral support material" (Layer 3)
  • KB-CAD wins if project_state has not been curated for this field yet, because KB-CAD is the canonical home for CAD-side structure
  • The Material entity is a derived view from KB-CAD; if it disagrees with KB-CAD, the entity is wrong and a conflict is surfaced
  • The PKM note is historical context, not authoritative for "current"

Example 2 — "did we decide to merge the bind mounts?"

Possible representations:

  • A working session interaction is captured in the interactions table with the response containing ## Decision: merge the two bind mounts into one
  • The Phase 9 Commit C extractor produced a candidate adaptation memory from that decision
  • A reviewer promoted the candidate to active
  • The AtoCore source repo has the actual code change in commit d0ff8b5 and the docker-compose.yml is in its post-merge form

Which one wins for "is this decision real and current"?

  • The Gitea repo wins for "is this decision implemented" — the docker-compose.yml is the canonical home for the actual bind mount configuration
  • The active adaptation memory wins for "did we decide this" — that's exactly what the Commit C lifecycle is for
  • The interaction record is the audit trail — it's authoritative for "when did this conversation happen and what did the LLM say", but not for "is this decision current"
  • The source chunks from PKM are not relevant here because no PKM note about this decision exists yet (and that's fine — decisions don't have to live in PKM if they live in the repo and the AtoCore memory)

Example 3 — "what's p05's current next focus?"

Possible representations:

  • The PKM has a current-status.md note updated last week
  • AtoCore project_state has p05 / status / next_focus = "wave 2 ingestion"
  • A captured interaction from yesterday discussed the next focus at length

Which one wins?

  • project_state wins, full stop. The trust hierarchy says Layer 3 is canonical for current state. This is exactly the reason project_state exists.
  • The PKM note is historical context.
  • The interaction is conversation history.
  • If project_state and the PKM disagree, the human updates one or the other to bring them in line — usually by re-curating project_state if the conversation revealed a real change.

What this means for the engineering layer V1 implementation

Several concrete consequences fall out of the matrix:

  1. The Material and Parameter entity types are mostly KB-CAD shadows in V1. They exist in AtoCore so other entities (Decisions, Requirements) can reference them with structured links, but their authoritative values come from KB-CAD imports. If KB-CAD doesn't know about a material, the AtoCore entity is the canonical home only because nothing else is.
  2. Decisions / Requirements / Constraints / ValidationClaims are AtoCore-canonical. These don't have a natural home in KB-CAD or KB-FEM. They live in AtoCore as first-class entities with full lifecycle and supersession.
  3. The PKM is never authoritative. It is the substrate for extraction. The reviewer promotes things out of it; they don't point at PKM notes as the "current truth".
  4. project_state is the override layer. Whenever the human wants to declare "the current truth is X regardless of what the entities and memories and KB exports say", they curate into project_state. Layer 3 is intentionally small and intentionally manual.
  5. The conflict model is the enforcement mechanism. When two representations disagree on a fact whose canonical home rule should pick a winner, the conflict surfaces via the /conflicts endpoint and the reviewer resolves it. The matrix in this document tells the reviewer who is supposed to win in each scenario; they're not making the decision blind.

What the matrix does NOT define

  1. Facts about people other than the user. No "team member" entity, no per-collaborator preferences. AtoCore is single-user in V1.
  2. Facts about AtoCore itself as a project. Those are project memories and project_state entries under project=atocore, same lifecycle as any other project's facts.
  3. Vendor / supplier / cost facts. Out of V1 scope.
  4. Time-bounded facts (a value that was true between two dates and may not be true now). The current matrix treats all active facts as currently-true and uses supersession to represent change. Temporal facts are a V2 concern.
  5. Cross-project shared facts (a Material that is reused across p04, p05, and p06). Currently each project has its own copy. Cross-project deduplication is also a V2 concern.

The "single canonical home" invariant in practice

The hard rule that every fact has exactly one canonical home is the load-bearing invariant of this matrix. To enforce it operationally:

  • Extraction never duplicates. When the extractor scans an interaction or a source chunk and proposes a candidate, the candidate is dropped if it duplicates an already-active record in the canonical home (the existing extractor implementation already does this for memories; the entity extractor will follow the same pattern).
  • Imports never duplicate. When KB-CAD pushes the same Component twice with the same value, the second push is recognized as identical and updates the last_imported_at timestamp without creating a new entity.
  • Imports surface drift as conflict. When KB-CAD pushes the same Component with a different value, that's a conflict per the conflict model — never a silent overwrite.
  • Hand-curation into project_state always wins. A project_state entry can disagree with an entity or a KB export; the project_state entry is correct by fiat (Layer 3 trust), and the reviewer is responsible for bringing the lower layers in line if appropriate.

Open questions for V1 implementation

  1. How does the reviewer see the canonical home for a fact in the UI? Probably by including the fact's authoritative layer in the entity / memory detail view: "this Material is currently mirrored from KB-CAD; the canonical home is KB-CAD".
  2. Who owns running the KB-CAD / KB-FEM exporter? The tool-handoff-boundaries.md doc lists this as an open question; same answer applies here.
  3. Do we need an explicit canonical_home field on entity rows? A field that records "this entity is canonical here" vs "this entity is a mirror of ". Probably yes; deferred to the entity schema spec.
  4. How are project_state overrides surfaced in the engineering layer query results? When a query (e.g. Q-001 "what does this subsystem contain?") would return entity rows, the result should also flag any project_state entries that contradict the entities — letting the reviewer see the override at query time, not just in the conflict queue.

TL;DR

  • Six representation layers: PKM, KB project, repos, AtoCore memories, AtoCore entities, AtoCore project_state
  • Every fact kind has exactly one canonical home
  • The trust hierarchy resolves cross-layer conflicts: project_state > tool-of-origin (KB-CAD/KB-FEM) > entities > active memories > source chunks
  • Decisions / Requirements / Constraints / ValidationClaims are AtoCore-canonical (no other system has a natural home for them)
  • Materials / Parameters / CAD-side structure are KB-CAD-canonical
  • FEM results / validation outcomes are KB-FEM-canonical
  • project_state is the human override layer, top of the hierarchy, manually curated only
  • Conflicts surface via /conflicts and the reviewer applies the matrix to pick a winner