ATOCore/docs/architecture/representation-authority.md

# Representation Authority (canonical home matrix)

## Why this document exists

The same fact about an engineering project can show up in many
places: a markdown note in the PKM, a structured field in KB-CAD,
a commit message in a Gitea repo, an active memory in AtoCore, an
entity in the engineering layer, a row in trusted project state.
**Without an explicit rule about which representation is
authoritative for which kind of fact, the system will accumulate
contradictions and the human will lose trust in all of them.**

This document is the canonical-home matrix. Every kind of fact
that AtoCore handles has exactly one authoritative representation,
and every other place that holds a copy of that fact is, by
definition, a derived view that may be stale.

## The representations in scope

Six places where facts can live in this ecosystem:

| Layer | What it is | Who edits it | How it's structured |
|---|---|---|---|
| **PKM** | Antoine's Obsidian-style markdown vault under `/srv/storage/atocore/sources/vault/` | Antoine, by hand | unstructured markdown with optional frontmatter |
| **KB project** | the engineering Knowledge Base (KB-CAD / KB-FEM repos and any companion docs) | Antoine, semi-structured | per-tool typed records |
| **Gitea repos** | source code repos under `dalidou:3000/Antoine/*` (Fullum-Interferometer, polisher-sim, ATOCore itself, ...) | Antoine via git commits | code, READMEs, repo-specific markdown |
| **AtoCore memories** | rows in the `memories` table | hand-authored or extracted from interactions | typed (identity / preference / project / episodic / knowledge / adaptation) |
| **AtoCore entities** | rows in the `entities` table (V1, not yet built) | imported from KB exports or extracted from interactions | typed entities + relationships per the V1 ontology |
| **AtoCore project state** | rows in the `project_state` table (Layer 3, trusted) | hand-curated only, never automatic | category + key + value |

## The canonical home rule

> For each kind of fact, exactly one of the six representations is
> the authoritative source. The other five may hold derived
> copies, but they are not allowed to disagree with the
> authoritative one. When they disagree, the disagreement is a
> conflict and surfaces via the conflict model.

The matrix below assigns the authoritative representation per fact
kind. It is the practical answer to the question "where does this
fact actually live?" for daily decisions.

## The canonical-home matrix

| Fact kind | Canonical home | Why | How it gets into AtoCore |
|---|---|---|---|
| **CAD geometry** (the actual model) | NX (or successor CAD tool) | the only place that can render and validate it | not in AtoCore at all in V1 |
| **CAD-side structure** (subsystem tree, component list, materials, parameters) | KB-CAD | KB-CAD is the structured wrapper around NX | KB-CAD export → `/ingest/kb-cad/export` → entities |
| **FEM mesh & solver settings** | KB-FEM (wrapping the FEM tool) | only the solver representation can run | not in AtoCore at all in V1 |
| **FEM results & validation outcomes** | KB-FEM | KB-FEM owns the outcome records | KB-FEM export → `/ingest/kb-fem/export` → entities |
| **Source code** | Gitea repos | repos are version-controlled and reviewable | indirectly via repo markdown ingestion (Phase 1) |
| **Repo-level documentation** (READMEs, design docs in the repo) | Gitea repos | lives next to the code it documents | ingested as source chunks; never hand-edited in AtoCore |
| **Project-level prose notes** (decisions in long-form, journal-style entries, working notes) | PKM | the place Antoine actually writes when thinking | ingested as source chunks; the extractor proposes candidates from these for the review queue |
| **Identity** ("the user is a mechanical engineer running AtoCore") | AtoCore memories (`identity` type) | nowhere else holds personal identity | hand-authored via `POST /memory` or extracted from interactions |
| **Preference** ("prefers small reviewable diffs", "uses SI units") | AtoCore memories (`preference` type) | nowhere else holds personal preferences | hand-authored or extracted |
| **Episodic** ("on April 6 we debugged the EXDEV bug") | AtoCore memories (`episodic` type) | nowhere else has time-bound personal recall | extracted from captured interactions |
| **Decision** (a structured engineering decision) | AtoCore **entities** (Decision) once the engineering layer ships; AtoCore memories (`adaptation`) until then | needs structured supersession, audit trail, and link to affected components | extracted from PKM or interactions; promoted via review queue |
| **Requirement** | AtoCore **entities** (Requirement) | needs structured satisfaction tracking | extracted from PKM, KB-CAD, or interactions |
| **Constraint** | AtoCore **entities** (Constraint) | needs structured link to the entity it constrains | extracted from PKM, KB-CAD, or interactions |
| **Validation claim** | AtoCore **entities** (ValidationClaim) | needs structured link to supporting Result | extracted from KB-FEM exports or interactions |
| **Material** | KB-CAD if the material is on a real component; AtoCore entity (Material) if it's a project-wide material decision not yet attached to geometry | structured properties live in KB-CAD's material database | KB-CAD export, or hand-authored as a Material entity |
| **Parameter** | KB-CAD or KB-FEM depending on whether it's a geometry or solver parameter; AtoCore entity (Parameter) if it's a higher-level project parameter not in either tool | structured numeric values with units live in their tool of origin | KB export, or hand-authored |
| **Project status / current focus / next milestone** | AtoCore **project_state** (Layer 3) | the trust hierarchy says trusted state is the highest authority for "what is the current state of the project" | hand-curated via `POST /project/state` |
| **Architectural decision records (ADRs)** | depends on form: long-form ADR markdown lives in the repo; the structured fact about which ADR was selected lives in the AtoCore Decision entity | both representations are useful for different audiences | repo ingestion provides the prose; the entity is created by extraction or hand-authored |
| **Operational runbooks** | repo (next to the code they describe) | lives with the system it operates | not promoted into AtoCore entities — runbooks are reference material, not facts |
| **Backup metadata** (snapshot timestamps, integrity status) | the backup-metadata.json files under `/srv/storage/atocore/backups/` | each snapshot is its own self-describing record | not in AtoCore's database; queried via the `/admin/backup` endpoints |
| **Conversation history with AtoCore (interactions)** | AtoCore `interactions` table | nowhere else has the prompt + context pack + response triple | written by capture (Phase 9 Commit A) |

## The supremacy rule for cross-layer facts

When the same fact has copies in multiple representations and they
disagree, the trust hierarchy applies in this order:

1. **AtoCore project_state** (Layer 3) is highest authority for any
   "current state of the project" question. This is why it requires
   manual curation and never gets touched by automatic processes.
2. **The tool-of-origin canonical home** is highest authority for
   facts that are tool-managed: KB-CAD wins over AtoCore entities
   for CAD-side structure facts; KB-FEM wins for FEM result facts.
3. **AtoCore entities** are highest authority for facts that are
   AtoCore-managed: Decisions, Requirements, Constraints,
   ValidationClaims (when the supporting Results are still loose).
4. **Active AtoCore memories** are highest authority for personal
   facts (identity, preference, episodic).
5. **Source chunks (PKM, repos, ingested docs)** are lowest
   authority — they are the raw substrate from which higher layers
   are extracted, but they may be stale, contradictory among
   themselves, or out of date.

This is the same hierarchy enforced by `conflict-model.md`. This
document just makes it explicit per fact kind.

## Examples

### Example 1 — "what material does the lateral support pad use?"

Possible representations:

- KB-CAD has the field `component.lateral-support-pad.material = "GF-PTFE"`
- A PKM note from last month says "considering PEEK for the
  lateral support, GF-PTFE was the previous choice"
- An AtoCore Material entity says `GF-PTFE`
- An AtoCore project_state entry says `p05 / decision /
  lateral_support_material = GF-PTFE`

Which one wins for the question "what's the current material"?

- **project_state wins** if the query is "what is the current
  trusted answer for p05's lateral support material" (Layer 3)
- **KB-CAD wins** if project_state has not been curated for this
  field yet, because KB-CAD is the canonical home for CAD-side
  structure
- **The Material entity** is a derived view from KB-CAD; if it
  disagrees with KB-CAD, the entity is wrong and a conflict is
  surfaced
- **The PKM note** is historical context, not authoritative for
  "current"

### Example 2 — "did we decide to merge the bind mounts?"

Possible representations:

- A working session interaction is captured in the `interactions`
  table with the response containing `## Decision: merge the two
  bind mounts into one`
- The Phase 9 Commit C extractor produced a candidate adaptation
  memory from that decision
- A reviewer promoted the candidate to active
- The AtoCore source repo has the actual code change in commit
  `d0ff8b5` and the docker-compose.yml is in its post-merge form

Which one wins for "is this decision real and current"?

- **The Gitea repo** wins for "is this decision implemented" —
  the docker-compose.yml is the canonical home for the actual
  bind mount configuration
- **The active adaptation memory** wins for "did we decide this"
  — that's exactly what the Commit C lifecycle is for
- **The interaction record** is the audit trail — it's
  authoritative for "when did this conversation happen and what
  did the LLM say", but not for "is this decision current"
- **The source chunks** from PKM are not relevant here because no
  PKM note about this decision exists yet (and that's fine —
  decisions don't have to live in PKM if they live in the repo
  and the AtoCore memory)

### Example 3 — "what's p05's current next focus?"

Possible representations:

- The PKM has a `current-status.md` note updated last week
- AtoCore project_state has `p05 / status / next_focus = "wave 2 ingestion"`
- A captured interaction from yesterday discussed the next focus
  at length

Which one wins?

- **project_state wins**, full stop. The trust hierarchy says
  Layer 3 is canonical for current state. This is exactly the
  reason project_state exists.
- The PKM note is historical context.
- The interaction is conversation history.
- If project_state and the PKM disagree, the human updates one or
  the other to bring them in line — usually by re-curating
  project_state if the conversation revealed a real change.

## What this means for the engineering layer V1 implementation

Several concrete consequences fall out of the matrix:

1. **The Material and Parameter entity types are mostly KB-CAD
   shadows in V1.** They exist in AtoCore so other entities
   (Decisions, Requirements) can reference them with structured
   links, but their authoritative values come from KB-CAD imports.
   If KB-CAD doesn't know about a material, the AtoCore entity is
   the canonical home only because nothing else is.
2. **Decisions / Requirements / Constraints / ValidationClaims
   are AtoCore-canonical.** These don't have a natural home in
   KB-CAD or KB-FEM. They live in AtoCore as first-class entities
   with full lifecycle and supersession.
3. **The PKM is never authoritative.** It is the substrate for
   extraction. The reviewer promotes things out of it; they don't
   point at PKM notes as the "current truth".
4. **project_state is the override layer.** Whenever the human
   wants to declare "the current truth is X regardless of what
   the entities and memories and KB exports say", they curate
   into project_state. Layer 3 is intentionally small and
   intentionally manual.
5. **The conflict model is the enforcement mechanism.** When two
   representations disagree on a fact whose canonical home rule
   should pick a winner, the conflict surfaces via the
   `/conflicts` endpoint and the reviewer resolves it. The
   matrix in this document tells the reviewer who is supposed
   to win in each scenario; they're not making the decision blind.

## What the matrix does NOT define

1. **Facts about people other than the user.** No "team member"
   entity, no per-collaborator preferences. AtoCore is
   single-user in V1.
2. **Facts about AtoCore itself as a project.** Those are project
   memories and project_state entries under `project=atocore`,
   same lifecycle as any other project's facts.
3. **Vendor / supplier / cost facts.** Out of V1 scope.
4. **Time-bounded facts** (a value that was true between two
   dates and may not be true now). The current matrix treats all
   active facts as currently-true and uses supersession to
   represent change. Temporal facts are a V2 concern.
5. **Cross-project shared facts** (a Material that is reused across
   p04, p05, and p06). Currently each project has its own copy.
   Cross-project deduplication is also a V2 concern.

## The "single canonical home" invariant in practice

The hard rule that every fact has exactly one canonical home is
the load-bearing invariant of this matrix. To enforce it
operationally:

- **Extraction never duplicates.** When the extractor scans an
  interaction or a source chunk and proposes a candidate, the
  candidate is dropped if it duplicates an already-active record
  in the canonical home (the existing extractor implementation
  already does this for memories; the entity extractor will
  follow the same pattern).
- **Imports never duplicate.** When KB-CAD pushes the same
  Component twice with the same value, the second push is
  recognized as identical and updates the `last_imported_at`
  timestamp without creating a new entity.
- **Imports surface drift as conflict.** When KB-CAD pushes the
  same Component with a different value, that's a conflict per
  the conflict model — never a silent overwrite.
- **Hand-curation into project_state always wins.** A
  project_state entry can disagree with an entity or a KB
  export; the project_state entry is correct by fiat (Layer 3
  trust), and the reviewer is responsible for bringing the lower
  layers in line if appropriate.

## Open questions for V1 implementation

1. **How does the reviewer see the canonical home for a fact in
   the UI?** Probably by including the fact's authoritative
   layer in the entity / memory detail view: "this Material is
   currently mirrored from KB-CAD; the canonical home is KB-CAD".
2. **Who owns running the KB-CAD / KB-FEM exporter?** The
   `tool-handoff-boundaries.md` doc lists this as an open
   question; same answer applies here.
3. **Do we need an explicit `canonical_home` field on entity
   rows?** A field that records "this entity is canonical here"
   vs "this entity is a mirror of <external system>". Probably
   yes; deferred to the entity schema spec.
4. **How are project_state overrides surfaced in the engineering
   layer query results?** When a query (e.g. Q-001 "what does
   this subsystem contain?") would return entity rows, the result
   should also flag any project_state entries that contradict the
   entities — letting the reviewer see the override at query
   time, not just in the conflict queue.

## TL;DR

- Six representation layers: PKM, KB project, repos, AtoCore
  memories, AtoCore entities, AtoCore project_state
- Every fact kind has exactly one canonical home
- The trust hierarchy resolves cross-layer conflicts:
  project_state > tool-of-origin (KB-CAD/KB-FEM) > entities >
  active memories > source chunks
- Decisions / Requirements / Constraints / ValidationClaims are
  AtoCore-canonical (no other system has a natural home for them)
- Materials / Parameters / CAD-side structure are KB-CAD-canonical
- FEM results / validation outcomes are KB-FEM-canonical
- project_state is the human override layer, top of the
  hierarchy, manually curated only
- Conflicts surface via `/conflicts` and the reviewer applies the
  matrix to pick a winner