docs(arch): memory-vs-entities, promotion-rules, conflict-model

Three planning docs that answer the architectural questions the
engineering query catalog raised. Together with the catalog they
form roughly half of the pre-implementation planning sprint.

docs/architecture/memory-vs-entities.md
---------------------------------------
Resolves the central question blocking every other engineering
layer doc: is a Decision a memory or an entity?

Key decisions:
- memories stay the canonical home for identity, preference, and
  episodic facts
- entities become the canonical home for project, knowledge, and
  adaptation facts once the engineering layer V1 ships
- no concept lives in both layers at full fidelity; one canonical
  home per concept
- a "graduation" flow lets active memories upgrade into entities
  (memory stays as a frozen historical pointer, never deleted)
- one shared candidate review queue across both layers
- context builder budget gains a 15% slot for engineering entities,
  slotted between identity/preference memories and retrieved chunks
- the Phase 9 memory extractor's structural cues (decision heading,
  constraint heading, requirement heading) are explicitly an
  intentional temporary overlap, cleanly migrated via graduation
  when the entity extractor ships

docs/architecture/promotion-rules.md
------------------------------------
Defines the full Layer 0 → Layer 2 pipeline:

- four layers: L0 raw source, L1 memory candidate/active, L2 entity
  candidate/active, L3 trusted project state
- three extraction triggers: on interaction capture (existing),
  on ingestion wave (new, batched per wave), on explicit request
- per-rule prior confidence tuned at write time by structural
  signal (echoes the retriever's high/low signal hints) and
  freshness bonus
- batch cap of 50 candidates per pass to protect the reviewer
- full provenance requirements: every candidate carries rule id,
  source_chunk_id, source_interaction_id, and extractor_version
- reversibility matrix for every promotion step
- explicit no-auto-promotion-in-V1 stance with the schema designed
  so auto-promotion policies can be added later without migration
- the hard invariant: nothing ever moves into L3 automatically
- ingestion-wave extraction produces a report artifact under
  data/extraction-reports/<wave-id>/

docs/architecture/conflict-model.md
-----------------------------------
Defines how AtoCore handles contradictory facts without violating
the "bad memory is worse than no memory" rule.

- conflict = two or more active rows claiming the same slot with
  incompatible values
- per-type "slot key" tuples for both memory and entity types
- cross-layer conflict detection respects the trust hierarchy:
  trusted project state > active entities > active memories
- new conflicts and conflict_members tables (schema proposal)
- detection at two latencies: synchronous at write time,
  asynchronous nightly sweep
- "flag, never block" rule: writes always succeed, conflicts are
  surfaced via /conflicts, /health open_conflicts_count, per-row
  response bodies, and the Human Mirror's disputed marker
- resolution is always human: promote-winner + supersede-others,
  or dismiss-as-not-a-real-conflict, both with audit trail
- explicitly out of scope for V1: cross-project conflicts,
  temporal-overlap conflicts, tolerance-aware numeric comparisons

Also updates:
- master-plan-status.md: Phase 9 moved from "started" to "baseline
  complete" now that Commits A, B, C are all landed
- master-plan-status.md: adds a "Engineering Layer Planning Sprint"
  section listing the doc wave so far and the remaining docs
  (tool-handoff-boundaries, human-mirror-rules,
  representation-authority, engineering-v1-acceptance)
- current-state.md: Phase 9 moved from "not started" to "baseline
  complete" with the A/B/C annotation

This is pure doc work. No code changes, no schema changes, no
behavior changes. Per the working rule in master-plan-status.md:
the architecture docs shape decisions, they do not force premature
schema work.
This commit is contained in:
2026-04-06 21:30:35 -04:00
parent 53147d326c
commit 480f13a6df
5 changed files with 1013 additions and 4 deletions

View File

@@ -0,0 +1,332 @@
# Conflict Model (how AtoCore handles contradictory facts)
## Why this document exists
Any system that accumulates facts from multiple sources — interactions,
ingested documents, repo history, PKM notes — will eventually see
contradictory facts about the same thing. AtoCore's operating model
already has the hard rule:
> **Bad memory is worse than no memory.**
The practical consequence of that rule is: AtoCore must never
silently merge contradictory facts, never silently pick a winner,
and never silently discard evidence. Every conflict must be
surfaced to a human reviewer with full audit context.
This document defines what "conflict" means in AtoCore, how
conflicts are detected, how they are represented, how they are
surfaced, and how they are resolved.
## What counts as a conflict
A conflict exists when two or more facts in the system claim
incompatible values for the same conceptual slot. More precisely:
A conflict is a set of two or more **active** rows (across memories,
entities, project_state) such that:
1. They share the same **target identity** — same entity type and
same semantic key
2. Their **claimed values** are incompatible
3. They are all in an **active** status (not superseded, not
invalid, not candidate)
Examples that are conflicts:
- Two active `Decision` entities affecting the same `Subsystem`
with contradictory values for the same decided field (e.g.
lateral support material = GF-PTFE vs lateral support material = PEEK)
- An active `preference` memory "prefers rebase workflow" and an
active `preference` memory "prefers merge-commit workflow"
- A `project_state` entry `p05 / decision / lateral_support_material = GF-PTFE`
and an active `Decision` entity also claiming the lateral support
material is PEEK (cross-layer conflict)
Examples that are NOT conflicts:
- Two active memories both saying "prefers small diffs" — same
meaning, not contradictory
- An active memory saying X and a candidate memory saying Y —
candidates are not active, so this is part of the review queue
flow, not the conflict flow
- A superseded `Decision` saying X and an active `Decision` saying Y
— supersession is a resolved history, not a conflict
- Two active `Requirement` entities each constraining the same
component in different but compatible ways (e.g. one caps mass,
one caps heat flux) — different fields, no contradiction
## Detection triggers
Conflict detection must fire at every write that could create a new
active fact. That means the following hook points:
1. **`POST /memory` creating an active memory** (legacy path)
2. **`POST /memory/{id}/promote`** (candidate → active)
3. **`POST /entities` creating an active entity** (future)
4. **`POST /entities/{id}/promote`** (candidate → active, future)
5. **`POST /project/state`** (curating trusted state directly)
6. **`POST /memory/{id}/graduate`** (memory → entity graduation,
future — the resulting entity could conflict with something)
Extraction passes do NOT trigger conflict detection at candidate
write time. Candidates are allowed to sit in the queue in an
apparently-conflicting state; the reviewer will see them during
promotion and decision-detection fires at that moment.
## Detection strategy per layer
### Memory layer
For identity / preference / episodic memories (the ones that stay
in the memory layer):
- Matching key: `(memory_type, project, normalized_content_family)`
- `normalized_content_family` is not a hash of the content — that
would require exact equality — but a slot identifier extracted
by a small per-type rule set:
- identity: slot is "role" / "background" / "credentials"
- preference: slot is the first content word after "prefers" / "uses" / "likes"
normalized to a lowercase noun stem, OR the rule id that extracted it
- episodic: no slot — episodic entries are intrinsically tied to
a moment in time and rarely conflict
A conflict is flagged when two active memories share a
`(memory_type, project, slot)` but have different content bodies.
### Entity layer (V1)
For each V1 entity type, the conflict key is a short tuple that
uniquely identifies the "slot" that entity is claiming:
| Entity type | Conflict slot |
|-------------------|-------------------------------------------------------|
| Project | `(project_id)` |
| Subsystem | `(project_id, subsystem_name)` |
| Component | `(project_id, subsystem_name, component_name)` |
| Requirement | `(project_id, requirement_key)` |
| Constraint | `(project_id, constraint_target, constraint_kind)` |
| Decision | `(project_id, decision_target, decision_field)` |
| Material | `(project_id, component_id)` |
| Parameter | `(project_id, parameter_scope, parameter_name)` |
| AnalysisModel | `(project_id, subsystem_id, model_name)` |
| Result | `(project_id, analysis_model_id, result_key)` |
| ValidationClaim | `(project_id, claim_key)` |
| Artifact | no conflict detection — artifacts are additive |
A conflict is two active entities with the same slot but
different structural values. The exact "which fields count as
structural" list is per-type and lives in the entity schema doc
(not yet written — tracked as future `engineering-ontology-v1.md`
updates).
### Cross-layer (memory vs entity vs trusted project state)
Trusted project state trumps active entities trumps active
memories. This is the trust hierarchy from the operating model.
Cross-layer conflict detection works by a nightly job that walks
the three layers and flags any slot that has entries in more than
one layer with incompatible values:
- If trusted project state and an entity disagree: the entity is
flagged; trusted state is assumed correct
- If an entity and a memory disagree: the memory is flagged; the
entity is assumed correct
- If trusted state and a memory disagree: the memory is flagged;
trusted state is assumed correct
In all three cases the lower-trust row gets a `conflicts_with`
reference pointing at the higher-trust row but does NOT auto-move
to superseded. The flag is an alert, not an action.
## Representation
Conflicts are represented as rows in a new `conflicts` table
(V1 schema, not yet shipped):
```sql
CREATE TABLE conflicts (
id TEXT PRIMARY KEY,
detected_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
slot_kind TEXT NOT NULL, -- "memory_slot" or "entity_slot" or "cross_layer"
slot_key TEXT NOT NULL, -- JSON-encoded tuple identifying the slot
project TEXT DEFAULT '',
status TEXT NOT NULL DEFAULT 'open', -- open | resolved | dismissed
resolved_at DATETIME,
resolution TEXT DEFAULT '', -- free text from the reviewer
-- links to conflicting rows live in conflict_members
UNIQUE(slot_kind, slot_key, status) -- ensures only one open conflict per slot
);
CREATE TABLE conflict_members (
conflict_id TEXT NOT NULL REFERENCES conflicts(id) ON DELETE CASCADE,
member_kind TEXT NOT NULL, -- "memory" | "entity" | "project_state"
member_id TEXT NOT NULL,
member_layer_trust INTEGER NOT NULL,-- 1=memory, 2=entity, 3=project_state
PRIMARY KEY (conflict_id, member_kind, member_id)
);
```
Constraint rationale:
- `UNIQUE(slot_kind, slot_key, status)` where status='open' prevents
duplicate "conflict already open for this slot" rows. At most one
open conflict exists per slot at a time; new conflicting rows are
added as members to the existing conflict, not as a new conflict.
- `conflict_members.member_layer_trust` is denormalized so the
conflict resolution UI can sort conflicting rows by trust tier
without re-querying.
- `status='dismissed'` exists separately from `resolved` because
"the reviewer looked at this and declared it not a real conflict"
is a valid distinct outcome (the two rows really do describe
different things and the detector was overfitting).
## API shape
```
GET /conflicts list open conflicts
GET /conflicts?status=resolved list resolved conflicts
GET /conflicts?project=p05-interferometer scope by project
GET /conflicts/{id} full detail including all members
POST /conflicts/{id}/resolve mark resolved with notes
body: {
"resolution_notes": "...",
"winner_member_id": "...", # optional: if specified,
# other members are auto-superseded
"action": "supersede_others" # or "no_action" if reviewer
# wants to resolve without touching rows
}
POST /conflicts/{id}/dismiss mark dismissed ("not a real conflict")
body: {
"reason": "..."
}
```
Conflict detection must also surface in existing endpoints:
- `GET /memory/{id}` — response includes a `conflicts` array if
the memory is a member of any open conflict
- `GET /entities/{type}/{id}` (future) — same
- `GET /health` — includes `open_conflicts_count` so the operator
sees at a glance that review is pending
## Supersession as a conflict resolution tool
When the reviewer resolves a conflict with `action: "supersede_others"`,
the winner stays active and every other member is flipped to
status="superseded" with a `superseded_by` pointer to the winner.
This is the normal path: "we used to think X, now we know Y, flag
X as superseded so the audit trail keeps X visible but X no longer
influences context".
The conflict resolution audit record links back to all superseded
members, so the conflict history itself is queryable:
- "Show me every conflict that touched Subsystem X"
- "Show me every Decision that superseded another Decision because
of a conflict"
These are entries in the V1 query catalog (see Q-014 decision history).
## Detection latency
Conflict detection runs at two latencies:
1. **Synchronous (at write time)** — every create/promote/update of
an active row in a conflict-enabled type runs a synchronous
same-layer detector. If a conflict is detected the write still
succeeds but a row is inserted into `conflicts` and the API
response includes a `conflict_id` field so the caller knows
immediately.
2. **Asynchronous (nightly sweep)** — a scheduled job walks all
three layers looking for cross-layer conflicts that slipped
past write-time detection (e.g. a memory that was already
active before an entity with the same slot was promoted). The
sweep also looks for slot overlaps that the synchronous
detector can't see because the slot key extraction rules have
improved since the row was written.
Both paths write to the same `conflicts` table and both are
surfaced in the same review queue.
## The "flag, never block" rule
Detection **never** blocks writes. The operating rule is:
- If the write is otherwise valid (schema, permissions, trust
hierarchy), accept it
- Log the conflict
- Surface it to the reviewer
- Let the system keep functioning with the conflict in place
The alternative — blocking writes on conflict — would mean that
one stale fact could prevent all future writes until manually
resolved, which in practice makes the system unusable for normal
work. The "flag, never block" rule keeps AtoCore responsive while
still making conflicts impossible to ignore (the `/health`
endpoint's `open_conflicts_count` makes them loud).
The one exception: writing to `project_state` (layer 3) when an
open conflict already exists on that slot will return a warning
in the response body. The write still happens, but the reviewer
is explicitly told "you just wrote to a slot that has an open
conflict". This is the highest-trust layer so we want extra
friction there without actually blocking.
## Showing conflicts in the Human Mirror
When the Human Mirror template renders a project overview, any
open conflict in that project shows as a **"⚠ disputed"** marker
next to the affected field, with a link to the conflict detail.
This makes conflicts visible to anyone reading the derived
human-facing pages, not just to reviewers who think to check the
`/conflicts` endpoint.
The Human Mirror render rules (not yet written — tracked as future
`human-mirror-rules.md`) will specify exactly where and how the
disputed marker appears.
## What this document does NOT solve
1. **Automatic conflict resolution.** No policy will ever
automatically promote one conflict member over another. The
trust hierarchy is an *alert ordering* for reviewers, not an
auto-resolve rule. The human signs off on every resolution.
2. **Cross-project conflicts.** If p04 and p06 both have
entities claiming conflicting things about a shared component,
that is currently out of scope because the V1 slot keys all
include `project_id`. Cross-project conflict detection is a
future concern that needs its own slot key strategy.
3. **Temporal conflicts with partial overlap.** If a fact was
true during a time window and another fact is true in a
different time window, that is not a conflict — it's history.
Representing time-bounded facts is deferred to a future
temporal-entities doc.
4. **Probabilistic "soft" conflicts.** If two entities claim the
same slot with slightly different values (e.g. "4.8 kg" vs
"4.82 kg"), is that a conflict? For V1, yes — the string
values are unequal so they're flagged. Tolerance-aware
numeric comparisons are a V2 concern.
## TL;DR
- Conflicts = two or more active rows claiming the same slot with
incompatible values
- Detection fires on every active write AND in a nightly sweep
- Conflicts are stored in a dedicated `conflicts` table with a
`conflict_members` join
- Resolution is always human (promote-winner / supersede-others
/ dismiss-as-not-a-conflict)
- "Flag, never block" — writes always succeed, conflicts are
surfaced via `/conflicts`, `/health`, per-entity responses, and
the Human Mirror
- Trusted project state is the top of the trust hierarchy and is
assumed correct in any cross-layer conflict until the reviewer
says otherwise

View File

@@ -0,0 +1,309 @@
# Memory vs Entities (Engineering Layer V1 boundary)
## Why this document exists
The engineering layer introduces a new representation — typed
entities with explicit relationships — alongside AtoCore's existing
memory system and its six memory types. The question that blocks
every other engineering-layer planning doc is:
> When we extract a fact from an interaction or a document, does it
> become a memory, an entity, or both? And if both, which one is
> canonical?
Without an answer, the rest of the engineering layer cannot be
designed. This document is the answer.
## The short version
- **Memories stay.** They are still the canonical home for
*unstructured, attributed, personal, natural-language* facts.
- **Entities are new.** They are the canonical home for *structured,
typed, relational, engineering-domain* facts.
- **No concept lives in both at full fidelity.** Every concept has
exactly one canonical home. The other layer may hold a pointer or
a rendered view, never a second source of truth.
- **The two layers share one review queue.** Candidates from
extraction flow into the same `status=candidate` lifecycle
regardless of whether they are memory-bound or entity-bound.
- **Memories can "graduate" into entities** when enough structure has
accumulated, but the upgrade is an explicit, logged promotion, not
a silent rewrite.
## The split per memory type
The six memory types from the current Phase 2 implementation each
map to exactly one outcome in V1:
| Memory type | V1 destination | Rationale |
|---------------|-------------------------------|-------------------------------------------------------------------------------------------------------------|
| identity | **memory only** | Always about the human user. No engineering domain structure. Never gets entity-shaped. |
| preference | **memory only** | Always about the human user's working style. Same reasoning. |
| episodic | **memory only** | "What happened in this conversation / this day." Attribution and time are the point, not typed structure. |
| knowledge | **entity when possible**, memory otherwise | If the knowledge maps to a typed engineering object (material property, constant, tolerance), it becomes a Fact entity with provenance. If it's loose general knowledge, stays a memory. |
| project | **entity** | Anything that belonged in the "project" memory type is really a Requirement, Constraint, Decision, Subsystem attribute, etc. It belongs in the engineering layer once entities exist. |
| adaptation | **entity (Decision)** | "We decided to X" is literally a Decision entity in the ontology. This is the clearest migration. |
**Practical consequence:** when the engineering layer V1 ships, the
`project`, `knowledge`, and `adaptation` memory types are deprecated
as a canonical home for new facts. Existing rows are not deleted —
they are backfilled as entities through the promotion-rules flow
(see `promotion-rules.md`), and the old memory rows become frozen
references pointing at their graduated entity.
The `identity`, `preference`, and `episodic` memory types continue
to exist exactly as they do today and do not interact with the
engineering layer at all.
## What "canonical home" actually means
A concept's canonical home is the single place where:
- its *current active value* is stored
- its *status lifecycle* is managed (active/superseded/invalid)
- its *confidence* is tracked
- its *provenance chain* is rooted
- edits, supersessions, and invalidations are applied
- conflict resolution is arbitrated
Everything else is a derived view of that canonical row.
If a `Decision` entity is the canonical home for "we switched to
GF-PTFE pads", then:
- there is no `adaptation` memory row with the same content; the
extractor creates a `Decision` candidate directly
- the context builder, when asked to include relevant state, reaches
into the entity store via the engineering layer, not the memory
store
- if the user wants to see "recent decisions" they hit the entity
API, never the memory API
- if they want to invalidate the decision, they do so via the entity
API
The memory API remains the canonical home for `identity`,
`preference`, and `episodic` — same rules, just a different set of
types.
## Why not a unified table with a `kind` column?
It would be simpler to implement. It is rejected for three reasons:
1. **Different query shapes.** Memories are queried by type, project,
confidence, recency. Entities are queried by type, relationships,
graph traversal, coverage gaps ("orphan requirements"). Cramming
both into one table forces the schema to be the union of both
worlds and makes each query slower.
2. **Different lifecycles.** Memories have a simple four-state
lifecycle (candidate/active/superseded/invalid). Entities have
the same four states *plus* per-relationship supersession,
per-field versioning for the killer correctness queries, and
structured conflict flagging. The unified table would have to
carry all entity apparatus for every memory row.
3. **Different provenance semantics.** A preference memory is
provenanced by "the user told me" — one author, one time.
An entity like a `Requirement` is provenanced by "this source
chunk + this source document + these supporting Results" — a
graph. The tables want to be different because their provenance
models are different.
So: two tables, one review queue, one promotion flow, one trust
hierarchy.
## The shared review queue
Both the memory extractor (Phase 9 Commit C, already shipped) and
the future entity extractor write into the same conceptual queue:
everything lands at `status=candidate` in its own table, and the
human reviewer sees a unified list. The reviewer UI (future work)
shows candidates of all kinds side by side, grouped by source
interaction / source document, with the rule that fired.
From the data side this means:
- the memories table gets a `candidate` status (**already done in
Phase 9 Commit B/C**)
- the future entities table will get the same `candidate` status
- both tables get the same `promote` / `reject` API shape: one verb
per candidate, with an audit log entry
Implementation note: the API routes should evolve from
`POST /memory/{id}/promote` to `POST /candidates/{id}/promote` once
both tables exist, so the reviewer tooling can treat them
uniformly. The current memory-only route stays in place for
backward compatibility and is aliased by the unified route.
## Memory-to-entity graduation
Even though the split is clean on paper, real usage will reveal
memories that deserve to be entities but started as plain text.
Four signals are good candidates for proposing graduation:
1. **Reference count crosses a threshold.** A memory that has been
reinforced 5+ times across multiple interactions is a strong
signal that it deserves structure.
2. **Memory content matches a known entity template.** If a
`knowledge` memory's content matches the shape "X = value [unit]"
it can be proposed as a `Fact` or `Parameter` entity.
3. **A user explicitly asks for promotion.** `POST /memory/{id}/graduate`
is the simplest explicit path — it returns a proposal for an
entity structured from the memory's content, which the user can
accept or reject.
4. **Extraction pass proposes an entity that happens to match an
existing memory.** The entity extractor, when scanning a new
interaction, sees the same content already exists as a memory
and proposes graduation as part of its candidate output.
The graduation flow is:
```
memory row (active, confidence C)
|
| propose_graduation()
v
entity candidate row (candidate, confidence C)
+
memory row gets status="graduated" and a forward pointer to the
entity candidate
|
| human promotes the candidate entity
v
entity row (active)
+
memory row stays "graduated" permanently (historical record)
```
The memory is never deleted. It becomes a frozen historical
pointer to the entity it became. This keeps the audit trail intact
and lets the Human Mirror show "this decision started life as a
memory on April 2, was graduated to an entity on April 15, now has
2 supporting ValidationClaims".
The `graduated` status is a new memory status that gets added when
the graduation flow is implemented. For now (Phase 9), only the
three non-graduating types (identity/preference/episodic) would
ever avoid it, and the three graduating types stay in their current
memory-only state until the engineering layer ships.
## Context pack assembly after the split
The context builder today (`src/atocore/context/builder.py`) pulls:
1. Trusted Project State
2. Identity + Preference memories
3. Retrieved chunks
After the split, it pulls:
1. Trusted Project State (unchanged)
2. **Identity + Preference memories** (unchanged — these stay memories)
3. **Engineering-layer facts relevant to the prompt**, queried through
the entity API (new)
4. Retrieved chunks (unchanged, lowest trust)
Note the ordering: identity/preference memories stay above entities,
because personal style information is always more trusted than
extracted engineering facts. Entities sit below the personal layer
but above raw retrieval, because they have structured provenance
that raw chunks lack.
The budget allocation gains a new slot:
- trusted project state: 20% (unchanged, highest trust)
- identity memories: 5% (unchanged)
- preference memories: 5% (unchanged)
- **engineering entities: 15%** (new — pulls only V1-required
objects relevant to the prompt)
- retrieval: 55% (reduced from 70% to make room)
These are starting numbers. After the engineering layer ships and
real usage tunes retrieval quality, these will be revisited.
## What the shipped memory types still mean after the split
| Memory type | Still accepts new writes? | V1 destination for new extractions |
|-------------|---------------------------|------------------------------------|
| identity | **yes** | memory (no change) |
| preference | **yes** | memory (no change) |
| episodic | **yes** | memory (no change) |
| knowledge | yes, but only for loose facts | entity (Fact / Parameter) for structured things; memory is a fallback |
| project | **no new writes after engineering V1 ships** | entity (Requirement / Constraint / Subsystem attribute) |
| adaptation | **no new writes after engineering V1 ships** | entity (Decision) |
"No new writes" means the `create_memory` path will refuse to
create new `project` or `adaptation` memories once the engineering
layer V1 ships. Existing rows stay queryable and reinforceable but
new facts of those kinds must become entities. This keeps the
canonical-home rule clean going forward.
The deprecation is deferred: it does not happen until the engineering
layer V1 is demonstrably working against the active project set. Until
then, the existing memory types continue to accept writes so the
Phase 9 loop can be exercised without waiting on the engineering
layer.
## Consequences for Phase 9 (what we just built)
The capture loop, reinforcement, and extractor we shipped today
are *memory-facing*. They produce memory candidates, reinforce
memory confidence, and respect the memory status lifecycle. None
of that changes.
When the engineering layer V1 ships, the extractor in
`src/atocore/memory/extractor.py` gets a sibling in
`src/atocore/entities/extractor.py` that uses the same
interaction-scanning approach but produces entity candidates
instead. The `POST /interactions/{id}/extract` endpoint either:
- runs both extractors and returns a combined result, or
- gains a `?target=memory|entities|both` query parameter
and the decision between those two shapes can wait until the
entity extractor actually exists.
Until the entity layer is real, the memory extractor also has to
cover some things that will eventually move to entities (decisions,
constraints, requirements). **That overlap is temporary and
intentional.** Rather than leave those cues unextracted for months
while the entity layer is being built, the memory extractor
surfaces them as memory candidates. Later, a migration pass will
propose graduation on every active memory created by
`decision_heading`, `constraint_heading`, and `requirement_heading`
rules once the entity types exist to receive them.
So: **no rework in Phase 9, no wasted extraction, clean handoff
once the entity layer lands**.
## Open questions this document does NOT answer
These are deliberately deferred to later planning docs:
1. **When exactly does extraction fire?** (answered by
`promotion-rules.md`)
2. **How are conflicts between a memory and an entity handled
during graduation?** (answered by `conflict-model.md`)
3. **Does the context builder traverse the entity graph for
relationship-rich queries, or does it only surface direct facts?**
(answered by the context-builder spec in a future
`engineering-context-integration.md` doc)
4. **What is the exact API shape of the unified candidate review
queue?** (answered by a future `review-queue-api.md` doc when
the entity extractor exists and both tables need one UI)
## TL;DR
- memories = user-facing unstructured facts, still own identity/preference/episodic
- entities = engineering-facing typed facts, own project/knowledge/adaptation
- one canonical home per concept, never both
- one shared candidate-review queue, same promote/reject shape
- graduated memories stay as frozen historical pointers
- Phase 9 stays memory-only and ships today; entity V1 follows the
remaining architecture docs in this planning sprint
- no rework required when the entity layer lands; the current memory
extractor's structural cues get migrated forward via explicit
graduation

View File

@@ -0,0 +1,343 @@
# Promotion Rules (Layer 0 → Layer 2 pipeline)
## Purpose
AtoCore ingests raw human-authored content (markdown, repo notes,
interaction transcripts) and eventually must turn some of it into
typed engineering entities that the V1 query catalog can answer.
The path from raw text to typed entity has to be:
- **explicit**: every step has a named operation, a trigger, and an
audit log
- **reversible**: every promotion can be undone without data loss
- **conservative**: no automatic movement into trusted state; a human
(or later, a very confident policy) always signs off
- **traceable**: every typed entity must carry a back-pointer to
the raw source that produced it
This document defines that path.
## The four layers
Promotion is described in terms of four layers, all of which exist
simultaneously in the system once the engineering layer V1 ships:
| Layer | Name | Canonical storage | Trust | Who writes |
|-------|-------------------|------------------------------------------|-------|------------|
| L0 | Raw source | source_documents + source_chunks | low | ingestion pipeline |
| L1 | Memory candidate | memories (status="candidate") | low | extractor |
| L1' | Active memory | memories (status="active") | med | human promotion |
| L2 | Entity candidate | entities (status="candidate") | low | extractor + graduation |
| L2' | Active entity | entities (status="active") | high | human promotion |
| L3 | Trusted state | project_state | highest | human curation |
Layer 3 (trusted project state) is already implemented and stays
manually curated — automatic promotion into L3 is **never** allowed.
## The promotion graph
```
[L0] source chunks
|
| extraction (memory extractor, Phase 9 Commit C)
v
[L1] memory candidate
|
| promote_memory()
v
[L1'] active memory
|
| (optional) propose_graduation()
v
[L2] entity candidate
|
| promote_entity()
v
[L2'] active entity
|
| (manual curation, NEVER automatic)
v
[L3] trusted project state
```
Short path (direct entity extraction, once the entity extractor
exists):
```
[L0] source chunks
|
| entity extractor
v
[L2] entity candidate
|
| promote_entity()
v
[L2'] active entity
```
A single fact can travel either path depending on what the
extractor saw. The graduation path exists for facts that started
life as memories before the entity layer existed, and for the
memory extractor's structural cues (decisions, constraints,
requirements) which are eventually entity-shaped.
## Triggers (when does extraction fire?)
Phase 9 already shipped one trigger: **on explicit API request**
(`POST /interactions/{id}/extract`). The V1 engineering layer adds
two more:
1. **On interaction capture (automatic)**
- Same event that runs reinforcement today
- Controlled by a `extract` boolean flag on the record request
(default: `false` for memory extractor, `true` once an
engineering extractor exists and has been validated)
- Output goes to the candidate queue; nothing auto-promotes
2. **On ingestion (batched, per wave)**
- After a wave of markdown ingestion finishes, a batch extractor
pass sweeps all newly-added source chunks and produces
candidates from them
- Batched per wave (not per chunk) to keep the review queue
digestible and to let the reviewer see all candidates from a
single ingestion in one place
- Output: a report artifact plus a review queue entry per
candidate
3. **On explicit human request (existing)**
- `POST /interactions/{id}/extract` for a single interaction
- Future: `POST /ingestion/wave/{id}/extract` for a whole wave
- Future: `POST /memory/{id}/graduate` to propose graduation
of one specific memory into an entity
Batch size rule: **extraction passes never write more than N
candidates per human review cycle, where N = 50 by default**. If
a pass produces more, it ranks by (rule confidence × content
length × novelty) and only writes the top N. The remaining
candidates are logged, not persisted. This protects the reviewer
from getting buried.
## Confidence and ranking of candidates
Each rule-based extraction rule carries a *prior confidence*
based on how specific its pattern is:
| Rule class | Prior | Rationale |
|---------------------------|-------|-----------|
| Heading with explicit type (`## Decision:`) | 0.7 | Very specific structural cue, intentional author marker |
| Typed list item (`- [Decision] ...`) | 0.65 | Explicit but often embedded in looser prose |
| Sentence pattern (`I prefer X`) | 0.5 | Moderate structure, more false positives |
| Regex pattern matching a value+unit (`X = 4.8 kg`) | 0.6 | Structural but prone to coincidence |
| LLM-based (future) | variable | Depends on model's returned confidence |
The candidate's final confidence at write time is:
```
final = prior * structural_signal_multiplier * freshness_bonus
```
Where:
- `structural_signal_multiplier` is 1.1 if the source chunk path
contains any of `_HIGH_SIGNAL_HINTS` from the retriever (status,
decision, requirements, charter, ...) and 0.9 if it contains
`_LOW_SIGNAL_HINTS` (`_archive`, `_history`, ...)
- `freshness_bonus` is 1.05 if the source chunk was updated in the
last 30 days, else 1.0
This formula is tuned later; the numbers are starting values.
## Review queue mechanics
### Queue population
- Each candidate writes one row into its target table
(memories or entities) with `status="candidate"`
- Each candidate carries: `rule`, `source_span`, `source_chunk_id`,
`source_interaction_id`, `extractor_version`
- No two candidates ever share the same (type, normalized_content,
project) — if a second extraction pass produces a duplicate, it
is dropped before being written
### Queue surfacing
- `GET /memory?status=candidate` lists memory candidates
- `GET /entities?status=candidate` (future) lists entity candidates
- `GET /candidates` (future unified route) lists both
### Reviewer actions
For each candidate, exactly one of:
- **promote**: `POST /memory/{id}/promote` or
`POST /entities/{id}/promote`
- sets `status="active"`
- preserves the audit trail (source_chunk_id, rule, source_span)
- **reject**: `POST /memory/{id}/reject` or
`POST /entities/{id}/reject`
- sets `status="invalid"`
- preserves audit trail so repeat extractions don't re-propose
- **edit-then-promote**: `PUT /memory/{id}` to adjust content, then
`POST /memory/{id}/promote`
- every edit is logged, original content preserved in a
`previous_content_log` column (schema addition deferred to
the first implementation sprint)
- **defer**: no action; candidate stays in queue indefinitely
(future: add a `pending_since` staleness indicator to the UI)
### Reviewer authentication
In V1 the review queue is single-user by convention. There is no
per-reviewer authorization. Every promote/reject call is logged
with the same default identity. Multi-user review is a V2 concern.
## Auto-promotion policies (deferred, but designed for)
The current V1 stance is: **no auto-promotion, ever**. All
promotions require a human reviewer.
The schema and API are designed so that automatic policies can be
added later without schema changes. The anticipated policies:
1. **Reference-count threshold**
- If a candidate accumulates N+ references across multiple
interactions within M days AND the reviewer hasn't seen it yet
(indicating the system sees it often but the human hasn't
gotten to it), propose auto-promote
- Starting thresholds: N=5, M=7 days. Never auto-promote
entity candidates that affect validation claims or decisions
without explicit human review — those are too consequential.
2. **Confidence threshold**
- If `final_confidence >= 0.85` AND the rule is a heading
rule (not a sentence rule), eligible for auto-promotion
3. **Identity/preference lane**
- identity and preference memories extracted from an
interaction where the user explicitly says "I am X" or
"I prefer X" with a first-person subject and high-signal
verb could auto-promote. This is the safest lane because
the user is the authoritative source for their own identity.
None of these run in V1. The APIs and data shape are designed so
they can be added as a separate policy module without disrupting
existing tests.
## Reversibility
Every promotion step must be undoable:
| Operation | How to undo |
|---------------------------|-------------------------------------------------------|
| memory candidate written | delete the candidate row (low-risk, it was never in context) |
| memory candidate promoted | `PUT /memory/{id}` status=candidate (reverts to queue) |
| memory candidate rejected | `PUT /memory/{id}` status=candidate |
| memory graduated | memory stays as a frozen pointer; delete the entity candidate to undo |
| entity candidate promoted | `PUT /entities/{id}` status=candidate |
| entity promoted to active | supersede with a new active, or `PUT` back to candidate |
The only irreversible operation is manual curation into L3
(trusted project state). That is by design — L3 is small, curated,
and human-authored end to end.
## Provenance (what every candidate must carry)
Every candidate row, memory or entity, MUST have:
- `source_chunk_id` — if extracted from ingested content, the chunk it came from
- `source_interaction_id` — if extracted from a captured interaction, the interaction it came from
- `rule` — the extractor rule id that fired
- `extractor_version` — a semver-ish string the extractor module carries
so old candidates can be re-evaluated with a newer extractor
If both `source_chunk_id` and `source_interaction_id` are null, the
candidate was hand-authored (via `POST /memory` directly) and must
be flagged as such. Hand-authored candidates are allowed but
discouraged — the preference is to extract from real content, not
dictate candidates directly.
The active rows inherit all of these fields from their candidate
row at promotion time. They are never overwritten.
## Extractor versioning
The extractor is going to change — new rules added, old rules
refined, precision/recall tuned over time. The promotion flow
must survive extractor changes:
- every extractor module exposes an `EXTRACTOR_VERSION = "0.1.0"`
constant
- every candidate row records this version
- when the extractor version changes, the change log explains
what the new rules do
- old candidates are NOT automatically re-evaluated by the new
extractor — that would lose the auditable history of why the
old candidate was created
- future `POST /memory/{id}/re-extract` can optionally propose
an updated candidate from the same source chunk with the new
extractor, but it produces a *new* candidate alongside the old
one, never a silent rewrite
## Ingestion-wave extraction semantics
When the batched extraction pass fires on an ingestion wave, it
produces a report artifact:
```
data/extraction-reports/<wave-id>/
├── report.json # summary counts, rule distribution
├── candidates.ndjson # one JSON line per persisted candidate
├── dropped.ndjson # one JSON line per candidate dropped
│ # (over batch cap, duplicate, below
│ # min content length, etc.)
└── errors.log # any rule-level errors
```
The report artifact lives under the configured `data_dir` and is
retained per the backup retention policy. The ingestion-waves doc
(`docs/ingestion-waves.md`) is updated to include an "extract"
step after each wave, with the expectation that the human
reviews the candidates before the next wave fires.
## Candidate-to-candidate deduplication across passes
Two extraction passes over the same chunk (or two different
chunks containing the same fact) should not produce two identical
candidate rows. The deduplication key is:
```
(memory_type_or_entity_type, normalized_content, project, status)
```
Normalization strips whitespace variants, lowercases, and drops
trailing punctuation (same rules as the extractor's `_clean_value`
function). If a second pass would produce a duplicate, it instead
increments a `re_extraction_count` column on the existing
candidate row and updates `last_re_extracted_at`. This gives the
reviewer a "saw this N times" signal without flooding the queue.
This column is a future schema addition — current candidates do
not track re-extraction. The promotion-rules implementation will
land the column as part of its first migration.
## The "never auto-promote into trusted state" invariant
Regardless of what auto-promotion policies might exist between
L0 → L2', **nothing ever moves into L3 (trusted project state)
without explicit human action via `POST /project/state`**. This
is the one hard line in the promotion graph and it is enforced
by having no API endpoint that takes a candidate id and writes
to `project_state`.
## Summary
- Four layers: L0 raw, L1 memory candidate/active, L2 entity
candidate/active, L3 trusted state
- Three triggers for extraction: on capture, on ingestion wave, on
explicit request
- Per-rule prior confidence, tuned by structural signals at write time
- Shared candidate review queue, promote/reject/edit/defer actions
- No auto-promotion in V1 (but the schema allows it later)
- Every candidate carries full provenance and extractor version
- Every promotion step is reversible except L3 curation
- L3 is never touched automatically

View File

@@ -19,12 +19,12 @@ now includes a first curated ingestion batch for the active projects.
- Phase 3
- Phase 5
- Phase 7
- Phase 9 (Commits A/B/C: capture, reinforcement, extractor + review queue)
- partial
- Phase 4
- Phase 8
- not started
- Phase 6
- Phase 9
- Phase 10
- Phase 11
- Phase 12

View File

@@ -29,10 +29,10 @@ read-only additive mode.
- Phase 4 - Identity / Preferences
- Phase 8 - OpenClaw Integration
### Started
### Baseline Complete
- Phase 9 - Reflection (Commit A: capture loop in place; Commits B/C
reinforcement and extraction still pending)
- Phase 9 - Reflection (all three foundation commits landed:
A capture, B reinforcement, C candidate extraction + review queue)
### Not Yet Complete In The Intended Sense
@@ -42,6 +42,31 @@ read-only additive mode.
- Phase 12 - Evaluation
- Phase 13 - Hardening
### Engineering Layer Planning Sprint
The engineering layer is intentionally in planning, not implementation.
The architecture docs below are the current state of that planning:
- [engineering-query-catalog.md](architecture/engineering-query-catalog.md) —
the 20 v1-required queries the engineering layer must answer
- [memory-vs-entities.md](architecture/memory-vs-entities.md) —
canonical home split between memory and entity tables
- [promotion-rules.md](architecture/promotion-rules.md) —
Layer 0 → Layer 2 pipeline, triggers, review queue mechanics
- [conflict-model.md](architecture/conflict-model.md) —
detection, representation, and resolution of contradictory facts
- [engineering-knowledge-hybrid-architecture.md](architecture/engineering-knowledge-hybrid-architecture.md) —
the 5-layer model (from the previous planning wave)
- [engineering-ontology-v1.md](architecture/engineering-ontology-v1.md) —
the initial V1 object and relationship inventory (previous wave)
Still to draft before engineering-layer implementation begins:
- tool-handoff-boundaries.md (KB-CAD / KB-FEM read vs write)
- human-mirror-rules.md (templates, triggers, edit flow)
- representation-authority.md (PKM / KB / repo / AtoCore canonical home matrix)
- engineering-v1-acceptance.md (done definition)
## What Is Real Today
- canonical AtoCore runtime on Dalidou