344 lines
14 KiB
Markdown
344 lines
14 KiB
Markdown
|
|
# Promotion Rules (Layer 0 → Layer 2 pipeline)
|
|||
|
|
|
|||
|
|
## Purpose
|
|||
|
|
|
|||
|
|
AtoCore ingests raw human-authored content (markdown, repo notes,
|
|||
|
|
interaction transcripts) and eventually must turn some of it into
|
|||
|
|
typed engineering entities that the V1 query catalog can answer.
|
|||
|
|
The path from raw text to typed entity has to be:
|
|||
|
|
|
|||
|
|
- **explicit**: every step has a named operation, a trigger, and an
|
|||
|
|
audit log
|
|||
|
|
- **reversible**: every promotion can be undone without data loss
|
|||
|
|
- **conservative**: no automatic movement into trusted state; a human
|
|||
|
|
(or later, a very confident policy) always signs off
|
|||
|
|
- **traceable**: every typed entity must carry a back-pointer to
|
|||
|
|
the raw source that produced it
|
|||
|
|
|
|||
|
|
This document defines that path.
|
|||
|
|
|
|||
|
|
## The four layers
|
|||
|
|
|
|||
|
|
Promotion is described in terms of four layers, all of which exist
|
|||
|
|
simultaneously in the system once the engineering layer V1 ships:
|
|||
|
|
|
|||
|
|
| Layer | Name | Canonical storage | Trust | Who writes |
|
|||
|
|
|-------|-------------------|------------------------------------------|-------|------------|
|
|||
|
|
| L0 | Raw source | source_documents + source_chunks | low | ingestion pipeline |
|
|||
|
|
| L1 | Memory candidate | memories (status="candidate") | low | extractor |
|
|||
|
|
| L1' | Active memory | memories (status="active") | med | human promotion |
|
|||
|
|
| L2 | Entity candidate | entities (status="candidate") | low | extractor + graduation |
|
|||
|
|
| L2' | Active entity | entities (status="active") | high | human promotion |
|
|||
|
|
| L3 | Trusted state | project_state | highest | human curation |
|
|||
|
|
|
|||
|
|
Layer 3 (trusted project state) is already implemented and stays
|
|||
|
|
manually curated — automatic promotion into L3 is **never** allowed.
|
|||
|
|
|
|||
|
|
## The promotion graph
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
[L0] source chunks
|
|||
|
|
|
|
|||
|
|
| extraction (memory extractor, Phase 9 Commit C)
|
|||
|
|
v
|
|||
|
|
[L1] memory candidate
|
|||
|
|
|
|
|||
|
|
| promote_memory()
|
|||
|
|
v
|
|||
|
|
[L1'] active memory
|
|||
|
|
|
|
|||
|
|
| (optional) propose_graduation()
|
|||
|
|
v
|
|||
|
|
[L2] entity candidate
|
|||
|
|
|
|
|||
|
|
| promote_entity()
|
|||
|
|
v
|
|||
|
|
[L2'] active entity
|
|||
|
|
|
|
|||
|
|
| (manual curation, NEVER automatic)
|
|||
|
|
v
|
|||
|
|
[L3] trusted project state
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Short path (direct entity extraction, once the entity extractor
|
|||
|
|
exists):
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
[L0] source chunks
|
|||
|
|
|
|
|||
|
|
| entity extractor
|
|||
|
|
v
|
|||
|
|
[L2] entity candidate
|
|||
|
|
|
|
|||
|
|
| promote_entity()
|
|||
|
|
v
|
|||
|
|
[L2'] active entity
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
A single fact can travel either path depending on what the
|
|||
|
|
extractor saw. The graduation path exists for facts that started
|
|||
|
|
life as memories before the entity layer existed, and for the
|
|||
|
|
memory extractor's structural cues (decisions, constraints,
|
|||
|
|
requirements) which are eventually entity-shaped.
|
|||
|
|
|
|||
|
|
## Triggers (when does extraction fire?)
|
|||
|
|
|
|||
|
|
Phase 9 already shipped one trigger: **on explicit API request**
|
|||
|
|
(`POST /interactions/{id}/extract`). The V1 engineering layer adds
|
|||
|
|
two more:
|
|||
|
|
|
|||
|
|
1. **On interaction capture (automatic)**
|
|||
|
|
- Same event that runs reinforcement today
|
|||
|
|
- Controlled by a `extract` boolean flag on the record request
|
|||
|
|
(default: `false` for memory extractor, `true` once an
|
|||
|
|
engineering extractor exists and has been validated)
|
|||
|
|
- Output goes to the candidate queue; nothing auto-promotes
|
|||
|
|
|
|||
|
|
2. **On ingestion (batched, per wave)**
|
|||
|
|
- After a wave of markdown ingestion finishes, a batch extractor
|
|||
|
|
pass sweeps all newly-added source chunks and produces
|
|||
|
|
candidates from them
|
|||
|
|
- Batched per wave (not per chunk) to keep the review queue
|
|||
|
|
digestible and to let the reviewer see all candidates from a
|
|||
|
|
single ingestion in one place
|
|||
|
|
- Output: a report artifact plus a review queue entry per
|
|||
|
|
candidate
|
|||
|
|
|
|||
|
|
3. **On explicit human request (existing)**
|
|||
|
|
- `POST /interactions/{id}/extract` for a single interaction
|
|||
|
|
- Future: `POST /ingestion/wave/{id}/extract` for a whole wave
|
|||
|
|
- Future: `POST /memory/{id}/graduate` to propose graduation
|
|||
|
|
of one specific memory into an entity
|
|||
|
|
|
|||
|
|
Batch size rule: **extraction passes never write more than N
|
|||
|
|
candidates per human review cycle, where N = 50 by default**. If
|
|||
|
|
a pass produces more, it ranks by (rule confidence × content
|
|||
|
|
length × novelty) and only writes the top N. The remaining
|
|||
|
|
candidates are logged, not persisted. This protects the reviewer
|
|||
|
|
from getting buried.
|
|||
|
|
|
|||
|
|
## Confidence and ranking of candidates
|
|||
|
|
|
|||
|
|
Each rule-based extraction rule carries a *prior confidence*
|
|||
|
|
based on how specific its pattern is:
|
|||
|
|
|
|||
|
|
| Rule class | Prior | Rationale |
|
|||
|
|
|---------------------------|-------|-----------|
|
|||
|
|
| Heading with explicit type (`## Decision:`) | 0.7 | Very specific structural cue, intentional author marker |
|
|||
|
|
| Typed list item (`- [Decision] ...`) | 0.65 | Explicit but often embedded in looser prose |
|
|||
|
|
| Sentence pattern (`I prefer X`) | 0.5 | Moderate structure, more false positives |
|
|||
|
|
| Regex pattern matching a value+unit (`X = 4.8 kg`) | 0.6 | Structural but prone to coincidence |
|
|||
|
|
| LLM-based (future) | variable | Depends on model's returned confidence |
|
|||
|
|
|
|||
|
|
The candidate's final confidence at write time is:
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
final = prior * structural_signal_multiplier * freshness_bonus
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Where:
|
|||
|
|
|
|||
|
|
- `structural_signal_multiplier` is 1.1 if the source chunk path
|
|||
|
|
contains any of `_HIGH_SIGNAL_HINTS` from the retriever (status,
|
|||
|
|
decision, requirements, charter, ...) and 0.9 if it contains
|
|||
|
|
`_LOW_SIGNAL_HINTS` (`_archive`, `_history`, ...)
|
|||
|
|
- `freshness_bonus` is 1.05 if the source chunk was updated in the
|
|||
|
|
last 30 days, else 1.0
|
|||
|
|
|
|||
|
|
This formula is tuned later; the numbers are starting values.
|
|||
|
|
|
|||
|
|
## Review queue mechanics
|
|||
|
|
|
|||
|
|
### Queue population
|
|||
|
|
|
|||
|
|
- Each candidate writes one row into its target table
|
|||
|
|
(memories or entities) with `status="candidate"`
|
|||
|
|
- Each candidate carries: `rule`, `source_span`, `source_chunk_id`,
|
|||
|
|
`source_interaction_id`, `extractor_version`
|
|||
|
|
- No two candidates ever share the same (type, normalized_content,
|
|||
|
|
project) — if a second extraction pass produces a duplicate, it
|
|||
|
|
is dropped before being written
|
|||
|
|
|
|||
|
|
### Queue surfacing
|
|||
|
|
|
|||
|
|
- `GET /memory?status=candidate` lists memory candidates
|
|||
|
|
- `GET /entities?status=candidate` (future) lists entity candidates
|
|||
|
|
- `GET /candidates` (future unified route) lists both
|
|||
|
|
|
|||
|
|
### Reviewer actions
|
|||
|
|
|
|||
|
|
For each candidate, exactly one of:
|
|||
|
|
|
|||
|
|
- **promote**: `POST /memory/{id}/promote` or
|
|||
|
|
`POST /entities/{id}/promote`
|
|||
|
|
- sets `status="active"`
|
|||
|
|
- preserves the audit trail (source_chunk_id, rule, source_span)
|
|||
|
|
- **reject**: `POST /memory/{id}/reject` or
|
|||
|
|
`POST /entities/{id}/reject`
|
|||
|
|
- sets `status="invalid"`
|
|||
|
|
- preserves audit trail so repeat extractions don't re-propose
|
|||
|
|
- **edit-then-promote**: `PUT /memory/{id}` to adjust content, then
|
|||
|
|
`POST /memory/{id}/promote`
|
|||
|
|
- every edit is logged, original content preserved in a
|
|||
|
|
`previous_content_log` column (schema addition deferred to
|
|||
|
|
the first implementation sprint)
|
|||
|
|
- **defer**: no action; candidate stays in queue indefinitely
|
|||
|
|
(future: add a `pending_since` staleness indicator to the UI)
|
|||
|
|
|
|||
|
|
### Reviewer authentication
|
|||
|
|
|
|||
|
|
In V1 the review queue is single-user by convention. There is no
|
|||
|
|
per-reviewer authorization. Every promote/reject call is logged
|
|||
|
|
with the same default identity. Multi-user review is a V2 concern.
|
|||
|
|
|
|||
|
|
## Auto-promotion policies (deferred, but designed for)
|
|||
|
|
|
|||
|
|
The current V1 stance is: **no auto-promotion, ever**. All
|
|||
|
|
promotions require a human reviewer.
|
|||
|
|
|
|||
|
|
The schema and API are designed so that automatic policies can be
|
|||
|
|
added later without schema changes. The anticipated policies:
|
|||
|
|
|
|||
|
|
1. **Reference-count threshold**
|
|||
|
|
- If a candidate accumulates N+ references across multiple
|
|||
|
|
interactions within M days AND the reviewer hasn't seen it yet
|
|||
|
|
(indicating the system sees it often but the human hasn't
|
|||
|
|
gotten to it), propose auto-promote
|
|||
|
|
- Starting thresholds: N=5, M=7 days. Never auto-promote
|
|||
|
|
entity candidates that affect validation claims or decisions
|
|||
|
|
without explicit human review — those are too consequential.
|
|||
|
|
|
|||
|
|
2. **Confidence threshold**
|
|||
|
|
- If `final_confidence >= 0.85` AND the rule is a heading
|
|||
|
|
rule (not a sentence rule), eligible for auto-promotion
|
|||
|
|
|
|||
|
|
3. **Identity/preference lane**
|
|||
|
|
- identity and preference memories extracted from an
|
|||
|
|
interaction where the user explicitly says "I am X" or
|
|||
|
|
"I prefer X" with a first-person subject and high-signal
|
|||
|
|
verb could auto-promote. This is the safest lane because
|
|||
|
|
the user is the authoritative source for their own identity.
|
|||
|
|
|
|||
|
|
None of these run in V1. The APIs and data shape are designed so
|
|||
|
|
they can be added as a separate policy module without disrupting
|
|||
|
|
existing tests.
|
|||
|
|
|
|||
|
|
## Reversibility
|
|||
|
|
|
|||
|
|
Every promotion step must be undoable:
|
|||
|
|
|
|||
|
|
| Operation | How to undo |
|
|||
|
|
|---------------------------|-------------------------------------------------------|
|
|||
|
|
| memory candidate written | delete the candidate row (low-risk, it was never in context) |
|
|||
|
|
| memory candidate promoted | `PUT /memory/{id}` status=candidate (reverts to queue) |
|
|||
|
|
| memory candidate rejected | `PUT /memory/{id}` status=candidate |
|
|||
|
|
| memory graduated | memory stays as a frozen pointer; delete the entity candidate to undo |
|
|||
|
|
| entity candidate promoted | `PUT /entities/{id}` status=candidate |
|
|||
|
|
| entity promoted to active | supersede with a new active, or `PUT` back to candidate |
|
|||
|
|
|
|||
|
|
The only irreversible operation is manual curation into L3
|
|||
|
|
(trusted project state). That is by design — L3 is small, curated,
|
|||
|
|
and human-authored end to end.
|
|||
|
|
|
|||
|
|
## Provenance (what every candidate must carry)
|
|||
|
|
|
|||
|
|
Every candidate row, memory or entity, MUST have:
|
|||
|
|
|
|||
|
|
- `source_chunk_id` — if extracted from ingested content, the chunk it came from
|
|||
|
|
- `source_interaction_id` — if extracted from a captured interaction, the interaction it came from
|
|||
|
|
- `rule` — the extractor rule id that fired
|
|||
|
|
- `extractor_version` — a semver-ish string the extractor module carries
|
|||
|
|
so old candidates can be re-evaluated with a newer extractor
|
|||
|
|
|
|||
|
|
If both `source_chunk_id` and `source_interaction_id` are null, the
|
|||
|
|
candidate was hand-authored (via `POST /memory` directly) and must
|
|||
|
|
be flagged as such. Hand-authored candidates are allowed but
|
|||
|
|
discouraged — the preference is to extract from real content, not
|
|||
|
|
dictate candidates directly.
|
|||
|
|
|
|||
|
|
The active rows inherit all of these fields from their candidate
|
|||
|
|
row at promotion time. They are never overwritten.
|
|||
|
|
|
|||
|
|
## Extractor versioning
|
|||
|
|
|
|||
|
|
The extractor is going to change — new rules added, old rules
|
|||
|
|
refined, precision/recall tuned over time. The promotion flow
|
|||
|
|
must survive extractor changes:
|
|||
|
|
|
|||
|
|
- every extractor module exposes an `EXTRACTOR_VERSION = "0.1.0"`
|
|||
|
|
constant
|
|||
|
|
- every candidate row records this version
|
|||
|
|
- when the extractor version changes, the change log explains
|
|||
|
|
what the new rules do
|
|||
|
|
- old candidates are NOT automatically re-evaluated by the new
|
|||
|
|
extractor — that would lose the auditable history of why the
|
|||
|
|
old candidate was created
|
|||
|
|
- future `POST /memory/{id}/re-extract` can optionally propose
|
|||
|
|
an updated candidate from the same source chunk with the new
|
|||
|
|
extractor, but it produces a *new* candidate alongside the old
|
|||
|
|
one, never a silent rewrite
|
|||
|
|
|
|||
|
|
## Ingestion-wave extraction semantics
|
|||
|
|
|
|||
|
|
When the batched extraction pass fires on an ingestion wave, it
|
|||
|
|
produces a report artifact:
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
data/extraction-reports/<wave-id>/
|
|||
|
|
├── report.json # summary counts, rule distribution
|
|||
|
|
├── candidates.ndjson # one JSON line per persisted candidate
|
|||
|
|
├── dropped.ndjson # one JSON line per candidate dropped
|
|||
|
|
│ # (over batch cap, duplicate, below
|
|||
|
|
│ # min content length, etc.)
|
|||
|
|
└── errors.log # any rule-level errors
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
The report artifact lives under the configured `data_dir` and is
|
|||
|
|
retained per the backup retention policy. The ingestion-waves doc
|
|||
|
|
(`docs/ingestion-waves.md`) is updated to include an "extract"
|
|||
|
|
step after each wave, with the expectation that the human
|
|||
|
|
reviews the candidates before the next wave fires.
|
|||
|
|
|
|||
|
|
## Candidate-to-candidate deduplication across passes
|
|||
|
|
|
|||
|
|
Two extraction passes over the same chunk (or two different
|
|||
|
|
chunks containing the same fact) should not produce two identical
|
|||
|
|
candidate rows. The deduplication key is:
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
(memory_type_or_entity_type, normalized_content, project, status)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Normalization strips whitespace variants, lowercases, and drops
|
|||
|
|
trailing punctuation (same rules as the extractor's `_clean_value`
|
|||
|
|
function). If a second pass would produce a duplicate, it instead
|
|||
|
|
increments a `re_extraction_count` column on the existing
|
|||
|
|
candidate row and updates `last_re_extracted_at`. This gives the
|
|||
|
|
reviewer a "saw this N times" signal without flooding the queue.
|
|||
|
|
|
|||
|
|
This column is a future schema addition — current candidates do
|
|||
|
|
not track re-extraction. The promotion-rules implementation will
|
|||
|
|
land the column as part of its first migration.
|
|||
|
|
|
|||
|
|
## The "never auto-promote into trusted state" invariant
|
|||
|
|
|
|||
|
|
Regardless of what auto-promotion policies might exist between
|
|||
|
|
L0 → L2', **nothing ever moves into L3 (trusted project state)
|
|||
|
|
without explicit human action via `POST /project/state`**. This
|
|||
|
|
is the one hard line in the promotion graph and it is enforced
|
|||
|
|
by having no API endpoint that takes a candidate id and writes
|
|||
|
|
to `project_state`.
|
|||
|
|
|
|||
|
|
## Summary
|
|||
|
|
|
|||
|
|
- Four layers: L0 raw, L1 memory candidate/active, L2 entity
|
|||
|
|
candidate/active, L3 trusted state
|
|||
|
|
- Three triggers for extraction: on capture, on ingestion wave, on
|
|||
|
|
explicit request
|
|||
|
|
- Per-rule prior confidence, tuned by structural signals at write time
|
|||
|
|
- Shared candidate review queue, promote/reject/edit/defer actions
|
|||
|
|
- No auto-promotion in V1 (but the schema allows it later)
|
|||
|
|
- Every candidate carries full provenance and extractor version
|
|||
|
|
- Every promotion step is reversible except L3 curation
|
|||
|
|
- L3 is never touched automatically
|