docs(planning): V1 Completion Plan + gbrain-plan rejection record

- docs/decisions/2026-04-22-gbrain-plan-rejection.md: record of gbrain-inspired "Phase 8 Minions + typed edges" plan rejection. Three high findings from Codex verified against cited architecture docs (ontology V1 predicate set, canonical entity contract, master-plan-status Now list sequencing). - docs/plans/engineering-v1-completion-plan.md: seven-phase plan for finishing Engineering V1 against engineering-v1-acceptance.md. V1-0 (write-time invariants: F-8 provenance + F-5 hooks + F-1 audit) as hard prerequisite per Codex first-round review. Per- criterion gap audit against each F/Q/O/D acceptance item with code:line references. Explicit collision points with the Now list; schedule shifted ~4 weeks to avoid pipeline-soak window. Awaiting Codex file-level audit. - DEV-LEDGER.md: Recent Decisions + Session Log entries covering both the rejection and the revised plan. No code changes. Docs + ledger only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:58:10 -04:00
parent e147ab2abd
commit ce3a87857e
3 changed files with 708 additions and 0 deletions
--- a/DEV-LEDGER.md
+++ b/DEV-LEDGER.md
@@ -146,6 +146,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
 ## Recent Decisions
 - **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** — original six-phase order (queries → ingest → mirror → graduation → provenance → ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" — real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "50–70% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex.
 - **2026-04-22** gbrain-inspired "Phase 8 Minions + typed edges" plan **rejected as packaged** — wrong sequencing (leapfrogged `master-plan-status.md` Now list), wrong predicate set (6 vs V1's 17), wrong canonical boundary (edges-on-wikilinks instead of typed entities+relationships per `memory-vs-entities.md`). Mechanic (durable jobs + typed graph) deferred to V1 home. Record: `docs/decisions/2026-04-22-gbrain-plan-rejection.md`. *Proposed by:* Claude. *Reviewed/rejected by:* Codex. *Ratified by:* Antoine.
 - **2026-04-12** Day 4 gate cleared: LLM-assisted extraction via `claude -p` (OAuth, no API key) is the path forward. Rule extractor stays as default for structural cues. *Proposed by:* Claude. *Ratified by:* Antoine.
 - **2026-04-12** First live triage: 16 promoted, 35 rejected from 51 LLM-extracted candidates. 31% accept rate. Active memory count 20->36. *Executed by:* Claude. *Ratified by:* Antoine.
 - **2026-04-12** No API keys allowed in AtoCore — LLM-assisted features use OAuth via `claude -p` or equivalent CLI-authenticated paths. *Proposed by:* Antoine.
@@ -160,6 +162,12 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
 ## Session Log
 - **2026-04-22 Claude (night)** Codex first-round review of the V1 Completion Plan summary came back with four findings. Three substantive, one workspace-sync: (1) "50–70% built" too loose — replaced with per-criterion table, global framing withdrawn; (2) phase order backward — provenance-at-write (F-8) and conflict hooks (F-5 minimal) depend-upon by every later phase but were in V1-E; new V1-0 prerequisite phase inserted to establish write-time invariants, and V1-A shrunk to a minimum query slice (four pillars Q-001/Q-005/Q-006/Q-017 + Q-6 integration) rather than full catalog closure; (3) "parallel with Now list / disjoint surfaces" too strong — real collisions listed explicitly (V1-0 provenance + memory extractor write path, V1-E graduation + memory module, V1-F conflicts migration + memory promote); schedule shifted ~4 weeks, V1-0 cannot start during pipeline soak; (4) Codex's Playground workspace can't see the plan file or the `src/atocore/engineering/` code — added a Workspace note to the plan directing per-file audit at the Windows canonical dev tree (`C:\Users\antoi\ATOCore`) and noting the three visible file paths (`docs/plans/engineering-v1-completion-plan.md`, `docs/decisions/2026-04-22-gbrain-plan-rejection.md`, `DEV-LEDGER.md`). Revised plan estimate: 12–17 days across 7 phases (up from 11–14 / 6), ~65 tests added (up from ~50). V1-0 is a hard prerequisite; no later phase starts until it lands. Pending Antoine decision on workspace sync (commit+push vs paste-to-Codex) so Codex can do the file-level audit. No code changes this session.
 - **2026-04-22 Claude (late eve)** After the rejection, read the four core V1 architecture docs end-to-end (`engineering-ontology-v1.md`, `engineering-query-catalog.md`, `memory-vs-entities.md`, `engineering-v1-acceptance.md`) plus the four supporting docs (`promotion-rules.md`, `conflict-model.md`, `human-mirror-rules.md`, `tool-handoff-boundaries.md`). Cross-referenced against current code in `src/atocore/engineering/`. **Key finding:** V1 is already 50–70% built — entity types (16, superset of V1's 12), all 18 V1 relationship types, 4-state lifecycle, CRUD + supersede + invalidate + PATCH, queries module with most killer-correctness queries (orphan_requirements, risky_decisions, unsupported_claims, impact_analysis, evidence_chain), conflicts module scaffolded, mirror scaffolded, graduation endpoint scaffolded. Recent commits e147ab2/b94f9df/081c058/069d155/b1a3dd0 are all V1 entity-layer work. Drafted `docs/plans/engineering-v1-completion-plan.md` reframing the work as **V1 completion, not V1 start**. Six sequential phases V1-A through V1-F, estimated 11–14 days, ~50 new tests (533 → ~580). Phases run in parallel with the Now list (pipeline soak + density + multi-model triage + p04-constraints) because surfaces are disjoint. Plan explicitly defers the minions/queue mechanic per acceptance-doc negative list. Pending Codex audit of the plan itself — especially the F-2 query gap list (Claude didn't read each query function end-to-end), F-5 conflicts schema divergence (per-type detectors vs spec's generic slot-keyed shape), and F-7 graduation depth. No code changes this session.
 - **2026-04-22 Claude (eve)** gbrain review session. Antoine surfaced https://github.com/garrytan/gbrain for compare/contrast. Claude drafted a "Phase 8 Minions + typed edges" plan pairing a durable job queue with a 6-predicate edge upgrade over wikilinks. Codex reviewed and rejected as packaged: (1) sequencing leapfrogged the `master-plan-status.md` Now list (pipeline soak → 100+ memories → multi-model triage → p04-constraints fix); (2) 6 predicates vs V1's 17 across Structural/Intent/Validation/Provenance families — would have been schema debt on day one per `engineering-ontology-v1.md:112-137`; (3) "edges over wikilinks" bypassed the V1 canonical entity + promotion contract in `memory-vs-entities.md`. Claude verified each high finding against the cited files and concurred. The underlying mechanic (durable background jobs + typed relationship graph) is still a valid future direction, but its correct home is the Engineering V1 sprint under **Next** in `master-plan-status.md:179`, not a leapfrog phase. Decision record: `docs/decisions/2026-04-22-gbrain-plan-rejection.md`. No code changes this session. Next (pending Antoine ack): read the four V1 architecture docs end-to-end, then draft an Engineering V1 foundation plan that follows the existing contract, not a new external reference. Phase 8 (OpenClaw) name remains untouched — Claude's misuse of "Phase 8" in the rejected plan was a naming collision, not a renaming.
 - **2026-04-22 Claude (pm)** Issue B (wiki redlinks) landed — last remaining P2 from Antoine's sprint plan. `_wikilink_transform(text, current_project)` in `src/atocore/engineering/wiki.py` replaces `[[Name]]` / `[[Name|Display]]` tokens (pre-markdown) with HTML anchors. Resolution order: same-project exact-name match → live `wikilink`; other-project match → live link with `(in project X)` scope indicator (`wikilink-cross`); no match → `redlink` pointing at `/wiki/new?name=<quoted>&project=<current>`. New route `GET /wiki/new` renders a pre-filled "create this entity" form that POSTs to `/v1/entities` via a minimal inline fetch() and redirects to the new entity's wiki page on success. Transform applied in `render_project` (over the mirror markdown) and `render_entity` (over the description body). CSS: dashed-underline accent for live wikilinks, red italic + dashed for redlinks. 12 new tests including the regression from the spec (entity A references `[[EntityB]]` → initial render has `class="redlink"`; after EntityB is created, re-render no longer has redlink and includes `/wiki/entities/{b.id}`). Tests 521 → 533. All 6 acceptance criteria from the sprint plan ("daily-usable") now green: retract/supersede, edit without cloning, cross-project has a home, visual evidence, wiki readable, AKC can capture reliably.
 - **2026-04-22 Claude** PATCH `/entities/{id}` + Issue D (/v1/engineering/* aliases) landed. New `update_entity()` in `src/atocore/engineering/service.py` supports partial updates to description (replace), properties (shallow merge — `null` value deletes a key), confidence (0..1, 400 on bounds violation), source_refs (append + dedup). Writes an `updated` audit row with full before/after snapshots. Forbidden via this path: entity_type / project / name / status — those require supersede+create or the dedicated status endpoints, by design. New route `PATCH /entities/{id}` aliased under `/v1`. Issue D: all 10 `/engineering/*` query paths (decisions, systems, components/{id}/requirements, changes, gaps + sub-paths, impact, evidence) added to the `/v1` allowlist. 12 new PATCH tests (merge, null-delete, confidence bounds, source_refs dedup, 404, audit row, v1 alias). Tests 509 → 521. Next: commit + deploy, then Issue B (wiki redlinks) as the last remaining P2 per Antoine's sprint order.
--- a/docs/decisions/2026-04-22-gbrain-plan-rejection.md
+++ b/docs/decisions/2026-04-22-gbrain-plan-rejection.md
@@ -0,0 +1,113 @@
 # Decision record: gbrain-inspired "Phase 8 Minions + typed edges" plan rejected
 **Date:** 2026-04-22
 **Author of plan:** Claude
 **Reviewer:** Codex
 **Ratified by:** Antoine
 **Status:** Rejected as packaged. Underlying mechanic (durable background jobs + typed relationships) deferred to its correct home.
 ## Context
 Antoine surfaced https://github.com/garrytan/gbrain and asked for a compare/contrast and a
 plan to improve AtoCore. Claude proposed a "Phase 8" plan pairing:
 1. A Minion-style durable job queue replacing the nightly cron pipeline
 2. A typed-edge upgrade over existing wikilinks, with a six-predicate set
   (`mentions`, `decided_by`, `supersedes`, `evidences`, `part_of`, `blocks`)
 Codex reviewed and rejected the plan as packaged. This record captures what went wrong,
 what was right, and where the ideas should actually land.
 ## What Codex flagged (verified against repo)
 ### High — wrong sequencing
 `docs/master-plan-status.md` defines the **Now** list:
 1. Observe the enhanced pipeline for a week
 2. Knowledge density — batch-extract over all 234 interactions, target 100+ memories
 3. Multi-model triage (Phase 11 entry)
 4. Fix p04-constraints harness failure
 Engineering V1 appears under **Next** (line 179) as
 "Engineering V1 implementation sprint — once knowledge density is sufficient and the
 pipeline feels boring and dependable."
 Claude's plan jumped over all four **Now** items. That was the primary sequencing error.
 ### High — wrong predicate set
 `docs/architecture/engineering-ontology-v1.md` already defines a 17-predicate V1
 ontology across four families:
 - **Structural:** `CONTAINS`, `PART_OF`, `INTERFACES_WITH`
 - **Intent / logic:** `SATISFIES`, `CONSTRAINED_BY`, `BASED_ON_ASSUMPTION`,
  `AFFECTED_BY_DECISION`, `SUPERSEDES`
 - **Validation:** `ANALYZED_BY`, `VALIDATED_BY`, `SUPPORTS`, `CONFLICTS_WITH`,
  `DEPENDS_ON`
 - **Artifact / provenance:** `DESCRIBED_BY`, `UPDATED_BY_SESSION`, `EVIDENCED_BY`,
  `SUMMARIZED_IN`
 Claude's six-predicate set was a gbrain-shaped subset that could not express the V1
 example statements at lines 141–147 of that doc. Shipping it first would have been
 schema debt on day one.
 ### High — wrong canonical boundary
 `docs/architecture/memory-vs-entities.md` and
 `docs/architecture/engineering-v1-acceptance.md` establish that V1 is **typed
 entities plus typed relationships**, with one canonical home per concept, a shared
 candidate-review / promotion flow, provenance, conflict handling, and mirror
 generation. Claude's "typed edges on top of wikilinks" framing bypassed the canonical
 entity contract — it would have produced labelled links over notes without the
 promotion / canonicalization machinery V1 actually requires.
 ### Medium — overstated problem
 Claude described the nightly pipeline as a "monolithic bash script" that needed to be
 replaced. The actual runtime is API-driven (`src/atocore/api/routes.py:516`,
 `src/atocore/interactions/service.py:55`), SQLite is already in WAL with a busy
 timeout (`src/atocore/models/database.py:151`), and the reflection loop is explicit
 capture / reinforce / extract. The queue argument overstated the current shape.
 ## What was right
 - gbrain is genuine validation of the general pattern: **durable background jobs +
  typed relationship graph compound value**. The gbrain v0.12.0 graph release and
  Minions benchmark (both 2026-04-18) are evidence, not just inspiration.
 - Async-ification of extraction with retries, per-job visibility, and SLOs remains a
  real future win for AtoCore — but **additively, behind flags, after V1**, not as a
  replacement for the current explicit endpoints.
 ## What we will do instead
 1. **Keep to the `master-plan-status.md` Now list.** No leapfrog. Observe the
   pipeline (including the confidence-decay Step F4 first real run), land knowledge
   density via full-backlog batch extract, progress multi-model triage, fix
   p04-constraints.
 2. **When Engineering V1 is ready to start** (criterion: pipeline feels boring and
   dependable, knowledge density ≥ 100 active memories), write a V1 foundation plan
   that follows `engineering-ontology-v1.md`, `engineering-query-catalog.md`,
   `memory-vs-entities.md`, and `engineering-v1-acceptance.md` — entities +
   relationships + memory-to-entity bridge + mirror / query surfaces, in that order.
 3. **Async workerization is optional and later.** Only after V1 is working, and only
   if observed contention or latency warrants it. Jobs stay in the primary SQLite
   (WAL already in place). No separate DB unless contention is measured.
 ## Lesson for future plans
 A plan built from a **new external reference** (gbrain) without reading the
 repository's own architecture docs will mis-specify predicates, boundaries, and
 sequencing — even when the underlying mechanic is valid. Read the four V1
 architecture docs end-to-end before proposing schema work.
 ## References
 - https://github.com/garrytan/gbrain
 - `docs/master-plan-status.md` (Now / Next / Later)
 - `docs/architecture/engineering-ontology-v1.md`
 - `docs/architecture/engineering-query-catalog.md`
 - `docs/architecture/memory-vs-entities.md`
 - `docs/architecture/engineering-v1-acceptance.md`
 - `docs/architecture/llm-client-integration.md`
 - `docs/architecture/human-mirror-rules.md`
--- a/docs/plans/engineering-v1-completion-plan.md
+++ b/docs/plans/engineering-v1-completion-plan.md
@@ -0,0 +1,587 @@
 # Engineering V1 Completion Plan
 **Date:** 2026-04-22
 **Author:** Claude (after reading the four V1 architecture docs + promotion-rules,
 conflict-model, human-mirror-rules, tool-handoff-boundaries end-to-end)
 **Status:** Draft, pending Codex review
 **Replaces:** the rejected "Phase 8 Minions + typed edges" plan (see
 `docs/decisions/2026-04-22-gbrain-plan-rejection.md`)
 ## Position
 This is **not** a plan to start Engineering V1. It is a plan to **finish** V1.
 **Against what criterion?** Each F/Q/O/D item in `engineering-v1-acceptance.md`
 gets scored individually in the Gap audit table below with exact code/test/doc
 references. No global percentage. The headline framing from the first draft
 ("50–70% built") is withdrawn — it's either done per-criterion or it isn't.
 The relevant observation is narrower: the entity schema, the full
 relationship type set, the 4-state lifecycle, basic CRUD and most of the
 killer-correctness query functions are already implemented in
 `src/atocore/engineering/*.py` in the Windows working tree at
 `C:\Users\antoi\ATOCore` (the canonical dev workspace, per
 `CLAUDE.md`). The recent commits e147ab2, b94f9df, 081c058, 069d155, b1a3dd0
 are V1 entity-layer work. **Codex auditors working in a different
 workspace / branch should sync from the canonical dev tree before
 per-file review** — see the "Workspace note" at the end of this doc.
 The question this plan answers: given the current code state, in what
 order should the remaining V1 acceptance criteria be closed so that
 every phase builds on invariants the earlier phases already enforced?
 ## Corrected sequencing principle (post-Codex review 2026-04-22)
 The first draft ordered phases F-1 → F-2 → F-3 → F-4 → F-5 → F-6 → F-7 → F-8
 following the acceptance doc's suggested reading order. Codex rejected
 that ordering. The correct dependency order, which this revision adopts, is:
 1. **Write-time invariants come first.** Every later phase creates active
   entities. Provenance-at-write (F-8) and synchronous conflict-detection
   hooks (F-5 minimal) must be enforced **before** any phase that writes
   entities at scale (ingest, graduation, or broad query coverage that
   depends on the model being populated).
 2. **Query closure sits on top of the schema + invariants**, not ahead of
   them. A minimum query slice that proves the model is fine early. The
   full catalog closure waits until after the write paths are invariant-safe.
 3. **Mirror is a derived consumer** of the entity layer, not a midstream
   milestone. It comes after the entity layer produces enforced, correct data.
 4. **Graduation and full conflict-spec compliance** are finishing work that
   depend on everything above being stable.
 The acceptance criteria are unchanged. Only the order of closing them changes.
 ## How this plan respects the rejected-plan lessons
 - **No new predicates.** The V1 ontology in `engineering-ontology-v1.md:112-137`
  already defines 18 relationship types; `service.py:38-62` already implements
  them. Nothing added, nothing reshaped.
 - **No new canonical boundary.** Typed entities + typed relationships with
  promotion-based candidate flow per `memory-vs-entities.md`. Not
  edges-over-wikilinks.
 - **No leapfrog of `master-plan-status.md` Now list.** This plan is **in
  parallel** with (not ahead of) the Now items because V1 entity work is
  already happening alongside them. The sequencing section below is explicit.
 - **Queue/worker infrastructure is explicitly out of scope.** The "flag it for
  later" note at the end of this doc is the only mention, per
  `engineering-v1-acceptance.md:378` negative list.
 ---
 ## Gap audit against `engineering-v1-acceptance.md`
 Each criterion marked: ✅ done / 🟡 partial / ❌ missing. "Partial" means the
 capability exists but does not yet match spec shape or coverage.
 ### Functional (F-1 through F-8)
 | ID | Criterion | Status | Evidence |
 |----|-----------|--------|----------|
 | F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | ✅ done | `service.py:16-36` (16 types, superset of V1 minimum), `service.py:38-62` (18 relationship types), `service.py:64` statuses, `Entity` dataclass at line 67 |
 | F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟡 partial | `queries.py` has system_map (Q-004), decisions_affecting (Q-008), requirements_for (Q-005 component side), recent_changes (Q-013), orphan_requirements (Q-006 killer), risky_decisions (Q-009 killer), unsupported_claims (Q-011 killer), impact_analysis (Q-016), evidence_chain (Q-017). Likely missing or partial: Q-001 (expand=contains), Q-002 (expand=parents), Q-003 (interfaces), Q-007 (constraints on component), Q-010 (supports trace), Q-012 (conflicting results), Q-014 (decision-log ordered chain), Q-018 (include=superseded chain), Q-019 (material→components), Q-020 (project overview mirror endpoint in V1-required shape) |
 | F-3 | `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export` | ❌ missing | No `/ingest/kb-cad` or `/ingest/kb-fem` route in `api/routes.py`. No schema doc under `docs/architecture/` |
 | F-4 | Candidate review queue end-to-end (list/promote/reject/edit) | 🟡 partial for entities | Memory side shipped in Phase 9 Commit C. Entity side has `promote_entity`, `supersede_entity`, `invalidate_active_entity` but reject path and editable-before-promote may not match spec shape. Need to verify `GET /entities?status=candidate` returns spec shape |
 | F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟡 partial | `conflicts.py` has `detect_conflicts_for_entity`, `list_open_conflicts`, `resolve_conflict`. API at `/admin/conflicts` + `/admin/conflicts/{id}/resolve`. **Gap vs spec**: spec wants generic slot-key model with `conflicts` + `conflict_members` tables; current code has per-type detectors (`_check_component_conflicts`, `_check_requirement_conflicts`) — need to verify schema, and spec routes are `/conflicts/*` not `/admin/conflicts/*` |
 | F-6 | Mirror: `/mirror/{project}/overview`, `/decisions`, `/subsystems/{id}`, `/regenerate`; files under `/srv/storage/atocore/data/mirror/`; disputed + curated markers; deterministic output | 🟡 partial | `mirror.py` has `generate_project_overview` with header/state/system/decisions/requirements/materials/vendors/memories/footer sections. API at `/projects/{project_name}/mirror` and `.html`. **Gaps**: no separate `/mirror/{project}/decisions` or `/mirror/{project}/subsystems/{id}` routes, no `POST /regenerate` endpoint, no debounced-async-on-write, no daily refresh, no `⚠ disputed` markers wired to conflicts, no `(curated)` override annotations verified, no golden-file test for determinism |
 | F-7 | Memory→entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟡 partial | `_graduation_prompt.py` exists; `api_request_graduation` + `api_graduation_status` + `api_graduation_stats` routes exist (routes.py:1573, 1607, 2065). Need to verify full flow against F-7 spec — original preserved? `graduated` status row added? forward pointer column present? |
 | F-8 | Every active entity has `source_refs`; Q-017 returns ≥1 row for every active entity | 🟡 partial | `Entity.source_refs` field exists; Q-017 (`evidence_chain`) exists. **Gap**: is provenance enforced at write time (not NULL), or just encouraged? Per spec it must be mandatory |
 ### Quality (Q-1 through Q-6)
 | ID | Criterion | Status | Evidence |
 |----|-----------|--------|----------|
 | Q-1 | All pre-V1 tests still pass | ✅ presumed | 533 tests passing per DEV-LEDGER line 12 |
 | Q-2 | Each F criterion has happy-path + error-path test, <10s each, <30s total | 🟡 partial | 16 + 15 + 15 + 12 = 58 tests in engineering/queries/v1-phase5/patch files. Need to verify coverage of each F criterion one-for-one |
 | Q-3 | Conflict invariants enforced by tests (contradictory imports produce conflict, can't promote both, flag-never-block) | 🟡 partial | Tests likely exist in `test_engineering_v1_phase5.py` — verify explicit coverage of the three invariants |
 | Q-4 | Trust hierarchy enforced by tests (candidates never in context, active-only reinforcement, no auto-project-state writes) | 🟡 partial | Phase 9 Commit B covered the memory side; verify entity side has equivalent tests |
 | Q-5 | Mirror has golden-file test, deterministic output | ❌ missing | No golden file seen; mirror output includes `now` timestamp (line 326) which is non-deterministic — would fail Q-5 as written |
 | Q-6 | Killer correctness queries pass against seeded real-ish data (5 seed cases per Q-006/Q-009/Q-011) | ❌ likely missing | No fixture file named for this seen. The three queries exist but there's no evidence of the single integration test described in Q-6 |
 ### Operational (O-1 through O-5)
 | ID | Criterion | Status | Evidence |
 |----|-----------|--------|----------|
 | O-1 | Schema migration additive, idempotent, tested against fresh + prod-copy DB | 🟡 presumed | `_apply_migrations` pattern is in use per CLAUDE.md sessions; tables exist. Need one confirmation run against a Dalidou backup copy |
 | O-2 | Backup includes new tables; full restore drill passes; post-restore Q-001 works | ❌ not done | No evidence a restore drill has been run on V1 entity state. `docs/backup-restore-procedure.md` exists but drill is an explicit V1 prerequisite |
 | O-3 | Performance bounds: write <100ms p99, query <500ms p99 at 1000 entities, mirror <5s per project | 🟡 unmeasured | 35 entities in system — bounds unmeasured at scale. Spec says "sanity-checked, not benchmarked", so this is a one-off manual check |
 | O-4 | No new manual ops burden | 🟡 | Mirror regen auto-triggers not wired yet (see F-6 gap) — they must be wired for O-4 to pass |
 | O-5 | Phase 9 reflection loop unchanged for identity/preference/episodic | ✅ presumed | `memory-vs-entities.md` says these three types don't interact with engineering layer. No recent change to memory extractor for these types |
 ### Documentation (D-1 through D-4)
 | ID | Criterion | Status | Evidence |
 |----|-----------|--------|----------|
 | D-1 | 12 per-entity-type spec docs under `docs/architecture/entities/` | ❌ missing | No `docs/architecture/entities/` folder |
 | D-2 | `kb-cad-export-schema.md` + `kb-fem-export-schema.md` | ❌ missing | No such files in `docs/architecture/` |
 | D-3 | `docs/v1-release-notes.md` | ❌ missing | Not written yet (appropriately — it's written when V1 is done) |
 | D-4 | `master-plan-status.md` + `current-state.md` updated with V1 completion | ❌ not yet | `master-plan-status.md:179` still has V1 under **Next** |
 ### Summary
 - **Functional:** 1/8 ✅, 6/8 🟡 partial, 1/8 ❌ missing → the entity layer is real; the ingest + mirror + graduation surfaces need completion
 - **Quality:** 1/6 ✅, 3/6 🟡 partial, 2/6 ❌ missing → golden file + killer-correctness integration test are the two clear gaps
 - **Operational:** 0/5 ✅ (none marked fully verified), 3/5 🟡, 1/5 ❌ → backup drill is the one hard blocker here
 - **Documentation:** 0/4 ✅, 4/4 ❌ → all 4 docs need writing, D-3/D-4 at the end, D-1/D-2 as part of their respective F criteria
 ---
 ## Proposed completion order (revised post-Codex review)
 Seven phases instead of six. The new V1-0 establishes the write-time
 invariants (provenance enforcement F-8 + synchronous conflict hooks F-5
 minimal) that every later phase depends on. V1-A becomes a **minimal query
 slice** that proves the model on one project, not a full catalog closure.
 Full query catalog closure moves to V1-C. Full F-5 spec compliance (the
 generic `conflicts`/`conflict_members` slot-key schema) stays in V1-F
 because that's the final shape, but the *minimal hooks* that fire
 synchronously on writes land in V1-0.
 Skipped by construction: F-1 core schema (already implemented) and O-5
 (identity/preference/episodic don't touch the engineering layer).
 ### Phase V1-0: Write-time invariants (F-8 + F-5 minimal + F-1 audit)
 **Scope:**
 - **F-1 audit (Codex action).** Before any code change, Codex does a
  per-file audit of `src/atocore/engineering/service.py`,
  `conflicts.py`, `mirror.py`, `queries.py` against the acceptance doc's
  F-1 shared-header-field list (`id, type, name, project_id, status,
  confidence, source_refs, created_at, updated_at, extractor_version,
  canonical_home`). Confirm which fields exist, which are missing. This
  becomes the ground-truth F-1 row in the gap audit table below.
 - **F-8 provenance enforcement.** Add a NOT-NULL invariant at
  `create_entity` and `promote_entity` that `source_refs` is non-empty
  OR an explicit `hand_authored=True` flag is set (per
  `promotion-rules.md:253`). Backfill any existing active entities that
  fail the invariant — either attach provenance, flag as hand-authored,
  or invalidate. Every future active entity has provenance by schema,
  not by discipline.
 - **F-5 minimal hooks.** Wire synchronous conflict detection into every
  active-entity write path (`create_entity` with status=active,
  `promote_entity`, `supersede_entity`). The *detector* can stay in its
  current per-type form (`_check_component_conflicts`,
  `_check_requirement_conflicts`); the *hook* must fire on every write.
  Full generic slot-keyed schema lands in V1-F; the hook shape must be
  generic enough that V1-F is a detector-body swap, not an API refactor.
 - **Q-3 "flag never block" test.** The hook must return conflict-id in
  the response body but never 4xx-block the write. One test per write
  path demonstrating this.
 - **Q-4 trust-hierarchy test for candidates.** One test: entity
  candidates never appear in `/context/build` output. (Full trust tests
  land in V1-E; this is the one that V1-0 can cover without graduation
  being ready.)
 **Acceptance:** F-8 ✅, F-5 minimal hooks ✅, Q-3 ✅, partial Q-4 ✅,
 F-1 row in gap table is accurate.
 **Estimated size:** 3 days (the audit is the biggest unknown; the
 enforcement patches are small).
 **Tests added:** ~10.
 **Why first:** every later phase writes entities. Without F-8 + F-5
 hooks, V1-A through V1-F can leak invalid state into the store that
 must then be cleaned up.
 ### Phase V1-A: Minimal query slice that proves the model (partial F-2 + Q-6)
 **Scope:**
 - Pick the **four queries that prove the model on p05-interferometer**:
  Q-001 (subsystem contents), Q-005 (component satisfies requirements),
  Q-006 (orphan requirements — killer correctness), Q-017 (evidence
  chain). These four exercise structural + intent + killer-correctness +
  provenance, which are the four pillars of the V1 shape.
 - Seed p05-interferometer with Q-6 integration data (one satisfying
  Component + one orphan Requirement + one Decision on flagged
  Assumption + one supported ValidationClaim + one unsupported
  ValidationClaim).
 - Verify each of the four queries returns correct results against the
  seeded data. The three killer-correctness queries (Q-006, Q-009,
  Q-011) run as a single integration test. Q-009 and Q-011 are
  implemented against the seed data here even though they're not in the
  "four pillars" list, because Q-6 requires all three.
 - Any query function the Codex F-1 audit found to be missing fields
  required by Q-001/Q-005/Q-006/Q-017 gets filled in here, not in V1-C.
 **Acceptance:** The four pillar queries + Q-006/Q-009/Q-011 killer
 correctness all return correct results. Q-6 ✅ passes. Partial F-2
 (the remaining queries land in V1-C).
 **Estimated size:** 2 days.
 **Tests added:** ~6.
 **Why second:** proves the entity layer shape works end-to-end on real
 data before we start bolting ingest, graduation, or mirror onto it. If
 the four pillar queries don't work, stopping here is cheap.
 ### Phase V1-B: KB-CAD / KB-FEM ingest (F-3) + D-2 schema docs
 **Scope:**
 - Write `docs/architecture/kb-cad-export-schema.md` and
  `kb-fem-export-schema.md` (matches D-2).
 - Implement `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export`
  per `tool-handoff-boundaries.md` sketches. Validator + entity-candidate
  producer + provenance population **using the F-8 invariant from V1-0**.
 - Hand-craft one real KB-CAD export for p05-interferometer and
  round-trip it: export → candidate queue → reviewer promotes → queryable
  via V1-A's four pillar queries.
 - Tests: valid export → candidates created; invalid export → 400;
  duplicate re-export → no duplicate candidates; re-export with changed
  value → new candidate + conflict row (exercises V1-0's F-5 hook on a
  real workload).
 **Acceptance:** F-3 ✅, D-2 ✅.
 **Estimated size:** 2 days.
 **Tests added:** ~8.
 **Why third:** ingest is the first real stress test of the V1-0
 invariants. A re-import that creates a conflict must trigger the V1-0
 hook; if it doesn't, V1-0 is incomplete and we catch it before going
 further.
 ### Phase V1-C: Close the rest of the query catalog (remaining F-2)
 **Scope:**
 - Implement remaining v1-required queries: Q-002 (component parents),
  Q-003 (subsystem interfaces, with Interface as simple string label),
  Q-004 (project system-map tree), Q-007 (component constraints),
  Q-008 (decisions affecting an entity, full shape), Q-010 (supports
  trace to AnalysisModel), Q-012 (conflicting results on same claim —
  exercises V1-0's F-5 hook), Q-013 (recent changes with window),
  Q-014 (decision log ordered + superseded chain), Q-016 (impact
  analysis — likely already done, just verify shape), Q-018
  (`include=superseded`), Q-019 (material → components).
 - Q-020 (project overview mirror route) is deferred to V1-D where the
  mirror lands in full.
 **Acceptance:** F-2 ✅ (all 19 of 20 v1-required queries; Q-020 in V1-D).
 **Estimated size:** 2 days.
 **Tests added:** ~12.
 **Why fourth:** with the model proven (V1-A) and ingest exercising the
 write invariants (V1-B), filling in the remaining queries is mechanical.
 They all sit on top of the same entity store and V1-0 invariants.
 ### Phase V1-D: Full Mirror surface (F-6) + determinism golden file (Q-5) + Q-020
 **Scope:**
 - Split the single `/projects/{project_name}/mirror` route into the three
  spec routes: `/mirror/{project}/overview` (= Q-020),
  `/mirror/{project}/decisions`, `/mirror/{project}/subsystems/{subsystem}`.
 - Add `POST /mirror/{project}/regenerate`.
 - Move generated files to `/srv/storage/atocore/data/mirror/{project}/`.
 - **Deterministic output:** stabilize the `now` timestamp (input
  parameter pinned by golden tests), sort every iteration, remove
  `dict` ordering dependencies.
 - `⚠ disputed` markers inline wherever an open conflict touches a
  rendered field (uses V1-0's F-5 hook output).
 - `(curated)` annotations where project_state overrides entity state.
 - Regeneration triggers: synchronous on regenerate, debounced async on
  entity write (30s window), daily scheduled refresh via existing
  nightly cron (one new cron line, not a new queue).
 - `mirror_regeneration_failures` table.
 - Golden-file test: fixture project state → render → bytes equal.
 **Acceptance:** F-6 ✅, Q-5 ✅, Q-020 ✅, O-4 moves toward ✅.
 **Estimated size:** 3–4 days.
 **Tests added:** ~15.
 **Why fifth:** mirror is a derived consumer. It cannot be correct
 before the entity store + queries + conflict hooks are correct. It
 lands after everything it depends on is stable.
 ### Phase V1-E: Memory→entity graduation end-to-end (F-7) + remaining Q-4
 **Scope:**
 - Verify and close F-7 spec gaps:
  - Original memory gets `status="graduated"` (new status).
  - Forward-pointer column from graduated memory to entity candidate id.
  - Promote-entity preserves original memory.
  - Flow tested for `adaptation` → Decision, `project` → Requirement,
    `knowledge` → Fact.
 - Minimal schema additions: one column + one new status value; additive
  migration only.
 - **Q-4 full trust-hierarchy tests**: no auto-write to project_state
  from any promote path; active-only reinforcement for entities; etc.
  (The entity-candidates-excluded-from-context test shipped in V1-0.)
 **Acceptance:** F-7 ✅, Q-4 ✅.
 **Estimated size:** 2 days.
 **Tests added:** ~8.
 **Why sixth:** graduation touches memory-layer semantics (adds a
 `graduated` status, flows memory→entity, requires memory-module changes).
 Doing it after the entity layer is fully invariant-safe + query-complete
 + mirror-derived means the memory side only has to deal with one shape:
 a stable, tested entity layer.
 ### Phase V1-F: Full F-5 spec compliance + O-1/O-2/O-3 + D-1/D-3/D-4
 **Scope:**
 - **F-5 full spec compliance.** Audit `conflicts.py` against
  `conflict-model.md`. The spec wants a generic `conflicts` +
  `conflict_members` table with slot-keyed detection. V1-0 put the hook
  in place; V1-F is where the detector body gets swapped to the generic
  shape if the audit shows divergence.
  - If schema already matches spec: no work.
  - If divergent: migrate additively (new tables alongside existing,
    dual-read, drop old after one stable release).
  - Rename `/admin/conflicts/*` routes to `/conflicts/*` per spec,
    keep `/admin/conflicts/*` as aliases for one release, deprecate in
    D-3 release notes.
 - **O-1:** Run the full migration against a Dalidou backup copy.
  Confirm additive, idempotent, safe to run twice.
 - **O-2:** Run a full restore drill on the test project per
  `docs/backup-restore-procedure.md`. Post-restore, Q-001 returns
  correct shape. `POST /admin/backup` snapshot includes the new tables.
 - **O-3:** Manual sanity-check of the three performance bounds.
 - **D-1:** Write 12 short spec docs under `docs/architecture/entities/`
  (one per V1 entity type).
 - **D-3:** Write `docs/v1-release-notes.md`.
 - **D-4:** Update `master-plan-status.md` and `current-state.md` —
  move engineering V1 from **Next** to **What Is Real Today**.
 **Acceptance:** F-5 ✅, O-1 ✅, O-2 ✅, O-3 ✅, D-1 ✅, D-3 ✅, D-4 ✅ →
 **V1 is done.**
 **Estimated size:** 3 days (F-5 migration if needed is the main unknown;
 D-1 entity docs at ~30 min each ≈ 6 hours; verification is fast).
 **Tests added:** ~6 (F-5 spec-shape tests; verification adds no automated
 tests).
 ### Total (revised)
 - Estimated **12–17 days of focused work** across seven phases — up from
  the original 11–14 days to reflect V1-0 overhead and Codex's objection
  that the first estimate was too tight.
 - Adds roughly **65 tests** (533 → ~600).
 - Branch strategy: one branch per phase (V1-0 → V1-F), each squash-merged
  to main after Codex review. Phases sequential because each builds on
  the previous. **V1-0 is a hard prerequisite for all later phases** —
  nothing starts until V1-0 lands.
 ---
 ## Sequencing with the `master-plan-status.md` Now list
 The **Now** list from master-plan-status.md:159-169 is:
 1. Observe the enhanced pipeline (1 week soak — first F4 confidence decay
   run was 2026-04-19 per Trusted State, so soak window ends ~2026-04-26)
 2. Knowledge density — batch extract over 234 interactions, target 100+
   active memories (currently 84)
 3. Multi-model triage (Phase 11 entry)
 4. Fix p04-constraints harness failure
 **Principle (revised per Codex review):** V1 work and the Now list are
 **less disjoint than the first draft claimed**. Real collision points:
 | V1 phase | Collides with Now list at |
 |---|---|
 | V1-0 provenance enforcement | memory extractor write path if it shares helper functions; context assembly for the Q-4 partial trust test |
 | V1-0 F-5 hooks | any write path that creates active rows (limited collision; entity writes are separate from memory writes) |
 | V1-B KB-CAD/FEM ingest | none on the Now list, but adds an ingest surface that becomes operational burden (ties to O-4 "no new manual ops") |
 | V1-D mirror regen triggers | scheduling / ops behavior that intersects with "boring and dependable" gate — mirror regen failures become an observable that the pipeline soak must accommodate |
 | V1-E graduation | memory module (new `graduated` status, memory→entity flow); direct collision with memory extractor + triage |
 | V1-F F-5 migration | conflicts.py touches the write path shared with memory promotion |
 **Recommended schedule (revised):**
 - **This week (2026-04-22 to 2026-04-26):** Pipeline soak continues.
  Density batch-extract continues. V1 work **waits** — V1-0 would start
  touching write paths, which is explicitly something we should not do
  during a soak window. Density target (100+ active memories) and the
  pipeline soak complete first.
 - **Week of 2026-04-27:** If soak is clean and density reached, V1-0
  starts. V1-0 is a hard prerequisite and cannot be skipped or parallelized.
 - **Weeks of 2026-05-04 and 2026-05-11:** V1-A through V1-D in order.
  Multi-model triage work (Now list item 3) continues in parallel only
  if its touch-surface is triage-path-only (memory side). Any memory
  extractor change pauses V1-E.
 - **Week of 2026-05-18 approx:** V1-E (graduation). **This phase must
  not run in parallel with memory extractor changes** — it directly
  modifies memory module semantics. Multi-model triage should be settled
  before V1-E starts.
 - **Week of 2026-05-25:** V1-F.
 - **End date target:** ~2026-06-01, four weeks later than the first
  draft's 2026-05-18 soft target. The shift is deliberate — the first
  draft's "parallel / disjoint" claim understated the real collisions.
 **Pause points (explicit):**
 - Any Now-list item that regresses the pipeline → V1 pauses immediately.
 - Memory extractor changes in flight → V1-E pauses until they land and
  soak.
 - p04-constraints fix requires retrieval ranking changes → V1 does not
  pause (retrieval is genuinely disjoint from entities).
 - Multi-model triage work touching the entity extractor path (if one
  gets prototyped) → V1-0 pauses until the triage decision settles.
 ---
 ## Test project
 Per `engineering-v1-acceptance.md:379`, the recommended test bed is
 **p05-interferometer** — "the optical/structural domain has the cleanest
 entity model". I agree. Every F-2, F-3, F-6 criterion asserts against this
 project.
 p06-polisher is the backup test bed if p05 turns out to have data gaps
 (polisher suite is actively worked and has more content).
 ---
 ## What V1 completion does NOT include
 Per the negative list in `engineering-v1-acceptance.md:351-373`, all of the
 following are **explicitly out of scope** for this plan:
 - LLM extractor for entities (rule-based is V1)
 - Auto-promotion of candidates (human-only in V1)
 - Write-back to KB-CAD / KB-FEM
 - Multi-user auth
 - Real-time UI (API + Mirror markdown only)
 - Cross-project rollups
 - Time-travel queries (Q-015 stays stretch)
 - Nightly conflict sweep (synchronous only)
 - Incremental Chroma snapshots
 - Retention cleanup script
 - Backup encryption
 - Off-Dalidou backup target (already exists at clawdbot per ledger, but
  not a V1 criterion)
 - **Async job queue / minions pattern** (the rejected plan's centerpiece —
  explicitly deferred to post-V1 per the negative list)
 ---
 ## Open questions for Codex
 1. **Is the parallel schedule with the Now list acceptable?** Claude's read
   is that V1 work and Now items touch disjoint surfaces so they run in
   parallel without conflict. Codex may see collisions Claude missed.
 2. **Phase V1-A query audit scope.** Claude listed Q-001, Q-002, Q-003,
   Q-007, Q-010, Q-012, Q-014, Q-018, Q-019, Q-020 as likely gaps without
   reading each query function end-to-end. Codex's per-file audit may find
   more already done (or more missing).
 3. **F-5 conflicts schema divergence.** The current code uses per-type
   detectors (`_check_component_conflicts`, `_check_requirement_conflicts`)
   whereas the spec wants a generic slot-keyed `conflicts` + `conflict_members`.
   Is the existing schema *equivalent* (just implemented differently) or
   *divergent* (needs migration)? This is a one-read decision for Codex.
 4. **Should F-5 route rename (`/admin/conflicts/*` → `/conflicts/*`) be
   breaking?** Spec route path differs from current. Proposal: add
   `/conflicts/*` as aliases, keep `/admin/conflicts/*` for one release,
   deprecate in V1 release notes, remove in V1.1.
 5. **Mirror determinism — where does `now` go?** The current mirror footer
   has a live timestamp (line 326 of `mirror.py`). Spec says deterministic
   output, spec also shows a `Regenerated:` header with timestamp (line
   265 of `human-mirror-rules.md`). Reconciliation: timestamp is allowed
   in the header banner but must be an input parameter so the golden-file
   test can pin it. Sound right?
 6. **F-7 graduation gap depth.** Without running the existing graduation
   flow end-to-end against a real memory, Claude can't tell how close the
   existing code is to F-7 spec. Codex's audit of
   `_graduation_prompt.py` + `api_request_graduation` + DB schema would
   close this question in one read.
 7. **Estimated 11–14 days honest?** Given recent phase velocities (Phase
   7A was a week, Phase 7D fit in a single session), 2–3 days per phase
   across 6 phases may be light or heavy. Codex's calibration against
   actual repo velocity would help.
 8. **After V1, the minions/queue mechanic we rejected returns as a
   candidate V2 item.** Should we note it explicitly in V1 release notes
   (D-3) as a future track, or leave it unnamed until V2 planning starts?
 ---
 ## Risks
 | Risk | Mitigation |
 |---|---|
 | V1 work slows the Now list | V1 pauses on any Now-list blocker. Codex veto on any V1 PR that touches memory extractor, retrieval ranking, or triage paths |
 | F-5 schema migration is bigger than estimated | If Codex audit shows material divergence, split V1-E into two phases (schema migration separate from provenance enforcement) |
 | Mirror determinism regresses existing mirror output | Keep `/projects/{project_name}/mirror` alias returning the current shape; new `/mirror/{project}/overview` is the spec-compliant one. Deprecate old in V1 release notes |
 | Golden file churn as templates evolve | Standard workflow: updating a golden file is a normal part of template work, documented in V1-C commit message |
 | Backup drill on Dalidou is disruptive | Run against a clone of the Dalidou DB at a safe hour; no production drill required for V1 acceptance |
 | p05-interferometer data gaps | Fall back to p06-polisher per this plan's test-project section |
 | Scope creep during V1-A query audit | Any query that isn't in the v1-required set (Q-021 onward) is out of scope, period |
 ---
 ## What this plan is **for**
 1. A checklist Claude can follow to close V1.
 2. A review target for Codex — every phase has explicit acceptance
   criteria tied to the acceptance doc.
 3. A communication artifact for Antoine — "here's what's left, here's why,
   here's the order, here's the risk."
 ## What this plan is **not**
 1. Not a commitment to start tomorrow. Pipeline soak + density density
   come first in parallel; V1-A can start this week only because it's
   zero-risk additive work.
 2. Not a rewrite. Every phase builds on existing code.
 3. Not an ontology debate. The ontology is fixed in
   `engineering-ontology-v1.md`. Any desire to change it is V2 material.
 ## Workspace note (for Codex audit)
 Codex's first-round review (2026-04-22) flagged that
 `docs/plans/engineering-v1-completion-plan.md` and `DEV-LEDGER.md` were
 **not visible** in the Playground workspace they were running against,
 and that `src/atocore/engineering/` appeared empty there.
 The canonical dev workspace for AtoCore is the Windows path
 `C:\Users\antoi\ATOCore` (per `CLAUDE.md`). The engineering layer code
 (`src/atocore/engineering/service.py`, `queries.py`, `conflicts.py`,
 `mirror.py`, `_graduation_prompt.py`, `wiki.py`, `triage_ui.py`) exists
 there and is what the recent commits (e147ab2, b94f9df, 081c058, 069d155,
 b1a3dd0) touched. The Windows working tree is what this plan was written
 against.
 Before the file-level audit:
 1. Confirm which branch / SHA Codex is reviewing. The Windows working
   tree has uncommitted changes to this plan + DEV-LEDGER as of
   2026-04-22; commit will be made only after Antoine approves sync.
 2. If Codex is reviewing `ATOCore-clean` or a Playground snapshot, that
   tree may lag the canonical dev tree. Sync or re-clone from the
   Windows working tree / current `origin/main` before per-file audit.
 3. The three visible-to-Codex file paths for this plan are:
   - `docs/plans/engineering-v1-completion-plan.md` (this file)
   - `docs/decisions/2026-04-22-gbrain-plan-rejection.md` (prior decision)
   - `DEV-LEDGER.md` (Recent Decisions + Session Log entries 2026-04-22)
 ## References
 - `docs/architecture/engineering-ontology-v1.md`
 - `docs/architecture/engineering-query-catalog.md`
 - `docs/architecture/memory-vs-entities.md`
 - `docs/architecture/engineering-v1-acceptance.md`
 - `docs/architecture/promotion-rules.md`
 - `docs/architecture/conflict-model.md`
 - `docs/architecture/human-mirror-rules.md`
 - `docs/architecture/tool-handoff-boundaries.md`
 - `docs/master-plan-status.md` (Now/Next/Later list)
 - `docs/decisions/2026-04-22-gbrain-plan-rejection.md` (the rejected plan)
 - `src/atocore/engineering/service.py` (current V1 entity service)
 - `src/atocore/engineering/queries.py` (current V1 query implementations)
 - `src/atocore/engineering/conflicts.py` (current conflicts module)
 - `src/atocore/engineering/mirror.py` (current mirror module)