docs(planning): V1 Completion Plan revised per Codex file-level audit

Three findings folded in, all with exact file:line refs from Codex:

- F-1 downgraded from done to partial. Entity dataclass at
  service.py:67 and entities table missing extractor_version and
  canonical_home fields per engineering-v1-acceptance.md:45. V1-0
  scope now adds both via additive migration + doc note that
  project is the project_id per "fields equivalent to" wording.

- F-2 replaced guesses with ground truth per-query status:
  9 of 20 v1-required queries done, 1 partial (Q-001 needs
  subsystem-scoped variant), 10 missing. V1-A scope shrank to
  Q-001 shape fix + Q-6 integration. V1-C closes the 8 net-new
  queries; Q-020 deferred to V1-D (mirror).

- F-5 reframed. Generic conflicts + conflict_members schema
  already present at database.py:190, no migration needed.
  Divergence is detector body (per-type dispatch needs
  generalization) + routes (/admin/conflicts/* needs
  /conflicts/* alias). V1-F scope is detector + routes only.

Totals revised: 16.5-17.5 days, ~60 tests.

Three of Codex's eight open questions now resolved. Remaining:
F-7 graduation depth, mirror determinism, project naming,
velocity calibration, minions-as-V2 naming.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-22 14:09:50 -04:00
parent ce3a87857e
commit 44724c81ab
2 changed files with 124 additions and 100 deletions

View File

@@ -146,6 +146,7 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
## Recent Decisions ## Recent Decisions
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex second-round file-level audit** — three findings folded in, all with exact file:line refs from Codex: (1) F-1 downgraded from ✅ to 🟡 — `extractor_version` and `canonical_home` missing from `Entity` dataclass and `entities` table per `engineering-v1-acceptance.md:45`; V1-0 scope now adds both fields via additive migration + doc note that `project` IS `project_id` per "fields equivalent to" spec wording; (2) F-2 replaced with ground-truth per-query status: 9 of 20 v1-required queries done (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017), 1 partial (Q-001 needs subsystem-scoped variant), 10 missing (Q-002/003/007/010/012/014/018/019/020); V1-A scope shrank to Q-001 shape fix + Q-6 integration (pillar queries already implemented); V1-C closes the 8 remaining new queries + Q-020 deferred to V1-D; (3) F-5 reframed — generic `conflicts` + `conflict_members` schema already present at `database.py:190`, no migration needed; divergence is detector body (per-type dispatch needs generalization) + routes (`/admin/conflicts/*` needs `/conflicts/*` alias). Total revised to 16.517.5 days, ~60 tests. Plan: `docs/plans/engineering-v1-completion-plan.md` at commit `ce3a878` (Codex pulled clean). Three of Codex's eight open questions now answered; remaining: F-7 graduation depth, mirror determinism, `project` rename question, velocity calibration, minions naming. *Proposed by:* Claude. *Reviewed by:* Codex (two rounds).
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** — original six-phase order (queries → ingest → mirror → graduation → provenance → ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" — real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "5070% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex. - **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** — original six-phase order (queries → ingest → mirror → graduation → provenance → ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" — real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "5070% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex.
- **2026-04-22** gbrain-inspired "Phase 8 Minions + typed edges" plan **rejected as packaged** — wrong sequencing (leapfrogged `master-plan-status.md` Now list), wrong predicate set (6 vs V1's 17), wrong canonical boundary (edges-on-wikilinks instead of typed entities+relationships per `memory-vs-entities.md`). Mechanic (durable jobs + typed graph) deferred to V1 home. Record: `docs/decisions/2026-04-22-gbrain-plan-rejection.md`. *Proposed by:* Claude. *Reviewed/rejected by:* Codex. *Ratified by:* Antoine. - **2026-04-22** gbrain-inspired "Phase 8 Minions + typed edges" plan **rejected as packaged** — wrong sequencing (leapfrogged `master-plan-status.md` Now list), wrong predicate set (6 vs V1's 17), wrong canonical boundary (edges-on-wikilinks instead of typed entities+relationships per `memory-vs-entities.md`). Mechanic (durable jobs + typed graph) deferred to V1 home. Record: `docs/decisions/2026-04-22-gbrain-plan-rejection.md`. *Proposed by:* Claude. *Reviewed/rejected by:* Codex. *Ratified by:* Antoine.
- **2026-04-12** Day 4 gate cleared: LLM-assisted extraction via `claude -p` (OAuth, no API key) is the path forward. Rule extractor stays as default for structural cues. *Proposed by:* Claude. *Ratified by:* Antoine. - **2026-04-12** Day 4 gate cleared: LLM-assisted extraction via `claude -p` (OAuth, no API key) is the path forward. Rule extractor stays as default for structural cues. *Proposed by:* Claude. *Ratified by:* Antoine.
@@ -162,6 +163,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
## Session Log ## Session Log
- **2026-04-22 Claude (late night)** Codex second-round review did the full file-level audit and came back with three P1/P2 findings, all with exact file:line refs. Verified each against current code before revising. (1) **F-1 not clean**: `Entity` dataclass at `service.py:67` and `entities` table schema are missing the `extractor_version` and `canonical_home` shared-header fields required by `engineering-v1-acceptance.md:45`; `project` field is the project identifier but not named `project_id` as spec writes (spec wording "fields equivalent to" allows the naming, but needs explicit doc note). V1-0 scope now includes adding both missing fields via additive `_apply_migrations` pattern. (2) **F-2 needed exact statuses, not guesses**: per-function audit gave ground truth — 9 of 20 v1-required queries done, 1 partial (Q-001 returns project-wide tree not subsystem-scoped expand=contains per `engineering-query-catalog.md:71`), 10 missing. V1-A scope shrank to Q-001 shape fix + Q-6 integration (most pillar queries already implemented); V1-C closes the 8 net-new queries + Q-020 to V1-D. (3) **F-5 misframed**: the generic `conflicts` + `conflict_members` schema is ALREADY spec-compliant at `database.py:190`; divergence is detector body at `conflicts.py:36` (per-type dispatch needs generalization) + route path (`/admin/conflicts/*` needs `/conflicts/*` alias). V1-F no longer includes a schema migration; detector generalization + route alignment only. Totals revised to 16.517.5 days, ~60 tests (down from 1217 / 65 because V1-A and V1-F scopes both shrank after audit). Three of the eight open questions resolved. Remaining open: F-7 graduation depth, mirror determinism, `project` naming, velocity calibration, minions-as-V2 naming. No code changes this session — plan + ledger only. Next: commit + push revised plan, then await Antoine+Codex joint sign-off before V1-0 starts.
- **2026-04-22 Claude (night)** Codex first-round review of the V1 Completion Plan summary came back with four findings. Three substantive, one workspace-sync: (1) "5070% built" too loose — replaced with per-criterion table, global framing withdrawn; (2) phase order backward — provenance-at-write (F-8) and conflict hooks (F-5 minimal) depend-upon by every later phase but were in V1-E; new V1-0 prerequisite phase inserted to establish write-time invariants, and V1-A shrunk to a minimum query slice (four pillars Q-001/Q-005/Q-006/Q-017 + Q-6 integration) rather than full catalog closure; (3) "parallel with Now list / disjoint surfaces" too strong — real collisions listed explicitly (V1-0 provenance + memory extractor write path, V1-E graduation + memory module, V1-F conflicts migration + memory promote); schedule shifted ~4 weeks, V1-0 cannot start during pipeline soak; (4) Codex's Playground workspace can't see the plan file or the `src/atocore/engineering/` code — added a Workspace note to the plan directing per-file audit at the Windows canonical dev tree (`C:\Users\antoi\ATOCore`) and noting the three visible file paths (`docs/plans/engineering-v1-completion-plan.md`, `docs/decisions/2026-04-22-gbrain-plan-rejection.md`, `DEV-LEDGER.md`). Revised plan estimate: 1217 days across 7 phases (up from 1114 / 6), ~65 tests added (up from ~50). V1-0 is a hard prerequisite; no later phase starts until it lands. Pending Antoine decision on workspace sync (commit+push vs paste-to-Codex) so Codex can do the file-level audit. No code changes this session. - **2026-04-22 Claude (night)** Codex first-round review of the V1 Completion Plan summary came back with four findings. Three substantive, one workspace-sync: (1) "5070% built" too loose — replaced with per-criterion table, global framing withdrawn; (2) phase order backward — provenance-at-write (F-8) and conflict hooks (F-5 minimal) depend-upon by every later phase but were in V1-E; new V1-0 prerequisite phase inserted to establish write-time invariants, and V1-A shrunk to a minimum query slice (four pillars Q-001/Q-005/Q-006/Q-017 + Q-6 integration) rather than full catalog closure; (3) "parallel with Now list / disjoint surfaces" too strong — real collisions listed explicitly (V1-0 provenance + memory extractor write path, V1-E graduation + memory module, V1-F conflicts migration + memory promote); schedule shifted ~4 weeks, V1-0 cannot start during pipeline soak; (4) Codex's Playground workspace can't see the plan file or the `src/atocore/engineering/` code — added a Workspace note to the plan directing per-file audit at the Windows canonical dev tree (`C:\Users\antoi\ATOCore`) and noting the three visible file paths (`docs/plans/engineering-v1-completion-plan.md`, `docs/decisions/2026-04-22-gbrain-plan-rejection.md`, `DEV-LEDGER.md`). Revised plan estimate: 1217 days across 7 phases (up from 1114 / 6), ~65 tests added (up from ~50). V1-0 is a hard prerequisite; no later phase starts until it lands. Pending Antoine decision on workspace sync (commit+push vs paste-to-Codex) so Codex can do the file-level audit. No code changes this session.
- **2026-04-22 Claude (late eve)** After the rejection, read the four core V1 architecture docs end-to-end (`engineering-ontology-v1.md`, `engineering-query-catalog.md`, `memory-vs-entities.md`, `engineering-v1-acceptance.md`) plus the four supporting docs (`promotion-rules.md`, `conflict-model.md`, `human-mirror-rules.md`, `tool-handoff-boundaries.md`). Cross-referenced against current code in `src/atocore/engineering/`. **Key finding:** V1 is already 5070% built — entity types (16, superset of V1's 12), all 18 V1 relationship types, 4-state lifecycle, CRUD + supersede + invalidate + PATCH, queries module with most killer-correctness queries (orphan_requirements, risky_decisions, unsupported_claims, impact_analysis, evidence_chain), conflicts module scaffolded, mirror scaffolded, graduation endpoint scaffolded. Recent commits e147ab2/b94f9df/081c058/069d155/b1a3dd0 are all V1 entity-layer work. Drafted `docs/plans/engineering-v1-completion-plan.md` reframing the work as **V1 completion, not V1 start**. Six sequential phases V1-A through V1-F, estimated 1114 days, ~50 new tests (533 → ~580). Phases run in parallel with the Now list (pipeline soak + density + multi-model triage + p04-constraints) because surfaces are disjoint. Plan explicitly defers the minions/queue mechanic per acceptance-doc negative list. Pending Codex audit of the plan itself — especially the F-2 query gap list (Claude didn't read each query function end-to-end), F-5 conflicts schema divergence (per-type detectors vs spec's generic slot-keyed shape), and F-7 graduation depth. No code changes this session. - **2026-04-22 Claude (late eve)** After the rejection, read the four core V1 architecture docs end-to-end (`engineering-ontology-v1.md`, `engineering-query-catalog.md`, `memory-vs-entities.md`, `engineering-v1-acceptance.md`) plus the four supporting docs (`promotion-rules.md`, `conflict-model.md`, `human-mirror-rules.md`, `tool-handoff-boundaries.md`). Cross-referenced against current code in `src/atocore/engineering/`. **Key finding:** V1 is already 5070% built — entity types (16, superset of V1's 12), all 18 V1 relationship types, 4-state lifecycle, CRUD + supersede + invalidate + PATCH, queries module with most killer-correctness queries (orphan_requirements, risky_decisions, unsupported_claims, impact_analysis, evidence_chain), conflicts module scaffolded, mirror scaffolded, graduation endpoint scaffolded. Recent commits e147ab2/b94f9df/081c058/069d155/b1a3dd0 are all V1 entity-layer work. Drafted `docs/plans/engineering-v1-completion-plan.md` reframing the work as **V1 completion, not V1 start**. Six sequential phases V1-A through V1-F, estimated 1114 days, ~50 new tests (533 → ~580). Phases run in parallel with the Now list (pipeline soak + density + multi-model triage + p04-constraints) because surfaces are disjoint. Plan explicitly defers the minions/queue mechanic per acceptance-doc negative list. Pending Codex audit of the plan itself — especially the F-2 query gap list (Claude didn't read each query function end-to-end), F-5 conflicts schema divergence (per-type detectors vs spec's generic slot-keyed shape), and F-7 graduation depth. No code changes this session.

View File

@@ -77,11 +77,11 @@ capability exists but does not yet match spec shape or coverage.
| ID | Criterion | Status | Evidence | | ID | Criterion | Status | Evidence |
|----|-----------|--------|----------| |----|-----------|--------|----------|
| F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | ✅ done | `service.py:16-36` (16 types, superset of V1 minimum), `service.py:38-62` (18 relationship types), `service.py:64` statuses, `Entity` dataclass at line 67 | | F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | 🟡 partial (per Codex 2026-04-22 audit) | `service.py:16-36` has 16 types (superset of V1's 12), `service.py:38-62` has 18 relationship types, `service.py:64` statuses, `Entity` dataclass at line 67. **Gaps vs `engineering-v1-acceptance.md:45`**: `extractor_version` missing from dataclass and `entities` table; `canonical_home` missing from dataclass and table; `project` field is the project identifier but not named `project_id` as spec uses — spec says "fields equivalent to" so naming flexibility is allowed but needs an explicit doc note. Remediation lands in V1-0 |
| F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟡 partial | `queries.py` has system_map (Q-004), decisions_affecting (Q-008), requirements_for (Q-005 component side), recent_changes (Q-013), orphan_requirements (Q-006 killer), risky_decisions (Q-009 killer), unsupported_claims (Q-011 killer), impact_analysis (Q-016), evidence_chain (Q-017). Likely missing or partial: Q-001 (expand=contains), Q-002 (expand=parents), Q-003 (interfaces), Q-007 (constraints on component), Q-010 (supports trace), Q-012 (conflicting results), Q-014 (decision-log ordered chain), Q-018 (include=superseded chain), Q-019 (material→components), Q-020 (project overview mirror endpoint in V1-required shape) | | F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟡 partial (per Codex 2026-04-22 per-function audit) | **Ground truth from per-function read of `queries.py` + `routes.py:2092+`:** Q-001 partial (`system_map()` returns project-wide tree, not the catalog's subsystem-scoped `GET /entities/Subsystem/<id>?expand=contains` shape per `engineering-query-catalog.md:71`); Q-002 missing; Q-003 missing; Q-004 done (covered by `system_map()`); Q-005 done (`requirements_for()`); Q-006 done (`orphan_requirements()`); Q-007 missing; Q-008 done (`decisions_affecting()`); Q-009 done (`risky_decisions()`); Q-010 missing; Q-011 done (`unsupported_claims()`); Q-012 missing; Q-013 done (`recent_changes()`); Q-014 missing; Q-016 done (`impact_analysis()`); Q-017 done (`evidence_chain()`); Q-018 missing; Q-019 missing; Q-020 missing (mirror route in spec shape). **Net: 9 of 20 v1-required queries done, 1 partial (Q-001), 10 missing.** Q-015 is v1-stretch, out of scope |
| F-3 | `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export` | ❌ missing | No `/ingest/kb-cad` or `/ingest/kb-fem` route in `api/routes.py`. No schema doc under `docs/architecture/` | | F-3 | `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export` | ❌ missing | No `/ingest/kb-cad` or `/ingest/kb-fem` route in `api/routes.py`. No schema doc under `docs/architecture/` |
| F-4 | Candidate review queue end-to-end (list/promote/reject/edit) | 🟡 partial for entities | Memory side shipped in Phase 9 Commit C. Entity side has `promote_entity`, `supersede_entity`, `invalidate_active_entity` but reject path and editable-before-promote may not match spec shape. Need to verify `GET /entities?status=candidate` returns spec shape | | F-4 | Candidate review queue end-to-end (list/promote/reject/edit) | 🟡 partial for entities | Memory side shipped in Phase 9 Commit C. Entity side has `promote_entity`, `supersede_entity`, `invalidate_active_entity` but reject path and editable-before-promote may not match spec shape. Need to verify `GET /entities?status=candidate` returns spec shape |
| F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟡 partial | `conflicts.py` has `detect_conflicts_for_entity`, `list_open_conflicts`, `resolve_conflict`. API at `/admin/conflicts` + `/admin/conflicts/{id}/resolve`. **Gap vs spec**: spec wants generic slot-key model with `conflicts` + `conflict_members` tables; current code has per-type detectors (`_check_component_conflicts`, `_check_requirement_conflicts`) — need to verify schema, and spec routes are `/conflicts/*` not `/admin/conflicts/*` | | F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟡 partial (per Codex 2026-04-22 audit — schema present, detector+routes divergent) | **Schema is already spec-shaped**: `database.py:190` defines the generic `conflicts` + `conflict_members` tables per `conflict-model.md`; `conflicts.py:154` persists through them. **Divergences are in detection and API, not schema**: (1) `conflicts.py:36` dispatches per-type detectors only (`_check_component_conflicts`, `_check_requirement_conflicts`) — needs generalization to slot-key-driven detection; (2) routes live at `/admin/conflicts/*`, spec says `/conflicts/*` — needs alias + deprecation. **No schema migration needed** |
| F-6 | Mirror: `/mirror/{project}/overview`, `/decisions`, `/subsystems/{id}`, `/regenerate`; files under `/srv/storage/atocore/data/mirror/`; disputed + curated markers; deterministic output | 🟡 partial | `mirror.py` has `generate_project_overview` with header/state/system/decisions/requirements/materials/vendors/memories/footer sections. API at `/projects/{project_name}/mirror` and `.html`. **Gaps**: no separate `/mirror/{project}/decisions` or `/mirror/{project}/subsystems/{id}` routes, no `POST /regenerate` endpoint, no debounced-async-on-write, no daily refresh, no `⚠ disputed` markers wired to conflicts, no `(curated)` override annotations verified, no golden-file test for determinism | | F-6 | Mirror: `/mirror/{project}/overview`, `/decisions`, `/subsystems/{id}`, `/regenerate`; files under `/srv/storage/atocore/data/mirror/`; disputed + curated markers; deterministic output | 🟡 partial | `mirror.py` has `generate_project_overview` with header/state/system/decisions/requirements/materials/vendors/memories/footer sections. API at `/projects/{project_name}/mirror` and `.html`. **Gaps**: no separate `/mirror/{project}/decisions` or `/mirror/{project}/subsystems/{id}` routes, no `POST /regenerate` endpoint, no debounced-async-on-write, no daily refresh, no `⚠ disputed` markers wired to conflicts, no `(curated)` override annotations verified, no golden-file test for determinism |
| F-7 | Memory→entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟡 partial | `_graduation_prompt.py` exists; `api_request_graduation` + `api_graduation_status` + `api_graduation_stats` routes exist (routes.py:1573, 1607, 2065). Need to verify full flow against F-7 spec — original preserved? `graduated` status row added? forward pointer column present? | | F-7 | Memory→entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟡 partial | `_graduation_prompt.py` exists; `api_request_graduation` + `api_graduation_status` + `api_graduation_stats` routes exist (routes.py:1573, 1607, 2065). Need to verify full flow against F-7 spec — original preserved? `graduated` status row added? forward pointer column present? |
| F-8 | Every active entity has `source_refs`; Q-017 returns ≥1 row for every active entity | 🟡 partial | `Entity.source_refs` field exists; Q-017 (`evidence_chain`) exists. **Gap**: is provenance enforced at write time (not NULL), or just encouraged? Per spec it must be mandatory | | F-8 | Every active entity has `source_refs`; Q-017 returns ≥1 row for every active entity | 🟡 partial | `Entity.source_refs` field exists; Q-017 (`evidence_chain`) exists. **Gap**: is provenance enforced at write time (not NULL), or just encouraged? Per spec it must be mandatory |
@@ -116,12 +116,14 @@ capability exists but does not yet match spec shape or coverage.
| D-3 | `docs/v1-release-notes.md` | ❌ missing | Not written yet (appropriately — it's written when V1 is done) | | D-3 | `docs/v1-release-notes.md` | ❌ missing | Not written yet (appropriately — it's written when V1 is done) |
| D-4 | `master-plan-status.md` + `current-state.md` updated with V1 completion | ❌ not yet | `master-plan-status.md:179` still has V1 under **Next** | | D-4 | `master-plan-status.md` + `current-state.md` updated with V1 completion | ❌ not yet | `master-plan-status.md:179` still has V1 under **Next** |
### Summary ### Summary (revised per Codex 2026-04-22 per-file audit)
- **Functional:** 1/8 ✅, 6/8 🟡 partial, 1/8 ❌ missing → the entity layer is real; the ingest + mirror + graduation surfaces need completion - **Functional:** 0/8 ✅, 7/8 🟡 partial (F-1 downgraded from ✅ — two header fields missing; F-2 through F-7 partial), 1/8 ❌ missing (F-3 ingest endpoints) → the entity layer shape is real but not yet spec-clean; write-time invariants come first, then everything builds on stable invariants
- **F-2 detail:** 9 of 20 v1-required queries done, 1 partial (Q-001 needs subsystem-scoped variant), 10 missing
- **F-5 detail:** generic `conflicts` + `conflict_members` schema already present (no migration needed); detector body + routes diverge from spec
- **Quality:** 1/6 ✅, 3/6 🟡 partial, 2/6 ❌ missing → golden file + killer-correctness integration test are the two clear gaps - **Quality:** 1/6 ✅, 3/6 🟡 partial, 2/6 ❌ missing → golden file + killer-correctness integration test are the two clear gaps
- **Operational:** 0/5 ✅ (none marked fully verified), 3/5 🟡, 1/5 ❌ → backup drill is the one hard blocker here - **Operational:** 0/5 ✅ (none fully verified), 3/5 🟡, 1/5 ❌ → backup drill is the one hard blocker here
- **Documentation:** 0/4 ✅, 4/4 ❌ → all 4 docs need writing, D-3/D-4 at the end, D-1/D-2 as part of their respective F criteria - **Documentation:** 0/4 ✅, 4/4 ❌ → all 4 docs need writing
--- ---
@@ -142,13 +144,23 @@ Skipped by construction: F-1 core schema (already implemented) and O-5
### Phase V1-0: Write-time invariants (F-8 + F-5 minimal + F-1 audit) ### Phase V1-0: Write-time invariants (F-8 + F-5 minimal + F-1 audit)
**Scope:** **Scope:**
- **F-1 audit (Codex action).** Before any code change, Codex does a - **F-1 remediation (Codex audit 2026-04-22 already completed).** Add
per-file audit of `src/atocore/engineering/service.py`, the two missing shared-header fields to the `Entity` dataclass
`conflicts.py`, `mirror.py`, `queries.py` against the acceptance doc's (`service.py:67`) and the `entities` table schema:
F-1 shared-header-field list (`id, type, name, project_id, status, - `extractor_version TEXT` — semver-ish string carrying the extractor
confidence, source_refs, created_at, updated_at, extractor_version, module version per `promotion-rules.md:268`. Backfill existing rows
canonical_home`). Confirm which fields exist, which are missing. This with `"0.0.0"` or `NULL` flagged as unknown. Every future
becomes the ground-truth F-1 row in the gap audit table below. write carries the current `EXTRACTOR_VERSION` constant.
- `canonical_home TEXT` — which layer is canonical for this concept.
For entities, value is always `"entity"`. For future graduation
records it may be `"memory"` (frozen pointer). Backfill active
rows with `"entity"`.
- Additive migration via the existing `_apply_migrations` pattern,
idempotent, safe on replay.
- Add doc note in `engineering-ontology-v1.md` clarifying that the
`project` field IS the `project_id` per spec — "fields equivalent
to" wording in the spec allows this, but make it explicit so
future readers don't trip on the naming.
- **F-8 provenance enforcement.** Add a NOT-NULL invariant at - **F-8 provenance enforcement.** Add a NOT-NULL invariant at
`create_entity` and `promote_entity` that `source_refs` is non-empty `create_entity` and `promote_entity` that `source_refs` is non-empty
OR an explicit `hand_authored=True` flag is set (per OR an explicit `hand_authored=True` flag is set (per
@@ -171,11 +183,13 @@ Skipped by construction: F-1 core schema (already implemented) and O-5
land in V1-E; this is the one that V1-0 can cover without graduation land in V1-E; this is the one that V1-0 can cover without graduation
being ready.) being ready.)
**Acceptance:** F-8, F-5 minimal hooks ✅, Q-3 ✅, partial Q-4 ✅, **Acceptance:** F-1 (after `extractor_version` + `canonical_home`
F-1 row in gap table is accurate. land + doc note on `project` naming), F-8 ✅, F-5 hooks ✅, Q-3 ✅,
partial Q-4 ✅.
**Estimated size:** 3 days (the audit is the biggest unknown; the **Estimated size:** 3 days (two small schema additions + invariant
enforcement patches are small). patches + hook wiring + tests; no audit overhead — Codex already did
that part).
**Tests added:** ~10. **Tests added:** ~10.
@@ -186,30 +200,34 @@ must then be cleaned up.
### Phase V1-A: Minimal query slice that proves the model (partial F-2 + Q-6) ### Phase V1-A: Minimal query slice that proves the model (partial F-2 + Q-6)
**Scope:** **Scope:**
- Pick the **four queries that prove the model on p05-interferometer**: - Pick the **four pillar queries**: Q-001 (subsystem contents),
Q-001 (subsystem contents), Q-005 (component satisfies requirements), Q-005 (component satisfies requirements), Q-006 (orphan requirements —
Q-006 (orphan requirements — killer correctness), Q-017 (evidence killer correctness), Q-017 (evidence chain). These exercise structural +
chain). These four exercise structural + intent + killer-correctness + intent + killer-correctness + provenance.
provenance, which are the four pillars of the V1 shape. - **Q-001 needs a shape fix**: Codex's audit confirms the existing
`system_map()` returns a project-wide tree, not the spec's
subsystem-scoped `GET /entities/Subsystem/<id>?expand=contains`.
Add a subsystem-scoped variant (the existing project-wide route stays
for Q-004). This is the only shape fix in V1-A; larger query additions
move to V1-C.
- Q-005, Q-006, Q-017 are already implemented per Codex audit. V1-A
verifies them against seeded data; no code changes expected.
- Seed p05-interferometer with Q-6 integration data (one satisfying - Seed p05-interferometer with Q-6 integration data (one satisfying
Component + one orphan Requirement + one Decision on flagged Component + one orphan Requirement + one Decision on flagged
Assumption + one supported ValidationClaim + one unsupported Assumption + one supported ValidationClaim + one unsupported
ValidationClaim). ValidationClaim).
- Verify each of the four queries returns correct results against the - All three killer-correctness queries (Q-006, Q-009, Q-011) are
seeded data. The three killer-correctness queries (Q-006, Q-009, **already implemented** per Codex audit. V1-A runs them as a single
Q-011) run as a single integration test. Q-009 and Q-011 are integration test against the seed data.
implemented against the seed data here even though they're not in the
"four pillars" list, because Q-6 requires all three.
- Any query function the Codex F-1 audit found to be missing fields
required by Q-001/Q-005/Q-006/Q-017 gets filled in here, not in V1-C.
**Acceptance:** The four pillar queries + Q-006/Q-009/Q-011 killer **Acceptance:** Q-001 subsystem-scoped variant + Q-6 integration test.
correctness all return correct results. Q-6 ✅ passes. Partial F-2 Partial F-2 (remaining 10 missing + 1 partial queries land in V1-C).
(the remaining queries land in V1-C).
**Estimated size:** 2 days. **Estimated size:** 1.5 days (scope shrunk — most pillar queries already
work per Codex audit; only Q-001 shape fix + seed data + integration
test required).
**Tests added:** ~6. **Tests added:** ~4.
**Why second:** proves the entity layer shape works end-to-end on real **Why second:** proves the entity layer shape works end-to-end on real
data before we start bolting ingest, graduation, or mirror onto it. If data before we start bolting ingest, graduation, or mirror onto it. If
@@ -244,22 +262,25 @@ further.
### Phase V1-C: Close the rest of the query catalog (remaining F-2) ### Phase V1-C: Close the rest of the query catalog (remaining F-2)
**Scope:** **Scope:** close the 10 missing queries per Codex's audit. Already-done
- Implement remaining v1-required queries: Q-002 (component parents), queries (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017) are
Q-003 (subsystem interfaces, with Interface as simple string label), verified but not rewritten.
Q-004 (project system-map tree), Q-007 (component constraints), - Q-002 (component → parents, inverse of CONTAINS)
Q-008 (decisions affecting an entity, full shape), Q-010 (supports - Q-003 (subsystem interfaces, Interface as simple string label)
trace to AnalysisModel), Q-012 (conflicting results on same claim — - Q-007 (component → constraints via CONSTRAINED_BY)
exercises V1-0's F-5 hook), Q-013 (recent changes with window), - Q-010 (ValidationClaim → supporting results + AnalysisModel trace)
Q-014 (decision log ordered + superseded chain), Q-016 (impact - Q-012 (conflicting results on same claim — exercises V1-0's F-5 hook)
analysis — likely already done, just verify shape), Q-018 - Q-014 (decision log ordered + superseded chain)
(`include=superseded`), Q-019 (material → components). - Q-018 (`include=superseded` for supersession chains)
- Q-020 (project overview mirror route) is deferred to V1-D where the - Q-019 (Material → components, derived from Component.material field
per `engineering-query-catalog.md:266`, no edge needed)
- Q-020 (project overview mirror route) — deferred to V1-D where the
mirror lands in full. mirror lands in full.
**Acceptance:** F-2 ✅ (all 19 of 20 v1-required queries; Q-020 in V1-D). **Acceptance:** F-2 ✅ (all 19 of 20 v1-required queries; Q-020 in V1-D).
**Estimated size:** 2 days. **Estimated size:** 2 days (eight new query functions + routes +
per-query happy-path tests).
**Tests added:** ~12. **Tests added:** ~12.
@@ -327,17 +348,22 @@ a stable, tested entity layer.
### Phase V1-F: Full F-5 spec compliance + O-1/O-2/O-3 + D-1/D-3/D-4 ### Phase V1-F: Full F-5 spec compliance + O-1/O-2/O-3 + D-1/D-3/D-4
**Scope:** **Scope:**
- **F-5 full spec compliance.** Audit `conflicts.py` against - **F-5 full spec compliance** (Codex 2026-04-22 audit already confirmed
`conflict-model.md`. The spec wants a generic `conflicts` + the gap shape — schema is spec-compliant, divergence is in detector +
`conflict_members` table with slot-keyed detection. V1-0 put the hook routes only).
in place; V1-F is where the detector body gets swapped to the generic - **Detector generalization.** Replace the per-type dispatch at
shape if the audit shows divergence. `conflicts.py:36` (`_check_component_conflicts`,
- If schema already matches spec: no work. `_check_requirement_conflicts`) with a slot-key-driven generic
- If divergent: migrate additively (new tables alongside existing, detector that reads the per-entity-type conflict slot from a
dual-read, drop old after one stable release). registry and queries the already-generic `conflicts` +
- Rename `/admin/conflicts/*` routes to `/conflicts/*` per spec, `conflict_members` tables. The V1-0 hook shape was chosen to make
keep `/admin/conflicts/*` as aliases for one release, deprecate in this a detector-body swap, not an API change.
D-3 release notes. - **Route alignment.** Add `/conflicts/*` routes as the canonical
surface per `conflict-model.md:187`. Keep `/admin/conflicts/*` as
aliases for one release, deprecate in D-3 release notes, remove
in V1.1.
- **No schema migration needed** (the tables at `database.py:190`
already match the spec).
- **O-1:** Run the full migration against a Dalidou backup copy. - **O-1:** Run the full migration against a Dalidou backup copy.
Confirm additive, idempotent, safe to run twice. Confirm additive, idempotent, safe to run twice.
- **O-2:** Run a full restore drill on the test project per - **O-2:** Run a full restore drill on the test project per
@@ -359,12 +385,14 @@ D-1 entity docs at ~30 min each ≈ 6 hours; verification is fast).
**Tests added:** ~6 (F-5 spec-shape tests; verification adds no automated **Tests added:** ~6 (F-5 spec-shape tests; verification adds no automated
tests). tests).
### Total (revised) ### Total (revised after Codex 2026-04-22 audit)
- Estimated **1217 days of focused work** across seven phases — up from - Phase budgets: V1-0 (3) + V1-A (1.5) + V1-B (2) + V1-C (2) + V1-D (3-4)
the original 1114 days to reflect V1-0 overhead and Codex's objection + V1-E (2) + V1-F (3) ≈ **16.517.5 days of focused work**. Revised down
that the first estimate was too tight. slightly from the previous 1217 estimate because V1-A scope shrank
- Adds roughly **65 tests** (533 → ~600). (four pillar queries are mostly already implemented per Codex audit)
and V1-F F-5 work shrank (no schema migration needed).
- Adds roughly **60 tests** (533 → ~593).
- Branch strategy: one branch per phase (V1-0 → V1-F), each squash-merged - Branch strategy: one branch per phase (V1-0 → V1-F), each squash-merged
to main after Codex review. Phases sequential because each builds on to main after Codex review. Phases sequential because each builds on
the previous. **V1-0 is a hard prerequisite for all later phases** the previous. **V1-0 is a hard prerequisite for all later phases**
@@ -464,49 +492,42 @@ following are **explicitly out of scope** for this plan:
--- ---
## Open questions for Codex ## Open questions for Codex (post-second-round revision)
1. **Is the parallel schedule with the Now list acceptable?** Claude's read Three of the original eight questions (F-1 field audit, F-2 per-query
is that V1 work and Now items touch disjoint surfaces so they run in audit, F-5 schema divergence) were answered by Codex's 2026-04-22 audit
parallel without conflict. Codex may see collisions Claude missed. and folded into the plan. Remaining open questions:
2. **Phase V1-A query audit scope.** Claude listed Q-001, Q-002, Q-003, 1. **Parallel schedule vs Now list.** The first-round review correctly
Q-007, Q-010, Q-012, Q-014, Q-018, Q-019, Q-020 as likely gaps without softened this from "fully parallel" to "less disjoint than claimed".
reading each query function end-to-end. Codex's per-file audit may find Is the revised collision table + pause-points section enough, or
more already done (or more missing). should specific Now-list items gate specific V1 phases more strictly?
3. **F-5 conflicts schema divergence.** The current code uses per-type 2. **F-7 graduation gap depth.** Still unaudited. `_graduation_prompt.py`
detectors (`_check_component_conflicts`, `_check_requirement_conflicts`) + `api_request_graduation` + DB schema need one Codex read to tell
whereas the spec wants a generic slot-keyed `conflicts` + `conflict_members`. us whether V1-E is a 2-day phase or a 4-day phase.
Is the existing schema *equivalent* (just implemented differently) or
*divergent* (needs migration)? This is a one-read decision for Codex.
4. **Should F-5 route rename (`/admin/conflicts/*``/conflicts/*`) be 3. **Mirror determinism — where does `now` go?** The current mirror
breaking?** Spec route path differs from current. Proposal: add footer has a live timestamp (`mirror.py:326`). Spec says
`/conflicts/*` as aliases, keep `/admin/conflicts/*` for one release, deterministic output, spec also shows a `Regenerated:` header with
deprecate in V1 release notes, remove in V1.1. timestamp (`human-mirror-rules.md:265`). Reconciliation proposal:
timestamp allowed in the header banner but must be an input
parameter so the golden-file test can pin it. Sound right?
5. **Mirror determinism — where does `now` go?** The current mirror footer 4. **`project` field naming.** The spec writes `project_id`; the code
has a live timestamp (line 326 of `mirror.py`). Spec says deterministic writes `project`. The spec says "fields equivalent to" so naming is
output, spec also shows a `Regenerated:` header with timestamp (line technically flexible. Proposal: V1-0 adds a doc note making this
265 of `human-mirror-rules.md`). Reconciliation: timestamp is allowed explicit, no column rename. Is this acceptable, or does Codex want
in the header banner but must be an input parameter so the golden-file the rename for cleanliness?
test can pin it. Sound right?
6. **F-7 graduation gap depth.** Without running the existing graduation 5. **Velocity calibration.** Revised 16.517.5 days total. Given Phase
flow end-to-end against a real memory, Claude can't tell how close the 7A took a week and Phase 7D fit in one session, is this a fair
existing code is to F-7 spec. Codex's audit of estimate for a single-operator sprint, or should we build in buffer
`_graduation_prompt.py` + `api_request_graduation` + DB schema would for multi-phase context switches?
close this question in one read.
7. **Estimated 1114 days honest?** Given recent phase velocities (Phase 6. **Minions/queue as V2 item in D-3.** Should we name it explicitly in
7A was a week, Phase 7D fit in a single session), 23 days per phase V1 release notes as a future track, or leave it unnamed until V2
across 6 phases may be light or heavy. Codex's calibration against planning starts?
actual repo velocity would help.
8. **After V1, the minions/queue mechanic we rejected returns as a
candidate V2 item.** Should we note it explicitly in V1 release notes
(D-3) as a future track, or leave it unnamed until V2 planning starts?
--- ---
@@ -515,7 +536,7 @@ following are **explicitly out of scope** for this plan:
| Risk | Mitigation | | Risk | Mitigation |
|---|---| |---|---|
| V1 work slows the Now list | V1 pauses on any Now-list blocker. Codex veto on any V1 PR that touches memory extractor, retrieval ranking, or triage paths | | V1 work slows the Now list | V1 pauses on any Now-list blocker. Codex veto on any V1 PR that touches memory extractor, retrieval ranking, or triage paths |
| F-5 schema migration is bigger than estimated | If Codex audit shows material divergence, split V1-E into two phases (schema migration separate from provenance enforcement) | | F-5 detector generalization harder than estimated | Codex audit confirmed schema is already spec-compliant; only detector body + routes need work. If detector generalization still slips, keep per-type detectors and document as a V1.1 cleanup (detection correctness is unaffected, only code organization) |
| Mirror determinism regresses existing mirror output | Keep `/projects/{project_name}/mirror` alias returning the current shape; new `/mirror/{project}/overview` is the spec-compliant one. Deprecate old in V1 release notes | | Mirror determinism regresses existing mirror output | Keep `/projects/{project_name}/mirror` alias returning the current shape; new `/mirror/{project}/overview` is the spec-compliant one. Deprecate old in V1 release notes |
| Golden file churn as templates evolve | Standard workflow: updating a golden file is a normal part of template work, documented in V1-C commit message | | Golden file churn as templates evolve | Standard workflow: updating a golden file is a normal part of template work, documented in V1-C commit message |
| Backup drill on Dalidou is disruptive | Run against a clone of the Dalidou DB at a safe hour; no production drill required for V1 acceptance | | Backup drill on Dalidou is disruptive | Run against a clone of the Dalidou DB at a safe hour; no production drill required for V1 acceptance |