docs(planning): V1 Completion Plan revised per Codex file-level audit

Three findings folded in, all with exact file:line refs from Codex:

- F-1 downgraded from done to partial. Entity dataclass at
  service.py:67 and entities table missing extractor_version and
  canonical_home fields per engineering-v1-acceptance.md:45. V1-0
  scope now adds both via additive migration + doc note that
  project is the project_id per "fields equivalent to" wording.

- F-2 replaced guesses with ground truth per-query status:
  9 of 20 v1-required queries done, 1 partial (Q-001 needs
  subsystem-scoped variant), 10 missing. V1-A scope shrank to
  Q-001 shape fix + Q-6 integration. V1-C closes the 8 net-new
  queries; Q-020 deferred to V1-D (mirror).

- F-5 reframed. Generic conflicts + conflict_members schema
  already present at database.py:190, no migration needed.
  Divergence is detector body (per-type dispatch needs
  generalization) + routes (/admin/conflicts/* needs
  /conflicts/* alias). V1-F scope is detector + routes only.

Totals revised: 16.5-17.5 days, ~60 tests.

Three of Codex's eight open questions now resolved. Remaining:
F-7 graduation depth, mirror determinism, project naming,
velocity calibration, minions-as-V2 naming.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-22 14:09:50 -04:00
parent ce3a87857e
commit 44724c81ab
2 changed files with 124 additions and 100 deletions

View File

@@ -146,6 +146,7 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
## Recent Decisions
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex second-round file-level audit** — three findings folded in, all with exact file:line refs from Codex: (1) F-1 downgraded from ✅ to 🟡 — `extractor_version` and `canonical_home` missing from `Entity` dataclass and `entities` table per `engineering-v1-acceptance.md:45`; V1-0 scope now adds both fields via additive migration + doc note that `project` IS `project_id` per "fields equivalent to" spec wording; (2) F-2 replaced with ground-truth per-query status: 9 of 20 v1-required queries done (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017), 1 partial (Q-001 needs subsystem-scoped variant), 10 missing (Q-002/003/007/010/012/014/018/019/020); V1-A scope shrank to Q-001 shape fix + Q-6 integration (pillar queries already implemented); V1-C closes the 8 remaining new queries + Q-020 deferred to V1-D; (3) F-5 reframed — generic `conflicts` + `conflict_members` schema already present at `database.py:190`, no migration needed; divergence is detector body (per-type dispatch needs generalization) + routes (`/admin/conflicts/*` needs `/conflicts/*` alias). Total revised to 16.517.5 days, ~60 tests. Plan: `docs/plans/engineering-v1-completion-plan.md` at commit `ce3a878` (Codex pulled clean). Three of Codex's eight open questions now answered; remaining: F-7 graduation depth, mirror determinism, `project` rename question, velocity calibration, minions naming. *Proposed by:* Claude. *Reviewed by:* Codex (two rounds).
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** — original six-phase order (queries → ingest → mirror → graduation → provenance → ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" — real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "5070% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex.
- **2026-04-22** gbrain-inspired "Phase 8 Minions + typed edges" plan **rejected as packaged** — wrong sequencing (leapfrogged `master-plan-status.md` Now list), wrong predicate set (6 vs V1's 17), wrong canonical boundary (edges-on-wikilinks instead of typed entities+relationships per `memory-vs-entities.md`). Mechanic (durable jobs + typed graph) deferred to V1 home. Record: `docs/decisions/2026-04-22-gbrain-plan-rejection.md`. *Proposed by:* Claude. *Reviewed/rejected by:* Codex. *Ratified by:* Antoine.
- **2026-04-12** Day 4 gate cleared: LLM-assisted extraction via `claude -p` (OAuth, no API key) is the path forward. Rule extractor stays as default for structural cues. *Proposed by:* Claude. *Ratified by:* Antoine.
@@ -162,6 +163,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
## Session Log
- **2026-04-22 Claude (late night)** Codex second-round review did the full file-level audit and came back with three P1/P2 findings, all with exact file:line refs. Verified each against current code before revising. (1) **F-1 not clean**: `Entity` dataclass at `service.py:67` and `entities` table schema are missing the `extractor_version` and `canonical_home` shared-header fields required by `engineering-v1-acceptance.md:45`; `project` field is the project identifier but not named `project_id` as spec writes (spec wording "fields equivalent to" allows the naming, but needs explicit doc note). V1-0 scope now includes adding both missing fields via additive `_apply_migrations` pattern. (2) **F-2 needed exact statuses, not guesses**: per-function audit gave ground truth — 9 of 20 v1-required queries done, 1 partial (Q-001 returns project-wide tree not subsystem-scoped expand=contains per `engineering-query-catalog.md:71`), 10 missing. V1-A scope shrank to Q-001 shape fix + Q-6 integration (most pillar queries already implemented); V1-C closes the 8 net-new queries + Q-020 to V1-D. (3) **F-5 misframed**: the generic `conflicts` + `conflict_members` schema is ALREADY spec-compliant at `database.py:190`; divergence is detector body at `conflicts.py:36` (per-type dispatch needs generalization) + route path (`/admin/conflicts/*` needs `/conflicts/*` alias). V1-F no longer includes a schema migration; detector generalization + route alignment only. Totals revised to 16.517.5 days, ~60 tests (down from 1217 / 65 because V1-A and V1-F scopes both shrank after audit). Three of the eight open questions resolved. Remaining open: F-7 graduation depth, mirror determinism, `project` naming, velocity calibration, minions-as-V2 naming. No code changes this session — plan + ledger only. Next: commit + push revised plan, then await Antoine+Codex joint sign-off before V1-0 starts.
- **2026-04-22 Claude (night)** Codex first-round review of the V1 Completion Plan summary came back with four findings. Three substantive, one workspace-sync: (1) "5070% built" too loose — replaced with per-criterion table, global framing withdrawn; (2) phase order backward — provenance-at-write (F-8) and conflict hooks (F-5 minimal) depend-upon by every later phase but were in V1-E; new V1-0 prerequisite phase inserted to establish write-time invariants, and V1-A shrunk to a minimum query slice (four pillars Q-001/Q-005/Q-006/Q-017 + Q-6 integration) rather than full catalog closure; (3) "parallel with Now list / disjoint surfaces" too strong — real collisions listed explicitly (V1-0 provenance + memory extractor write path, V1-E graduation + memory module, V1-F conflicts migration + memory promote); schedule shifted ~4 weeks, V1-0 cannot start during pipeline soak; (4) Codex's Playground workspace can't see the plan file or the `src/atocore/engineering/` code — added a Workspace note to the plan directing per-file audit at the Windows canonical dev tree (`C:\Users\antoi\ATOCore`) and noting the three visible file paths (`docs/plans/engineering-v1-completion-plan.md`, `docs/decisions/2026-04-22-gbrain-plan-rejection.md`, `DEV-LEDGER.md`). Revised plan estimate: 1217 days across 7 phases (up from 1114 / 6), ~65 tests added (up from ~50). V1-0 is a hard prerequisite; no later phase starts until it lands. Pending Antoine decision on workspace sync (commit+push vs paste-to-Codex) so Codex can do the file-level audit. No code changes this session.
- **2026-04-22 Claude (late eve)** After the rejection, read the four core V1 architecture docs end-to-end (`engineering-ontology-v1.md`, `engineering-query-catalog.md`, `memory-vs-entities.md`, `engineering-v1-acceptance.md`) plus the four supporting docs (`promotion-rules.md`, `conflict-model.md`, `human-mirror-rules.md`, `tool-handoff-boundaries.md`). Cross-referenced against current code in `src/atocore/engineering/`. **Key finding:** V1 is already 5070% built — entity types (16, superset of V1's 12), all 18 V1 relationship types, 4-state lifecycle, CRUD + supersede + invalidate + PATCH, queries module with most killer-correctness queries (orphan_requirements, risky_decisions, unsupported_claims, impact_analysis, evidence_chain), conflicts module scaffolded, mirror scaffolded, graduation endpoint scaffolded. Recent commits e147ab2/b94f9df/081c058/069d155/b1a3dd0 are all V1 entity-layer work. Drafted `docs/plans/engineering-v1-completion-plan.md` reframing the work as **V1 completion, not V1 start**. Six sequential phases V1-A through V1-F, estimated 1114 days, ~50 new tests (533 → ~580). Phases run in parallel with the Now list (pipeline soak + density + multi-model triage + p04-constraints) because surfaces are disjoint. Plan explicitly defers the minions/queue mechanic per acceptance-doc negative list. Pending Codex audit of the plan itself — especially the F-2 query gap list (Claude didn't read each query function end-to-end), F-5 conflicts schema divergence (per-type detectors vs spec's generic slot-keyed shape), and F-7 graduation depth. No code changes this session.

View File

@@ -77,11 +77,11 @@ capability exists but does not yet match spec shape or coverage.
| ID | Criterion | Status | Evidence |
|----|-----------|--------|----------|
| F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | ✅ done | `service.py:16-36` (16 types, superset of V1 minimum), `service.py:38-62` (18 relationship types), `service.py:64` statuses, `Entity` dataclass at line 67 |
| F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟡 partial | `queries.py` has system_map (Q-004), decisions_affecting (Q-008), requirements_for (Q-005 component side), recent_changes (Q-013), orphan_requirements (Q-006 killer), risky_decisions (Q-009 killer), unsupported_claims (Q-011 killer), impact_analysis (Q-016), evidence_chain (Q-017). Likely missing or partial: Q-001 (expand=contains), Q-002 (expand=parents), Q-003 (interfaces), Q-007 (constraints on component), Q-010 (supports trace), Q-012 (conflicting results), Q-014 (decision-log ordered chain), Q-018 (include=superseded chain), Q-019 (material→components), Q-020 (project overview mirror endpoint in V1-required shape) |
| F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | 🟡 partial (per Codex 2026-04-22 audit) | `service.py:16-36` has 16 types (superset of V1's 12), `service.py:38-62` has 18 relationship types, `service.py:64` statuses, `Entity` dataclass at line 67. **Gaps vs `engineering-v1-acceptance.md:45`**: `extractor_version` missing from dataclass and `entities` table; `canonical_home` missing from dataclass and table; `project` field is the project identifier but not named `project_id` as spec uses — spec says "fields equivalent to" so naming flexibility is allowed but needs an explicit doc note. Remediation lands in V1-0 |
| F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟡 partial (per Codex 2026-04-22 per-function audit) | **Ground truth from per-function read of `queries.py` + `routes.py:2092+`:** Q-001 partial (`system_map()` returns project-wide tree, not the catalog's subsystem-scoped `GET /entities/Subsystem/<id>?expand=contains` shape per `engineering-query-catalog.md:71`); Q-002 missing; Q-003 missing; Q-004 done (covered by `system_map()`); Q-005 done (`requirements_for()`); Q-006 done (`orphan_requirements()`); Q-007 missing; Q-008 done (`decisions_affecting()`); Q-009 done (`risky_decisions()`); Q-010 missing; Q-011 done (`unsupported_claims()`); Q-012 missing; Q-013 done (`recent_changes()`); Q-014 missing; Q-016 done (`impact_analysis()`); Q-017 done (`evidence_chain()`); Q-018 missing; Q-019 missing; Q-020 missing (mirror route in spec shape). **Net: 9 of 20 v1-required queries done, 1 partial (Q-001), 10 missing.** Q-015 is v1-stretch, out of scope |
| F-3 | `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export` | ❌ missing | No `/ingest/kb-cad` or `/ingest/kb-fem` route in `api/routes.py`. No schema doc under `docs/architecture/` |
| F-4 | Candidate review queue end-to-end (list/promote/reject/edit) | 🟡 partial for entities | Memory side shipped in Phase 9 Commit C. Entity side has `promote_entity`, `supersede_entity`, `invalidate_active_entity` but reject path and editable-before-promote may not match spec shape. Need to verify `GET /entities?status=candidate` returns spec shape |
| F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟡 partial | `conflicts.py` has `detect_conflicts_for_entity`, `list_open_conflicts`, `resolve_conflict`. API at `/admin/conflicts` + `/admin/conflicts/{id}/resolve`. **Gap vs spec**: spec wants generic slot-key model with `conflicts` + `conflict_members` tables; current code has per-type detectors (`_check_component_conflicts`, `_check_requirement_conflicts`) — need to verify schema, and spec routes are `/conflicts/*` not `/admin/conflicts/*` |
| F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟡 partial (per Codex 2026-04-22 audit — schema present, detector+routes divergent) | **Schema is already spec-shaped**: `database.py:190` defines the generic `conflicts` + `conflict_members` tables per `conflict-model.md`; `conflicts.py:154` persists through them. **Divergences are in detection and API, not schema**: (1) `conflicts.py:36` dispatches per-type detectors only (`_check_component_conflicts`, `_check_requirement_conflicts`) — needs generalization to slot-key-driven detection; (2) routes live at `/admin/conflicts/*`, spec says `/conflicts/*` — needs alias + deprecation. **No schema migration needed** |
| F-6 | Mirror: `/mirror/{project}/overview`, `/decisions`, `/subsystems/{id}`, `/regenerate`; files under `/srv/storage/atocore/data/mirror/`; disputed + curated markers; deterministic output | 🟡 partial | `mirror.py` has `generate_project_overview` with header/state/system/decisions/requirements/materials/vendors/memories/footer sections. API at `/projects/{project_name}/mirror` and `.html`. **Gaps**: no separate `/mirror/{project}/decisions` or `/mirror/{project}/subsystems/{id}` routes, no `POST /regenerate` endpoint, no debounced-async-on-write, no daily refresh, no `⚠ disputed` markers wired to conflicts, no `(curated)` override annotations verified, no golden-file test for determinism |
| F-7 | Memory→entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟡 partial | `_graduation_prompt.py` exists; `api_request_graduation` + `api_graduation_status` + `api_graduation_stats` routes exist (routes.py:1573, 1607, 2065). Need to verify full flow against F-7 spec — original preserved? `graduated` status row added? forward pointer column present? |
| F-8 | Every active entity has `source_refs`; Q-017 returns ≥1 row for every active entity | 🟡 partial | `Entity.source_refs` field exists; Q-017 (`evidence_chain`) exists. **Gap**: is provenance enforced at write time (not NULL), or just encouraged? Per spec it must be mandatory |
@@ -116,12 +116,14 @@ capability exists but does not yet match spec shape or coverage.
| D-3 | `docs/v1-release-notes.md` | ❌ missing | Not written yet (appropriately — it's written when V1 is done) |
| D-4 | `master-plan-status.md` + `current-state.md` updated with V1 completion | ❌ not yet | `master-plan-status.md:179` still has V1 under **Next** |
### Summary
### Summary (revised per Codex 2026-04-22 per-file audit)
- **Functional:** 1/8 ✅, 6/8 🟡 partial, 1/8 ❌ missing → the entity layer is real; the ingest + mirror + graduation surfaces need completion
- **Functional:** 0/8 ✅, 7/8 🟡 partial (F-1 downgraded from ✅ — two header fields missing; F-2 through F-7 partial), 1/8 ❌ missing (F-3 ingest endpoints) → the entity layer shape is real but not yet spec-clean; write-time invariants come first, then everything builds on stable invariants
- **F-2 detail:** 9 of 20 v1-required queries done, 1 partial (Q-001 needs subsystem-scoped variant), 10 missing
- **F-5 detail:** generic `conflicts` + `conflict_members` schema already present (no migration needed); detector body + routes diverge from spec
- **Quality:** 1/6 ✅, 3/6 🟡 partial, 2/6 ❌ missing → golden file + killer-correctness integration test are the two clear gaps
- **Operational:** 0/5 ✅ (none marked fully verified), 3/5 🟡, 1/5 ❌ → backup drill is the one hard blocker here
- **Documentation:** 0/4 ✅, 4/4 ❌ → all 4 docs need writing, D-3/D-4 at the end, D-1/D-2 as part of their respective F criteria
- **Operational:** 0/5 ✅ (none fully verified), 3/5 🟡, 1/5 ❌ → backup drill is the one hard blocker here
- **Documentation:** 0/4 ✅, 4/4 ❌ → all 4 docs need writing
---
@@ -142,13 +144,23 @@ Skipped by construction: F-1 core schema (already implemented) and O-5
### Phase V1-0: Write-time invariants (F-8 + F-5 minimal + F-1 audit)
**Scope:**
- **F-1 audit (Codex action).** Before any code change, Codex does a
per-file audit of `src/atocore/engineering/service.py`,
`conflicts.py`, `mirror.py`, `queries.py` against the acceptance doc's
F-1 shared-header-field list (`id, type, name, project_id, status,
confidence, source_refs, created_at, updated_at, extractor_version,
canonical_home`). Confirm which fields exist, which are missing. This
becomes the ground-truth F-1 row in the gap audit table below.
- **F-1 remediation (Codex audit 2026-04-22 already completed).** Add
the two missing shared-header fields to the `Entity` dataclass
(`service.py:67`) and the `entities` table schema:
- `extractor_version TEXT` — semver-ish string carrying the extractor
module version per `promotion-rules.md:268`. Backfill existing rows
with `"0.0.0"` or `NULL` flagged as unknown. Every future
write carries the current `EXTRACTOR_VERSION` constant.
- `canonical_home TEXT` — which layer is canonical for this concept.
For entities, value is always `"entity"`. For future graduation
records it may be `"memory"` (frozen pointer). Backfill active
rows with `"entity"`.
- Additive migration via the existing `_apply_migrations` pattern,
idempotent, safe on replay.
- Add doc note in `engineering-ontology-v1.md` clarifying that the
`project` field IS the `project_id` per spec — "fields equivalent
to" wording in the spec allows this, but make it explicit so
future readers don't trip on the naming.
- **F-8 provenance enforcement.** Add a NOT-NULL invariant at
`create_entity` and `promote_entity` that `source_refs` is non-empty
OR an explicit `hand_authored=True` flag is set (per
@@ -171,11 +183,13 @@ Skipped by construction: F-1 core schema (already implemented) and O-5
land in V1-E; this is the one that V1-0 can cover without graduation
being ready.)
**Acceptance:** F-8, F-5 minimal hooks ✅, Q-3 ✅, partial Q-4 ✅,
F-1 row in gap table is accurate.
**Acceptance:** F-1 (after `extractor_version` + `canonical_home`
land + doc note on `project` naming), F-8 ✅, F-5 hooks ✅, Q-3 ✅,
partial Q-4 ✅.
**Estimated size:** 3 days (the audit is the biggest unknown; the
enforcement patches are small).
**Estimated size:** 3 days (two small schema additions + invariant
patches + hook wiring + tests; no audit overhead — Codex already did
that part).
**Tests added:** ~10.
@@ -186,30 +200,34 @@ must then be cleaned up.
### Phase V1-A: Minimal query slice that proves the model (partial F-2 + Q-6)
**Scope:**
- Pick the **four queries that prove the model on p05-interferometer**:
Q-001 (subsystem contents), Q-005 (component satisfies requirements),
Q-006 (orphan requirements — killer correctness), Q-017 (evidence
chain). These four exercise structural + intent + killer-correctness +
provenance, which are the four pillars of the V1 shape.
- Pick the **four pillar queries**: Q-001 (subsystem contents),
Q-005 (component satisfies requirements), Q-006 (orphan requirements —
killer correctness), Q-017 (evidence chain). These exercise structural +
intent + killer-correctness + provenance.
- **Q-001 needs a shape fix**: Codex's audit confirms the existing
`system_map()` returns a project-wide tree, not the spec's
subsystem-scoped `GET /entities/Subsystem/<id>?expand=contains`.
Add a subsystem-scoped variant (the existing project-wide route stays
for Q-004). This is the only shape fix in V1-A; larger query additions
move to V1-C.
- Q-005, Q-006, Q-017 are already implemented per Codex audit. V1-A
verifies them against seeded data; no code changes expected.
- Seed p05-interferometer with Q-6 integration data (one satisfying
Component + one orphan Requirement + one Decision on flagged
Assumption + one supported ValidationClaim + one unsupported
ValidationClaim).
- Verify each of the four queries returns correct results against the
seeded data. The three killer-correctness queries (Q-006, Q-009,
Q-011) run as a single integration test. Q-009 and Q-011 are
implemented against the seed data here even though they're not in the
"four pillars" list, because Q-6 requires all three.
- Any query function the Codex F-1 audit found to be missing fields
required by Q-001/Q-005/Q-006/Q-017 gets filled in here, not in V1-C.
- All three killer-correctness queries (Q-006, Q-009, Q-011) are
**already implemented** per Codex audit. V1-A runs them as a single
integration test against the seed data.
**Acceptance:** The four pillar queries + Q-006/Q-009/Q-011 killer
correctness all return correct results. Q-6 ✅ passes. Partial F-2
(the remaining queries land in V1-C).
**Acceptance:** Q-001 subsystem-scoped variant + Q-6 integration test.
Partial F-2 (remaining 10 missing + 1 partial queries land in V1-C).
**Estimated size:** 2 days.
**Estimated size:** 1.5 days (scope shrunk — most pillar queries already
work per Codex audit; only Q-001 shape fix + seed data + integration
test required).
**Tests added:** ~6.
**Tests added:** ~4.
**Why second:** proves the entity layer shape works end-to-end on real
data before we start bolting ingest, graduation, or mirror onto it. If
@@ -244,22 +262,25 @@ further.
### Phase V1-C: Close the rest of the query catalog (remaining F-2)
**Scope:**
- Implement remaining v1-required queries: Q-002 (component parents),
Q-003 (subsystem interfaces, with Interface as simple string label),
Q-004 (project system-map tree), Q-007 (component constraints),
Q-008 (decisions affecting an entity, full shape), Q-010 (supports
trace to AnalysisModel), Q-012 (conflicting results on same claim —
exercises V1-0's F-5 hook), Q-013 (recent changes with window),
Q-014 (decision log ordered + superseded chain), Q-016 (impact
analysis — likely already done, just verify shape), Q-018
(`include=superseded`), Q-019 (material → components).
- Q-020 (project overview mirror route) is deferred to V1-D where the
**Scope:** close the 10 missing queries per Codex's audit. Already-done
queries (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017) are
verified but not rewritten.
- Q-002 (component → parents, inverse of CONTAINS)
- Q-003 (subsystem interfaces, Interface as simple string label)
- Q-007 (component → constraints via CONSTRAINED_BY)
- Q-010 (ValidationClaim → supporting results + AnalysisModel trace)
- Q-012 (conflicting results on same claim — exercises V1-0's F-5 hook)
- Q-014 (decision log ordered + superseded chain)
- Q-018 (`include=superseded` for supersession chains)
- Q-019 (Material → components, derived from Component.material field
per `engineering-query-catalog.md:266`, no edge needed)
- Q-020 (project overview mirror route) — deferred to V1-D where the
mirror lands in full.
**Acceptance:** F-2 ✅ (all 19 of 20 v1-required queries; Q-020 in V1-D).
**Estimated size:** 2 days.
**Estimated size:** 2 days (eight new query functions + routes +
per-query happy-path tests).
**Tests added:** ~12.
@@ -327,17 +348,22 @@ a stable, tested entity layer.
### Phase V1-F: Full F-5 spec compliance + O-1/O-2/O-3 + D-1/D-3/D-4
**Scope:**
- **F-5 full spec compliance.** Audit `conflicts.py` against
`conflict-model.md`. The spec wants a generic `conflicts` +
`conflict_members` table with slot-keyed detection. V1-0 put the hook
in place; V1-F is where the detector body gets swapped to the generic
shape if the audit shows divergence.
- If schema already matches spec: no work.
- If divergent: migrate additively (new tables alongside existing,
dual-read, drop old after one stable release).
- Rename `/admin/conflicts/*` routes to `/conflicts/*` per spec,
keep `/admin/conflicts/*` as aliases for one release, deprecate in
D-3 release notes.
- **F-5 full spec compliance** (Codex 2026-04-22 audit already confirmed
the gap shape — schema is spec-compliant, divergence is in detector +
routes only).
- **Detector generalization.** Replace the per-type dispatch at
`conflicts.py:36` (`_check_component_conflicts`,
`_check_requirement_conflicts`) with a slot-key-driven generic
detector that reads the per-entity-type conflict slot from a
registry and queries the already-generic `conflicts` +
`conflict_members` tables. The V1-0 hook shape was chosen to make
this a detector-body swap, not an API change.
- **Route alignment.** Add `/conflicts/*` routes as the canonical
surface per `conflict-model.md:187`. Keep `/admin/conflicts/*` as
aliases for one release, deprecate in D-3 release notes, remove
in V1.1.
- **No schema migration needed** (the tables at `database.py:190`
already match the spec).
- **O-1:** Run the full migration against a Dalidou backup copy.
Confirm additive, idempotent, safe to run twice.
- **O-2:** Run a full restore drill on the test project per
@@ -359,12 +385,14 @@ D-1 entity docs at ~30 min each ≈ 6 hours; verification is fast).
**Tests added:** ~6 (F-5 spec-shape tests; verification adds no automated
tests).
### Total (revised)
### Total (revised after Codex 2026-04-22 audit)
- Estimated **1217 days of focused work** across seven phases — up from
the original 1114 days to reflect V1-0 overhead and Codex's objection
that the first estimate was too tight.
- Adds roughly **65 tests** (533 → ~600).
- Phase budgets: V1-0 (3) + V1-A (1.5) + V1-B (2) + V1-C (2) + V1-D (3-4)
+ V1-E (2) + V1-F (3) ≈ **16.517.5 days of focused work**. Revised down
slightly from the previous 1217 estimate because V1-A scope shrank
(four pillar queries are mostly already implemented per Codex audit)
and V1-F F-5 work shrank (no schema migration needed).
- Adds roughly **60 tests** (533 → ~593).
- Branch strategy: one branch per phase (V1-0 → V1-F), each squash-merged
to main after Codex review. Phases sequential because each builds on
the previous. **V1-0 is a hard prerequisite for all later phases**
@@ -464,49 +492,42 @@ following are **explicitly out of scope** for this plan:
---
## Open questions for Codex
## Open questions for Codex (post-second-round revision)
1. **Is the parallel schedule with the Now list acceptable?** Claude's read
is that V1 work and Now items touch disjoint surfaces so they run in
parallel without conflict. Codex may see collisions Claude missed.
Three of the original eight questions (F-1 field audit, F-2 per-query
audit, F-5 schema divergence) were answered by Codex's 2026-04-22 audit
and folded into the plan. Remaining open questions:
2. **Phase V1-A query audit scope.** Claude listed Q-001, Q-002, Q-003,
Q-007, Q-010, Q-012, Q-014, Q-018, Q-019, Q-020 as likely gaps without
reading each query function end-to-end. Codex's per-file audit may find
more already done (or more missing).
1. **Parallel schedule vs Now list.** The first-round review correctly
softened this from "fully parallel" to "less disjoint than claimed".
Is the revised collision table + pause-points section enough, or
should specific Now-list items gate specific V1 phases more strictly?
3. **F-5 conflicts schema divergence.** The current code uses per-type
detectors (`_check_component_conflicts`, `_check_requirement_conflicts`)
whereas the spec wants a generic slot-keyed `conflicts` + `conflict_members`.
Is the existing schema *equivalent* (just implemented differently) or
*divergent* (needs migration)? This is a one-read decision for Codex.
2. **F-7 graduation gap depth.** Still unaudited. `_graduation_prompt.py`
+ `api_request_graduation` + DB schema need one Codex read to tell
us whether V1-E is a 2-day phase or a 4-day phase.
4. **Should F-5 route rename (`/admin/conflicts/*``/conflicts/*`) be
breaking?** Spec route path differs from current. Proposal: add
`/conflicts/*` as aliases, keep `/admin/conflicts/*` for one release,
deprecate in V1 release notes, remove in V1.1.
3. **Mirror determinism — where does `now` go?** The current mirror
footer has a live timestamp (`mirror.py:326`). Spec says
deterministic output, spec also shows a `Regenerated:` header with
timestamp (`human-mirror-rules.md:265`). Reconciliation proposal:
timestamp allowed in the header banner but must be an input
parameter so the golden-file test can pin it. Sound right?
5. **Mirror determinism — where does `now` go?** The current mirror footer
has a live timestamp (line 326 of `mirror.py`). Spec says deterministic
output, spec also shows a `Regenerated:` header with timestamp (line
265 of `human-mirror-rules.md`). Reconciliation: timestamp is allowed
in the header banner but must be an input parameter so the golden-file
test can pin it. Sound right?
4. **`project` field naming.** The spec writes `project_id`; the code
writes `project`. The spec says "fields equivalent to" so naming is
technically flexible. Proposal: V1-0 adds a doc note making this
explicit, no column rename. Is this acceptable, or does Codex want
the rename for cleanliness?
6. **F-7 graduation gap depth.** Without running the existing graduation
flow end-to-end against a real memory, Claude can't tell how close the
existing code is to F-7 spec. Codex's audit of
`_graduation_prompt.py` + `api_request_graduation` + DB schema would
close this question in one read.
5. **Velocity calibration.** Revised 16.517.5 days total. Given Phase
7A took a week and Phase 7D fit in one session, is this a fair
estimate for a single-operator sprint, or should we build in buffer
for multi-phase context switches?
7. **Estimated 1114 days honest?** Given recent phase velocities (Phase
7A was a week, Phase 7D fit in a single session), 23 days per phase
across 6 phases may be light or heavy. Codex's calibration against
actual repo velocity would help.
8. **After V1, the minions/queue mechanic we rejected returns as a
candidate V2 item.** Should we note it explicitly in V1 release notes
(D-3) as a future track, or leave it unnamed until V2 planning starts?
6. **Minions/queue as V2 item in D-3.** Should we name it explicitly in
V1 release notes as a future track, or leave it unnamed until V2
planning starts?
---
@@ -515,7 +536,7 @@ following are **explicitly out of scope** for this plan:
| Risk | Mitigation |
|---|---|
| V1 work slows the Now list | V1 pauses on any Now-list blocker. Codex veto on any V1 PR that touches memory extractor, retrieval ranking, or triage paths |
| F-5 schema migration is bigger than estimated | If Codex audit shows material divergence, split V1-E into two phases (schema migration separate from provenance enforcement) |
| F-5 detector generalization harder than estimated | Codex audit confirmed schema is already spec-compliant; only detector body + routes need work. If detector generalization still slips, keep per-type detectors and document as a V1.1 cleanup (detection correctness is unaffected, only code organization) |
| Mirror determinism regresses existing mirror output | Keep `/projects/{project_name}/mirror` alias returning the current shape; new `/mirror/{project}/overview` is the spec-compliant one. Deprecate old in V1 release notes |
| Golden file churn as templates evolve | Standard workflow: updating a golden file is a normal part of template work, documented in V1-C commit message |
| Backup drill on Dalidou is disruptive | Run against a clone of the Dalidou DB at a safe hour; no production drill required for V1 acceptance |