diff --git a/DEV-LEDGER.md b/DEV-LEDGER.md index ad6338b..8861fd7 100644 --- a/DEV-LEDGER.md +++ b/DEV-LEDGER.md @@ -146,6 +146,7 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha ## Recent Decisions +- **2026-04-22** **Engineering V1 Completion Plan revised per Codex second-round file-level audit** β€” three findings folded in, all with exact file:line refs from Codex: (1) F-1 downgraded from βœ… to 🟑 β€” `extractor_version` and `canonical_home` missing from `Entity` dataclass and `entities` table per `engineering-v1-acceptance.md:45`; V1-0 scope now adds both fields via additive migration + doc note that `project` IS `project_id` per "fields equivalent to" spec wording; (2) F-2 replaced with ground-truth per-query status: 9 of 20 v1-required queries done (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017), 1 partial (Q-001 needs subsystem-scoped variant), 10 missing (Q-002/003/007/010/012/014/018/019/020); V1-A scope shrank to Q-001 shape fix + Q-6 integration (pillar queries already implemented); V1-C closes the 8 remaining new queries + Q-020 deferred to V1-D; (3) F-5 reframed β€” generic `conflicts` + `conflict_members` schema already present at `database.py:190`, no migration needed; divergence is detector body (per-type dispatch needs generalization) + routes (`/admin/conflicts/*` needs `/conflicts/*` alias). Total revised to 16.5–17.5 days, ~60 tests. Plan: `docs/plans/engineering-v1-completion-plan.md` at commit `ce3a878` (Codex pulled clean). Three of Codex's eight open questions now answered; remaining: F-7 graduation depth, mirror determinism, `project` rename question, velocity calibration, minions naming. *Proposed by:* Claude. *Reviewed by:* Codex (two rounds). - **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** β€” original six-phase order (queries β†’ ingest β†’ mirror β†’ graduation β†’ provenance β†’ ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" β€” real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "50–70% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex. - **2026-04-22** gbrain-inspired "Phase 8 Minions + typed edges" plan **rejected as packaged** β€” wrong sequencing (leapfrogged `master-plan-status.md` Now list), wrong predicate set (6 vs V1's 17), wrong canonical boundary (edges-on-wikilinks instead of typed entities+relationships per `memory-vs-entities.md`). Mechanic (durable jobs + typed graph) deferred to V1 home. Record: `docs/decisions/2026-04-22-gbrain-plan-rejection.md`. *Proposed by:* Claude. *Reviewed/rejected by:* Codex. *Ratified by:* Antoine. - **2026-04-12** Day 4 gate cleared: LLM-assisted extraction via `claude -p` (OAuth, no API key) is the path forward. Rule extractor stays as default for structural cues. *Proposed by:* Claude. *Ratified by:* Antoine. @@ -162,6 +163,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha ## Session Log +- **2026-04-22 Claude (late night)** Codex second-round review did the full file-level audit and came back with three P1/P2 findings, all with exact file:line refs. Verified each against current code before revising. (1) **F-1 not clean**: `Entity` dataclass at `service.py:67` and `entities` table schema are missing the `extractor_version` and `canonical_home` shared-header fields required by `engineering-v1-acceptance.md:45`; `project` field is the project identifier but not named `project_id` as spec writes (spec wording "fields equivalent to" allows the naming, but needs explicit doc note). V1-0 scope now includes adding both missing fields via additive `_apply_migrations` pattern. (2) **F-2 needed exact statuses, not guesses**: per-function audit gave ground truth β€” 9 of 20 v1-required queries done, 1 partial (Q-001 returns project-wide tree not subsystem-scoped expand=contains per `engineering-query-catalog.md:71`), 10 missing. V1-A scope shrank to Q-001 shape fix + Q-6 integration (most pillar queries already implemented); V1-C closes the 8 net-new queries + Q-020 to V1-D. (3) **F-5 misframed**: the generic `conflicts` + `conflict_members` schema is ALREADY spec-compliant at `database.py:190`; divergence is detector body at `conflicts.py:36` (per-type dispatch needs generalization) + route path (`/admin/conflicts/*` needs `/conflicts/*` alias). V1-F no longer includes a schema migration; detector generalization + route alignment only. Totals revised to 16.5–17.5 days, ~60 tests (down from 12–17 / 65 because V1-A and V1-F scopes both shrank after audit). Three of the eight open questions resolved. Remaining open: F-7 graduation depth, mirror determinism, `project` naming, velocity calibration, minions-as-V2 naming. No code changes this session β€” plan + ledger only. Next: commit + push revised plan, then await Antoine+Codex joint sign-off before V1-0 starts. + - **2026-04-22 Claude (night)** Codex first-round review of the V1 Completion Plan summary came back with four findings. Three substantive, one workspace-sync: (1) "50–70% built" too loose β€” replaced with per-criterion table, global framing withdrawn; (2) phase order backward β€” provenance-at-write (F-8) and conflict hooks (F-5 minimal) depend-upon by every later phase but were in V1-E; new V1-0 prerequisite phase inserted to establish write-time invariants, and V1-A shrunk to a minimum query slice (four pillars Q-001/Q-005/Q-006/Q-017 + Q-6 integration) rather than full catalog closure; (3) "parallel with Now list / disjoint surfaces" too strong β€” real collisions listed explicitly (V1-0 provenance + memory extractor write path, V1-E graduation + memory module, V1-F conflicts migration + memory promote); schedule shifted ~4 weeks, V1-0 cannot start during pipeline soak; (4) Codex's Playground workspace can't see the plan file or the `src/atocore/engineering/` code β€” added a Workspace note to the plan directing per-file audit at the Windows canonical dev tree (`C:\Users\antoi\ATOCore`) and noting the three visible file paths (`docs/plans/engineering-v1-completion-plan.md`, `docs/decisions/2026-04-22-gbrain-plan-rejection.md`, `DEV-LEDGER.md`). Revised plan estimate: 12–17 days across 7 phases (up from 11–14 / 6), ~65 tests added (up from ~50). V1-0 is a hard prerequisite; no later phase starts until it lands. Pending Antoine decision on workspace sync (commit+push vs paste-to-Codex) so Codex can do the file-level audit. No code changes this session. - **2026-04-22 Claude (late eve)** After the rejection, read the four core V1 architecture docs end-to-end (`engineering-ontology-v1.md`, `engineering-query-catalog.md`, `memory-vs-entities.md`, `engineering-v1-acceptance.md`) plus the four supporting docs (`promotion-rules.md`, `conflict-model.md`, `human-mirror-rules.md`, `tool-handoff-boundaries.md`). Cross-referenced against current code in `src/atocore/engineering/`. **Key finding:** V1 is already 50–70% built β€” entity types (16, superset of V1's 12), all 18 V1 relationship types, 4-state lifecycle, CRUD + supersede + invalidate + PATCH, queries module with most killer-correctness queries (orphan_requirements, risky_decisions, unsupported_claims, impact_analysis, evidence_chain), conflicts module scaffolded, mirror scaffolded, graduation endpoint scaffolded. Recent commits e147ab2/b94f9df/081c058/069d155/b1a3dd0 are all V1 entity-layer work. Drafted `docs/plans/engineering-v1-completion-plan.md` reframing the work as **V1 completion, not V1 start**. Six sequential phases V1-A through V1-F, estimated 11–14 days, ~50 new tests (533 β†’ ~580). Phases run in parallel with the Now list (pipeline soak + density + multi-model triage + p04-constraints) because surfaces are disjoint. Plan explicitly defers the minions/queue mechanic per acceptance-doc negative list. Pending Codex audit of the plan itself β€” especially the F-2 query gap list (Claude didn't read each query function end-to-end), F-5 conflicts schema divergence (per-type detectors vs spec's generic slot-keyed shape), and F-7 graduation depth. No code changes this session. diff --git a/docs/plans/engineering-v1-completion-plan.md b/docs/plans/engineering-v1-completion-plan.md index e94cf4d..ed93fff 100644 --- a/docs/plans/engineering-v1-completion-plan.md +++ b/docs/plans/engineering-v1-completion-plan.md @@ -77,11 +77,11 @@ capability exists but does not yet match spec shape or coverage. | ID | Criterion | Status | Evidence | |----|-----------|--------|----------| -| F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | βœ… done | `service.py:16-36` (16 types, superset of V1 minimum), `service.py:38-62` (18 relationship types), `service.py:64` statuses, `Entity` dataclass at line 67 | -| F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟑 partial | `queries.py` has system_map (Q-004), decisions_affecting (Q-008), requirements_for (Q-005 component side), recent_changes (Q-013), orphan_requirements (Q-006 killer), risky_decisions (Q-009 killer), unsupported_claims (Q-011 killer), impact_analysis (Q-016), evidence_chain (Q-017). Likely missing or partial: Q-001 (expand=contains), Q-002 (expand=parents), Q-003 (interfaces), Q-007 (constraints on component), Q-010 (supports trace), Q-012 (conflicting results), Q-014 (decision-log ordered chain), Q-018 (include=superseded chain), Q-019 (materialβ†’components), Q-020 (project overview mirror endpoint in V1-required shape) | +| F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | 🟑 partial (per Codex 2026-04-22 audit) | `service.py:16-36` has 16 types (superset of V1's 12), `service.py:38-62` has 18 relationship types, `service.py:64` statuses, `Entity` dataclass at line 67. **Gaps vs `engineering-v1-acceptance.md:45`**: `extractor_version` missing from dataclass and `entities` table; `canonical_home` missing from dataclass and table; `project` field is the project identifier but not named `project_id` as spec uses β€” spec says "fields equivalent to" so naming flexibility is allowed but needs an explicit doc note. Remediation lands in V1-0 | +| F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟑 partial (per Codex 2026-04-22 per-function audit) | **Ground truth from per-function read of `queries.py` + `routes.py:2092+`:** Q-001 partial (`system_map()` returns project-wide tree, not the catalog's subsystem-scoped `GET /entities/Subsystem/?expand=contains` shape per `engineering-query-catalog.md:71`); Q-002 missing; Q-003 missing; Q-004 done (covered by `system_map()`); Q-005 done (`requirements_for()`); Q-006 done (`orphan_requirements()`); Q-007 missing; Q-008 done (`decisions_affecting()`); Q-009 done (`risky_decisions()`); Q-010 missing; Q-011 done (`unsupported_claims()`); Q-012 missing; Q-013 done (`recent_changes()`); Q-014 missing; Q-016 done (`impact_analysis()`); Q-017 done (`evidence_chain()`); Q-018 missing; Q-019 missing; Q-020 missing (mirror route in spec shape). **Net: 9 of 20 v1-required queries done, 1 partial (Q-001), 10 missing.** Q-015 is v1-stretch, out of scope | | F-3 | `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export` | ❌ missing | No `/ingest/kb-cad` or `/ingest/kb-fem` route in `api/routes.py`. No schema doc under `docs/architecture/` | | F-4 | Candidate review queue end-to-end (list/promote/reject/edit) | 🟑 partial for entities | Memory side shipped in Phase 9 Commit C. Entity side has `promote_entity`, `supersede_entity`, `invalidate_active_entity` but reject path and editable-before-promote may not match spec shape. Need to verify `GET /entities?status=candidate` returns spec shape | -| F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟑 partial | `conflicts.py` has `detect_conflicts_for_entity`, `list_open_conflicts`, `resolve_conflict`. API at `/admin/conflicts` + `/admin/conflicts/{id}/resolve`. **Gap vs spec**: spec wants generic slot-key model with `conflicts` + `conflict_members` tables; current code has per-type detectors (`_check_component_conflicts`, `_check_requirement_conflicts`) β€” need to verify schema, and spec routes are `/conflicts/*` not `/admin/conflicts/*` | +| F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟑 partial (per Codex 2026-04-22 audit β€” schema present, detector+routes divergent) | **Schema is already spec-shaped**: `database.py:190` defines the generic `conflicts` + `conflict_members` tables per `conflict-model.md`; `conflicts.py:154` persists through them. **Divergences are in detection and API, not schema**: (1) `conflicts.py:36` dispatches per-type detectors only (`_check_component_conflicts`, `_check_requirement_conflicts`) β€” needs generalization to slot-key-driven detection; (2) routes live at `/admin/conflicts/*`, spec says `/conflicts/*` β€” needs alias + deprecation. **No schema migration needed** | | F-6 | Mirror: `/mirror/{project}/overview`, `/decisions`, `/subsystems/{id}`, `/regenerate`; files under `/srv/storage/atocore/data/mirror/`; disputed + curated markers; deterministic output | 🟑 partial | `mirror.py` has `generate_project_overview` with header/state/system/decisions/requirements/materials/vendors/memories/footer sections. API at `/projects/{project_name}/mirror` and `.html`. **Gaps**: no separate `/mirror/{project}/decisions` or `/mirror/{project}/subsystems/{id}` routes, no `POST /regenerate` endpoint, no debounced-async-on-write, no daily refresh, no `⚠ disputed` markers wired to conflicts, no `(curated)` override annotations verified, no golden-file test for determinism | | F-7 | Memoryβ†’entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟑 partial | `_graduation_prompt.py` exists; `api_request_graduation` + `api_graduation_status` + `api_graduation_stats` routes exist (routes.py:1573, 1607, 2065). Need to verify full flow against F-7 spec β€” original preserved? `graduated` status row added? forward pointer column present? | | F-8 | Every active entity has `source_refs`; Q-017 returns β‰₯1 row for every active entity | 🟑 partial | `Entity.source_refs` field exists; Q-017 (`evidence_chain`) exists. **Gap**: is provenance enforced at write time (not NULL), or just encouraged? Per spec it must be mandatory | @@ -116,12 +116,14 @@ capability exists but does not yet match spec shape or coverage. | D-3 | `docs/v1-release-notes.md` | ❌ missing | Not written yet (appropriately β€” it's written when V1 is done) | | D-4 | `master-plan-status.md` + `current-state.md` updated with V1 completion | ❌ not yet | `master-plan-status.md:179` still has V1 under **Next** | -### Summary +### Summary (revised per Codex 2026-04-22 per-file audit) -- **Functional:** 1/8 βœ…, 6/8 🟑 partial, 1/8 ❌ missing β†’ the entity layer is real; the ingest + mirror + graduation surfaces need completion +- **Functional:** 0/8 βœ…, 7/8 🟑 partial (F-1 downgraded from βœ… β€” two header fields missing; F-2 through F-7 partial), 1/8 ❌ missing (F-3 ingest endpoints) β†’ the entity layer shape is real but not yet spec-clean; write-time invariants come first, then everything builds on stable invariants +- **F-2 detail:** 9 of 20 v1-required queries done, 1 partial (Q-001 needs subsystem-scoped variant), 10 missing +- **F-5 detail:** generic `conflicts` + `conflict_members` schema already present (no migration needed); detector body + routes diverge from spec - **Quality:** 1/6 βœ…, 3/6 🟑 partial, 2/6 ❌ missing β†’ golden file + killer-correctness integration test are the two clear gaps -- **Operational:** 0/5 βœ… (none marked fully verified), 3/5 🟑, 1/5 ❌ β†’ backup drill is the one hard blocker here -- **Documentation:** 0/4 βœ…, 4/4 ❌ β†’ all 4 docs need writing, D-3/D-4 at the end, D-1/D-2 as part of their respective F criteria +- **Operational:** 0/5 βœ… (none fully verified), 3/5 🟑, 1/5 ❌ β†’ backup drill is the one hard blocker here +- **Documentation:** 0/4 βœ…, 4/4 ❌ β†’ all 4 docs need writing --- @@ -142,13 +144,23 @@ Skipped by construction: F-1 core schema (already implemented) and O-5 ### Phase V1-0: Write-time invariants (F-8 + F-5 minimal + F-1 audit) **Scope:** -- **F-1 audit (Codex action).** Before any code change, Codex does a - per-file audit of `src/atocore/engineering/service.py`, - `conflicts.py`, `mirror.py`, `queries.py` against the acceptance doc's - F-1 shared-header-field list (`id, type, name, project_id, status, - confidence, source_refs, created_at, updated_at, extractor_version, - canonical_home`). Confirm which fields exist, which are missing. This - becomes the ground-truth F-1 row in the gap audit table below. +- **F-1 remediation (Codex audit 2026-04-22 already completed).** Add + the two missing shared-header fields to the `Entity` dataclass + (`service.py:67`) and the `entities` table schema: + - `extractor_version TEXT` β€” semver-ish string carrying the extractor + module version per `promotion-rules.md:268`. Backfill existing rows + with `"0.0.0"` or `NULL` flagged as unknown. Every future + write carries the current `EXTRACTOR_VERSION` constant. + - `canonical_home TEXT` β€” which layer is canonical for this concept. + For entities, value is always `"entity"`. For future graduation + records it may be `"memory"` (frozen pointer). Backfill active + rows with `"entity"`. + - Additive migration via the existing `_apply_migrations` pattern, + idempotent, safe on replay. + - Add doc note in `engineering-ontology-v1.md` clarifying that the + `project` field IS the `project_id` per spec β€” "fields equivalent + to" wording in the spec allows this, but make it explicit so + future readers don't trip on the naming. - **F-8 provenance enforcement.** Add a NOT-NULL invariant at `create_entity` and `promote_entity` that `source_refs` is non-empty OR an explicit `hand_authored=True` flag is set (per @@ -171,11 +183,13 @@ Skipped by construction: F-1 core schema (already implemented) and O-5 land in V1-E; this is the one that V1-0 can cover without graduation being ready.) -**Acceptance:** F-8 βœ…, F-5 minimal hooks βœ…, Q-3 βœ…, partial Q-4 βœ…, -F-1 row in gap table is accurate. +**Acceptance:** F-1 βœ… (after `extractor_version` + `canonical_home` +land + doc note on `project` naming), F-8 βœ…, F-5 hooks βœ…, Q-3 βœ…, +partial Q-4 βœ…. -**Estimated size:** 3 days (the audit is the biggest unknown; the -enforcement patches are small). +**Estimated size:** 3 days (two small schema additions + invariant +patches + hook wiring + tests; no audit overhead β€” Codex already did +that part). **Tests added:** ~10. @@ -186,30 +200,34 @@ must then be cleaned up. ### Phase V1-A: Minimal query slice that proves the model (partial F-2 + Q-6) **Scope:** -- Pick the **four queries that prove the model on p05-interferometer**: - Q-001 (subsystem contents), Q-005 (component satisfies requirements), - Q-006 (orphan requirements β€” killer correctness), Q-017 (evidence - chain). These four exercise structural + intent + killer-correctness + - provenance, which are the four pillars of the V1 shape. +- Pick the **four pillar queries**: Q-001 (subsystem contents), + Q-005 (component satisfies requirements), Q-006 (orphan requirements β€” + killer correctness), Q-017 (evidence chain). These exercise structural + + intent + killer-correctness + provenance. +- **Q-001 needs a shape fix**: Codex's audit confirms the existing + `system_map()` returns a project-wide tree, not the spec's + subsystem-scoped `GET /entities/Subsystem/?expand=contains`. + Add a subsystem-scoped variant (the existing project-wide route stays + for Q-004). This is the only shape fix in V1-A; larger query additions + move to V1-C. +- Q-005, Q-006, Q-017 are already implemented per Codex audit. V1-A + verifies them against seeded data; no code changes expected. - Seed p05-interferometer with Q-6 integration data (one satisfying Component + one orphan Requirement + one Decision on flagged Assumption + one supported ValidationClaim + one unsupported ValidationClaim). -- Verify each of the four queries returns correct results against the - seeded data. The three killer-correctness queries (Q-006, Q-009, - Q-011) run as a single integration test. Q-009 and Q-011 are - implemented against the seed data here even though they're not in the - "four pillars" list, because Q-6 requires all three. -- Any query function the Codex F-1 audit found to be missing fields - required by Q-001/Q-005/Q-006/Q-017 gets filled in here, not in V1-C. +- All three killer-correctness queries (Q-006, Q-009, Q-011) are + **already implemented** per Codex audit. V1-A runs them as a single + integration test against the seed data. -**Acceptance:** The four pillar queries + Q-006/Q-009/Q-011 killer -correctness all return correct results. Q-6 βœ… passes. Partial F-2 -(the remaining queries land in V1-C). +**Acceptance:** Q-001 subsystem-scoped variant + Q-6 integration test. +Partial F-2 (remaining 10 missing + 1 partial queries land in V1-C). -**Estimated size:** 2 days. +**Estimated size:** 1.5 days (scope shrunk β€” most pillar queries already +work per Codex audit; only Q-001 shape fix + seed data + integration +test required). -**Tests added:** ~6. +**Tests added:** ~4. **Why second:** proves the entity layer shape works end-to-end on real data before we start bolting ingest, graduation, or mirror onto it. If @@ -244,22 +262,25 @@ further. ### Phase V1-C: Close the rest of the query catalog (remaining F-2) -**Scope:** -- Implement remaining v1-required queries: Q-002 (component parents), - Q-003 (subsystem interfaces, with Interface as simple string label), - Q-004 (project system-map tree), Q-007 (component constraints), - Q-008 (decisions affecting an entity, full shape), Q-010 (supports - trace to AnalysisModel), Q-012 (conflicting results on same claim β€” - exercises V1-0's F-5 hook), Q-013 (recent changes with window), - Q-014 (decision log ordered + superseded chain), Q-016 (impact - analysis β€” likely already done, just verify shape), Q-018 - (`include=superseded`), Q-019 (material β†’ components). -- Q-020 (project overview mirror route) is deferred to V1-D where the +**Scope:** close the 10 missing queries per Codex's audit. Already-done +queries (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017) are +verified but not rewritten. +- Q-002 (component β†’ parents, inverse of CONTAINS) +- Q-003 (subsystem interfaces, Interface as simple string label) +- Q-007 (component β†’ constraints via CONSTRAINED_BY) +- Q-010 (ValidationClaim β†’ supporting results + AnalysisModel trace) +- Q-012 (conflicting results on same claim β€” exercises V1-0's F-5 hook) +- Q-014 (decision log ordered + superseded chain) +- Q-018 (`include=superseded` for supersession chains) +- Q-019 (Material β†’ components, derived from Component.material field + per `engineering-query-catalog.md:266`, no edge needed) +- Q-020 (project overview mirror route) β€” deferred to V1-D where the mirror lands in full. **Acceptance:** F-2 βœ… (all 19 of 20 v1-required queries; Q-020 in V1-D). -**Estimated size:** 2 days. +**Estimated size:** 2 days (eight new query functions + routes + +per-query happy-path tests). **Tests added:** ~12. @@ -327,17 +348,22 @@ a stable, tested entity layer. ### Phase V1-F: Full F-5 spec compliance + O-1/O-2/O-3 + D-1/D-3/D-4 **Scope:** -- **F-5 full spec compliance.** Audit `conflicts.py` against - `conflict-model.md`. The spec wants a generic `conflicts` + - `conflict_members` table with slot-keyed detection. V1-0 put the hook - in place; V1-F is where the detector body gets swapped to the generic - shape if the audit shows divergence. - - If schema already matches spec: no work. - - If divergent: migrate additively (new tables alongside existing, - dual-read, drop old after one stable release). - - Rename `/admin/conflicts/*` routes to `/conflicts/*` per spec, - keep `/admin/conflicts/*` as aliases for one release, deprecate in - D-3 release notes. +- **F-5 full spec compliance** (Codex 2026-04-22 audit already confirmed + the gap shape β€” schema is spec-compliant, divergence is in detector + + routes only). + - **Detector generalization.** Replace the per-type dispatch at + `conflicts.py:36` (`_check_component_conflicts`, + `_check_requirement_conflicts`) with a slot-key-driven generic + detector that reads the per-entity-type conflict slot from a + registry and queries the already-generic `conflicts` + + `conflict_members` tables. The V1-0 hook shape was chosen to make + this a detector-body swap, not an API change. + - **Route alignment.** Add `/conflicts/*` routes as the canonical + surface per `conflict-model.md:187`. Keep `/admin/conflicts/*` as + aliases for one release, deprecate in D-3 release notes, remove + in V1.1. + - **No schema migration needed** (the tables at `database.py:190` + already match the spec). - **O-1:** Run the full migration against a Dalidou backup copy. Confirm additive, idempotent, safe to run twice. - **O-2:** Run a full restore drill on the test project per @@ -359,12 +385,14 @@ D-1 entity docs at ~30 min each β‰ˆ 6 hours; verification is fast). **Tests added:** ~6 (F-5 spec-shape tests; verification adds no automated tests). -### Total (revised) +### Total (revised after Codex 2026-04-22 audit) -- Estimated **12–17 days of focused work** across seven phases β€” up from - the original 11–14 days to reflect V1-0 overhead and Codex's objection - that the first estimate was too tight. -- Adds roughly **65 tests** (533 β†’ ~600). +- Phase budgets: V1-0 (3) + V1-A (1.5) + V1-B (2) + V1-C (2) + V1-D (3-4) + + V1-E (2) + V1-F (3) β‰ˆ **16.5–17.5 days of focused work**. Revised down + slightly from the previous 12–17 estimate because V1-A scope shrank + (four pillar queries are mostly already implemented per Codex audit) + and V1-F F-5 work shrank (no schema migration needed). +- Adds roughly **60 tests** (533 β†’ ~593). - Branch strategy: one branch per phase (V1-0 β†’ V1-F), each squash-merged to main after Codex review. Phases sequential because each builds on the previous. **V1-0 is a hard prerequisite for all later phases** β€” @@ -464,49 +492,42 @@ following are **explicitly out of scope** for this plan: --- -## Open questions for Codex +## Open questions for Codex (post-second-round revision) -1. **Is the parallel schedule with the Now list acceptable?** Claude's read - is that V1 work and Now items touch disjoint surfaces so they run in - parallel without conflict. Codex may see collisions Claude missed. +Three of the original eight questions (F-1 field audit, F-2 per-query +audit, F-5 schema divergence) were answered by Codex's 2026-04-22 audit +and folded into the plan. Remaining open questions: -2. **Phase V1-A query audit scope.** Claude listed Q-001, Q-002, Q-003, - Q-007, Q-010, Q-012, Q-014, Q-018, Q-019, Q-020 as likely gaps without - reading each query function end-to-end. Codex's per-file audit may find - more already done (or more missing). +1. **Parallel schedule vs Now list.** The first-round review correctly + softened this from "fully parallel" to "less disjoint than claimed". + Is the revised collision table + pause-points section enough, or + should specific Now-list items gate specific V1 phases more strictly? -3. **F-5 conflicts schema divergence.** The current code uses per-type - detectors (`_check_component_conflicts`, `_check_requirement_conflicts`) - whereas the spec wants a generic slot-keyed `conflicts` + `conflict_members`. - Is the existing schema *equivalent* (just implemented differently) or - *divergent* (needs migration)? This is a one-read decision for Codex. +2. **F-7 graduation gap depth.** Still unaudited. `_graduation_prompt.py` + + `api_request_graduation` + DB schema need one Codex read to tell + us whether V1-E is a 2-day phase or a 4-day phase. -4. **Should F-5 route rename (`/admin/conflicts/*` β†’ `/conflicts/*`) be - breaking?** Spec route path differs from current. Proposal: add - `/conflicts/*` as aliases, keep `/admin/conflicts/*` for one release, - deprecate in V1 release notes, remove in V1.1. +3. **Mirror determinism β€” where does `now` go?** The current mirror + footer has a live timestamp (`mirror.py:326`). Spec says + deterministic output, spec also shows a `Regenerated:` header with + timestamp (`human-mirror-rules.md:265`). Reconciliation proposal: + timestamp allowed in the header banner but must be an input + parameter so the golden-file test can pin it. Sound right? -5. **Mirror determinism β€” where does `now` go?** The current mirror footer - has a live timestamp (line 326 of `mirror.py`). Spec says deterministic - output, spec also shows a `Regenerated:` header with timestamp (line - 265 of `human-mirror-rules.md`). Reconciliation: timestamp is allowed - in the header banner but must be an input parameter so the golden-file - test can pin it. Sound right? +4. **`project` field naming.** The spec writes `project_id`; the code + writes `project`. The spec says "fields equivalent to" so naming is + technically flexible. Proposal: V1-0 adds a doc note making this + explicit, no column rename. Is this acceptable, or does Codex want + the rename for cleanliness? -6. **F-7 graduation gap depth.** Without running the existing graduation - flow end-to-end against a real memory, Claude can't tell how close the - existing code is to F-7 spec. Codex's audit of - `_graduation_prompt.py` + `api_request_graduation` + DB schema would - close this question in one read. +5. **Velocity calibration.** Revised 16.5–17.5 days total. Given Phase + 7A took a week and Phase 7D fit in one session, is this a fair + estimate for a single-operator sprint, or should we build in buffer + for multi-phase context switches? -7. **Estimated 11–14 days honest?** Given recent phase velocities (Phase - 7A was a week, Phase 7D fit in a single session), 2–3 days per phase - across 6 phases may be light or heavy. Codex's calibration against - actual repo velocity would help. - -8. **After V1, the minions/queue mechanic we rejected returns as a - candidate V2 item.** Should we note it explicitly in V1 release notes - (D-3) as a future track, or leave it unnamed until V2 planning starts? +6. **Minions/queue as V2 item in D-3.** Should we name it explicitly in + V1 release notes as a future track, or leave it unnamed until V2 + planning starts? --- @@ -515,7 +536,7 @@ following are **explicitly out of scope** for this plan: | Risk | Mitigation | |---|---| | V1 work slows the Now list | V1 pauses on any Now-list blocker. Codex veto on any V1 PR that touches memory extractor, retrieval ranking, or triage paths | -| F-5 schema migration is bigger than estimated | If Codex audit shows material divergence, split V1-E into two phases (schema migration separate from provenance enforcement) | +| F-5 detector generalization harder than estimated | Codex audit confirmed schema is already spec-compliant; only detector body + routes need work. If detector generalization still slips, keep per-type detectors and document as a V1.1 cleanup (detection correctness is unaffected, only code organization) | | Mirror determinism regresses existing mirror output | Keep `/projects/{project_name}/mirror` alias returning the current shape; new `/mirror/{project}/overview` is the spec-compliant one. Deprecate old in V1 release notes | | Golden file churn as templates evolve | Standard workflow: updating a golden file is a normal part of template work, documented in V1-C commit message | | Backup drill on Dalidou is disruptive | Run against a clone of the Dalidou DB at a safe hour; no production drill required for V1 acceptance |