docs(planning): V1 Completion Plan — Codex sign-off (third round)

Codex's third-round audit closed the remaining five open questions with concrete file:line resolutions, patched inline in the plan: - F-7 (P1): graduation stack is partially built — graduated_to_entity_id at database.py:143-146, graduated memory status, promote preserves original at service.py:354-356, tests at test_engineering_v1_phase5.py. Gaps: missing direct POST /memory/{id}/graduate route; spec's knowledge -> Fact mismatches ontology (no fact type). Reconcile to parameter or similar. V1-E 2 days -> 3-4 days. - Q-5 / V1-D (P2): renderer reads wall-clock in _footer at mirror.py:320. Fix is injecting regenerated timestamp + checksum as renderer inputs, sorting DB iteration, removing dict ordering deps. Render code must not call wall-clock directly. - project vs project_id (P3): doc note only, no storage rename. - Total estimate: 17.5-19.5 focused days (calendar buffer on top). - Release notes must NOT canonize "Minions" as a V2 name. Use neutral "queued background processing / async workers" wording. Sign-off from Codex: "with those edits, I'd sign off on the five questions. The only non-architectural uncertainty left in the plan is scheduling discipline against the current Now list; that does not block V1-0 once the soak window and memory-density gate clear." Plan frozen. V1-0 starts after pipeline soak (~2026-04-26) and the 100-active-memory density gate clear. Co-Authored-By: Codex <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:24:43 -04:00
parent 44724c81ab
commit 9ab5b3c9d8
2 changed files with 51 additions and 39 deletions
--- a/docs/plans/engineering-v1-completion-plan.md
+++ b/docs/plans/engineering-v1-completion-plan.md
@@ -83,7 +83,7 @@ capability exists but does not yet match spec shape or coverage.
 | F-4 | Candidate review queue end-to-end (list/promote/reject/edit) | 🟡 partial for entities | Memory side shipped in Phase 9 Commit C. Entity side has `promote_entity`, `supersede_entity`, `invalidate_active_entity` but reject path and editable-before-promote may not match spec shape. Need to verify `GET /entities?status=candidate` returns spec shape |
 | F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟡 partial (per Codex 2026-04-22 audit — schema present, detector+routes divergent) | **Schema is already spec-shaped**: `database.py:190` defines the generic `conflicts` + `conflict_members` tables per `conflict-model.md`; `conflicts.py:154` persists through them. **Divergences are in detection and API, not schema**: (1) `conflicts.py:36` dispatches per-type detectors only (`_check_component_conflicts`, `_check_requirement_conflicts`) — needs generalization to slot-key-driven detection; (2) routes live at `/admin/conflicts/*`, spec says `/conflicts/*` — needs alias + deprecation. **No schema migration needed** |
 | F-6 | Mirror: `/mirror/{project}/overview`, `/decisions`, `/subsystems/{id}`, `/regenerate`; files under `/srv/storage/atocore/data/mirror/`; disputed + curated markers; deterministic output | 🟡 partial | `mirror.py` has `generate_project_overview` with header/state/system/decisions/requirements/materials/vendors/memories/footer sections. API at `/projects/{project_name}/mirror` and `.html`. **Gaps**: no separate `/mirror/{project}/decisions` or `/mirror/{project}/subsystems/{id}` routes, no `POST /regenerate` endpoint, no debounced-async-on-write, no daily refresh, no `⚠ disputed` markers wired to conflicts, no `(curated)` override annotations verified, no golden-file test for determinism |
-| F-7 | Memory→entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟡 partial | `_graduation_prompt.py` exists; `api_request_graduation` + `api_graduation_status` + `api_graduation_stats` routes exist (routes.py:1573, 1607, 2065). Need to verify full flow against F-7 spec — original preserved? `graduated` status row added? forward pointer column present? |
+| F-7 | Memory→entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟡 partial (per Codex 2026-04-22 third-round audit) | `_graduation_prompt.py` exists; `scripts/graduate_memories.py` creates entity candidates from active memories; `database.py:143-146` adds `graduated_to_entity_id`; `memory.service` already has a `graduated` status; `service.py:354-356,389-451` preserves the original memory and marks it `graduated` with a forward pointer on entity promote; `tests/test_engineering_v1_phase5.py:67-90` covers that flow. **Gaps vs spec**: no direct `POST /memory/{id}/graduate` route yet (current surface is batch/admin-driven via `/admin/graduation/request`); no explicit acceptance tests yet for `adaptation→decision` and `project→requirement`; spec wording `knowledge→Fact` does not match the current ontology (there is no `fact` entity type in `service.py` / `_graduation_prompt.py`) and should be reconciled to an actual V1 type such as `parameter` or another ontology-defined entity. |
 | F-8 | Every active entity has `source_refs`; Q-017 returns ≥1 row for every active entity | 🟡 partial | `Entity.source_refs` field exists; Q-017 (`evidence_chain`) exists. **Gap**: is provenance enforced at write time (not NULL), or just encouraged? Per spec it must be mandatory |

 ### Quality (Q-1 through Q-6)
@@ -94,7 +94,7 @@ capability exists but does not yet match spec shape or coverage.
 | Q-2 | Each F criterion has happy-path + error-path test, <10s each, <30s total | 🟡 partial | 16 + 15 + 15 + 12 = 58 tests in engineering/queries/v1-phase5/patch files. Need to verify coverage of each F criterion one-for-one |
 | Q-3 | Conflict invariants enforced by tests (contradictory imports produce conflict, can't promote both, flag-never-block) | 🟡 partial | Tests likely exist in `test_engineering_v1_phase5.py` — verify explicit coverage of the three invariants |
 | Q-4 | Trust hierarchy enforced by tests (candidates never in context, active-only reinforcement, no auto-project-state writes) | 🟡 partial | Phase 9 Commit B covered the memory side; verify entity side has equivalent tests |
-| Q-5 | Mirror has golden-file test, deterministic output | ❌ missing | No golden file seen; mirror output includes `now` timestamp (line 326) which is non-deterministic — would fail Q-5 as written |
+| Q-5 | Mirror has golden-file test, deterministic output | ❌ missing | No golden file seen; mirror output reads wall-clock time inside `_footer()` (`mirror.py:320-327`). Determinism should come from injecting the regenerated timestamp/checksum as inputs to the renderer and pinning them in the golden-file test, not from calling `datetime.now()` inside render code |
 | Q-6 | Killer correctness queries pass against seeded real-ish data (5 seed cases per Q-006/Q-009/Q-011) | ❌ likely missing | No fixture file named for this seen. The three queries exist but there's no evidence of the single integration test described in Q-6 |

 ### Operational (O-1 through O-5)
@@ -296,9 +296,10 @@ They all sit on top of the same entity store and V1-0 invariants.
  `/mirror/{project}/decisions`, `/mirror/{project}/subsystems/{subsystem}`.
 - Add `POST /mirror/{project}/regenerate`.
 - Move generated files to `/srv/storage/atocore/data/mirror/{project}/`.
- **Deterministic output:** stabilize the `now` timestamp (input
-  parameter pinned by golden tests), sort every iteration, remove
-  `dict` ordering dependencies.
+- **Deterministic output:** inject regenerated timestamp + checksum as
+  renderer inputs (pinned by golden tests), sort every iteration, and
+  remove `dict` / database ordering dependencies. The renderer should
+  not call wall-clock time directly.
 - `⚠ disputed` markers inline wherever an open conflict touches a
  rendered field (uses V1-0's F-5 hook output).
 - `(curated)` annotations where project_state overrides entity state.
@@ -322,20 +323,28 @@ lands after everything it depends on is stable.

 **Scope:**
 - Verify and close F-7 spec gaps:
-  - Original memory gets `status="graduated"` (new status).
-  - Forward-pointer column from graduated memory to entity candidate id.
-  - Promote-entity preserves original memory.
-  - Flow tested for `adaptation` → Decision, `project` → Requirement,
-    `knowledge` → Fact.
- Minimal schema additions: one column + one new status value; additive
-  migration only.
+  - Add the missing direct `POST /memory/{id}/graduate` route, reusing the
+    same prompt/parser as the batch graduation path.
+  - Keep `/admin/graduation/request` as the bulk lane; direct route is the
+    per-memory acceptance surface.
+  - Preserve current behavior where promote marks source memories
+    `status="graduated"` and sets `graduated_to_entity_id`.
+  - Flow tested for `adaptation` → Decision and `project` → Requirement.
+  - Reconcile the spec's `knowledge` → Fact wording with the actual V1
+    ontology (no `fact` entity type exists today). Prefer doc alignment to
+    an existing typed entity such as `parameter`, rather than adding a vague
+    catch-all `Fact` type late in V1.
+- Schema is mostly already in place: `graduated` status exists in memory
+  service, `graduated_to_entity_id` column + index exist, and promote
+  preserves the original memory. Remaining work is route surface,
+  ontology/spec reconciliation, and targeted end-to-end tests.
 - **Q-4 full trust-hierarchy tests**: no auto-write to project_state
  from any promote path; active-only reinforcement for entities; etc.
  (The entity-candidates-excluded-from-context test shipped in V1-0.)

 **Acceptance:** F-7 ✅, Q-4 ✅.

-**Estimated size:** 2 days.
+**Estimated size:** 3–4 days.

 **Tests added:** ~8.

@@ -388,10 +397,9 @@ tests).
 ### Total (revised after Codex 2026-04-22 audit)

 - Phase budgets: V1-0 (3) + V1-A (1.5) + V1-B (2) + V1-C (2) + V1-D (3-4)
-  + V1-E (2) + V1-F (3) ≈ **16.5–17.5 days of focused work**. Revised down
-  slightly from the previous 12–17 estimate because V1-A scope shrank
-  (four pillar queries are mostly already implemented per Codex audit)
-  and V1-F F-5 work shrank (no schema migration needed).
+  + V1-E (3-4) + V1-F (3) ≈ **17.5–19.5 days of focused work**. This is a
+  realistic engineering-effort estimate, but a single-operator calendar
+  plan should still carry context-switch / soak / review buffer on top.
 - Adds roughly **60 tests** (533 → ~593).
 - Branch strategy: one branch per phase (V1-0 → V1-F), each squash-merged
  to main after Codex review. Phases sequential because each builds on
@@ -496,38 +504,39 @@ following are **explicitly out of scope** for this plan:

 Three of the original eight questions (F-1 field audit, F-2 per-query
 audit, F-5 schema divergence) were answered by Codex's 2026-04-22 audit
-and folded into the plan. Remaining open questions:
+and folded into the plan. One open question remains; the rest are now
+resolved in-plan:

 1. **Parallel schedule vs Now list.** The first-round review correctly
   softened this from "fully parallel" to "less disjoint than claimed".
   Is the revised collision table + pause-points section enough, or
   should specific Now-list items gate specific V1 phases more strictly?

-2. **F-7 graduation gap depth.** Still unaudited. `_graduation_prompt.py`
-   + `api_request_graduation` + DB schema need one Codex read to tell
-   us whether V1-E is a 2-day phase or a 4-day phase.
+2. **F-7 graduation gap depth.** Resolved by Codex audit. The schema and
+   preserve-original-memory hook are already in place, so V1-E is not a
+   greenfield build. But the direct `/memory/{id}/graduate` route and the
+   ontology/spec mismatch around `knowledge` → `Fact` are still open, so
+   V1-E is closer to **3–4 days** than 2.

-3. **Mirror determinism — where does `now` go?** The current mirror
-   footer has a live timestamp (`mirror.py:326`). Spec says
-   deterministic output, spec also shows a `Regenerated:` header with
-   timestamp (`human-mirror-rules.md:265`). Reconciliation proposal:
-   timestamp allowed in the header banner but must be an input
-   parameter so the golden-file test can pin it. Sound right?
+3. **Mirror determinism — where does `now` go?** Resolved. Keep the
+   regenerated timestamp in the rendered output if desired, but pass it
+   into the renderer as an input value. Golden-file tests pin that input;
+   render code must not read the clock directly.

-4. **`project` field naming.** The spec writes `project_id`; the code
-   writes `project`. The spec says "fields equivalent to" so naming is
-   technically flexible. Proposal: V1-0 adds a doc note making this
-   explicit, no column rename. Is this acceptable, or does Codex want
-   the rename for cleanliness?
+4. **`project` field naming.** Resolved. Keep the existing `project`
+   field; add the explicit doc note that it is the project identifier for
+   V1 acceptance purposes. No storage rename needed.

-5. **Velocity calibration.** Revised 16.5–17.5 days total. Given Phase
-   7A took a week and Phase 7D fit in one session, is this a fair
-   estimate for a single-operator sprint, or should we build in buffer
-   for multi-phase context switches?
+5. **Velocity calibration.** Resolved. **17.5–19.5 focused days** is a
+   fair engineering-effort estimate after the F-7 audit. For an actual
+   operator schedule, keep additional buffer for context switching, soak,
+   and review rounds.

-6. **Minions/queue as V2 item in D-3.** Should we name it explicitly in
-   V1 release notes as a future track, or leave it unnamed until V2
-   planning starts?
+6. **Minions/queue as V2 item in D-3.** Resolved. Do not name the
+   rejected "Minions" plan in V1 release notes. If D-3 includes a future
+   work section, refer to it neutrally as "queued background processing /
+   async workers" rather than canonizing a V2 codename before V2 is
+   designed.

 ---