Codex's third-round audit closed the remaining five open questions
with concrete file:line resolutions, patched inline in the plan:
- F-7 (P1): graduation stack is partially built — graduated_to_entity_id
at database.py:143-146, graduated memory status, promote preserves
original at service.py:354-356, tests at test_engineering_v1_phase5.py.
Gaps: missing direct POST /memory/{id}/graduate route; spec's
knowledge -> Fact mismatches ontology (no fact type). Reconcile to
parameter or similar. V1-E 2 days -> 3-4 days.
- Q-5 / V1-D (P2): renderer reads wall-clock in _footer at mirror.py:320.
Fix is injecting regenerated timestamp + checksum as renderer inputs,
sorting DB iteration, removing dict ordering deps. Render code must
not call wall-clock directly.
- project vs project_id (P3): doc note only, no storage rename.
- Total estimate: 17.5-19.5 focused days (calendar buffer on top).
- Release notes must NOT canonize "Minions" as a V2 name. Use neutral
"queued background processing / async workers" wording.
Sign-off from Codex: "with those edits, I'd sign off on the five
questions. The only non-architectural uncertainty left in the plan is
scheduling discipline against the current Now list; that does not
block V1-0 once the soak window and memory-density gate clear."
Plan frozen. V1-0 starts after pipeline soak (~2026-04-26) and the
100-active-memory density gate clear.
Co-Authored-By: Codex <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
618 lines
36 KiB
Markdown
618 lines
36 KiB
Markdown
# Engineering V1 Completion Plan
|
||
|
||
**Date:** 2026-04-22
|
||
**Author:** Claude (after reading the four V1 architecture docs + promotion-rules,
|
||
conflict-model, human-mirror-rules, tool-handoff-boundaries end-to-end)
|
||
**Status:** Draft, pending Codex review
|
||
**Replaces:** the rejected "Phase 8 Minions + typed edges" plan (see
|
||
`docs/decisions/2026-04-22-gbrain-plan-rejection.md`)
|
||
|
||
## Position
|
||
|
||
This is **not** a plan to start Engineering V1. It is a plan to **finish** V1.
|
||
|
||
**Against what criterion?** Each F/Q/O/D item in `engineering-v1-acceptance.md`
|
||
gets scored individually in the Gap audit table below with exact code/test/doc
|
||
references. No global percentage. The headline framing from the first draft
|
||
("50–70% built") is withdrawn — it's either done per-criterion or it isn't.
|
||
|
||
The relevant observation is narrower: the entity schema, the full
|
||
relationship type set, the 4-state lifecycle, basic CRUD and most of the
|
||
killer-correctness query functions are already implemented in
|
||
`src/atocore/engineering/*.py` in the Windows working tree at
|
||
`C:\Users\antoi\ATOCore` (the canonical dev workspace, per
|
||
`CLAUDE.md`). The recent commits e147ab2, b94f9df, 081c058, 069d155, b1a3dd0
|
||
are V1 entity-layer work. **Codex auditors working in a different
|
||
workspace / branch should sync from the canonical dev tree before
|
||
per-file review** — see the "Workspace note" at the end of this doc.
|
||
|
||
The question this plan answers: given the current code state, in what
|
||
order should the remaining V1 acceptance criteria be closed so that
|
||
every phase builds on invariants the earlier phases already enforced?
|
||
|
||
## Corrected sequencing principle (post-Codex review 2026-04-22)
|
||
|
||
The first draft ordered phases F-1 → F-2 → F-3 → F-4 → F-5 → F-6 → F-7 → F-8
|
||
following the acceptance doc's suggested reading order. Codex rejected
|
||
that ordering. The correct dependency order, which this revision adopts, is:
|
||
|
||
1. **Write-time invariants come first.** Every later phase creates active
|
||
entities. Provenance-at-write (F-8) and synchronous conflict-detection
|
||
hooks (F-5 minimal) must be enforced **before** any phase that writes
|
||
entities at scale (ingest, graduation, or broad query coverage that
|
||
depends on the model being populated).
|
||
2. **Query closure sits on top of the schema + invariants**, not ahead of
|
||
them. A minimum query slice that proves the model is fine early. The
|
||
full catalog closure waits until after the write paths are invariant-safe.
|
||
3. **Mirror is a derived consumer** of the entity layer, not a midstream
|
||
milestone. It comes after the entity layer produces enforced, correct data.
|
||
4. **Graduation and full conflict-spec compliance** are finishing work that
|
||
depend on everything above being stable.
|
||
|
||
The acceptance criteria are unchanged. Only the order of closing them changes.
|
||
|
||
## How this plan respects the rejected-plan lessons
|
||
|
||
- **No new predicates.** The V1 ontology in `engineering-ontology-v1.md:112-137`
|
||
already defines 18 relationship types; `service.py:38-62` already implements
|
||
them. Nothing added, nothing reshaped.
|
||
- **No new canonical boundary.** Typed entities + typed relationships with
|
||
promotion-based candidate flow per `memory-vs-entities.md`. Not
|
||
edges-over-wikilinks.
|
||
- **No leapfrog of `master-plan-status.md` Now list.** This plan is **in
|
||
parallel** with (not ahead of) the Now items because V1 entity work is
|
||
already happening alongside them. The sequencing section below is explicit.
|
||
- **Queue/worker infrastructure is explicitly out of scope.** The "flag it for
|
||
later" note at the end of this doc is the only mention, per
|
||
`engineering-v1-acceptance.md:378` negative list.
|
||
|
||
---
|
||
|
||
## Gap audit against `engineering-v1-acceptance.md`
|
||
|
||
Each criterion marked: ✅ done / 🟡 partial / ❌ missing. "Partial" means the
|
||
capability exists but does not yet match spec shape or coverage.
|
||
|
||
### Functional (F-1 through F-8)
|
||
|
||
| ID | Criterion | Status | Evidence |
|
||
|----|-----------|--------|----------|
|
||
| F-1 | 12 V1 entity types, 4 relationship families, shared header fields, 4-state lifecycle | 🟡 partial (per Codex 2026-04-22 audit) | `service.py:16-36` has 16 types (superset of V1's 12), `service.py:38-62` has 18 relationship types, `service.py:64` statuses, `Entity` dataclass at line 67. **Gaps vs `engineering-v1-acceptance.md:45`**: `extractor_version` missing from dataclass and `entities` table; `canonical_home` missing from dataclass and table; `project` field is the project identifier but not named `project_id` as spec uses — spec says "fields equivalent to" so naming flexibility is allowed but needs an explicit doc note. Remediation lands in V1-0 |
|
||
| F-2 | All v1-required Q-001 through Q-020 implemented, with provenance where required | 🟡 partial (per Codex 2026-04-22 per-function audit) | **Ground truth from per-function read of `queries.py` + `routes.py:2092+`:** Q-001 partial (`system_map()` returns project-wide tree, not the catalog's subsystem-scoped `GET /entities/Subsystem/<id>?expand=contains` shape per `engineering-query-catalog.md:71`); Q-002 missing; Q-003 missing; Q-004 done (covered by `system_map()`); Q-005 done (`requirements_for()`); Q-006 done (`orphan_requirements()`); Q-007 missing; Q-008 done (`decisions_affecting()`); Q-009 done (`risky_decisions()`); Q-010 missing; Q-011 done (`unsupported_claims()`); Q-012 missing; Q-013 done (`recent_changes()`); Q-014 missing; Q-016 done (`impact_analysis()`); Q-017 done (`evidence_chain()`); Q-018 missing; Q-019 missing; Q-020 missing (mirror route in spec shape). **Net: 9 of 20 v1-required queries done, 1 partial (Q-001), 10 missing.** Q-015 is v1-stretch, out of scope |
|
||
| F-3 | `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export` | ❌ missing | No `/ingest/kb-cad` or `/ingest/kb-fem` route in `api/routes.py`. No schema doc under `docs/architecture/` |
|
||
| F-4 | Candidate review queue end-to-end (list/promote/reject/edit) | 🟡 partial for entities | Memory side shipped in Phase 9 Commit C. Entity side has `promote_entity`, `supersede_entity`, `invalidate_active_entity` but reject path and editable-before-promote may not match spec shape. Need to verify `GET /entities?status=candidate` returns spec shape |
|
||
| F-5 | Conflict detector fires synchronously; `POST /conflicts/{id}/resolve` + dismiss | 🟡 partial (per Codex 2026-04-22 audit — schema present, detector+routes divergent) | **Schema is already spec-shaped**: `database.py:190` defines the generic `conflicts` + `conflict_members` tables per `conflict-model.md`; `conflicts.py:154` persists through them. **Divergences are in detection and API, not schema**: (1) `conflicts.py:36` dispatches per-type detectors only (`_check_component_conflicts`, `_check_requirement_conflicts`) — needs generalization to slot-key-driven detection; (2) routes live at `/admin/conflicts/*`, spec says `/conflicts/*` — needs alias + deprecation. **No schema migration needed** |
|
||
| F-6 | Mirror: `/mirror/{project}/overview`, `/decisions`, `/subsystems/{id}`, `/regenerate`; files under `/srv/storage/atocore/data/mirror/`; disputed + curated markers; deterministic output | 🟡 partial | `mirror.py` has `generate_project_overview` with header/state/system/decisions/requirements/materials/vendors/memories/footer sections. API at `/projects/{project_name}/mirror` and `.html`. **Gaps**: no separate `/mirror/{project}/decisions` or `/mirror/{project}/subsystems/{id}` routes, no `POST /regenerate` endpoint, no debounced-async-on-write, no daily refresh, no `⚠ disputed` markers wired to conflicts, no `(curated)` override annotations verified, no golden-file test for determinism |
|
||
| F-7 | Memory→entity graduation: `POST /memory/{id}/graduate` + `graduated` status + forward pointer + original preserved | 🟡 partial (per Codex 2026-04-22 third-round audit) | `_graduation_prompt.py` exists; `scripts/graduate_memories.py` creates entity candidates from active memories; `database.py:143-146` adds `graduated_to_entity_id`; `memory.service` already has a `graduated` status; `service.py:354-356,389-451` preserves the original memory and marks it `graduated` with a forward pointer on entity promote; `tests/test_engineering_v1_phase5.py:67-90` covers that flow. **Gaps vs spec**: no direct `POST /memory/{id}/graduate` route yet (current surface is batch/admin-driven via `/admin/graduation/request`); no explicit acceptance tests yet for `adaptation→decision` and `project→requirement`; spec wording `knowledge→Fact` does not match the current ontology (there is no `fact` entity type in `service.py` / `_graduation_prompt.py`) and should be reconciled to an actual V1 type such as `parameter` or another ontology-defined entity. |
|
||
| F-8 | Every active entity has `source_refs`; Q-017 returns ≥1 row for every active entity | 🟡 partial | `Entity.source_refs` field exists; Q-017 (`evidence_chain`) exists. **Gap**: is provenance enforced at write time (not NULL), or just encouraged? Per spec it must be mandatory |
|
||
|
||
### Quality (Q-1 through Q-6)
|
||
|
||
| ID | Criterion | Status | Evidence |
|
||
|----|-----------|--------|----------|
|
||
| Q-1 | All pre-V1 tests still pass | ✅ presumed | 533 tests passing per DEV-LEDGER line 12 |
|
||
| Q-2 | Each F criterion has happy-path + error-path test, <10s each, <30s total | 🟡 partial | 16 + 15 + 15 + 12 = 58 tests in engineering/queries/v1-phase5/patch files. Need to verify coverage of each F criterion one-for-one |
|
||
| Q-3 | Conflict invariants enforced by tests (contradictory imports produce conflict, can't promote both, flag-never-block) | 🟡 partial | Tests likely exist in `test_engineering_v1_phase5.py` — verify explicit coverage of the three invariants |
|
||
| Q-4 | Trust hierarchy enforced by tests (candidates never in context, active-only reinforcement, no auto-project-state writes) | 🟡 partial | Phase 9 Commit B covered the memory side; verify entity side has equivalent tests |
|
||
| Q-5 | Mirror has golden-file test, deterministic output | ❌ missing | No golden file seen; mirror output reads wall-clock time inside `_footer()` (`mirror.py:320-327`). Determinism should come from injecting the regenerated timestamp/checksum as inputs to the renderer and pinning them in the golden-file test, not from calling `datetime.now()` inside render code |
|
||
| Q-6 | Killer correctness queries pass against seeded real-ish data (5 seed cases per Q-006/Q-009/Q-011) | ❌ likely missing | No fixture file named for this seen. The three queries exist but there's no evidence of the single integration test described in Q-6 |
|
||
|
||
### Operational (O-1 through O-5)
|
||
|
||
| ID | Criterion | Status | Evidence |
|
||
|----|-----------|--------|----------|
|
||
| O-1 | Schema migration additive, idempotent, tested against fresh + prod-copy DB | 🟡 presumed | `_apply_migrations` pattern is in use per CLAUDE.md sessions; tables exist. Need one confirmation run against a Dalidou backup copy |
|
||
| O-2 | Backup includes new tables; full restore drill passes; post-restore Q-001 works | ❌ not done | No evidence a restore drill has been run on V1 entity state. `docs/backup-restore-procedure.md` exists but drill is an explicit V1 prerequisite |
|
||
| O-3 | Performance bounds: write <100ms p99, query <500ms p99 at 1000 entities, mirror <5s per project | 🟡 unmeasured | 35 entities in system — bounds unmeasured at scale. Spec says "sanity-checked, not benchmarked", so this is a one-off manual check |
|
||
| O-4 | No new manual ops burden | 🟡 | Mirror regen auto-triggers not wired yet (see F-6 gap) — they must be wired for O-4 to pass |
|
||
| O-5 | Phase 9 reflection loop unchanged for identity/preference/episodic | ✅ presumed | `memory-vs-entities.md` says these three types don't interact with engineering layer. No recent change to memory extractor for these types |
|
||
|
||
### Documentation (D-1 through D-4)
|
||
|
||
| ID | Criterion | Status | Evidence |
|
||
|----|-----------|--------|----------|
|
||
| D-1 | 12 per-entity-type spec docs under `docs/architecture/entities/` | ❌ missing | No `docs/architecture/entities/` folder |
|
||
| D-2 | `kb-cad-export-schema.md` + `kb-fem-export-schema.md` | ❌ missing | No such files in `docs/architecture/` |
|
||
| D-3 | `docs/v1-release-notes.md` | ❌ missing | Not written yet (appropriately — it's written when V1 is done) |
|
||
| D-4 | `master-plan-status.md` + `current-state.md` updated with V1 completion | ❌ not yet | `master-plan-status.md:179` still has V1 under **Next** |
|
||
|
||
### Summary (revised per Codex 2026-04-22 per-file audit)
|
||
|
||
- **Functional:** 0/8 ✅, 7/8 🟡 partial (F-1 downgraded from ✅ — two header fields missing; F-2 through F-7 partial), 1/8 ❌ missing (F-3 ingest endpoints) → the entity layer shape is real but not yet spec-clean; write-time invariants come first, then everything builds on stable invariants
|
||
- **F-2 detail:** 9 of 20 v1-required queries done, 1 partial (Q-001 needs subsystem-scoped variant), 10 missing
|
||
- **F-5 detail:** generic `conflicts` + `conflict_members` schema already present (no migration needed); detector body + routes diverge from spec
|
||
- **Quality:** 1/6 ✅, 3/6 🟡 partial, 2/6 ❌ missing → golden file + killer-correctness integration test are the two clear gaps
|
||
- **Operational:** 0/5 ✅ (none fully verified), 3/5 🟡, 1/5 ❌ → backup drill is the one hard blocker here
|
||
- **Documentation:** 0/4 ✅, 4/4 ❌ → all 4 docs need writing
|
||
|
||
---
|
||
|
||
## Proposed completion order (revised post-Codex review)
|
||
|
||
Seven phases instead of six. The new V1-0 establishes the write-time
|
||
invariants (provenance enforcement F-8 + synchronous conflict hooks F-5
|
||
minimal) that every later phase depends on. V1-A becomes a **minimal query
|
||
slice** that proves the model on one project, not a full catalog closure.
|
||
Full query catalog closure moves to V1-C. Full F-5 spec compliance (the
|
||
generic `conflicts`/`conflict_members` slot-key schema) stays in V1-F
|
||
because that's the final shape, but the *minimal hooks* that fire
|
||
synchronously on writes land in V1-0.
|
||
|
||
Skipped by construction: F-1 core schema (already implemented) and O-5
|
||
(identity/preference/episodic don't touch the engineering layer).
|
||
|
||
### Phase V1-0: Write-time invariants (F-8 + F-5 minimal + F-1 audit)
|
||
|
||
**Scope:**
|
||
- **F-1 remediation (Codex audit 2026-04-22 already completed).** Add
|
||
the two missing shared-header fields to the `Entity` dataclass
|
||
(`service.py:67`) and the `entities` table schema:
|
||
- `extractor_version TEXT` — semver-ish string carrying the extractor
|
||
module version per `promotion-rules.md:268`. Backfill existing rows
|
||
with `"0.0.0"` or `NULL` flagged as unknown. Every future
|
||
write carries the current `EXTRACTOR_VERSION` constant.
|
||
- `canonical_home TEXT` — which layer is canonical for this concept.
|
||
For entities, value is always `"entity"`. For future graduation
|
||
records it may be `"memory"` (frozen pointer). Backfill active
|
||
rows with `"entity"`.
|
||
- Additive migration via the existing `_apply_migrations` pattern,
|
||
idempotent, safe on replay.
|
||
- Add doc note in `engineering-ontology-v1.md` clarifying that the
|
||
`project` field IS the `project_id` per spec — "fields equivalent
|
||
to" wording in the spec allows this, but make it explicit so
|
||
future readers don't trip on the naming.
|
||
- **F-8 provenance enforcement.** Add a NOT-NULL invariant at
|
||
`create_entity` and `promote_entity` that `source_refs` is non-empty
|
||
OR an explicit `hand_authored=True` flag is set (per
|
||
`promotion-rules.md:253`). Backfill any existing active entities that
|
||
fail the invariant — either attach provenance, flag as hand-authored,
|
||
or invalidate. Every future active entity has provenance by schema,
|
||
not by discipline.
|
||
- **F-5 minimal hooks.** Wire synchronous conflict detection into every
|
||
active-entity write path (`create_entity` with status=active,
|
||
`promote_entity`, `supersede_entity`). The *detector* can stay in its
|
||
current per-type form (`_check_component_conflicts`,
|
||
`_check_requirement_conflicts`); the *hook* must fire on every write.
|
||
Full generic slot-keyed schema lands in V1-F; the hook shape must be
|
||
generic enough that V1-F is a detector-body swap, not an API refactor.
|
||
- **Q-3 "flag never block" test.** The hook must return conflict-id in
|
||
the response body but never 4xx-block the write. One test per write
|
||
path demonstrating this.
|
||
- **Q-4 trust-hierarchy test for candidates.** One test: entity
|
||
candidates never appear in `/context/build` output. (Full trust tests
|
||
land in V1-E; this is the one that V1-0 can cover without graduation
|
||
being ready.)
|
||
|
||
**Acceptance:** F-1 ✅ (after `extractor_version` + `canonical_home`
|
||
land + doc note on `project` naming), F-8 ✅, F-5 hooks ✅, Q-3 ✅,
|
||
partial Q-4 ✅.
|
||
|
||
**Estimated size:** 3 days (two small schema additions + invariant
|
||
patches + hook wiring + tests; no audit overhead — Codex already did
|
||
that part).
|
||
|
||
**Tests added:** ~10.
|
||
|
||
**Why first:** every later phase writes entities. Without F-8 + F-5
|
||
hooks, V1-A through V1-F can leak invalid state into the store that
|
||
must then be cleaned up.
|
||
|
||
### Phase V1-A: Minimal query slice that proves the model (partial F-2 + Q-6)
|
||
|
||
**Scope:**
|
||
- Pick the **four pillar queries**: Q-001 (subsystem contents),
|
||
Q-005 (component satisfies requirements), Q-006 (orphan requirements —
|
||
killer correctness), Q-017 (evidence chain). These exercise structural +
|
||
intent + killer-correctness + provenance.
|
||
- **Q-001 needs a shape fix**: Codex's audit confirms the existing
|
||
`system_map()` returns a project-wide tree, not the spec's
|
||
subsystem-scoped `GET /entities/Subsystem/<id>?expand=contains`.
|
||
Add a subsystem-scoped variant (the existing project-wide route stays
|
||
for Q-004). This is the only shape fix in V1-A; larger query additions
|
||
move to V1-C.
|
||
- Q-005, Q-006, Q-017 are already implemented per Codex audit. V1-A
|
||
verifies them against seeded data; no code changes expected.
|
||
- Seed p05-interferometer with Q-6 integration data (one satisfying
|
||
Component + one orphan Requirement + one Decision on flagged
|
||
Assumption + one supported ValidationClaim + one unsupported
|
||
ValidationClaim).
|
||
- All three killer-correctness queries (Q-006, Q-009, Q-011) are
|
||
**already implemented** per Codex audit. V1-A runs them as a single
|
||
integration test against the seed data.
|
||
|
||
**Acceptance:** Q-001 subsystem-scoped variant + Q-6 integration test.
|
||
Partial F-2 (remaining 10 missing + 1 partial queries land in V1-C).
|
||
|
||
**Estimated size:** 1.5 days (scope shrunk — most pillar queries already
|
||
work per Codex audit; only Q-001 shape fix + seed data + integration
|
||
test required).
|
||
|
||
**Tests added:** ~4.
|
||
|
||
**Why second:** proves the entity layer shape works end-to-end on real
|
||
data before we start bolting ingest, graduation, or mirror onto it. If
|
||
the four pillar queries don't work, stopping here is cheap.
|
||
|
||
### Phase V1-B: KB-CAD / KB-FEM ingest (F-3) + D-2 schema docs
|
||
|
||
**Scope:**
|
||
- Write `docs/architecture/kb-cad-export-schema.md` and
|
||
`kb-fem-export-schema.md` (matches D-2).
|
||
- Implement `POST /ingest/kb-cad/export` and `POST /ingest/kb-fem/export`
|
||
per `tool-handoff-boundaries.md` sketches. Validator + entity-candidate
|
||
producer + provenance population **using the F-8 invariant from V1-0**.
|
||
- Hand-craft one real KB-CAD export for p05-interferometer and
|
||
round-trip it: export → candidate queue → reviewer promotes → queryable
|
||
via V1-A's four pillar queries.
|
||
- Tests: valid export → candidates created; invalid export → 400;
|
||
duplicate re-export → no duplicate candidates; re-export with changed
|
||
value → new candidate + conflict row (exercises V1-0's F-5 hook on a
|
||
real workload).
|
||
|
||
**Acceptance:** F-3 ✅, D-2 ✅.
|
||
|
||
**Estimated size:** 2 days.
|
||
|
||
**Tests added:** ~8.
|
||
|
||
**Why third:** ingest is the first real stress test of the V1-0
|
||
invariants. A re-import that creates a conflict must trigger the V1-0
|
||
hook; if it doesn't, V1-0 is incomplete and we catch it before going
|
||
further.
|
||
|
||
### Phase V1-C: Close the rest of the query catalog (remaining F-2)
|
||
|
||
**Scope:** close the 10 missing queries per Codex's audit. Already-done
|
||
queries (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017) are
|
||
verified but not rewritten.
|
||
- Q-002 (component → parents, inverse of CONTAINS)
|
||
- Q-003 (subsystem interfaces, Interface as simple string label)
|
||
- Q-007 (component → constraints via CONSTRAINED_BY)
|
||
- Q-010 (ValidationClaim → supporting results + AnalysisModel trace)
|
||
- Q-012 (conflicting results on same claim — exercises V1-0's F-5 hook)
|
||
- Q-014 (decision log ordered + superseded chain)
|
||
- Q-018 (`include=superseded` for supersession chains)
|
||
- Q-019 (Material → components, derived from Component.material field
|
||
per `engineering-query-catalog.md:266`, no edge needed)
|
||
- Q-020 (project overview mirror route) — deferred to V1-D where the
|
||
mirror lands in full.
|
||
|
||
**Acceptance:** F-2 ✅ (all 19 of 20 v1-required queries; Q-020 in V1-D).
|
||
|
||
**Estimated size:** 2 days (eight new query functions + routes +
|
||
per-query happy-path tests).
|
||
|
||
**Tests added:** ~12.
|
||
|
||
**Why fourth:** with the model proven (V1-A) and ingest exercising the
|
||
write invariants (V1-B), filling in the remaining queries is mechanical.
|
||
They all sit on top of the same entity store and V1-0 invariants.
|
||
|
||
### Phase V1-D: Full Mirror surface (F-6) + determinism golden file (Q-5) + Q-020
|
||
|
||
**Scope:**
|
||
- Split the single `/projects/{project_name}/mirror` route into the three
|
||
spec routes: `/mirror/{project}/overview` (= Q-020),
|
||
`/mirror/{project}/decisions`, `/mirror/{project}/subsystems/{subsystem}`.
|
||
- Add `POST /mirror/{project}/regenerate`.
|
||
- Move generated files to `/srv/storage/atocore/data/mirror/{project}/`.
|
||
- **Deterministic output:** inject regenerated timestamp + checksum as
|
||
renderer inputs (pinned by golden tests), sort every iteration, and
|
||
remove `dict` / database ordering dependencies. The renderer should
|
||
not call wall-clock time directly.
|
||
- `⚠ disputed` markers inline wherever an open conflict touches a
|
||
rendered field (uses V1-0's F-5 hook output).
|
||
- `(curated)` annotations where project_state overrides entity state.
|
||
- Regeneration triggers: synchronous on regenerate, debounced async on
|
||
entity write (30s window), daily scheduled refresh via existing
|
||
nightly cron (one new cron line, not a new queue).
|
||
- `mirror_regeneration_failures` table.
|
||
- Golden-file test: fixture project state → render → bytes equal.
|
||
|
||
**Acceptance:** F-6 ✅, Q-5 ✅, Q-020 ✅, O-4 moves toward ✅.
|
||
|
||
**Estimated size:** 3–4 days.
|
||
|
||
**Tests added:** ~15.
|
||
|
||
**Why fifth:** mirror is a derived consumer. It cannot be correct
|
||
before the entity store + queries + conflict hooks are correct. It
|
||
lands after everything it depends on is stable.
|
||
|
||
### Phase V1-E: Memory→entity graduation end-to-end (F-7) + remaining Q-4
|
||
|
||
**Scope:**
|
||
- Verify and close F-7 spec gaps:
|
||
- Add the missing direct `POST /memory/{id}/graduate` route, reusing the
|
||
same prompt/parser as the batch graduation path.
|
||
- Keep `/admin/graduation/request` as the bulk lane; direct route is the
|
||
per-memory acceptance surface.
|
||
- Preserve current behavior where promote marks source memories
|
||
`status="graduated"` and sets `graduated_to_entity_id`.
|
||
- Flow tested for `adaptation` → Decision and `project` → Requirement.
|
||
- Reconcile the spec's `knowledge` → Fact wording with the actual V1
|
||
ontology (no `fact` entity type exists today). Prefer doc alignment to
|
||
an existing typed entity such as `parameter`, rather than adding a vague
|
||
catch-all `Fact` type late in V1.
|
||
- Schema is mostly already in place: `graduated` status exists in memory
|
||
service, `graduated_to_entity_id` column + index exist, and promote
|
||
preserves the original memory. Remaining work is route surface,
|
||
ontology/spec reconciliation, and targeted end-to-end tests.
|
||
- **Q-4 full trust-hierarchy tests**: no auto-write to project_state
|
||
from any promote path; active-only reinforcement for entities; etc.
|
||
(The entity-candidates-excluded-from-context test shipped in V1-0.)
|
||
|
||
**Acceptance:** F-7 ✅, Q-4 ✅.
|
||
|
||
**Estimated size:** 3–4 days.
|
||
|
||
**Tests added:** ~8.
|
||
|
||
**Why sixth:** graduation touches memory-layer semantics (adds a
|
||
`graduated` status, flows memory→entity, requires memory-module changes).
|
||
Doing it after the entity layer is fully invariant-safe + query-complete
|
||
+ mirror-derived means the memory side only has to deal with one shape:
|
||
a stable, tested entity layer.
|
||
|
||
### Phase V1-F: Full F-5 spec compliance + O-1/O-2/O-3 + D-1/D-3/D-4
|
||
|
||
**Scope:**
|
||
- **F-5 full spec compliance** (Codex 2026-04-22 audit already confirmed
|
||
the gap shape — schema is spec-compliant, divergence is in detector +
|
||
routes only).
|
||
- **Detector generalization.** Replace the per-type dispatch at
|
||
`conflicts.py:36` (`_check_component_conflicts`,
|
||
`_check_requirement_conflicts`) with a slot-key-driven generic
|
||
detector that reads the per-entity-type conflict slot from a
|
||
registry and queries the already-generic `conflicts` +
|
||
`conflict_members` tables. The V1-0 hook shape was chosen to make
|
||
this a detector-body swap, not an API change.
|
||
- **Route alignment.** Add `/conflicts/*` routes as the canonical
|
||
surface per `conflict-model.md:187`. Keep `/admin/conflicts/*` as
|
||
aliases for one release, deprecate in D-3 release notes, remove
|
||
in V1.1.
|
||
- **No schema migration needed** (the tables at `database.py:190`
|
||
already match the spec).
|
||
- **O-1:** Run the full migration against a Dalidou backup copy.
|
||
Confirm additive, idempotent, safe to run twice.
|
||
- **O-2:** Run a full restore drill on the test project per
|
||
`docs/backup-restore-procedure.md`. Post-restore, Q-001 returns
|
||
correct shape. `POST /admin/backup` snapshot includes the new tables.
|
||
- **O-3:** Manual sanity-check of the three performance bounds.
|
||
- **D-1:** Write 12 short spec docs under `docs/architecture/entities/`
|
||
(one per V1 entity type).
|
||
- **D-3:** Write `docs/v1-release-notes.md`.
|
||
- **D-4:** Update `master-plan-status.md` and `current-state.md` —
|
||
move engineering V1 from **Next** to **What Is Real Today**.
|
||
|
||
**Acceptance:** F-5 ✅, O-1 ✅, O-2 ✅, O-3 ✅, D-1 ✅, D-3 ✅, D-4 ✅ →
|
||
**V1 is done.**
|
||
|
||
**Estimated size:** 3 days (F-5 migration if needed is the main unknown;
|
||
D-1 entity docs at ~30 min each ≈ 6 hours; verification is fast).
|
||
|
||
**Tests added:** ~6 (F-5 spec-shape tests; verification adds no automated
|
||
tests).
|
||
|
||
### Total (revised after Codex 2026-04-22 audit)
|
||
|
||
- Phase budgets: V1-0 (3) + V1-A (1.5) + V1-B (2) + V1-C (2) + V1-D (3-4)
|
||
+ V1-E (3-4) + V1-F (3) ≈ **17.5–19.5 days of focused work**. This is a
|
||
realistic engineering-effort estimate, but a single-operator calendar
|
||
plan should still carry context-switch / soak / review buffer on top.
|
||
- Adds roughly **60 tests** (533 → ~593).
|
||
- Branch strategy: one branch per phase (V1-0 → V1-F), each squash-merged
|
||
to main after Codex review. Phases sequential because each builds on
|
||
the previous. **V1-0 is a hard prerequisite for all later phases** —
|
||
nothing starts until V1-0 lands.
|
||
|
||
---
|
||
|
||
## Sequencing with the `master-plan-status.md` Now list
|
||
|
||
The **Now** list from master-plan-status.md:159-169 is:
|
||
|
||
1. Observe the enhanced pipeline (1 week soak — first F4 confidence decay
|
||
run was 2026-04-19 per Trusted State, so soak window ends ~2026-04-26)
|
||
2. Knowledge density — batch extract over 234 interactions, target 100+
|
||
active memories (currently 84)
|
||
3. Multi-model triage (Phase 11 entry)
|
||
4. Fix p04-constraints harness failure
|
||
|
||
**Principle (revised per Codex review):** V1 work and the Now list are
|
||
**less disjoint than the first draft claimed**. Real collision points:
|
||
|
||
| V1 phase | Collides with Now list at |
|
||
|---|---|
|
||
| V1-0 provenance enforcement | memory extractor write path if it shares helper functions; context assembly for the Q-4 partial trust test |
|
||
| V1-0 F-5 hooks | any write path that creates active rows (limited collision; entity writes are separate from memory writes) |
|
||
| V1-B KB-CAD/FEM ingest | none on the Now list, but adds an ingest surface that becomes operational burden (ties to O-4 "no new manual ops") |
|
||
| V1-D mirror regen triggers | scheduling / ops behavior that intersects with "boring and dependable" gate — mirror regen failures become an observable that the pipeline soak must accommodate |
|
||
| V1-E graduation | memory module (new `graduated` status, memory→entity flow); direct collision with memory extractor + triage |
|
||
| V1-F F-5 migration | conflicts.py touches the write path shared with memory promotion |
|
||
|
||
**Recommended schedule (revised):**
|
||
|
||
- **This week (2026-04-22 to 2026-04-26):** Pipeline soak continues.
|
||
Density batch-extract continues. V1 work **waits** — V1-0 would start
|
||
touching write paths, which is explicitly something we should not do
|
||
during a soak window. Density target (100+ active memories) and the
|
||
pipeline soak complete first.
|
||
- **Week of 2026-04-27:** If soak is clean and density reached, V1-0
|
||
starts. V1-0 is a hard prerequisite and cannot be skipped or parallelized.
|
||
- **Weeks of 2026-05-04 and 2026-05-11:** V1-A through V1-D in order.
|
||
Multi-model triage work (Now list item 3) continues in parallel only
|
||
if its touch-surface is triage-path-only (memory side). Any memory
|
||
extractor change pauses V1-E.
|
||
- **Week of 2026-05-18 approx:** V1-E (graduation). **This phase must
|
||
not run in parallel with memory extractor changes** — it directly
|
||
modifies memory module semantics. Multi-model triage should be settled
|
||
before V1-E starts.
|
||
- **Week of 2026-05-25:** V1-F.
|
||
- **End date target:** ~2026-06-01, four weeks later than the first
|
||
draft's 2026-05-18 soft target. The shift is deliberate — the first
|
||
draft's "parallel / disjoint" claim understated the real collisions.
|
||
|
||
**Pause points (explicit):**
|
||
|
||
- Any Now-list item that regresses the pipeline → V1 pauses immediately.
|
||
- Memory extractor changes in flight → V1-E pauses until they land and
|
||
soak.
|
||
- p04-constraints fix requires retrieval ranking changes → V1 does not
|
||
pause (retrieval is genuinely disjoint from entities).
|
||
- Multi-model triage work touching the entity extractor path (if one
|
||
gets prototyped) → V1-0 pauses until the triage decision settles.
|
||
|
||
---
|
||
|
||
## Test project
|
||
|
||
Per `engineering-v1-acceptance.md:379`, the recommended test bed is
|
||
**p05-interferometer** — "the optical/structural domain has the cleanest
|
||
entity model". I agree. Every F-2, F-3, F-6 criterion asserts against this
|
||
project.
|
||
|
||
p06-polisher is the backup test bed if p05 turns out to have data gaps
|
||
(polisher suite is actively worked and has more content).
|
||
|
||
---
|
||
|
||
## What V1 completion does NOT include
|
||
|
||
Per the negative list in `engineering-v1-acceptance.md:351-373`, all of the
|
||
following are **explicitly out of scope** for this plan:
|
||
|
||
- LLM extractor for entities (rule-based is V1)
|
||
- Auto-promotion of candidates (human-only in V1)
|
||
- Write-back to KB-CAD / KB-FEM
|
||
- Multi-user auth
|
||
- Real-time UI (API + Mirror markdown only)
|
||
- Cross-project rollups
|
||
- Time-travel queries (Q-015 stays stretch)
|
||
- Nightly conflict sweep (synchronous only)
|
||
- Incremental Chroma snapshots
|
||
- Retention cleanup script
|
||
- Backup encryption
|
||
- Off-Dalidou backup target (already exists at clawdbot per ledger, but
|
||
not a V1 criterion)
|
||
- **Async job queue / minions pattern** (the rejected plan's centerpiece —
|
||
explicitly deferred to post-V1 per the negative list)
|
||
|
||
---
|
||
|
||
## Open questions for Codex (post-second-round revision)
|
||
|
||
Three of the original eight questions (F-1 field audit, F-2 per-query
|
||
audit, F-5 schema divergence) were answered by Codex's 2026-04-22 audit
|
||
and folded into the plan. One open question remains; the rest are now
|
||
resolved in-plan:
|
||
|
||
1. **Parallel schedule vs Now list.** The first-round review correctly
|
||
softened this from "fully parallel" to "less disjoint than claimed".
|
||
Is the revised collision table + pause-points section enough, or
|
||
should specific Now-list items gate specific V1 phases more strictly?
|
||
|
||
2. **F-7 graduation gap depth.** Resolved by Codex audit. The schema and
|
||
preserve-original-memory hook are already in place, so V1-E is not a
|
||
greenfield build. But the direct `/memory/{id}/graduate` route and the
|
||
ontology/spec mismatch around `knowledge` → `Fact` are still open, so
|
||
V1-E is closer to **3–4 days** than 2.
|
||
|
||
3. **Mirror determinism — where does `now` go?** Resolved. Keep the
|
||
regenerated timestamp in the rendered output if desired, but pass it
|
||
into the renderer as an input value. Golden-file tests pin that input;
|
||
render code must not read the clock directly.
|
||
|
||
4. **`project` field naming.** Resolved. Keep the existing `project`
|
||
field; add the explicit doc note that it is the project identifier for
|
||
V1 acceptance purposes. No storage rename needed.
|
||
|
||
5. **Velocity calibration.** Resolved. **17.5–19.5 focused days** is a
|
||
fair engineering-effort estimate after the F-7 audit. For an actual
|
||
operator schedule, keep additional buffer for context switching, soak,
|
||
and review rounds.
|
||
|
||
6. **Minions/queue as V2 item in D-3.** Resolved. Do not name the
|
||
rejected "Minions" plan in V1 release notes. If D-3 includes a future
|
||
work section, refer to it neutrally as "queued background processing /
|
||
async workers" rather than canonizing a V2 codename before V2 is
|
||
designed.
|
||
|
||
---
|
||
|
||
## Risks
|
||
|
||
| Risk | Mitigation |
|
||
|---|---|
|
||
| V1 work slows the Now list | V1 pauses on any Now-list blocker. Codex veto on any V1 PR that touches memory extractor, retrieval ranking, or triage paths |
|
||
| F-5 detector generalization harder than estimated | Codex audit confirmed schema is already spec-compliant; only detector body + routes need work. If detector generalization still slips, keep per-type detectors and document as a V1.1 cleanup (detection correctness is unaffected, only code organization) |
|
||
| Mirror determinism regresses existing mirror output | Keep `/projects/{project_name}/mirror` alias returning the current shape; new `/mirror/{project}/overview` is the spec-compliant one. Deprecate old in V1 release notes |
|
||
| Golden file churn as templates evolve | Standard workflow: updating a golden file is a normal part of template work, documented in V1-C commit message |
|
||
| Backup drill on Dalidou is disruptive | Run against a clone of the Dalidou DB at a safe hour; no production drill required for V1 acceptance |
|
||
| p05-interferometer data gaps | Fall back to p06-polisher per this plan's test-project section |
|
||
| Scope creep during V1-A query audit | Any query that isn't in the v1-required set (Q-021 onward) is out of scope, period |
|
||
|
||
---
|
||
|
||
## What this plan is **for**
|
||
|
||
1. A checklist Claude can follow to close V1.
|
||
2. A review target for Codex — every phase has explicit acceptance
|
||
criteria tied to the acceptance doc.
|
||
3. A communication artifact for Antoine — "here's what's left, here's why,
|
||
here's the order, here's the risk."
|
||
|
||
## What this plan is **not**
|
||
|
||
1. Not a commitment to start tomorrow. Pipeline soak + density density
|
||
come first in parallel; V1-A can start this week only because it's
|
||
zero-risk additive work.
|
||
2. Not a rewrite. Every phase builds on existing code.
|
||
3. Not an ontology debate. The ontology is fixed in
|
||
`engineering-ontology-v1.md`. Any desire to change it is V2 material.
|
||
|
||
## Workspace note (for Codex audit)
|
||
|
||
Codex's first-round review (2026-04-22) flagged that
|
||
`docs/plans/engineering-v1-completion-plan.md` and `DEV-LEDGER.md` were
|
||
**not visible** in the Playground workspace they were running against,
|
||
and that `src/atocore/engineering/` appeared empty there.
|
||
|
||
The canonical dev workspace for AtoCore is the Windows path
|
||
`C:\Users\antoi\ATOCore` (per `CLAUDE.md`). The engineering layer code
|
||
(`src/atocore/engineering/service.py`, `queries.py`, `conflicts.py`,
|
||
`mirror.py`, `_graduation_prompt.py`, `wiki.py`, `triage_ui.py`) exists
|
||
there and is what the recent commits (e147ab2, b94f9df, 081c058, 069d155,
|
||
b1a3dd0) touched. The Windows working tree is what this plan was written
|
||
against.
|
||
|
||
Before the file-level audit:
|
||
|
||
1. Confirm which branch / SHA Codex is reviewing. The Windows working
|
||
tree has uncommitted changes to this plan + DEV-LEDGER as of
|
||
2026-04-22; commit will be made only after Antoine approves sync.
|
||
2. If Codex is reviewing `ATOCore-clean` or a Playground snapshot, that
|
||
tree may lag the canonical dev tree. Sync or re-clone from the
|
||
Windows working tree / current `origin/main` before per-file audit.
|
||
3. The three visible-to-Codex file paths for this plan are:
|
||
- `docs/plans/engineering-v1-completion-plan.md` (this file)
|
||
- `docs/decisions/2026-04-22-gbrain-plan-rejection.md` (prior decision)
|
||
- `DEV-LEDGER.md` (Recent Decisions + Session Log entries 2026-04-22)
|
||
|
||
## References
|
||
|
||
- `docs/architecture/engineering-ontology-v1.md`
|
||
- `docs/architecture/engineering-query-catalog.md`
|
||
- `docs/architecture/memory-vs-entities.md`
|
||
- `docs/architecture/engineering-v1-acceptance.md`
|
||
- `docs/architecture/promotion-rules.md`
|
||
- `docs/architecture/conflict-model.md`
|
||
- `docs/architecture/human-mirror-rules.md`
|
||
- `docs/architecture/tool-handoff-boundaries.md`
|
||
- `docs/master-plan-status.md` (Now/Next/Later list)
|
||
- `docs/decisions/2026-04-22-gbrain-plan-rejection.md` (the rejected plan)
|
||
- `src/atocore/engineering/service.py` (current V1 entity service)
|
||
- `src/atocore/engineering/queries.py` (current V1 query implementations)
|
||
- `src/atocore/engineering/conflicts.py` (current conflicts module)
|
||
- `src/atocore/engineering/mirror.py` (current mirror module)
|