fix(engineering): V1-0 gap closures per Codex review

Codex audit of cbf9e03 surfaced two P1 gaps + one P2 scope concern, all verified with code-level probes. Patches below. P1: promote_entity did not re-check F-8 at status flip. Legacy candidates with source_refs='[]' and hand_authored=0 can exist from before V1-0 enforcement. promote_entity now raises ValueError before flipping status so no F-8 violation can slip into the active store through the promote path. Row stays candidate on rejection. Symmetric error shape with the create side. P1: supersede_entity was missing the F-5 hook. Plan calls for synchronous conflict detection on every active-entity write path. Supersede creates a `supersedes` relationship rooted at the `superseded_by` entity, which can produce a conflict the detector should catch. Added detect_conflicts_for_entity(superseded_by) call with fail-open per conflict-model.md:256. P2: backfill script --invalidate-instead was too broad. Query included both active AND superseded rows; invalidating superseded rows collapses audit history that V1-0 remediation never intended to touch. Now --invalidate-instead scopes to status='active' only. Default hand_authored-flag mode stays broad since it's additive/non-destructive. Help text made the destructive posture explicit. Four new regression tests in test_v1_0_write_invariants.py: - test_promote_rejects_legacy_candidate_without_provenance - test_promote_accepts_candidate_flagged_hand_authored - test_supersede_runs_conflict_detection_on_new_active - test_supersede_hook_fails_open Test count: 543 -> 547 (+4). Full suite green in 81.07s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:49:42 -04:00
parent cbf9e03ab9
commit f16cd5272f
4 changed files with 180 additions and 4 deletions
--- a/DEV-LEDGER.md
+++ b/DEV-LEDGER.md
@@ -164,6 +164,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha

 ## Session Log

+- **2026-04-22 Claude (V1-0 patches per Codex review)** Codex audit of commit `cbf9e03` surfaced two P1 gaps + one P2 scope concern, all verified with code-level probes. **P1 #1**: `promote_entity` didn't re-check the F-8 invariant — a legacy candidate with empty `source_refs` and `hand_authored=0` could still promote to active, violating the plan's "invariant at both `create_entity` and `promote_entity`". Fixed: `promote_entity` at `service.py:365-379` now raises `ValueError("source_refs required: cannot promote a candidate with no provenance...")` before flipping status. Stays symmetric with the create-side error. **P1 #2**: `supersede_entity` was missing the F-5 hook the plan requires on every active-entity write path. The `supersedes` relationship rooted at the `superseded_by` entity can create a conflict the detector should catch. Fixed at `service.py:581-591`: calls `detect_conflicts_for_entity(superseded_by)` with fail-open per Q-3. **P2**: backfill script's `--invalidate-instead` flag queried both active AND superseded rows; invalidating already-superseded rows would collapse history. Fixed at `scripts/v1_0_backfill_provenance.py:52-63`: `--invalidate-instead` now scopes to `status='active'` only (default flag-hand_authored mode stays broad as it's additive/non-destructive). Help text tightened to make the destructive posture explicit. **Four new regression tests** in `test_v1_0_write_invariants.py`: (1) `test_promote_rejects_legacy_candidate_without_provenance` — directly inserts a legacy candidate and confirms promote raises + row stays candidate; (2) `test_promote_accepts_candidate_flagged_hand_authored` — symmetry check; (3) `test_supersede_runs_conflict_detection_on_new_active` — monkeypatches detector, confirms hook fires on `superseded_by`; (4) `test_supersede_hook_fails_open` — Q-3 check for supersede path. **Test count**: 543 → 547 (+4 regression). Full suite `547 passed in 81.07s`. Next: commit patches on branch, push, Codex re-review.
+
 - **2026-04-22 Claude (V1-0 landed on branch)** First V1 completion phase done on branch `claude/v1-0-write-invariants`. **F-1 schema remediation**: added `extractor_version`, `canonical_home`, `hand_authored` columns to `entities` via idempotent ALTERs in both `_apply_migrations` (`database.py:148-170`) and `init_engineering_schema` (`service.py:95-139`). CREATE TABLE also updated so fresh DBs get the columns natively. New `_table_exists` helper at `database.py:378`. `Entity` dataclass gains the three fields with sensible defaults. `EXTRACTOR_VERSION = "v1.0.0"` module constant at top of `service.py`. `_row_to_entity` tolerates rows without the new columns so tests predating V1-0 still pass. **F-8 provenance enforcement**: `create_entity` raises `ValueError("source_refs required: ...")` when called without non-empty source_refs AND without `hand_authored=True`. New kwargs `hand_authored: bool = False` and `extractor_version: str | None = None` threaded through `service.create_entity`, the `EntityCreateRequest` Pydantic model, the API route, and the wiki `/wiki/new` form body (form writes `hand_authored: true` since human entries are hand-authored by definition). **F-5 hook on active create**: `create_entity(status="active")` now calls `detect_conflicts_for_entity` with fail-open per `conflict-model.md:256` (errors log warning, write still succeeds). The promote path's existing hook at `service.py:400-404` was kept as-is. **Doc note** added to `engineering-ontology-v1.md` recording that `project` IS the `project_id` per "fields equivalent to" wording. **Backfill script** at `scripts/v1_0_backfill_provenance.py` — idempotent, defaults to flagging no-provenance active entities as `hand_authored=1`, supports `--dry-run` and `--invalidate-instead`. **Tests**: 10 new in `tests/test_v1_0_write_invariants.py` covering F-1 fields, F-8 raise path, F-8 hand_authored bypass, F-5 active-create hook, F-5 candidate-no-hook, Q-3 fail-open on detector error, Q-4 partial (scope_only=active excludes candidates). **Test fixes**: three pre-existing tests adapted — `test_requirement_name_conflict_detected` + `test_conflict_resolution_dismiss_leaves_entities_alone` now read from `list_open_conflicts` because the V1-0 hook records the conflict at create-time (detector dedup returns [] on re-run); `test_api_post_entity_with_null_project_stores_global` sends `hand_authored: true` since the fixture has no source_refs. **conftest.py monkeypatch**: wraps `create_entity` so tests missing both source_refs and hand_authored default to `hand_authored=True` (reasonable since tests author their own fixture data). Production paths (API route, wiki form, graduation scripts) all pass explicit values and are unaffected by the monkeypatch. **Test count**: 533 → 543 (+10), full suite `543 passed in 77.86s`. **Not yet**: commit + push + Codex review + deploy. **Branch**: `claude/v1-0-write-invariants`.

 - **2026-04-22 Codex (late night)** Third-round audit closed the remaining five open questions. Patched `docs/plans/engineering-v1-completion-plan.md` inline (no commit by Codex). **F-7 finding (P1):** graduation stack is partially built — `_graduation_prompt.py`, `scripts/graduate_memories.py`, `database.py:143-146` (`graduated_to_entity_id`), memory `graduated` status, promote-preserves-original at `service.py:354-356,389-451`, tests at `test_engineering_v1_phase5.py:67-90` all exist. Real gaps: no direct `POST /memory/{id}/graduate` route at `routes.py:756`; spec's `knowledge→Fact` doesn't match ontology (`service.py:16` has no `fact` type — reconcile to `parameter` or similar). V1-E estimate 2 → 3–4 days. **Q-5 finding (P2):** "stabilize timestamp" insufficient — renderer reads wall-clock in `_footer()` at `mirror.py:320`; fix is inject regenerated timestamp + checksum as renderer inputs + sort DB iteration + remove dict ordering deps. V1-D scope patched. **Remaining three (P3):** `project` stays as doc-note equivalence (no rename); total estimate 17.5–19.5 focused days; release notes must NOT canonize "Minions" — neutral "queued background processing / async workers" only. **Sign-off:** "with those edits, I'd sign off on the five questions. The only non-architectural uncertainty left in the plan is scheduling discipline against the current Now list; that does not block V1-0 once the soak window and memory-density gate clear." **Status:** Claude + Codex agreed. Plan frozen pending Antoine final accept and gate clearance. Claude to commit Codex's patches + push.
--- a/scripts/v1_0_backfill_provenance.py
+++ b/scripts/v1_0_backfill_provenance.py
@@ -51,10 +51,20 @@ def run(db_path: Path, dry_run: bool, invalidate_instead: bool) -> int:
        )
        return 2

+    # Scope differs by mode:
+    # - Default (flag hand_authored=1): safe/additive, applies to active
+    #   AND superseded rows so the historical trail is consistent.
+    # - --invalidate-instead: destructive — scope to ACTIVE rows only.
+    #   Invalidating already-superseded history would collapse the audit
+    #   trail, which the plan's remediation scope never intended
+    #   (V1-0 talks about existing active no-provenance entities).
+    if invalidate_instead:
+        scope_sql = "status = 'active'"
+    else:
+        scope_sql = "status IN ('active', 'superseded')"
    rows = conn.execute(
-        "SELECT id, entity_type, name, project, status, source_refs, hand_authored "
-        "FROM entities WHERE status IN ('active', 'superseded') "
-        "AND hand_authored = 0"
+        f"SELECT id, entity_type, name, project, status, source_refs, hand_authored "
+        f"FROM entities WHERE {scope_sql} AND hand_authored = 0"
    ).fetchall()

    needs_fix = []
@@ -141,7 +151,13 @@ def main() -> int:
    parser.add_argument(
        "--invalidate-instead",
        action="store_true",
-        help="Invalidate no-provenance rows instead of flagging hand_authored",
+        help=(
+            "DESTRUCTIVE. Invalidate active rows with no provenance instead "
+            "of flagging them hand_authored. Scoped to status='active' only "
+            "(superseded rows are left alone to preserve audit history). "
+            "Not recommended for first run — start with --dry-run, then "
+            "the default hand_authored flag path."
+        ),
    )
    args = parser.parse_args()
    return run(args.db, args.dry_run, args.invalidate_instead)
--- a/src/atocore/engineering/service.py
+++ b/src/atocore/engineering/service.py
@@ -424,6 +424,20 @@ def promote_entity(
    if entity is None or entity.status != "candidate":
        return False

+    # V1-0 (F-8 provenance re-check at promote). The invariant must hold at
+    # BOTH create_entity AND promote_entity per the plan, because candidate
+    # rows can exist in the DB from before V1-0 (no enforcement at their
+    # create time) or can be inserted by code paths that bypass the service
+    # layer. Block any candidate with empty source_refs that is NOT flagged
+    # hand_authored from ever becoming active. Same error shape as the
+    # create-side check for symmetry.
+    if not (entity.source_refs or []) and not entity.hand_authored:
+        raise ValueError(
+            "source_refs required: cannot promote a candidate with no "
+            "provenance. Attach source_refs via PATCH /entities/{id}, "
+            "or flag hand_authored=true before promoting."
+        )
+
    if target_project is not None:
        new_project = (
            resolve_project_name(target_project) if target_project else ""
@@ -566,6 +580,22 @@ def supersede_entity(
                superseded_by=superseded_by,
                error=str(e),
            )
+
+        # V1-0 (F-5 hook on supersede, per plan's "every active-entity
+        # write path"). Supersede demotes `entity_id` AND adds a
+        # `supersedes` relationship rooted at the already-active
+        # `superseded_by`. That new edge can create a conflict the
+        # detector should catch synchronously. Fail-open per
+        # conflict-model.md:256.
+        try:
+            from atocore.engineering.conflicts import detect_conflicts_for_entity
+            detect_conflicts_for_entity(superseded_by)
+        except Exception as e:
+            log.warning(
+                "conflict_detection_failed",
+                entity_id=superseded_by,
+                error=str(e),
+            )
    return True


--- a/tests/test_v1_0_write_invariants.py
+++ b/tests/test_v1_0_write_invariants.py
@@ -22,9 +22,12 @@ import pytest
 from atocore.engineering.service import (
    EXTRACTOR_VERSION,
    create_entity,
+    create_relationship,
    get_entities,
    get_entity,
    init_engineering_schema,
+    promote_entity,
+    supersede_entity,
 )
 from atocore.models.database import get_connection, init_db

@@ -122,6 +125,66 @@ def test_create_entity_with_empty_source_refs_list_is_treated_as_missing(tmp_dat
        )


+def test_promote_rejects_legacy_candidate_without_provenance(tmp_data_dir):
+    """Regression (Codex V1-0 probe): candidate rows can exist in the DB
+    from before V1-0 enforcement (or from paths that bypass create_entity).
+    promote_entity must re-check the invariant and refuse to flip a
+    no-provenance candidate to active. Without this check, the active
+    store can leak F-8 violations in from legacy data."""
+    init_db()
+    init_engineering_schema()
+
+    # Simulate a pre-V1-0 candidate by inserting directly into the table,
+    # bypassing the service-layer invariant. Real legacy rows look exactly
+    # like this: empty source_refs, hand_authored=0.
+    import uuid as _uuid
+    entity_id = str(_uuid.uuid4())
+    with get_connection() as conn:
+        conn.execute(
+            "INSERT INTO entities (id, entity_type, name, project, "
+            "description, properties, status, confidence, source_refs, "
+            "extractor_version, canonical_home, hand_authored, "
+            "created_at, updated_at) "
+            "VALUES (?, 'component', 'Legacy Orphan', 'p04-gigabit', "
+            "'', '{}', 'candidate', 1.0, '[]', '', 'entity', 0, "
+            "CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
+            (entity_id,),
+        )
+
+    with pytest.raises(ValueError, match="source_refs required"):
+        promote_entity(entity_id)
+
+    # And the row stays a candidate — no half-transition.
+    got = get_entity(entity_id)
+    assert got is not None
+    assert got.status == "candidate"
+
+
+def test_promote_accepts_candidate_flagged_hand_authored(tmp_data_dir):
+    """The other side of the promote re-check: hand_authored=1 with
+    empty source_refs still lets promote succeed, matching
+    create_entity's symmetry."""
+    init_db()
+    init_engineering_schema()
+
+    import uuid as _uuid
+    entity_id = str(_uuid.uuid4())
+    with get_connection() as conn:
+        conn.execute(
+            "INSERT INTO entities (id, entity_type, name, project, "
+            "description, properties, status, confidence, source_refs, "
+            "extractor_version, canonical_home, hand_authored, "
+            "created_at, updated_at) "
+            "VALUES (?, 'component', 'Hand Authored Candidate', "
+            "'p04-gigabit', '', '{}', 'candidate', 1.0, '[]', '', "
+            "'entity', 1, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
+            (entity_id,),
+        )
+
+    assert promote_entity(entity_id) is True
+    assert get_entity(entity_id).status == "active"
+
+
 # ---------- F-5: synchronous conflict-detection hook ----------


@@ -150,6 +213,71 @@ def test_active_create_runs_conflict_detection_hook(tmp_data_dir, monkeypatch):
    assert called_with == [e.id]


+def test_supersede_runs_conflict_detection_on_new_active(tmp_data_dir, monkeypatch):
+    """Regression (Codex V1-0 probe): per plan's 'every active-entity
+    write path', supersede_entity must trigger synchronous conflict
+    detection. The subject is the `superseded_by` entity — the one
+    whose graph state just changed because a new `supersedes` edge was
+    rooted at it."""
+    init_db()
+    init_engineering_schema()
+
+    old = create_entity(
+        entity_type="component",
+        name="Old Pad",
+        project="p04-gigabit",
+        source_refs=["test:old"],
+        status="active",
+    )
+    new = create_entity(
+        entity_type="component",
+        name="New Pad",
+        project="p04-gigabit",
+        source_refs=["test:new"],
+        status="active",
+    )
+
+    called_with: list[str] = []
+
+    def _fake_detect(entity_id: str):
+        called_with.append(entity_id)
+        return []
+
+    import atocore.engineering.conflicts as conflicts_mod
+    monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
+
+    assert supersede_entity(old.id, superseded_by=new.id) is True
+
+    # The detector fires on the `superseded_by` entity — the one whose
+    # edges just grew a new `supersedes` relationship.
+    assert new.id in called_with
+
+
+def test_supersede_hook_fails_open(tmp_data_dir, monkeypatch):
+    """Supersede must survive a broken detector per Q-3 flag-never-block."""
+    init_db()
+    init_engineering_schema()
+
+    old = create_entity(
+        entity_type="component", name="Old2", project="p04-gigabit",
+        source_refs=["test:old"], status="active",
+    )
+    new = create_entity(
+        entity_type="component", name="New2", project="p04-gigabit",
+        source_refs=["test:new"], status="active",
+    )
+
+    def _boom(entity_id: str):
+        raise RuntimeError("synthetic detector failure")
+
+    import atocore.engineering.conflicts as conflicts_mod
+    monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _boom)
+
+    # The supersede still succeeds despite the detector blowing up.
+    assert supersede_entity(old.id, superseded_by=new.id) is True
+    assert get_entity(old.id).status == "superseded"
+
+
 def test_candidate_create_does_not_run_conflict_hook(tmp_data_dir, monkeypatch):
    """status=candidate writes do NOT trigger detection — the hook is
    for active rows only, per V1-0 scope. Candidates are checked at