fix(engineering): V1-0 gap closures per Codex review

Codex audit of cbf9e03 surfaced two P1 gaps + one P2 scope concern,
all verified with code-level probes. Patches below.

P1: promote_entity did not re-check F-8 at status flip.
Legacy candidates with source_refs='[]' and hand_authored=0 can
exist from before V1-0 enforcement. promote_entity now raises
ValueError before flipping status so no F-8 violation can slip
into the active store through the promote path. Row stays
candidate on rejection. Symmetric error shape with the create
side.

P1: supersede_entity was missing the F-5 hook.
Plan calls for synchronous conflict detection on every
active-entity write path. Supersede creates a `supersedes`
relationship rooted at the `superseded_by` entity, which can
produce a conflict the detector should catch. Added
detect_conflicts_for_entity(superseded_by) call with fail-open
per conflict-model.md:256.

P2: backfill script --invalidate-instead was too broad.
Query included both active AND superseded rows; invalidating
superseded rows collapses audit history that V1-0 remediation
never intended to touch. Now --invalidate-instead scopes to
status='active' only. Default hand_authored-flag mode stays
broad since it's additive/non-destructive. Help text made the
destructive posture explicit.

Four new regression tests in test_v1_0_write_invariants.py:
- test_promote_rejects_legacy_candidate_without_provenance
- test_promote_accepts_candidate_flagged_hand_authored
- test_supersede_runs_conflict_detection_on_new_active
- test_supersede_hook_fails_open

Test count: 543 -> 547 (+4). Full suite green in 81.07s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-22 14:49:42 -04:00
parent cbf9e03ab9
commit f16cd5272f
4 changed files with 180 additions and 4 deletions

View File

@@ -164,6 +164,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
## Session Log
- **2026-04-22 Claude (V1-0 patches per Codex review)** Codex audit of commit `cbf9e03` surfaced two P1 gaps + one P2 scope concern, all verified with code-level probes. **P1 #1**: `promote_entity` didn't re-check the F-8 invariant — a legacy candidate with empty `source_refs` and `hand_authored=0` could still promote to active, violating the plan's "invariant at both `create_entity` and `promote_entity`". Fixed: `promote_entity` at `service.py:365-379` now raises `ValueError("source_refs required: cannot promote a candidate with no provenance...")` before flipping status. Stays symmetric with the create-side error. **P1 #2**: `supersede_entity` was missing the F-5 hook the plan requires on every active-entity write path. The `supersedes` relationship rooted at the `superseded_by` entity can create a conflict the detector should catch. Fixed at `service.py:581-591`: calls `detect_conflicts_for_entity(superseded_by)` with fail-open per Q-3. **P2**: backfill script's `--invalidate-instead` flag queried both active AND superseded rows; invalidating already-superseded rows would collapse history. Fixed at `scripts/v1_0_backfill_provenance.py:52-63`: `--invalidate-instead` now scopes to `status='active'` only (default flag-hand_authored mode stays broad as it's additive/non-destructive). Help text tightened to make the destructive posture explicit. **Four new regression tests** in `test_v1_0_write_invariants.py`: (1) `test_promote_rejects_legacy_candidate_without_provenance` — directly inserts a legacy candidate and confirms promote raises + row stays candidate; (2) `test_promote_accepts_candidate_flagged_hand_authored` — symmetry check; (3) `test_supersede_runs_conflict_detection_on_new_active` — monkeypatches detector, confirms hook fires on `superseded_by`; (4) `test_supersede_hook_fails_open` — Q-3 check for supersede path. **Test count**: 543 → 547 (+4 regression). Full suite `547 passed in 81.07s`. Next: commit patches on branch, push, Codex re-review.
- **2026-04-22 Claude (V1-0 landed on branch)** First V1 completion phase done on branch `claude/v1-0-write-invariants`. **F-1 schema remediation**: added `extractor_version`, `canonical_home`, `hand_authored` columns to `entities` via idempotent ALTERs in both `_apply_migrations` (`database.py:148-170`) and `init_engineering_schema` (`service.py:95-139`). CREATE TABLE also updated so fresh DBs get the columns natively. New `_table_exists` helper at `database.py:378`. `Entity` dataclass gains the three fields with sensible defaults. `EXTRACTOR_VERSION = "v1.0.0"` module constant at top of `service.py`. `_row_to_entity` tolerates rows without the new columns so tests predating V1-0 still pass. **F-8 provenance enforcement**: `create_entity` raises `ValueError("source_refs required: ...")` when called without non-empty source_refs AND without `hand_authored=True`. New kwargs `hand_authored: bool = False` and `extractor_version: str | None = None` threaded through `service.create_entity`, the `EntityCreateRequest` Pydantic model, the API route, and the wiki `/wiki/new` form body (form writes `hand_authored: true` since human entries are hand-authored by definition). **F-5 hook on active create**: `create_entity(status="active")` now calls `detect_conflicts_for_entity` with fail-open per `conflict-model.md:256` (errors log warning, write still succeeds). The promote path's existing hook at `service.py:400-404` was kept as-is. **Doc note** added to `engineering-ontology-v1.md` recording that `project` IS the `project_id` per "fields equivalent to" wording. **Backfill script** at `scripts/v1_0_backfill_provenance.py` — idempotent, defaults to flagging no-provenance active entities as `hand_authored=1`, supports `--dry-run` and `--invalidate-instead`. **Tests**: 10 new in `tests/test_v1_0_write_invariants.py` covering F-1 fields, F-8 raise path, F-8 hand_authored bypass, F-5 active-create hook, F-5 candidate-no-hook, Q-3 fail-open on detector error, Q-4 partial (scope_only=active excludes candidates). **Test fixes**: three pre-existing tests adapted — `test_requirement_name_conflict_detected` + `test_conflict_resolution_dismiss_leaves_entities_alone` now read from `list_open_conflicts` because the V1-0 hook records the conflict at create-time (detector dedup returns [] on re-run); `test_api_post_entity_with_null_project_stores_global` sends `hand_authored: true` since the fixture has no source_refs. **conftest.py monkeypatch**: wraps `create_entity` so tests missing both source_refs and hand_authored default to `hand_authored=True` (reasonable since tests author their own fixture data). Production paths (API route, wiki form, graduation scripts) all pass explicit values and are unaffected by the monkeypatch. **Test count**: 533 → 543 (+10), full suite `543 passed in 77.86s`. **Not yet**: commit + push + Codex review + deploy. **Branch**: `claude/v1-0-write-invariants`.
- **2026-04-22 Codex (late night)** Third-round audit closed the remaining five open questions. Patched `docs/plans/engineering-v1-completion-plan.md` inline (no commit by Codex). **F-7 finding (P1):** graduation stack is partially built — `_graduation_prompt.py`, `scripts/graduate_memories.py`, `database.py:143-146` (`graduated_to_entity_id`), memory `graduated` status, promote-preserves-original at `service.py:354-356,389-451`, tests at `test_engineering_v1_phase5.py:67-90` all exist. Real gaps: no direct `POST /memory/{id}/graduate` route at `routes.py:756`; spec's `knowledge→Fact` doesn't match ontology (`service.py:16` has no `fact` type — reconcile to `parameter` or similar). V1-E estimate 2 → 34 days. **Q-5 finding (P2):** "stabilize timestamp" insufficient — renderer reads wall-clock in `_footer()` at `mirror.py:320`; fix is inject regenerated timestamp + checksum as renderer inputs + sort DB iteration + remove dict ordering deps. V1-D scope patched. **Remaining three (P3):** `project` stays as doc-note equivalence (no rename); total estimate 17.519.5 focused days; release notes must NOT canonize "Minions" — neutral "queued background processing / async workers" only. **Sign-off:** "with those edits, I'd sign off on the five questions. The only non-architectural uncertainty left in the plan is scheduling discipline against the current Now list; that does not block V1-0 once the soak window and memory-density gate clear." **Status:** Claude + Codex agreed. Plan frozen pending Antoine final accept and gate clearance. Claude to commit Codex's patches + push.

View File

@@ -51,10 +51,20 @@ def run(db_path: Path, dry_run: bool, invalidate_instead: bool) -> int:
)
return 2
# Scope differs by mode:
# - Default (flag hand_authored=1): safe/additive, applies to active
# AND superseded rows so the historical trail is consistent.
# - --invalidate-instead: destructive — scope to ACTIVE rows only.
# Invalidating already-superseded history would collapse the audit
# trail, which the plan's remediation scope never intended
# (V1-0 talks about existing active no-provenance entities).
if invalidate_instead:
scope_sql = "status = 'active'"
else:
scope_sql = "status IN ('active', 'superseded')"
rows = conn.execute(
"SELECT id, entity_type, name, project, status, source_refs, hand_authored "
"FROM entities WHERE status IN ('active', 'superseded') "
"AND hand_authored = 0"
f"SELECT id, entity_type, name, project, status, source_refs, hand_authored "
f"FROM entities WHERE {scope_sql} AND hand_authored = 0"
).fetchall()
needs_fix = []
@@ -141,7 +151,13 @@ def main() -> int:
parser.add_argument(
"--invalidate-instead",
action="store_true",
help="Invalidate no-provenance rows instead of flagging hand_authored",
help=(
"DESTRUCTIVE. Invalidate active rows with no provenance instead "
"of flagging them hand_authored. Scoped to status='active' only "
"(superseded rows are left alone to preserve audit history). "
"Not recommended for first run — start with --dry-run, then "
"the default hand_authored flag path."
),
)
args = parser.parse_args()
return run(args.db, args.dry_run, args.invalidate_instead)

View File

@@ -424,6 +424,20 @@ def promote_entity(
if entity is None or entity.status != "candidate":
return False
# V1-0 (F-8 provenance re-check at promote). The invariant must hold at
# BOTH create_entity AND promote_entity per the plan, because candidate
# rows can exist in the DB from before V1-0 (no enforcement at their
# create time) or can be inserted by code paths that bypass the service
# layer. Block any candidate with empty source_refs that is NOT flagged
# hand_authored from ever becoming active. Same error shape as the
# create-side check for symmetry.
if not (entity.source_refs or []) and not entity.hand_authored:
raise ValueError(
"source_refs required: cannot promote a candidate with no "
"provenance. Attach source_refs via PATCH /entities/{id}, "
"or flag hand_authored=true before promoting."
)
if target_project is not None:
new_project = (
resolve_project_name(target_project) if target_project else ""
@@ -566,6 +580,22 @@ def supersede_entity(
superseded_by=superseded_by,
error=str(e),
)
# V1-0 (F-5 hook on supersede, per plan's "every active-entity
# write path"). Supersede demotes `entity_id` AND adds a
# `supersedes` relationship rooted at the already-active
# `superseded_by`. That new edge can create a conflict the
# detector should catch synchronously. Fail-open per
# conflict-model.md:256.
try:
from atocore.engineering.conflicts import detect_conflicts_for_entity
detect_conflicts_for_entity(superseded_by)
except Exception as e:
log.warning(
"conflict_detection_failed",
entity_id=superseded_by,
error=str(e),
)
return True

View File

@@ -22,9 +22,12 @@ import pytest
from atocore.engineering.service import (
EXTRACTOR_VERSION,
create_entity,
create_relationship,
get_entities,
get_entity,
init_engineering_schema,
promote_entity,
supersede_entity,
)
from atocore.models.database import get_connection, init_db
@@ -122,6 +125,66 @@ def test_create_entity_with_empty_source_refs_list_is_treated_as_missing(tmp_dat
)
def test_promote_rejects_legacy_candidate_without_provenance(tmp_data_dir):
"""Regression (Codex V1-0 probe): candidate rows can exist in the DB
from before V1-0 enforcement (or from paths that bypass create_entity).
promote_entity must re-check the invariant and refuse to flip a
no-provenance candidate to active. Without this check, the active
store can leak F-8 violations in from legacy data."""
init_db()
init_engineering_schema()
# Simulate a pre-V1-0 candidate by inserting directly into the table,
# bypassing the service-layer invariant. Real legacy rows look exactly
# like this: empty source_refs, hand_authored=0.
import uuid as _uuid
entity_id = str(_uuid.uuid4())
with get_connection() as conn:
conn.execute(
"INSERT INTO entities (id, entity_type, name, project, "
"description, properties, status, confidence, source_refs, "
"extractor_version, canonical_home, hand_authored, "
"created_at, updated_at) "
"VALUES (?, 'component', 'Legacy Orphan', 'p04-gigabit', "
"'', '{}', 'candidate', 1.0, '[]', '', 'entity', 0, "
"CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
(entity_id,),
)
with pytest.raises(ValueError, match="source_refs required"):
promote_entity(entity_id)
# And the row stays a candidate — no half-transition.
got = get_entity(entity_id)
assert got is not None
assert got.status == "candidate"
def test_promote_accepts_candidate_flagged_hand_authored(tmp_data_dir):
"""The other side of the promote re-check: hand_authored=1 with
empty source_refs still lets promote succeed, matching
create_entity's symmetry."""
init_db()
init_engineering_schema()
import uuid as _uuid
entity_id = str(_uuid.uuid4())
with get_connection() as conn:
conn.execute(
"INSERT INTO entities (id, entity_type, name, project, "
"description, properties, status, confidence, source_refs, "
"extractor_version, canonical_home, hand_authored, "
"created_at, updated_at) "
"VALUES (?, 'component', 'Hand Authored Candidate', "
"'p04-gigabit', '', '{}', 'candidate', 1.0, '[]', '', "
"'entity', 1, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
(entity_id,),
)
assert promote_entity(entity_id) is True
assert get_entity(entity_id).status == "active"
# ---------- F-5: synchronous conflict-detection hook ----------
@@ -150,6 +213,71 @@ def test_active_create_runs_conflict_detection_hook(tmp_data_dir, monkeypatch):
assert called_with == [e.id]
def test_supersede_runs_conflict_detection_on_new_active(tmp_data_dir, monkeypatch):
"""Regression (Codex V1-0 probe): per plan's 'every active-entity
write path', supersede_entity must trigger synchronous conflict
detection. The subject is the `superseded_by` entity — the one
whose graph state just changed because a new `supersedes` edge was
rooted at it."""
init_db()
init_engineering_schema()
old = create_entity(
entity_type="component",
name="Old Pad",
project="p04-gigabit",
source_refs=["test:old"],
status="active",
)
new = create_entity(
entity_type="component",
name="New Pad",
project="p04-gigabit",
source_refs=["test:new"],
status="active",
)
called_with: list[str] = []
def _fake_detect(entity_id: str):
called_with.append(entity_id)
return []
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
assert supersede_entity(old.id, superseded_by=new.id) is True
# The detector fires on the `superseded_by` entity — the one whose
# edges just grew a new `supersedes` relationship.
assert new.id in called_with
def test_supersede_hook_fails_open(tmp_data_dir, monkeypatch):
"""Supersede must survive a broken detector per Q-3 flag-never-block."""
init_db()
init_engineering_schema()
old = create_entity(
entity_type="component", name="Old2", project="p04-gigabit",
source_refs=["test:old"], status="active",
)
new = create_entity(
entity_type="component", name="New2", project="p04-gigabit",
source_refs=["test:new"], status="active",
)
def _boom(entity_id: str):
raise RuntimeError("synthetic detector failure")
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _boom)
# The supersede still succeeds despite the detector blowing up.
assert supersede_entity(old.id, superseded_by=new.id) is True
assert get_entity(old.id).status == "superseded"
def test_candidate_create_does_not_run_conflict_hook(tmp_data_dir, monkeypatch):
"""status=candidate writes do NOT trigger detection — the hook is
for active rows only, per V1-0 scope. Candidates are checked at