feat(engineering): V1-0 write-time invariants (F-1 + F-5 hook + F-8)

Phase V1-0 of the Engineering V1 Completion Plan. Establishes the write-time invariants every later phase depends on so no later phase can leak invalid state into the entity store. F-1 shared-header fields per engineering-v1-acceptance.md:45: - entities.extractor_version (default "", EXTRACTOR_VERSION="v1.0.0" written by service.create_entity) - entities.canonical_home (default "entity") - entities.hand_authored (default 0, INTEGER boolean) Idempotent ALTERs in both _apply_migrations (database.py) and init_engineering_schema (service.py). CREATE TABLE also carries the columns for fresh DBs. _row_to_entity tolerates old rows without them so tests that predate V1-0 keep passing. F-8 provenance enforcement per promotion-rules.md:243: create_entity raises ValueError when source_refs is empty and hand_authored is False. New kwargs hand_authored and extractor_version threaded through the API (EntityCreateRequest) and the /wiki/new form body (human wiki writes set hand_authored true by definition). The non-negotiable invariant: every row either carries provenance or is explicitly flagged as hand-authored. F-5 synchronous conflict-detection hook on active create per engineering-v1-acceptance.md:99: create_entity(status="active") now runs detect_conflicts_for_entity with fail-open per conflict-model.md:256. Detector errors log a warning but never 4xx-block the write (Q-3 "flag, never block"). Doc note added to engineering-ontology-v1.md recording that `project` IS the `project_id` per "fields equivalent to" wording. No storage rename. Backfill script scripts/v1_0_backfill_provenance.py reports and optionally flags existing active entities that lack provenance. Idempotent. Supports --dry-run and --invalidate-instead. Tests: 10 new in test_v1_0_write_invariants.py covering F-1 fields, F-8 raise + bypass, F-5 hook on active + no-hook on candidate, Q-3 fail-open, Q-4 partial scope_only=active excludes candidates. Three pre-existing conflict tests adapted to read list_open_conflicts rather than re-run the detector (which now dedups because the hook already fired at create-time). One API test adds hand_authored=true since its fixture has no source_refs. conftest.py wraps create_entity so tests that don't pass source_refs or hand_authored default to hand_authored=True (tests author their own fixture data — reasonable default). Production paths (API route, wiki form, graduation scripts) all pass explicit values and are unaffected. Test count: 533 -> 543 (+10). Full suite green in 77.86s. Pending: Codex review on the branch before squash-merge to main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:39:30 -04:00
parent 9ab5b3c9d8
commit cbf9e03ab9
11 changed files with 558 additions and 8 deletions
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -16,6 +16,36 @@ os.environ["ATOCORE_DATA_DIR"] = _default_test_dir
 os.environ["ATOCORE_DEBUG"] = "true"


+# V1-0: every entity created in a test is "hand authored" by the test
+# author — fixture data, not extracted content. Rather than rewrite 100+
+# existing test call sites, wrap create_entity so that tests which don't
+# provide source_refs get hand_authored=True automatically. Tests that
+# explicitly pass source_refs or hand_authored are unaffected. This keeps
+# the F-8 invariant enforced in production (the API, the wiki form, and
+# graduation scripts all go through the unwrapped function) while leaving
+# the existing test corpus intact.
+def _patch_create_entity_for_tests():
+    from atocore.engineering import service as _svc
+
+    _original = _svc.create_entity
+
+    def _create_entity_test(*args, **kwargs):
+        # Only auto-flag when hand_authored isn't explicitly specified.
+        # Tests that want to exercise the F-8 raise path pass
+        # hand_authored=False explicitly and should hit the error.
+        if (
+            not kwargs.get("source_refs")
+            and "hand_authored" not in kwargs
+        ):
+            kwargs["hand_authored"] = True
+        return _original(*args, **kwargs)
+
+    _svc.create_entity = _create_entity_test
+
+
+_patch_create_entity_for_tests()
+
+
@pytest.fixture
 def tmp_data_dir(tmp_path):
    """Provide a temporary data directory for tests."""
--- a/tests/test_engineering_v1_phase5.py
+++ b/tests/test_engineering_v1_phase5.py
@@ -143,8 +143,11 @@ def test_requirement_name_conflict_detected(tmp_data_dir):
    r2 = create_entity("requirement", "Surface figure < 25nm",
                      project="p-test", description="Different interpretation")

-    detected = detect_conflicts_for_entity(r2.id)
-    assert len(detected) == 1
+    # V1-0 synchronous hook: the conflict is already detected at r2's
+    # create-time, so a redundant detect call returns [] due to
+    # _record_conflict dedup. Assert on list_open_conflicts instead —
+    # that's what the intent of this test really tests: duplicate
+    # active requirements surface as an open conflict.
    conflicts = list_open_conflicts(project="p-test")
    assert any(c["slot_kind"] == "requirement.name" for c in conflicts)

@@ -191,8 +194,12 @@ def test_conflict_resolution_dismiss_leaves_entities_alone(tmp_data_dir):
                      description="first meaning")
    r2 = create_entity("requirement", "Dup req", project="p-test",
                      description="second meaning")
-    detected = detect_conflicts_for_entity(r2.id)
-    conflict_id = detected[0]
+    # V1-0 synchronous hook already recorded the conflict at r2's
+    # create-time. Look it up via list_open_conflicts rather than
+    # calling the detector again (which returns [] due to dedup).
+    open_list = list_open_conflicts(project="p-test")
+    assert open_list, "expected conflict recorded by create-time hook"
+    conflict_id = open_list[0]["id"]

    assert resolve_conflict(conflict_id, "dismiss")
    # Both still active — dismiss just clears the conflict marker
--- a/tests/test_inbox_crossproject.py
+++ b/tests/test_inbox_crossproject.py
@@ -132,6 +132,7 @@ def test_api_post_entity_with_null_project_stores_global(seeded_db):
        "entity_type": "material",
        "name": "Titanium",
        "project": None,
+        "hand_authored": True,  # V1-0 F-8: test fixture, no source_refs
    })
    assert r.status_code == 200

--- a/tests/test_v1_0_write_invariants.py
+++ b/tests/test_v1_0_write_invariants.py
@@ -0,0 +1,234 @@
+"""V1-0 write-time invariant tests.
+
+Covers the Engineering V1 completion plan Phase V1-0 acceptance:
+- F-1 shared-header fields: extractor_version + canonical_home + hand_authored
+  land in the entities table with working defaults
+- F-8 provenance enforcement: create_entity raises without source_refs
+  unless hand_authored=True
+- F-5 synchronous conflict-detection hook on any active-entity write
+  (create_entity with status="active" + the pre-existing promote_entity
+  path); fail-open per conflict-model.md:256
+- Q-3 "flag, never block": a conflict never 4xx-blocks the write
+- Q-4 partial trust: get_entities scope_only filters candidates out
+
+Plan: docs/plans/engineering-v1-completion-plan.md
+Spec: docs/architecture/engineering-v1-acceptance.md
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from atocore.engineering.service import (
+    EXTRACTOR_VERSION,
+    create_entity,
+    get_entities,
+    get_entity,
+    init_engineering_schema,
+)
+from atocore.models.database import get_connection, init_db
+
+
+# ---------- F-1: shared-header fields ----------
+
+
+def test_entity_row_has_shared_header_fields(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    with get_connection() as conn:
+        cols = {row["name"] for row in conn.execute("PRAGMA table_info(entities)").fetchall()}
+    assert "extractor_version" in cols
+    assert "canonical_home" in cols
+    assert "hand_authored" in cols
+
+
+def test_created_entity_has_default_extractor_version_and_canonical_home(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    e = create_entity(
+        entity_type="component",
+        name="Pivot Pin",
+        project="p04-gigabit",
+        source_refs=["test:fixture"],
+    )
+    assert e.extractor_version == EXTRACTOR_VERSION
+    assert e.canonical_home == "entity"
+    assert e.hand_authored is False
+
+    # round-trip through get_entity to confirm the row mapper returns
+    # the same values (not just the return-by-construct path)
+    got = get_entity(e.id)
+    assert got is not None
+    assert got.extractor_version == EXTRACTOR_VERSION
+    assert got.canonical_home == "entity"
+    assert got.hand_authored is False
+
+
+def test_explicit_extractor_version_is_persisted(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    e = create_entity(
+        entity_type="decision",
+        name="Pick GF-PTFE pads",
+        project="p04-gigabit",
+        source_refs=["interaction:abc"],
+        extractor_version="custom-v2.3",
+    )
+    got = get_entity(e.id)
+    assert got.extractor_version == "custom-v2.3"
+
+
+# ---------- F-8: provenance enforcement ----------
+
+
+def test_create_entity_without_provenance_raises(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    with pytest.raises(ValueError, match="source_refs required"):
+        create_entity(
+            entity_type="component",
+            name="No Provenance",
+            project="p04-gigabit",
+            hand_authored=False,  # explicit — bypasses the test-conftest auto-flag
+        )
+
+
+def test_create_entity_with_hand_authored_needs_no_source_refs(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    e = create_entity(
+        entity_type="component",
+        name="Human Entry",
+        project="p04-gigabit",
+        hand_authored=True,
+    )
+    assert e.hand_authored is True
+    got = get_entity(e.id)
+    assert got.hand_authored is True
+    # source_refs stays empty — the hand_authored flag IS the provenance
+    assert got.source_refs == []
+
+
+def test_create_entity_with_empty_source_refs_list_is_treated_as_missing(tmp_data_dir):
+    init_db()
+    init_engineering_schema()
+    with pytest.raises(ValueError, match="source_refs required"):
+        create_entity(
+            entity_type="component",
+            name="Empty Refs",
+            project="p04-gigabit",
+            source_refs=[],
+            hand_authored=False,
+        )
+
+
+# ---------- F-5: synchronous conflict-detection hook ----------
+
+
+def test_active_create_runs_conflict_detection_hook(tmp_data_dir, monkeypatch):
+    """status=active writes trigger detect_conflicts_for_entity."""
+    init_db()
+    init_engineering_schema()
+
+    called_with: list[str] = []
+
+    def _fake_detect(entity_id: str):
+        called_with.append(entity_id)
+        return []
+
+    import atocore.engineering.conflicts as conflicts_mod
+    monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
+
+    e = create_entity(
+        entity_type="component",
+        name="Active With Hook",
+        project="p04-gigabit",
+        source_refs=["test:hook"],
+        status="active",
+    )
+
+    assert called_with == [e.id]
+
+
+def test_candidate_create_does_not_run_conflict_hook(tmp_data_dir, monkeypatch):
+    """status=candidate writes do NOT trigger detection — the hook is
+    for active rows only, per V1-0 scope. Candidates are checked at
+    promote time."""
+    init_db()
+    init_engineering_schema()
+
+    called: list[str] = []
+
+    def _fake_detect(entity_id: str):
+        called.append(entity_id)
+        return []
+
+    import atocore.engineering.conflicts as conflicts_mod
+    monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
+
+    create_entity(
+        entity_type="component",
+        name="Candidate No Hook",
+        project="p04-gigabit",
+        source_refs=["test:cand"],
+        status="candidate",
+    )
+
+    assert called == []
+
+
+# ---------- Q-3: flag, never block ----------
+
+
+def test_conflict_detector_failure_does_not_block_write(tmp_data_dir, monkeypatch):
+    """Per conflict-model.md:256: detection errors must not fail the
+    write. The entity is still created; only a warning is logged."""
+    init_db()
+    init_engineering_schema()
+
+    def _boom(entity_id: str):
+        raise RuntimeError("synthetic detector failure")
+
+    import atocore.engineering.conflicts as conflicts_mod
+    monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _boom)
+
+    # The write still succeeds — no exception propagates.
+    e = create_entity(
+        entity_type="component",
+        name="Hook Fails Open",
+        project="p04-gigabit",
+        source_refs=["test:failopen"],
+        status="active",
+    )
+    assert get_entity(e.id) is not None
+
+
+# ---------- Q-4 (partial): trust-hierarchy — scope_only filters candidates ----------
+
+
+def test_scope_only_active_does_not_return_candidates(tmp_data_dir):
+    """V1-0 partial Q-4: active-scoped listing never returns candidates.
+    Full trust-hierarchy coverage (no-auto-project-state, etc.) ships in
+    V1-E per plan."""
+    init_db()
+    init_engineering_schema()
+
+    active = create_entity(
+        entity_type="component",
+        name="Active Alpha",
+        project="p04-gigabit",
+        source_refs=["test:alpha"],
+        status="active",
+    )
+    candidate = create_entity(
+        entity_type="component",
+        name="Candidate Beta",
+        project="p04-gigabit",
+        source_refs=["test:beta"],
+        status="candidate",
+    )
+
+    listed = get_entities(project="p04-gigabit", status="active", scope_only=True)
+    ids = {e.id for e in listed}
+    assert active.id in ids
+    assert candidate.id not in ids