feat(engineering): V1-0 write-time invariants (F-1 + F-5 hook + F-8)

Phase V1-0 of the Engineering V1 Completion Plan. Establishes the
write-time invariants every later phase depends on so no later phase
can leak invalid state into the entity store.

F-1 shared-header fields per engineering-v1-acceptance.md:45:
  - entities.extractor_version (default "", EXTRACTOR_VERSION="v1.0.0"
    written by service.create_entity)
  - entities.canonical_home (default "entity")
  - entities.hand_authored (default 0, INTEGER boolean)
  Idempotent ALTERs in both _apply_migrations (database.py) and
  init_engineering_schema (service.py). CREATE TABLE also carries the
  columns for fresh DBs. _row_to_entity tolerates old rows without
  them so tests that predate V1-0 keep passing.

F-8 provenance enforcement per promotion-rules.md:243:
  create_entity raises ValueError when source_refs is empty and
  hand_authored is False. New kwargs hand_authored and
  extractor_version threaded through the API (EntityCreateRequest)
  and the /wiki/new form body (human wiki writes set hand_authored
  true by definition). The non-negotiable invariant: every row either
  carries provenance or is explicitly flagged as hand-authored.

F-5 synchronous conflict-detection hook on active create per
engineering-v1-acceptance.md:99:
  create_entity(status="active") now runs detect_conflicts_for_entity
  with fail-open per conflict-model.md:256. Detector errors log a
  warning but never 4xx-block the write (Q-3 "flag, never block").

Doc note added to engineering-ontology-v1.md recording that `project`
IS the `project_id` per "fields equivalent to" wording. No storage
rename.

Backfill script scripts/v1_0_backfill_provenance.py reports and
optionally flags existing active entities that lack provenance.
Idempotent. Supports --dry-run and --invalidate-instead.

Tests: 10 new in test_v1_0_write_invariants.py covering F-1 fields,
F-8 raise + bypass, F-5 hook on active + no-hook on candidate, Q-3
fail-open, Q-4 partial scope_only=active excludes candidates.

Three pre-existing conflict tests adapted to read list_open_conflicts
rather than re-run the detector (which now dedups because the hook
already fired at create-time). One API test adds hand_authored=true
since its fixture has no source_refs.

conftest.py wraps create_entity so tests that don't pass source_refs
or hand_authored default to hand_authored=True (tests author their
own fixture data — reasonable default). Production paths (API route,
wiki form, graduation scripts) all pass explicit values and are
unaffected.

Test count: 533 -> 543 (+10). Full suite green in 77.86s.

Pending: Codex review on the branch before squash-merge to main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-22 14:39:30 -04:00
parent 9ab5b3c9d8
commit cbf9e03ab9
11 changed files with 558 additions and 8 deletions

View File

@@ -16,6 +16,36 @@ os.environ["ATOCORE_DATA_DIR"] = _default_test_dir
os.environ["ATOCORE_DEBUG"] = "true"
# V1-0: every entity created in a test is "hand authored" by the test
# author — fixture data, not extracted content. Rather than rewrite 100+
# existing test call sites, wrap create_entity so that tests which don't
# provide source_refs get hand_authored=True automatically. Tests that
# explicitly pass source_refs or hand_authored are unaffected. This keeps
# the F-8 invariant enforced in production (the API, the wiki form, and
# graduation scripts all go through the unwrapped function) while leaving
# the existing test corpus intact.
def _patch_create_entity_for_tests():
from atocore.engineering import service as _svc
_original = _svc.create_entity
def _create_entity_test(*args, **kwargs):
# Only auto-flag when hand_authored isn't explicitly specified.
# Tests that want to exercise the F-8 raise path pass
# hand_authored=False explicitly and should hit the error.
if (
not kwargs.get("source_refs")
and "hand_authored" not in kwargs
):
kwargs["hand_authored"] = True
return _original(*args, **kwargs)
_svc.create_entity = _create_entity_test
_patch_create_entity_for_tests()
@pytest.fixture
def tmp_data_dir(tmp_path):
"""Provide a temporary data directory for tests."""

View File

@@ -143,8 +143,11 @@ def test_requirement_name_conflict_detected(tmp_data_dir):
r2 = create_entity("requirement", "Surface figure < 25nm",
project="p-test", description="Different interpretation")
detected = detect_conflicts_for_entity(r2.id)
assert len(detected) == 1
# V1-0 synchronous hook: the conflict is already detected at r2's
# create-time, so a redundant detect call returns [] due to
# _record_conflict dedup. Assert on list_open_conflicts instead —
# that's what the intent of this test really tests: duplicate
# active requirements surface as an open conflict.
conflicts = list_open_conflicts(project="p-test")
assert any(c["slot_kind"] == "requirement.name" for c in conflicts)
@@ -191,8 +194,12 @@ def test_conflict_resolution_dismiss_leaves_entities_alone(tmp_data_dir):
description="first meaning")
r2 = create_entity("requirement", "Dup req", project="p-test",
description="second meaning")
detected = detect_conflicts_for_entity(r2.id)
conflict_id = detected[0]
# V1-0 synchronous hook already recorded the conflict at r2's
# create-time. Look it up via list_open_conflicts rather than
# calling the detector again (which returns [] due to dedup).
open_list = list_open_conflicts(project="p-test")
assert open_list, "expected conflict recorded by create-time hook"
conflict_id = open_list[0]["id"]
assert resolve_conflict(conflict_id, "dismiss")
# Both still active — dismiss just clears the conflict marker

View File

@@ -132,6 +132,7 @@ def test_api_post_entity_with_null_project_stores_global(seeded_db):
"entity_type": "material",
"name": "Titanium",
"project": None,
"hand_authored": True, # V1-0 F-8: test fixture, no source_refs
})
assert r.status_code == 200

View File

@@ -0,0 +1,234 @@
"""V1-0 write-time invariant tests.
Covers the Engineering V1 completion plan Phase V1-0 acceptance:
- F-1 shared-header fields: extractor_version + canonical_home + hand_authored
land in the entities table with working defaults
- F-8 provenance enforcement: create_entity raises without source_refs
unless hand_authored=True
- F-5 synchronous conflict-detection hook on any active-entity write
(create_entity with status="active" + the pre-existing promote_entity
path); fail-open per conflict-model.md:256
- Q-3 "flag, never block": a conflict never 4xx-blocks the write
- Q-4 partial trust: get_entities scope_only filters candidates out
Plan: docs/plans/engineering-v1-completion-plan.md
Spec: docs/architecture/engineering-v1-acceptance.md
"""
from __future__ import annotations
import pytest
from atocore.engineering.service import (
EXTRACTOR_VERSION,
create_entity,
get_entities,
get_entity,
init_engineering_schema,
)
from atocore.models.database import get_connection, init_db
# ---------- F-1: shared-header fields ----------
def test_entity_row_has_shared_header_fields(tmp_data_dir):
init_db()
init_engineering_schema()
with get_connection() as conn:
cols = {row["name"] for row in conn.execute("PRAGMA table_info(entities)").fetchall()}
assert "extractor_version" in cols
assert "canonical_home" in cols
assert "hand_authored" in cols
def test_created_entity_has_default_extractor_version_and_canonical_home(tmp_data_dir):
init_db()
init_engineering_schema()
e = create_entity(
entity_type="component",
name="Pivot Pin",
project="p04-gigabit",
source_refs=["test:fixture"],
)
assert e.extractor_version == EXTRACTOR_VERSION
assert e.canonical_home == "entity"
assert e.hand_authored is False
# round-trip through get_entity to confirm the row mapper returns
# the same values (not just the return-by-construct path)
got = get_entity(e.id)
assert got is not None
assert got.extractor_version == EXTRACTOR_VERSION
assert got.canonical_home == "entity"
assert got.hand_authored is False
def test_explicit_extractor_version_is_persisted(tmp_data_dir):
init_db()
init_engineering_schema()
e = create_entity(
entity_type="decision",
name="Pick GF-PTFE pads",
project="p04-gigabit",
source_refs=["interaction:abc"],
extractor_version="custom-v2.3",
)
got = get_entity(e.id)
assert got.extractor_version == "custom-v2.3"
# ---------- F-8: provenance enforcement ----------
def test_create_entity_without_provenance_raises(tmp_data_dir):
init_db()
init_engineering_schema()
with pytest.raises(ValueError, match="source_refs required"):
create_entity(
entity_type="component",
name="No Provenance",
project="p04-gigabit",
hand_authored=False, # explicit — bypasses the test-conftest auto-flag
)
def test_create_entity_with_hand_authored_needs_no_source_refs(tmp_data_dir):
init_db()
init_engineering_schema()
e = create_entity(
entity_type="component",
name="Human Entry",
project="p04-gigabit",
hand_authored=True,
)
assert e.hand_authored is True
got = get_entity(e.id)
assert got.hand_authored is True
# source_refs stays empty — the hand_authored flag IS the provenance
assert got.source_refs == []
def test_create_entity_with_empty_source_refs_list_is_treated_as_missing(tmp_data_dir):
init_db()
init_engineering_schema()
with pytest.raises(ValueError, match="source_refs required"):
create_entity(
entity_type="component",
name="Empty Refs",
project="p04-gigabit",
source_refs=[],
hand_authored=False,
)
# ---------- F-5: synchronous conflict-detection hook ----------
def test_active_create_runs_conflict_detection_hook(tmp_data_dir, monkeypatch):
"""status=active writes trigger detect_conflicts_for_entity."""
init_db()
init_engineering_schema()
called_with: list[str] = []
def _fake_detect(entity_id: str):
called_with.append(entity_id)
return []
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
e = create_entity(
entity_type="component",
name="Active With Hook",
project="p04-gigabit",
source_refs=["test:hook"],
status="active",
)
assert called_with == [e.id]
def test_candidate_create_does_not_run_conflict_hook(tmp_data_dir, monkeypatch):
"""status=candidate writes do NOT trigger detection — the hook is
for active rows only, per V1-0 scope. Candidates are checked at
promote time."""
init_db()
init_engineering_schema()
called: list[str] = []
def _fake_detect(entity_id: str):
called.append(entity_id)
return []
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
create_entity(
entity_type="component",
name="Candidate No Hook",
project="p04-gigabit",
source_refs=["test:cand"],
status="candidate",
)
assert called == []
# ---------- Q-3: flag, never block ----------
def test_conflict_detector_failure_does_not_block_write(tmp_data_dir, monkeypatch):
"""Per conflict-model.md:256: detection errors must not fail the
write. The entity is still created; only a warning is logged."""
init_db()
init_engineering_schema()
def _boom(entity_id: str):
raise RuntimeError("synthetic detector failure")
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _boom)
# The write still succeeds — no exception propagates.
e = create_entity(
entity_type="component",
name="Hook Fails Open",
project="p04-gigabit",
source_refs=["test:failopen"],
status="active",
)
assert get_entity(e.id) is not None
# ---------- Q-4 (partial): trust-hierarchy — scope_only filters candidates ----------
def test_scope_only_active_does_not_return_candidates(tmp_data_dir):
"""V1-0 partial Q-4: active-scoped listing never returns candidates.
Full trust-hierarchy coverage (no-auto-project-state, etc.) ships in
V1-E per plan."""
init_db()
init_engineering_schema()
active = create_entity(
entity_type="component",
name="Active Alpha",
project="p04-gigabit",
source_refs=["test:alpha"],
status="active",
)
candidate = create_entity(
entity_type="component",
name="Candidate Beta",
project="p04-gigabit",
source_refs=["test:beta"],
status="candidate",
)
listed = get_entities(project="p04-gigabit", status="active", scope_only=True)
ids = {e.id for e in listed}
assert active.id in ids
assert candidate.id not in ids