3 Commits

Author SHA1 Message Date
3888db926f fix(api): R14 — promote route translates V1-0 ValueError to 400
POST /entities/{id}/promote now wraps the promote_entity call in
try/except ValueError → HTTPException(400). Previously the new
V1-0 provenance re-check raised ValueError that the route didn't
catch, so legacy no-provenance candidates promoted via the API
surfaced as 500 instead of 400.

Matches the existing ValueError → 400 handling on POST /entities
(api_create_entity at routes.py:1490).

Test: test_api_promote_returns_400_on_legacy_no_provenance inserts
a pre-V1-0 legacy candidate directly, POSTs to the promote route,
asserts 400 with the expected detail, asserts the row stays
candidate.

Test count: 547 -> 548. Full suite green in 72.91s.

Closes R14.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 15:30:11 -04:00
22a37a7241 docs(ledger): record V1-0 closure + prod backfill + R14 residual
- Orientation bumped: live_sha 775960c -> 2712c5d, test_count
  533 -> 547, main_tip 999788b -> 2712c5d
- Recent Decisions: V1-0 approved, merged, deployed, 31-row prod
  backfill with zero-remainder follow-up dry-run
- Open Review Findings: new R14 (P2) — POST /entities/{id}/promote
  returns 500 instead of 400 on the new ValueError from legacy
  no-provenance candidate promotion. Non-blocking, follow-up tidy
- Session Log: full V1-0 closure narrative + V1-A gate status

V1-A (Q-001 subsystem-scoped variant + Q-6 killer-correctness
integration) holds until the soak window ends ~2026-04-26 and the
100-active-memory density target is hit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 15:17:06 -04:00
2712c5d2d0 feat(engineering): enforce V1-0 write invariants 2026-04-22 14:59:17 -04:00
12 changed files with 794 additions and 18 deletions

5
.gitignore vendored
View File

@@ -14,3 +14,8 @@ venv/
.claude/* .claude/*
!.claude/commands/ !.claude/commands/
!.claude/commands/** !.claude/commands/**
# Editor / IDE state — user-specific, not project config
.obsidian/
.vscode/
.idea/

View File

@@ -6,10 +6,10 @@
## Orientation ## Orientation
- **live_sha** (Dalidou `/health` build_sha): `775960c` (verified 2026-04-16 via /health, build_time 2026-04-16T17:59:30Z) - **live_sha** (Dalidou `/health` build_sha): `2712c5d` (verified 2026-04-22 via Codex after V1-0 deploy; status=ok)
- **last_updated**: 2026-04-18 by Claude (Phase 7A — Memory Consolidation "sleep cycle" V1 on branch, not yet deployed) - **last_updated**: 2026-04-22 by Claude (V1-0 write-time invariants landed + deployed + prod backfill complete)
- **main_tip**: `999788b` - **main_tip**: `2712c5d`
- **test_count**: 533 (prior 521 + 12 new wikilink/redlink tests) - **test_count**: 547 (prior 533 + 10 V1-0 invariant tests + 4 Codex-review regression tests)
- **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression) - **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression)
- **vectors**: 33,253 - **vectors**: 33,253
- **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity) - **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
@@ -143,9 +143,11 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
| R11 | Codex | P2 | src/atocore/api/routes.py:773-845 | `POST /admin/extract-batch` still accepts `mode="llm"` inside the container and returns a successful 0-candidate result instead of surfacing that host-only LLM extraction is unavailable from this runtime. That is a misleading API contract for operators. | fixed | Claude | 2026-04-12 | (pending) | | R11 | Codex | P2 | src/atocore/api/routes.py:773-845 | `POST /admin/extract-batch` still accepts `mode="llm"` inside the container and returns a successful 0-candidate result instead of surfacing that host-only LLM extraction is unavailable from this runtime. That is a misleading API contract for operators. | fixed | Claude | 2026-04-12 | (pending) |
| R12 | Codex | P2 | scripts/batch_llm_extract_live.py:39-190 | The host-side extractor duplicates the LLM system prompt and JSON parsing logic from `src/atocore/memory/extractor_llm.py`. It works today, but this is now a prompt/parser drift risk across the container and host implementations. | fixed | Claude | 2026-04-12 | (pending) | | R12 | Codex | P2 | scripts/batch_llm_extract_live.py:39-190 | The host-side extractor duplicates the LLM system prompt and JSON parsing logic from `src/atocore/memory/extractor_llm.py`. It works today, but this is now a prompt/parser drift risk across the container and host implementations. | fixed | Claude | 2026-04-12 | (pending) |
| R13 | Codex | P2 | DEV-LEDGER.md:12 | The new `286 passing` test-count claim is not reproducibly auditable from the current audit environments: neither Dalidou nor the clean worktree has `pytest` available. The claim may be true in Claude's dev shell, but it remains unverified in this audit. | fixed | Claude | 2026-04-12 | (pending) | | R13 | Codex | P2 | DEV-LEDGER.md:12 | The new `286 passing` test-count claim is not reproducibly auditable from the current audit environments: neither Dalidou nor the clean worktree has `pytest` available. The claim may be true in Claude's dev shell, but it remains unverified in this audit. | fixed | Claude | 2026-04-12 | (pending) |
| R14 | Codex | P2 | src/atocore/api/routes.py (POST /entities/{id}/promote) | The HTTP `POST /entities/{id}/promote` route does not translate the new service-layer `ValueError("source_refs required: cannot promote a candidate with no provenance...")` into a 400. A legacy no-provenance candidate promoted through the API currently surfaces as a 500. Does not block V1-0 acceptance; tidy in a follow-up. | fixed | Claude | 2026-04-22 | (pending) |
## Recent Decisions ## Recent Decisions
- **2026-04-22** **V1-0 done: approved, merged, deployed, prod backfilled.** Codex pulled `f16cd52`, re-ran the two original probes (both pass), re-ran the three targeted regression suites (all pass). Squash-merged to main as `2712c5d`. Dalidou deployed via canonical deploy script; `/health` reports build_sha=`2712c5d2d03cb2a6af38b559664afd1c4cd0e050`, status=ok. Validated backup snapshot taken at `/srv/storage/atocore/backups/snapshots/20260422T190624Z` before backfill. Prod backfill: `--dry-run` found 31 active/superseded entities with no provenance; list reviewed and sane; live run updated 31 rows via the default `hand_authored=1` flag path; follow-up dry-run returned 0 rows remaining. Residual logged as R14 (P2): `POST /entities/{id}/promote` HTTP route doesn't translate the new service-layer `ValueError` into a 400 — legacy bad candidate promotes via the API return 500 instead. Does not block V1-0 acceptance. V1-0 closed. Next: V1-A (Q-001 subsystem-scoped variant + Q-6 integration). V1-A holds until the soak window ends ~2026-04-26 and the 100-memory density target is hit. *Approved + landed by:* Codex. *Ratified by:* Antoine.
- **2026-04-22** **Engineering V1 Completion Plan — Codex sign-off (third round)**. Codex's third-round audit closed the remaining five open questions with concrete resolutions, patched inline in `docs/plans/engineering-v1-completion-plan.md`: (1) F-7 row rewritten with ground truth — schema + preserve-original + test coverage already exist (`graduated_to_entity_id` at `database.py:143-146`, `graduated` status in memory service, promote hook at `service.py:354-356,389-451`, tests at `test_engineering_v1_phase5.py:67-90`); **real gaps** are missing direct `POST /memory/{id}/graduate` route and spec's `knowledge→Fact` mismatch (no `fact` entity type exists; reconcile to `parameter` or similar); V1-E 2 → **34 days**; (2) Q-5 determinism reframed — don't stabilize the call to `datetime.now()`, inject regenerated timestamp + checksum as renderer inputs, remove DB iteration ordering dependencies; V1-D scope updated; (3) `project` vs `project_id` — doc note only, no rename, resolved; (4) total estimate 16.517.5 → **17.519.5 focused days** with calendar buffer on top; (5) "Minions" must not be canonized in D-3 release notes — neutral wording ("queued background processing / async workers") only. **Agreement reached**: Claude + Codex + Antoine aligned. V1-0 is ready to start once the current pipeline soak window ends (~2026-04-26) and the 100-memory density target is hit. *Patched by:* Codex. *Signed off by:* Codex ("with those edits, I'd sign off on the five questions"). *Accepted by:* Antoine. *Executor (V1-0 onwards):* Claude. - **2026-04-22** **Engineering V1 Completion Plan — Codex sign-off (third round)**. Codex's third-round audit closed the remaining five open questions with concrete resolutions, patched inline in `docs/plans/engineering-v1-completion-plan.md`: (1) F-7 row rewritten with ground truth — schema + preserve-original + test coverage already exist (`graduated_to_entity_id` at `database.py:143-146`, `graduated` status in memory service, promote hook at `service.py:354-356,389-451`, tests at `test_engineering_v1_phase5.py:67-90`); **real gaps** are missing direct `POST /memory/{id}/graduate` route and spec's `knowledge→Fact` mismatch (no `fact` entity type exists; reconcile to `parameter` or similar); V1-E 2 → **34 days**; (2) Q-5 determinism reframed — don't stabilize the call to `datetime.now()`, inject regenerated timestamp + checksum as renderer inputs, remove DB iteration ordering dependencies; V1-D scope updated; (3) `project` vs `project_id` — doc note only, no rename, resolved; (4) total estimate 16.517.5 → **17.519.5 focused days** with calendar buffer on top; (5) "Minions" must not be canonized in D-3 release notes — neutral wording ("queued background processing / async workers") only. **Agreement reached**: Claude + Codex + Antoine aligned. V1-0 is ready to start once the current pipeline soak window ends (~2026-04-26) and the 100-memory density target is hit. *Patched by:* Codex. *Signed off by:* Codex ("with those edits, I'd sign off on the five questions"). *Accepted by:* Antoine. *Executor (V1-0 onwards):* Claude.
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex second-round file-level audit** — three findings folded in, all with exact file:line refs from Codex: (1) F-1 downgraded from ✅ to 🟡 — `extractor_version` and `canonical_home` missing from `Entity` dataclass and `entities` table per `engineering-v1-acceptance.md:45`; V1-0 scope now adds both fields via additive migration + doc note that `project` IS `project_id` per "fields equivalent to" spec wording; (2) F-2 replaced with ground-truth per-query status: 9 of 20 v1-required queries done (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017), 1 partial (Q-001 needs subsystem-scoped variant), 10 missing (Q-002/003/007/010/012/014/018/019/020); V1-A scope shrank to Q-001 shape fix + Q-6 integration (pillar queries already implemented); V1-C closes the 8 remaining new queries + Q-020 deferred to V1-D; (3) F-5 reframed — generic `conflicts` + `conflict_members` schema already present at `database.py:190`, no migration needed; divergence is detector body (per-type dispatch needs generalization) + routes (`/admin/conflicts/*` needs `/conflicts/*` alias). Total revised to 16.517.5 days, ~60 tests. Plan: `docs/plans/engineering-v1-completion-plan.md` at commit `ce3a878` (Codex pulled clean). Three of Codex's eight open questions now answered; remaining: F-7 graduation depth, mirror determinism, `project` rename question, velocity calibration, minions naming. *Proposed by:* Claude. *Reviewed by:* Codex (two rounds). - **2026-04-22** **Engineering V1 Completion Plan revised per Codex second-round file-level audit** — three findings folded in, all with exact file:line refs from Codex: (1) F-1 downgraded from ✅ to 🟡 — `extractor_version` and `canonical_home` missing from `Entity` dataclass and `entities` table per `engineering-v1-acceptance.md:45`; V1-0 scope now adds both fields via additive migration + doc note that `project` IS `project_id` per "fields equivalent to" spec wording; (2) F-2 replaced with ground-truth per-query status: 9 of 20 v1-required queries done (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017), 1 partial (Q-001 needs subsystem-scoped variant), 10 missing (Q-002/003/007/010/012/014/018/019/020); V1-A scope shrank to Q-001 shape fix + Q-6 integration (pillar queries already implemented); V1-C closes the 8 remaining new queries + Q-020 deferred to V1-D; (3) F-5 reframed — generic `conflicts` + `conflict_members` schema already present at `database.py:190`, no migration needed; divergence is detector body (per-type dispatch needs generalization) + routes (`/admin/conflicts/*` needs `/conflicts/*` alias). Total revised to 16.517.5 days, ~60 tests. Plan: `docs/plans/engineering-v1-completion-plan.md` at commit `ce3a878` (Codex pulled clean). Three of Codex's eight open questions now answered; remaining: F-7 graduation depth, mirror determinism, `project` rename question, velocity calibration, minions naming. *Proposed by:* Claude. *Reviewed by:* Codex (two rounds).
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** — original six-phase order (queries → ingest → mirror → graduation → provenance → ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" — real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "5070% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex. - **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** — original six-phase order (queries → ingest → mirror → graduation → provenance → ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" — real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "5070% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex.
@@ -164,6 +166,12 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
## Session Log ## Session Log
- **2026-04-22 Codex + Antoine (V1-0 closed)** Codex approved `f16cd52` after re-running both original probes (legacy-candidate promote + supersede hook — both correct) and the three targeted regression suites (`test_v1_0_write_invariants.py`, `test_engineering_v1_phase5.py`, `test_inbox_crossproject.py` — all pass). Squash-merged to main as `2712c5d` ("feat(engineering): enforce V1-0 write invariants"). Deployed to Dalidou via the canonical deploy script; `/health` build_sha=`2712c5d2d03cb2a6af38b559664afd1c4cd0e050` status=ok. Validated backup snapshot at `/srv/storage/atocore/backups/snapshots/20260422T190624Z` taken BEFORE prod backfill. Prod backfill of `scripts/v1_0_backfill_provenance.py` against live DB: dry-run found 31 active/superseded entities with no provenance, list reviewed and looked sane; live run with default `hand_authored=1` flag path updated 31 rows; follow-up dry-run returned 0 rows remaining → no lingering F-8 violations in prod. Codex logged one residual P2 (R14): HTTP `POST /entities/{id}/promote` route doesn't translate the new service-layer `ValueError` into 400 — legacy bad candidate promoted through the API surfaces as 500. Not blocking. V1-0 closed. **Gates for V1-A**: soak window ends ~2026-04-26; 100-active-memory density target (currently 84 active + the ~31 newly flagged ones — need to check how those count in density math). V1-A holds until both gates clear.
- **2026-04-22 Claude (V1-0 patches per Codex review)** Codex audit of commit `cbf9e03` surfaced two P1 gaps + one P2 scope concern, all verified with code-level probes. **P1 #1**: `promote_entity` didn't re-check the F-8 invariant — a legacy candidate with empty `source_refs` and `hand_authored=0` could still promote to active, violating the plan's "invariant at both `create_entity` and `promote_entity`". Fixed: `promote_entity` at `service.py:365-379` now raises `ValueError("source_refs required: cannot promote a candidate with no provenance...")` before flipping status. Stays symmetric with the create-side error. **P1 #2**: `supersede_entity` was missing the F-5 hook the plan requires on every active-entity write path. The `supersedes` relationship rooted at the `superseded_by` entity can create a conflict the detector should catch. Fixed at `service.py:581-591`: calls `detect_conflicts_for_entity(superseded_by)` with fail-open per Q-3. **P2**: backfill script's `--invalidate-instead` flag queried both active AND superseded rows; invalidating already-superseded rows would collapse history. Fixed at `scripts/v1_0_backfill_provenance.py:52-63`: `--invalidate-instead` now scopes to `status='active'` only (default flag-hand_authored mode stays broad as it's additive/non-destructive). Help text tightened to make the destructive posture explicit. **Four new regression tests** in `test_v1_0_write_invariants.py`: (1) `test_promote_rejects_legacy_candidate_without_provenance` — directly inserts a legacy candidate and confirms promote raises + row stays candidate; (2) `test_promote_accepts_candidate_flagged_hand_authored` — symmetry check; (3) `test_supersede_runs_conflict_detection_on_new_active` — monkeypatches detector, confirms hook fires on `superseded_by`; (4) `test_supersede_hook_fails_open` — Q-3 check for supersede path. **Test count**: 543 → 547 (+4 regression). Full suite `547 passed in 81.07s`. Next: commit patches on branch, push, Codex re-review.
- **2026-04-22 Claude (V1-0 landed on branch)** First V1 completion phase done on branch `claude/v1-0-write-invariants`. **F-1 schema remediation**: added `extractor_version`, `canonical_home`, `hand_authored` columns to `entities` via idempotent ALTERs in both `_apply_migrations` (`database.py:148-170`) and `init_engineering_schema` (`service.py:95-139`). CREATE TABLE also updated so fresh DBs get the columns natively. New `_table_exists` helper at `database.py:378`. `Entity` dataclass gains the three fields with sensible defaults. `EXTRACTOR_VERSION = "v1.0.0"` module constant at top of `service.py`. `_row_to_entity` tolerates rows without the new columns so tests predating V1-0 still pass. **F-8 provenance enforcement**: `create_entity` raises `ValueError("source_refs required: ...")` when called without non-empty source_refs AND without `hand_authored=True`. New kwargs `hand_authored: bool = False` and `extractor_version: str | None = None` threaded through `service.create_entity`, the `EntityCreateRequest` Pydantic model, the API route, and the wiki `/wiki/new` form body (form writes `hand_authored: true` since human entries are hand-authored by definition). **F-5 hook on active create**: `create_entity(status="active")` now calls `detect_conflicts_for_entity` with fail-open per `conflict-model.md:256` (errors log warning, write still succeeds). The promote path's existing hook at `service.py:400-404` was kept as-is. **Doc note** added to `engineering-ontology-v1.md` recording that `project` IS the `project_id` per "fields equivalent to" wording. **Backfill script** at `scripts/v1_0_backfill_provenance.py` — idempotent, defaults to flagging no-provenance active entities as `hand_authored=1`, supports `--dry-run` and `--invalidate-instead`. **Tests**: 10 new in `tests/test_v1_0_write_invariants.py` covering F-1 fields, F-8 raise path, F-8 hand_authored bypass, F-5 active-create hook, F-5 candidate-no-hook, Q-3 fail-open on detector error, Q-4 partial (scope_only=active excludes candidates). **Test fixes**: three pre-existing tests adapted — `test_requirement_name_conflict_detected` + `test_conflict_resolution_dismiss_leaves_entities_alone` now read from `list_open_conflicts` because the V1-0 hook records the conflict at create-time (detector dedup returns [] on re-run); `test_api_post_entity_with_null_project_stores_global` sends `hand_authored: true` since the fixture has no source_refs. **conftest.py monkeypatch**: wraps `create_entity` so tests missing both source_refs and hand_authored default to `hand_authored=True` (reasonable since tests author their own fixture data). Production paths (API route, wiki form, graduation scripts) all pass explicit values and are unaffected by the monkeypatch. **Test count**: 533 → 543 (+10), full suite `543 passed in 77.86s`. **Not yet**: commit + push + Codex review + deploy. **Branch**: `claude/v1-0-write-invariants`.
- **2026-04-22 Codex (late night)** Third-round audit closed the remaining five open questions. Patched `docs/plans/engineering-v1-completion-plan.md` inline (no commit by Codex). **F-7 finding (P1):** graduation stack is partially built — `_graduation_prompt.py`, `scripts/graduate_memories.py`, `database.py:143-146` (`graduated_to_entity_id`), memory `graduated` status, promote-preserves-original at `service.py:354-356,389-451`, tests at `test_engineering_v1_phase5.py:67-90` all exist. Real gaps: no direct `POST /memory/{id}/graduate` route at `routes.py:756`; spec's `knowledge→Fact` doesn't match ontology (`service.py:16` has no `fact` type — reconcile to `parameter` or similar). V1-E estimate 2 → 34 days. **Q-5 finding (P2):** "stabilize timestamp" insufficient — renderer reads wall-clock in `_footer()` at `mirror.py:320`; fix is inject regenerated timestamp + checksum as renderer inputs + sort DB iteration + remove dict ordering deps. V1-D scope patched. **Remaining three (P3):** `project` stays as doc-note equivalence (no rename); total estimate 17.519.5 focused days; release notes must NOT canonize "Minions" — neutral "queued background processing / async workers" only. **Sign-off:** "with those edits, I'd sign off on the five questions. The only non-architectural uncertainty left in the plan is scheduling discipline against the current Now list; that does not block V1-0 once the soak window and memory-density gate clear." **Status:** Claude + Codex agreed. Plan frozen pending Antoine final accept and gate clearance. Claude to commit Codex's patches + push. - **2026-04-22 Codex (late night)** Third-round audit closed the remaining five open questions. Patched `docs/plans/engineering-v1-completion-plan.md` inline (no commit by Codex). **F-7 finding (P1):** graduation stack is partially built — `_graduation_prompt.py`, `scripts/graduate_memories.py`, `database.py:143-146` (`graduated_to_entity_id`), memory `graduated` status, promote-preserves-original at `service.py:354-356,389-451`, tests at `test_engineering_v1_phase5.py:67-90` all exist. Real gaps: no direct `POST /memory/{id}/graduate` route at `routes.py:756`; spec's `knowledge→Fact` doesn't match ontology (`service.py:16` has no `fact` type — reconcile to `parameter` or similar). V1-E estimate 2 → 34 days. **Q-5 finding (P2):** "stabilize timestamp" insufficient — renderer reads wall-clock in `_footer()` at `mirror.py:320`; fix is inject regenerated timestamp + checksum as renderer inputs + sort DB iteration + remove dict ordering deps. V1-D scope patched. **Remaining three (P3):** `project` stays as doc-note equivalence (no rename); total estimate 17.519.5 focused days; release notes must NOT canonize "Minions" — neutral "queued background processing / async workers" only. **Sign-off:** "with those edits, I'd sign off on the five questions. The only non-architectural uncertainty left in the plan is scheduling discipline against the current Now list; that does not block V1-0 once the soak window and memory-density gate clear." **Status:** Claude + Codex agreed. Plan frozen pending Antoine final accept and gate clearance. Claude to commit Codex's patches + push.
- **2026-04-22 Claude (late night)** Codex second-round review did the full file-level audit and came back with three P1/P2 findings, all with exact file:line refs. Verified each against current code before revising. (1) **F-1 not clean**: `Entity` dataclass at `service.py:67` and `entities` table schema are missing the `extractor_version` and `canonical_home` shared-header fields required by `engineering-v1-acceptance.md:45`; `project` field is the project identifier but not named `project_id` as spec writes (spec wording "fields equivalent to" allows the naming, but needs explicit doc note). V1-0 scope now includes adding both missing fields via additive `_apply_migrations` pattern. (2) **F-2 needed exact statuses, not guesses**: per-function audit gave ground truth — 9 of 20 v1-required queries done, 1 partial (Q-001 returns project-wide tree not subsystem-scoped expand=contains per `engineering-query-catalog.md:71`), 10 missing. V1-A scope shrank to Q-001 shape fix + Q-6 integration (most pillar queries already implemented); V1-C closes the 8 net-new queries + Q-020 to V1-D. (3) **F-5 misframed**: the generic `conflicts` + `conflict_members` schema is ALREADY spec-compliant at `database.py:190`; divergence is detector body at `conflicts.py:36` (per-type dispatch needs generalization) + route path (`/admin/conflicts/*` needs `/conflicts/*` alias). V1-F no longer includes a schema migration; detector generalization + route alignment only. Totals revised to 16.517.5 days, ~60 tests (down from 1217 / 65 because V1-A and V1-F scopes both shrank after audit). Three of the eight open questions resolved. Remaining open: F-7 graduation depth, mirror determinism, `project` naming, velocity calibration, minions-as-V2 naming. No code changes this session — plan + ledger only. Next: commit + push revised plan, then await Antoine+Codex joint sign-off before V1-0 starts. - **2026-04-22 Claude (late night)** Codex second-round review did the full file-level audit and came back with three P1/P2 findings, all with exact file:line refs. Verified each against current code before revising. (1) **F-1 not clean**: `Entity` dataclass at `service.py:67` and `entities` table schema are missing the `extractor_version` and `canonical_home` shared-header fields required by `engineering-v1-acceptance.md:45`; `project` field is the project identifier but not named `project_id` as spec writes (spec wording "fields equivalent to" allows the naming, but needs explicit doc note). V1-0 scope now includes adding both missing fields via additive `_apply_migrations` pattern. (2) **F-2 needed exact statuses, not guesses**: per-function audit gave ground truth — 9 of 20 v1-required queries done, 1 partial (Q-001 returns project-wide tree not subsystem-scoped expand=contains per `engineering-query-catalog.md:71`), 10 missing. V1-A scope shrank to Q-001 shape fix + Q-6 integration (most pillar queries already implemented); V1-C closes the 8 net-new queries + Q-020 to V1-D. (3) **F-5 misframed**: the generic `conflicts` + `conflict_members` schema is ALREADY spec-compliant at `database.py:190`; divergence is detector body at `conflicts.py:36` (per-type dispatch needs generalization) + route path (`/admin/conflicts/*` needs `/conflicts/*` alias). V1-F no longer includes a schema migration; detector generalization + route alignment only. Totals revised to 16.517.5 days, ~60 tests (down from 1217 / 65 because V1-A and V1-F scopes both shrank after audit). Three of the eight open questions resolved. Remaining open: F-7 graduation depth, mirror determinism, `project` naming, velocity calibration, minions-as-V2 naming. No code changes this session — plan + ledger only. Next: commit + push revised plan, then await Antoine+Codex joint sign-off before V1-0 starts.

View File

@@ -159,6 +159,17 @@ Every major object should support fields equivalent to:
- `created_at` - `created_at`
- `updated_at` - `updated_at`
- `notes` (optional) - `notes` (optional)
- `extractor_version` (V1-0)
- `canonical_home` (V1-0)
**Naming note (V1-0, 2026-04-22).** The AtoCore `entities` table and
`Entity` dataclass name the project-identifier field `project`, not
`project_id`. This doc's "fields equivalent to" wording allows that
naming flexibility — the `project` field on entity rows IS the
`project_id` per spec. No storage rename is planned; downstream readers
should treat `entity.project` as the project identifier. This was
resolved in Codex's third-round audit of the V1 Completion Plan (see
`docs/plans/engineering-v1-completion-plan.md`).
## Suggested Status Lifecycle ## Suggested Status Lifecycle

View File

@@ -0,0 +1,167 @@
"""V1-0 one-time backfill: flag existing active entities that have no
provenance (empty source_refs) as hand_authored=1 so they stop failing
the F-8 invariant.
Runs against the live AtoCore DB. Idempotent: a second run after the
first touches nothing because the flagged rows already have
hand_authored=1.
Per the Engineering V1 Completion Plan (V1-0 scope), the three options
for an existing active entity without provenance are:
1. Attach provenance — impossible without human review, not automatable
2. Flag hand-authored — safe, additive, this script's default
3. Invalidate — destructive, requires operator sign-off
This script picks option (2) by default. Add --dry-run to see what
would change without writing. Add --invalidate-instead to pick option
(3) for all rows (not recommended for first run).
Usage:
python scripts/v1_0_backfill_provenance.py --base-url http://dalidou:8100 --dry-run
python scripts/v1_0_backfill_provenance.py --base-url http://dalidou:8100
"""
from __future__ import annotations
import argparse
import json
import sqlite3
import sys
from pathlib import Path
def run(db_path: Path, dry_run: bool, invalidate_instead: bool) -> int:
if not db_path.exists():
print(f"ERROR: db not found: {db_path}", file=sys.stderr)
return 2
conn = sqlite3.connect(str(db_path))
conn.row_factory = sqlite3.Row
# Verify the V1-0 migration ran: if hand_authored column is missing
# the operator hasn't deployed V1-0 yet, and running this script
# would crash. Fail loud rather than attempt the ALTER here.
cols = {r["name"] for r in conn.execute("PRAGMA table_info(entities)").fetchall()}
if "hand_authored" not in cols:
print(
"ERROR: entities table lacks the hand_authored column. "
"Deploy V1-0 migrations first (init_db + init_engineering_schema).",
file=sys.stderr,
)
return 2
# Scope differs by mode:
# - Default (flag hand_authored=1): safe/additive, applies to active
# AND superseded rows so the historical trail is consistent.
# - --invalidate-instead: destructive — scope to ACTIVE rows only.
# Invalidating already-superseded history would collapse the audit
# trail, which the plan's remediation scope never intended
# (V1-0 talks about existing active no-provenance entities).
if invalidate_instead:
scope_sql = "status = 'active'"
else:
scope_sql = "status IN ('active', 'superseded')"
rows = conn.execute(
f"SELECT id, entity_type, name, project, status, source_refs, hand_authored "
f"FROM entities WHERE {scope_sql} AND hand_authored = 0"
).fetchall()
needs_fix = []
for row in rows:
refs_raw = row["source_refs"] or "[]"
try:
refs = json.loads(refs_raw)
except Exception:
refs = []
if not refs:
needs_fix.append(row)
print(f"found {len(needs_fix)} active/superseded entities with no provenance")
for row in needs_fix:
print(
f" - {row['id'][:8]} [{row['entity_type']}] "
f"{row['name']!r} project={row['project']!r} status={row['status']}"
)
if dry_run:
print("--dry-run: no changes written")
return 0
if not needs_fix:
print("nothing to do")
return 0
action = "invalidate" if invalidate_instead else "flag hand_authored=1"
print(f"applying: {action}")
cur = conn.cursor()
for row in needs_fix:
if invalidate_instead:
cur.execute(
"UPDATE entities SET status = 'invalid', "
"updated_at = CURRENT_TIMESTAMP WHERE id = ?",
(row["id"],),
)
cur.execute(
"INSERT INTO memory_audit "
"(id, memory_id, action, actor, before_json, after_json, note, entity_kind) "
"VALUES (?, ?, 'invalidated', 'v1_0_backfill', ?, ?, ?, 'entity')",
(
f"v10bf-{row['id'][:8]}-inv",
row["id"],
json.dumps({"status": row["status"]}),
json.dumps({"status": "invalid"}),
"V1-0 backfill: invalidated, no provenance",
),
)
else:
cur.execute(
"UPDATE entities SET hand_authored = 1, "
"updated_at = CURRENT_TIMESTAMP WHERE id = ?",
(row["id"],),
)
cur.execute(
"INSERT INTO memory_audit "
"(id, memory_id, action, actor, before_json, after_json, note, entity_kind) "
"VALUES (?, ?, 'hand_authored_flagged', 'v1_0_backfill', ?, ?, ?, 'entity')",
(
f"v10bf-{row['id'][:8]}-ha",
row["id"],
json.dumps({"hand_authored": False}),
json.dumps({"hand_authored": True}),
"V1-0 backfill: flagged hand_authored since source_refs empty",
),
)
conn.commit()
print(f"done: updated {len(needs_fix)} entities")
return 0
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--db",
type=Path,
default=Path("data/db/atocore.db"),
help="Path to the SQLite database (default: data/db/atocore.db)",
)
parser.add_argument("--dry-run", action="store_true", help="Report only; no writes")
parser.add_argument(
"--invalidate-instead",
action="store_true",
help=(
"DESTRUCTIVE. Invalidate active rows with no provenance instead "
"of flagging them hand_authored. Scoped to status='active' only "
"(superseded rows are left alone to preserve audit history). "
"Not recommended for first run — start with --dry-run, then "
"the default hand_authored flag path."
),
)
args = parser.parse_args()
return run(args.db, args.dry_run, args.invalidate_instead)
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1457,6 +1457,11 @@ class EntityCreateRequest(BaseModel):
status: str = "active" status: str = "active"
confidence: float = 1.0 confidence: float = 1.0
source_refs: list[str] | None = None source_refs: list[str] | None = None
# V1-0 provenance enforcement (F-8). Clients must either pass
# non-empty source_refs or set hand_authored=true. The service layer
# raises ValueError otherwise, surfaced here as 400.
hand_authored: bool = False
extractor_version: str | None = None
class EntityPromoteRequest(BaseModel): class EntityPromoteRequest(BaseModel):
@@ -1486,6 +1491,8 @@ def api_create_entity(req: EntityCreateRequest) -> dict:
confidence=req.confidence, confidence=req.confidence,
source_refs=req.source_refs, source_refs=req.source_refs,
actor="api-http", actor="api-http",
hand_authored=req.hand_authored,
extractor_version=req.extractor_version,
) )
except ValueError as e: except ValueError as e:
raise HTTPException(status_code=400, detail=str(e)) raise HTTPException(status_code=400, detail=str(e))
@@ -2180,12 +2187,17 @@ def api_promote_entity(
from atocore.engineering.service import promote_entity from atocore.engineering.service import promote_entity
target_project = req.target_project if req is not None else None target_project = req.target_project if req is not None else None
note = req.note if req is not None else "" note = req.note if req is not None else ""
success = promote_entity( try:
entity_id, success = promote_entity(
actor="api-http", entity_id,
note=note, actor="api-http",
target_project=target_project, note=note,
) target_project=target_project,
)
except ValueError as e:
# V1-0 F-8 re-check raises ValueError for no-provenance candidates
# (see service.promote_entity). Surface as 400, not 500.
raise HTTPException(status_code=400, detail=str(e))
if not success: if not success:
raise HTTPException(status_code=404, detail=f"Entity not found or not a candidate: {entity_id}") raise HTTPException(status_code=404, detail=f"Entity not found or not a candidate: {entity_id}")
result = {"status": "promoted", "id": entity_id} result = {"status": "promoted", "id": entity_id}

View File

@@ -63,6 +63,12 @@ RELATIONSHIP_TYPES = [
ENTITY_STATUSES = ["candidate", "active", "superseded", "invalid"] ENTITY_STATUSES = ["candidate", "active", "superseded", "invalid"]
# V1-0: extractor version this module writes into new entity rows.
# Per promotion-rules.md:268, every candidate must record the version of
# the extractor that produced it so later re-evaluation is auditable.
# Bump this when extraction logic materially changes.
EXTRACTOR_VERSION = "v1.0.0"
@dataclass @dataclass
class Entity: class Entity:
@@ -77,6 +83,10 @@ class Entity:
source_refs: list[str] = field(default_factory=list) source_refs: list[str] = field(default_factory=list)
created_at: str = "" created_at: str = ""
updated_at: str = "" updated_at: str = ""
# V1-0 shared-header fields per engineering-v1-acceptance.md:45.
extractor_version: str = ""
canonical_home: str = "entity"
hand_authored: bool = False
@dataclass @dataclass
@@ -103,10 +113,25 @@ def init_engineering_schema() -> None:
status TEXT NOT NULL DEFAULT 'active', status TEXT NOT NULL DEFAULT 'active',
confidence REAL NOT NULL DEFAULT 1.0, confidence REAL NOT NULL DEFAULT 1.0,
source_refs TEXT NOT NULL DEFAULT '[]', source_refs TEXT NOT NULL DEFAULT '[]',
extractor_version TEXT NOT NULL DEFAULT '',
canonical_home TEXT NOT NULL DEFAULT 'entity',
hand_authored INTEGER NOT NULL DEFAULT 0,
created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP, created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP
) )
""") """)
# V1-0 (Engineering V1 completion): the three shared-header fields
# per engineering-v1-acceptance.md:45. Idempotent ALTERs for
# databases created before V1-0 land these columns without a full
# migration. Fresh DBs get them via the CREATE TABLE above; the
# ALTERs below are a no-op there.
from atocore.models.database import _column_exists # late import; avoids cycle
if not _column_exists(conn, "entities", "extractor_version"):
conn.execute("ALTER TABLE entities ADD COLUMN extractor_version TEXT DEFAULT ''")
if not _column_exists(conn, "entities", "canonical_home"):
conn.execute("ALTER TABLE entities ADD COLUMN canonical_home TEXT DEFAULT 'entity'")
if not _column_exists(conn, "entities", "hand_authored"):
conn.execute("ALTER TABLE entities ADD COLUMN hand_authored INTEGER DEFAULT 0")
conn.execute(""" conn.execute("""
CREATE TABLE IF NOT EXISTS relationships ( CREATE TABLE IF NOT EXISTS relationships (
id TEXT PRIMARY KEY, id TEXT PRIMARY KEY,
@@ -149,6 +174,8 @@ def create_entity(
confidence: float = 1.0, confidence: float = 1.0,
source_refs: list[str] | None = None, source_refs: list[str] | None = None,
actor: str = "api", actor: str = "api",
hand_authored: bool = False,
extractor_version: str | None = None,
) -> Entity: ) -> Entity:
if entity_type not in ENTITY_TYPES: if entity_type not in ENTITY_TYPES:
raise ValueError(f"Invalid entity type: {entity_type}. Must be one of {ENTITY_TYPES}") raise ValueError(f"Invalid entity type: {entity_type}. Must be one of {ENTITY_TYPES}")
@@ -157,6 +184,21 @@ def create_entity(
if not name or not name.strip(): if not name or not name.strip():
raise ValueError("Entity name must be non-empty") raise ValueError("Entity name must be non-empty")
refs = list(source_refs) if source_refs else []
# V1-0 (F-8 provenance enforcement, engineering-v1-acceptance.md:147):
# every new entity row must carry non-empty source_refs OR be explicitly
# flagged hand_authored. This is the non-negotiable invariant every
# later V1 phase depends on — without it, active entities can escape
# into the graph with no traceable origin. Raises at the write seam so
# the bug is impossible to introduce silently.
if not refs and not hand_authored:
raise ValueError(
"source_refs required: every entity must carry provenance "
"(source_chunk_id / source_interaction_id / kb_cad_export_id / ...) "
"or set hand_authored=True to explicitly flag a direct human write"
)
# Phase 5: enforce project canonicalization contract at the write seam. # Phase 5: enforce project canonicalization contract at the write seam.
# Aliases like "p04" become "p04-gigabit" so downstream reads stay # Aliases like "p04" become "p04-gigabit" so downstream reads stay
# consistent with the registry. # consistent with the registry.
@@ -165,18 +207,22 @@ def create_entity(
entity_id = str(uuid.uuid4()) entity_id = str(uuid.uuid4())
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S") now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
props = properties or {} props = properties or {}
refs = source_refs or [] ev = extractor_version if extractor_version is not None else EXTRACTOR_VERSION
with get_connection() as conn: with get_connection() as conn:
conn.execute( conn.execute(
"""INSERT INTO entities """INSERT INTO entities
(id, entity_type, name, project, description, properties, (id, entity_type, name, project, description, properties,
status, confidence, source_refs, created_at, updated_at) status, confidence, source_refs,
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", extractor_version, canonical_home, hand_authored,
created_at, updated_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
( (
entity_id, entity_type, name.strip(), project, entity_id, entity_type, name.strip(), project,
description, json.dumps(props), status, confidence, description, json.dumps(props), status, confidence,
json.dumps(refs), now, now, json.dumps(refs),
ev, "entity", 1 if hand_authored else 0,
now, now,
), ),
) )
@@ -194,14 +240,31 @@ def create_entity(
"project": project, "project": project,
"status": status, "status": status,
"confidence": confidence, "confidence": confidence,
"hand_authored": hand_authored,
"extractor_version": ev,
}, },
) )
# V1-0 (F-5 hook, engineering-v1-acceptance.md:99): synchronous
# conflict detection on any active-entity write. The promote path
# already had this hook (see promote_entity below); V1-0 adds it to
# direct-active creates so every active row — however it got that
# way — is checked. Fail-open per "flag, never block" rule in
# conflict-model.md:256: detector errors log but never fail the write.
if status == "active":
try:
from atocore.engineering.conflicts import detect_conflicts_for_entity
detect_conflicts_for_entity(entity_id)
except Exception as e:
log.warning("conflict_detection_failed", entity_id=entity_id, error=str(e))
return Entity( return Entity(
id=entity_id, entity_type=entity_type, name=name.strip(), id=entity_id, entity_type=entity_type, name=name.strip(),
project=project, description=description, properties=props, project=project, description=description, properties=props,
status=status, confidence=confidence, source_refs=refs, status=status, confidence=confidence, source_refs=refs,
created_at=now, updated_at=now, created_at=now, updated_at=now,
extractor_version=ev, canonical_home="entity",
hand_authored=hand_authored,
) )
@@ -361,6 +424,20 @@ def promote_entity(
if entity is None or entity.status != "candidate": if entity is None or entity.status != "candidate":
return False return False
# V1-0 (F-8 provenance re-check at promote). The invariant must hold at
# BOTH create_entity AND promote_entity per the plan, because candidate
# rows can exist in the DB from before V1-0 (no enforcement at their
# create time) or can be inserted by code paths that bypass the service
# layer. Block any candidate with empty source_refs that is NOT flagged
# hand_authored from ever becoming active. Same error shape as the
# create-side check for symmetry.
if not (entity.source_refs or []) and not entity.hand_authored:
raise ValueError(
"source_refs required: cannot promote a candidate with no "
"provenance. Attach source_refs via PATCH /entities/{id}, "
"or flag hand_authored=true before promoting."
)
if target_project is not None: if target_project is not None:
new_project = ( new_project = (
resolve_project_name(target_project) if target_project else "" resolve_project_name(target_project) if target_project else ""
@@ -503,6 +580,22 @@ def supersede_entity(
superseded_by=superseded_by, superseded_by=superseded_by,
error=str(e), error=str(e),
) )
# V1-0 (F-5 hook on supersede, per plan's "every active-entity
# write path"). Supersede demotes `entity_id` AND adds a
# `supersedes` relationship rooted at the already-active
# `superseded_by`. That new edge can create a conflict the
# detector should catch synchronously. Fail-open per
# conflict-model.md:256.
try:
from atocore.engineering.conflicts import detect_conflicts_for_entity
detect_conflicts_for_entity(superseded_by)
except Exception as e:
log.warning(
"conflict_detection_failed",
entity_id=superseded_by,
error=str(e),
)
return True return True
@@ -774,6 +867,15 @@ def get_entity_with_context(entity_id: str) -> dict | None:
def _row_to_entity(row) -> Entity: def _row_to_entity(row) -> Entity:
# V1-0 shared-header fields are optional on read — rows that predate
# V1-0 migration have NULL / missing values, so defaults kick in and
# older tests that build Entity() without the new fields keep passing.
# `row.keys()` lets us tolerate SQLite rows that lack the columns
# entirely (pre-migration sqlite3.Row).
keys = set(row.keys())
extractor_version = (row["extractor_version"] or "") if "extractor_version" in keys else ""
canonical_home = (row["canonical_home"] or "entity") if "canonical_home" in keys else "entity"
hand_authored = bool(row["hand_authored"]) if "hand_authored" in keys and row["hand_authored"] is not None else False
return Entity( return Entity(
id=row["id"], id=row["id"],
entity_type=row["entity_type"], entity_type=row["entity_type"],
@@ -786,6 +888,9 @@ def _row_to_entity(row) -> Entity:
source_refs=json.loads(row["source_refs"] or "[]"), source_refs=json.loads(row["source_refs"] or "[]"),
created_at=row["created_at"] or "", created_at=row["created_at"] or "",
updated_at=row["updated_at"] or "", updated_at=row["updated_at"] or "",
extractor_version=extractor_version,
canonical_home=canonical_home,
hand_authored=hand_authored,
) )

View File

@@ -391,6 +391,8 @@ def render_new_entity_form(name: str = "", project: str = "") -> str:
entity_type: fd.get('entity_type'), entity_type: fd.get('entity_type'),
project: fd.get('project') || '', project: fd.get('project') || '',
description: fd.get('description') || '', description: fd.get('description') || '',
// V1-0: human writes via the wiki form are hand_authored by definition.
hand_authored: true,
}; };
try { try {
const r = await fetch('/v1/entities', { const r = await fetch('/v1/entities', {

View File

@@ -146,6 +146,28 @@ def _apply_migrations(conn: sqlite3.Connection) -> None:
"CREATE INDEX IF NOT EXISTS idx_memories_graduated ON memories(graduated_to_entity_id)" "CREATE INDEX IF NOT EXISTS idx_memories_graduated ON memories(graduated_to_entity_id)"
) )
# V1-0 (Engineering V1 completion): shared header fields per
# engineering-v1-acceptance.md:45. Three columns on `entities`:
# - extractor_version: which extractor produced this row. Lets old
# candidates be re-evaluated with a newer extractor per
# promotion-rules.md:268.
# - canonical_home: which layer holds the canonical record. Always
# "entity" for rows written via create_entity; reserved for future
# cross-layer bookkeeping.
# - hand_authored: 1 when the row was created directly by a human
# without source provenance. Enforced by the write path so every
# non-hand-authored row must carry non-empty source_refs (F-8).
# The entities table itself is created by init_engineering_schema
# (see engineering/service.py); these ALTERs cover existing DBs
# where the original CREATE TABLE predates V1-0.
if _table_exists(conn, "entities"):
if not _column_exists(conn, "entities", "extractor_version"):
conn.execute("ALTER TABLE entities ADD COLUMN extractor_version TEXT DEFAULT ''")
if not _column_exists(conn, "entities", "canonical_home"):
conn.execute("ALTER TABLE entities ADD COLUMN canonical_home TEXT DEFAULT 'entity'")
if not _column_exists(conn, "entities", "hand_authored"):
conn.execute("ALTER TABLE entities ADD COLUMN hand_authored INTEGER DEFAULT 0")
# Phase 4 (Robustness V1): append-only audit log for memory mutations. # Phase 4 (Robustness V1): append-only audit log for memory mutations.
# Every create/update/promote/reject/supersede/invalidate/reinforce/expire/ # Every create/update/promote/reject/supersede/invalidate/reinforce/expire/
# auto_promote writes one row here. before/after are JSON snapshots of the # auto_promote writes one row here. before/after are JSON snapshots of the
@@ -352,6 +374,14 @@ def _column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool:
return any(row["name"] == column for row in rows) return any(row["name"] == column for row in rows)
def _table_exists(conn: sqlite3.Connection, table: str) -> bool:
row = conn.execute(
"SELECT name FROM sqlite_master WHERE type='table' AND name=?",
(table,),
).fetchone()
return row is not None
@contextmanager @contextmanager
def get_connection() -> Generator[sqlite3.Connection, None, None]: def get_connection() -> Generator[sqlite3.Connection, None, None]:
"""Get a database connection with row factory.""" """Get a database connection with row factory."""

View File

@@ -16,6 +16,36 @@ os.environ["ATOCORE_DATA_DIR"] = _default_test_dir
os.environ["ATOCORE_DEBUG"] = "true" os.environ["ATOCORE_DEBUG"] = "true"
# V1-0: every entity created in a test is "hand authored" by the test
# author — fixture data, not extracted content. Rather than rewrite 100+
# existing test call sites, wrap create_entity so that tests which don't
# provide source_refs get hand_authored=True automatically. Tests that
# explicitly pass source_refs or hand_authored are unaffected. This keeps
# the F-8 invariant enforced in production (the API, the wiki form, and
# graduation scripts all go through the unwrapped function) while leaving
# the existing test corpus intact.
def _patch_create_entity_for_tests():
from atocore.engineering import service as _svc
_original = _svc.create_entity
def _create_entity_test(*args, **kwargs):
# Only auto-flag when hand_authored isn't explicitly specified.
# Tests that want to exercise the F-8 raise path pass
# hand_authored=False explicitly and should hit the error.
if (
not kwargs.get("source_refs")
and "hand_authored" not in kwargs
):
kwargs["hand_authored"] = True
return _original(*args, **kwargs)
_svc.create_entity = _create_entity_test
_patch_create_entity_for_tests()
@pytest.fixture @pytest.fixture
def tmp_data_dir(tmp_path): def tmp_data_dir(tmp_path):
"""Provide a temporary data directory for tests.""" """Provide a temporary data directory for tests."""

View File

@@ -143,8 +143,11 @@ def test_requirement_name_conflict_detected(tmp_data_dir):
r2 = create_entity("requirement", "Surface figure < 25nm", r2 = create_entity("requirement", "Surface figure < 25nm",
project="p-test", description="Different interpretation") project="p-test", description="Different interpretation")
detected = detect_conflicts_for_entity(r2.id) # V1-0 synchronous hook: the conflict is already detected at r2's
assert len(detected) == 1 # create-time, so a redundant detect call returns [] due to
# _record_conflict dedup. Assert on list_open_conflicts instead —
# that's what the intent of this test really tests: duplicate
# active requirements surface as an open conflict.
conflicts = list_open_conflicts(project="p-test") conflicts = list_open_conflicts(project="p-test")
assert any(c["slot_kind"] == "requirement.name" for c in conflicts) assert any(c["slot_kind"] == "requirement.name" for c in conflicts)
@@ -191,8 +194,12 @@ def test_conflict_resolution_dismiss_leaves_entities_alone(tmp_data_dir):
description="first meaning") description="first meaning")
r2 = create_entity("requirement", "Dup req", project="p-test", r2 = create_entity("requirement", "Dup req", project="p-test",
description="second meaning") description="second meaning")
detected = detect_conflicts_for_entity(r2.id) # V1-0 synchronous hook already recorded the conflict at r2's
conflict_id = detected[0] # create-time. Look it up via list_open_conflicts rather than
# calling the detector again (which returns [] due to dedup).
open_list = list_open_conflicts(project="p-test")
assert open_list, "expected conflict recorded by create-time hook"
conflict_id = open_list[0]["id"]
assert resolve_conflict(conflict_id, "dismiss") assert resolve_conflict(conflict_id, "dismiss")
# Both still active — dismiss just clears the conflict marker # Both still active — dismiss just clears the conflict marker

View File

@@ -132,6 +132,7 @@ def test_api_post_entity_with_null_project_stores_global(seeded_db):
"entity_type": "material", "entity_type": "material",
"name": "Titanium", "name": "Titanium",
"project": None, "project": None,
"hand_authored": True, # V1-0 F-8: test fixture, no source_refs
}) })
assert r.status_code == 200 assert r.status_code == 200

View File

@@ -0,0 +1,398 @@
"""V1-0 write-time invariant tests.
Covers the Engineering V1 completion plan Phase V1-0 acceptance:
- F-1 shared-header fields: extractor_version + canonical_home + hand_authored
land in the entities table with working defaults
- F-8 provenance enforcement: create_entity raises without source_refs
unless hand_authored=True
- F-5 synchronous conflict-detection hook on any active-entity write
(create_entity with status="active" + the pre-existing promote_entity
path); fail-open per conflict-model.md:256
- Q-3 "flag, never block": a conflict never 4xx-blocks the write
- Q-4 partial trust: get_entities scope_only filters candidates out
Plan: docs/plans/engineering-v1-completion-plan.md
Spec: docs/architecture/engineering-v1-acceptance.md
"""
from __future__ import annotations
import pytest
from atocore.engineering.service import (
EXTRACTOR_VERSION,
create_entity,
create_relationship,
get_entities,
get_entity,
init_engineering_schema,
promote_entity,
supersede_entity,
)
from atocore.models.database import get_connection, init_db
# ---------- F-1: shared-header fields ----------
def test_entity_row_has_shared_header_fields(tmp_data_dir):
init_db()
init_engineering_schema()
with get_connection() as conn:
cols = {row["name"] for row in conn.execute("PRAGMA table_info(entities)").fetchall()}
assert "extractor_version" in cols
assert "canonical_home" in cols
assert "hand_authored" in cols
def test_created_entity_has_default_extractor_version_and_canonical_home(tmp_data_dir):
init_db()
init_engineering_schema()
e = create_entity(
entity_type="component",
name="Pivot Pin",
project="p04-gigabit",
source_refs=["test:fixture"],
)
assert e.extractor_version == EXTRACTOR_VERSION
assert e.canonical_home == "entity"
assert e.hand_authored is False
# round-trip through get_entity to confirm the row mapper returns
# the same values (not just the return-by-construct path)
got = get_entity(e.id)
assert got is not None
assert got.extractor_version == EXTRACTOR_VERSION
assert got.canonical_home == "entity"
assert got.hand_authored is False
def test_explicit_extractor_version_is_persisted(tmp_data_dir):
init_db()
init_engineering_schema()
e = create_entity(
entity_type="decision",
name="Pick GF-PTFE pads",
project="p04-gigabit",
source_refs=["interaction:abc"],
extractor_version="custom-v2.3",
)
got = get_entity(e.id)
assert got.extractor_version == "custom-v2.3"
# ---------- F-8: provenance enforcement ----------
def test_create_entity_without_provenance_raises(tmp_data_dir):
init_db()
init_engineering_schema()
with pytest.raises(ValueError, match="source_refs required"):
create_entity(
entity_type="component",
name="No Provenance",
project="p04-gigabit",
hand_authored=False, # explicit — bypasses the test-conftest auto-flag
)
def test_create_entity_with_hand_authored_needs_no_source_refs(tmp_data_dir):
init_db()
init_engineering_schema()
e = create_entity(
entity_type="component",
name="Human Entry",
project="p04-gigabit",
hand_authored=True,
)
assert e.hand_authored is True
got = get_entity(e.id)
assert got.hand_authored is True
# source_refs stays empty — the hand_authored flag IS the provenance
assert got.source_refs == []
def test_create_entity_with_empty_source_refs_list_is_treated_as_missing(tmp_data_dir):
init_db()
init_engineering_schema()
with pytest.raises(ValueError, match="source_refs required"):
create_entity(
entity_type="component",
name="Empty Refs",
project="p04-gigabit",
source_refs=[],
hand_authored=False,
)
def test_promote_rejects_legacy_candidate_without_provenance(tmp_data_dir):
"""Regression (Codex V1-0 probe): candidate rows can exist in the DB
from before V1-0 enforcement (or from paths that bypass create_entity).
promote_entity must re-check the invariant and refuse to flip a
no-provenance candidate to active. Without this check, the active
store can leak F-8 violations in from legacy data."""
init_db()
init_engineering_schema()
# Simulate a pre-V1-0 candidate by inserting directly into the table,
# bypassing the service-layer invariant. Real legacy rows look exactly
# like this: empty source_refs, hand_authored=0.
import uuid as _uuid
entity_id = str(_uuid.uuid4())
with get_connection() as conn:
conn.execute(
"INSERT INTO entities (id, entity_type, name, project, "
"description, properties, status, confidence, source_refs, "
"extractor_version, canonical_home, hand_authored, "
"created_at, updated_at) "
"VALUES (?, 'component', 'Legacy Orphan', 'p04-gigabit', "
"'', '{}', 'candidate', 1.0, '[]', '', 'entity', 0, "
"CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
(entity_id,),
)
with pytest.raises(ValueError, match="source_refs required"):
promote_entity(entity_id)
# And the row stays a candidate — no half-transition.
got = get_entity(entity_id)
assert got is not None
assert got.status == "candidate"
def test_api_promote_returns_400_on_legacy_no_provenance(tmp_data_dir):
"""R14 (Codex, 2026-04-22): the HTTP promote route must translate
the V1-0 ValueError for no-provenance candidates into 400, not 500.
Previously the route didn't catch ValueError so legacy bad
candidates surfaced as a server error."""
init_db()
init_engineering_schema()
import uuid as _uuid
from fastapi.testclient import TestClient
from atocore.main import app
entity_id = str(_uuid.uuid4())
with get_connection() as conn:
conn.execute(
"INSERT INTO entities (id, entity_type, name, project, "
"description, properties, status, confidence, source_refs, "
"extractor_version, canonical_home, hand_authored, "
"created_at, updated_at) "
"VALUES (?, 'component', 'Legacy HTTP', 'p04-gigabit', "
"'', '{}', 'candidate', 1.0, '[]', '', 'entity', 0, "
"CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
(entity_id,),
)
client = TestClient(app)
r = client.post(f"/entities/{entity_id}/promote")
assert r.status_code == 400
assert "source_refs required" in r.json().get("detail", "")
# Row still candidate — the 400 didn't half-transition.
got = get_entity(entity_id)
assert got is not None
assert got.status == "candidate"
def test_promote_accepts_candidate_flagged_hand_authored(tmp_data_dir):
"""The other side of the promote re-check: hand_authored=1 with
empty source_refs still lets promote succeed, matching
create_entity's symmetry."""
init_db()
init_engineering_schema()
import uuid as _uuid
entity_id = str(_uuid.uuid4())
with get_connection() as conn:
conn.execute(
"INSERT INTO entities (id, entity_type, name, project, "
"description, properties, status, confidence, source_refs, "
"extractor_version, canonical_home, hand_authored, "
"created_at, updated_at) "
"VALUES (?, 'component', 'Hand Authored Candidate', "
"'p04-gigabit', '', '{}', 'candidate', 1.0, '[]', '', "
"'entity', 1, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
(entity_id,),
)
assert promote_entity(entity_id) is True
assert get_entity(entity_id).status == "active"
# ---------- F-5: synchronous conflict-detection hook ----------
def test_active_create_runs_conflict_detection_hook(tmp_data_dir, monkeypatch):
"""status=active writes trigger detect_conflicts_for_entity."""
init_db()
init_engineering_schema()
called_with: list[str] = []
def _fake_detect(entity_id: str):
called_with.append(entity_id)
return []
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
e = create_entity(
entity_type="component",
name="Active With Hook",
project="p04-gigabit",
source_refs=["test:hook"],
status="active",
)
assert called_with == [e.id]
def test_supersede_runs_conflict_detection_on_new_active(tmp_data_dir, monkeypatch):
"""Regression (Codex V1-0 probe): per plan's 'every active-entity
write path', supersede_entity must trigger synchronous conflict
detection. The subject is the `superseded_by` entity — the one
whose graph state just changed because a new `supersedes` edge was
rooted at it."""
init_db()
init_engineering_schema()
old = create_entity(
entity_type="component",
name="Old Pad",
project="p04-gigabit",
source_refs=["test:old"],
status="active",
)
new = create_entity(
entity_type="component",
name="New Pad",
project="p04-gigabit",
source_refs=["test:new"],
status="active",
)
called_with: list[str] = []
def _fake_detect(entity_id: str):
called_with.append(entity_id)
return []
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
assert supersede_entity(old.id, superseded_by=new.id) is True
# The detector fires on the `superseded_by` entity — the one whose
# edges just grew a new `supersedes` relationship.
assert new.id in called_with
def test_supersede_hook_fails_open(tmp_data_dir, monkeypatch):
"""Supersede must survive a broken detector per Q-3 flag-never-block."""
init_db()
init_engineering_schema()
old = create_entity(
entity_type="component", name="Old2", project="p04-gigabit",
source_refs=["test:old"], status="active",
)
new = create_entity(
entity_type="component", name="New2", project="p04-gigabit",
source_refs=["test:new"], status="active",
)
def _boom(entity_id: str):
raise RuntimeError("synthetic detector failure")
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _boom)
# The supersede still succeeds despite the detector blowing up.
assert supersede_entity(old.id, superseded_by=new.id) is True
assert get_entity(old.id).status == "superseded"
def test_candidate_create_does_not_run_conflict_hook(tmp_data_dir, monkeypatch):
"""status=candidate writes do NOT trigger detection — the hook is
for active rows only, per V1-0 scope. Candidates are checked at
promote time."""
init_db()
init_engineering_schema()
called: list[str] = []
def _fake_detect(entity_id: str):
called.append(entity_id)
return []
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _fake_detect)
create_entity(
entity_type="component",
name="Candidate No Hook",
project="p04-gigabit",
source_refs=["test:cand"],
status="candidate",
)
assert called == []
# ---------- Q-3: flag, never block ----------
def test_conflict_detector_failure_does_not_block_write(tmp_data_dir, monkeypatch):
"""Per conflict-model.md:256: detection errors must not fail the
write. The entity is still created; only a warning is logged."""
init_db()
init_engineering_schema()
def _boom(entity_id: str):
raise RuntimeError("synthetic detector failure")
import atocore.engineering.conflicts as conflicts_mod
monkeypatch.setattr(conflicts_mod, "detect_conflicts_for_entity", _boom)
# The write still succeeds — no exception propagates.
e = create_entity(
entity_type="component",
name="Hook Fails Open",
project="p04-gigabit",
source_refs=["test:failopen"],
status="active",
)
assert get_entity(e.id) is not None
# ---------- Q-4 (partial): trust-hierarchy — scope_only filters candidates ----------
def test_scope_only_active_does_not_return_candidates(tmp_data_dir):
"""V1-0 partial Q-4: active-scoped listing never returns candidates.
Full trust-hierarchy coverage (no-auto-project-state, etc.) ships in
V1-E per plan."""
init_db()
init_engineering_schema()
active = create_entity(
entity_type="component",
name="Active Alpha",
project="p04-gigabit",
source_refs=["test:alpha"],
status="active",
)
candidate = create_entity(
entity_type="component",
name="Candidate Beta",
project="p04-gigabit",
source_refs=["test:beta"],
status="candidate",
)
listed = get_entities(project="p04-gigabit", status="active", scope_only=True)
ids = {e.id for e in listed}
assert active.id in ids
assert candidate.id not in ids