Compare commits
8 Commits
claude/v1-
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| c53e61eb67 | |||
| 2b86543e6a | |||
| 0989fed9ee | |||
| 98bc848184 | |||
| 69176d11c5 | |||
| 4ca81e9b36 | |||
| 22a37a7241 | |||
| 2712c5d2d0 |
5
.gitignore
vendored
5
.gitignore
vendored
@@ -14,3 +14,8 @@ venv/
|
||||
.claude/*
|
||||
!.claude/commands/
|
||||
!.claude/commands/**
|
||||
|
||||
# Editor / IDE state — user-specific, not project config
|
||||
.obsidian/
|
||||
.vscode/
|
||||
.idea/
|
||||
|
||||
@@ -6,23 +6,26 @@
|
||||
|
||||
## Orientation
|
||||
|
||||
- **live_sha** (Dalidou `/health` build_sha): `775960c` (verified 2026-04-16 via /health, build_time 2026-04-16T17:59:30Z)
|
||||
- **last_updated**: 2026-04-18 by Claude (Phase 7A — Memory Consolidation "sleep cycle" V1 on branch, not yet deployed)
|
||||
- **main_tip**: `999788b`
|
||||
- **test_count**: 533 (prior 521 + 12 new wikilink/redlink tests)
|
||||
- **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression)
|
||||
- **live_sha** (Dalidou `/health` build_sha): `2b86543` (verified 2026-04-23T15:20:53Z post-R14 deploy; status=ok)
|
||||
- **last_updated**: 2026-04-23 by Claude (R14 squash-merged + deployed; Orientation refreshed)
|
||||
- **main_tip**: `2b86543`
|
||||
- **test_count**: 548 (547 + 1 R14 regression test)
|
||||
- **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — known content gap, not regression; consistent since 2026-04-19)
|
||||
- **vectors**: 33,253
|
||||
- **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
|
||||
- **candidate_memories**: 2
|
||||
- **interactions**: 234 total (192 claude-code, 38 openclaw, 4 test)
|
||||
- **active_memories**: 784 (up from 84 pre-density-batch — density gate CRUSHED vs V1-A's 100-target)
|
||||
- **candidate_memories**: 2 (triage queue drained)
|
||||
- **interactions**: 500+ (limit=2000 query returned 500 — density batch has been running; actual may be higher, confirm via /stats next update)
|
||||
- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08)
|
||||
- **project_state_entries**: 110 total (atocore=47, p06=19, p05=18, p04=15, abb=6, atomizer=5)
|
||||
- **entities**: 35 (engineering knowledge graph, Layer 2)
|
||||
- **project_state_entries**: 63 (atocore alone; full cross-project count not re-sampled this update)
|
||||
- **entities**: 66 (up from 35 — V1-0 backfill + ongoing work; 0 open conflicts)
|
||||
- **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron, verified
|
||||
- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → **auto-promote/expire (NEW)** → weekly synth/lint Sundays → **retrieval harness (NEW)** → **pipeline summary (NEW)**
|
||||
- **capture_clients**: claude-code (Stop hook + cwd project inference), openclaw (before_agent_start + llm_output plugin, verified live)
|
||||
- **wiki**: http://dalidou:8100/wiki (browse), /wiki/projects/{id}, /wiki/entities/{id}, /wiki/search
|
||||
- **dashboard**: http://dalidou:8100/admin/dashboard (now shows pipeline health, interaction totals by client, all registered projects)
|
||||
- **active_track**: Engineering V1 Completion (started 2026-04-22). V1-0 landed (`2712c5d`). V1-A density gate CLEARED (784 active ≫ 100 target as of 2026-04-23). V1-A soak gate at day 5/~7 (F4 first run 2026-04-19; nightly clean 2026-04-19 through 2026-04-23; failures confined to the known p04-constraints content gap). Plan: `docs/plans/engineering-v1-completion-plan.md`. Resume map: `docs/plans/v1-resume-state.md`.
|
||||
- **last_nightly_pipeline**: `2026-04-23T03:00:20Z` — harness 17/18, triage promoted=3 rejected=7 human=0, dedup 7 clusters (1 tier1 + 6 tier2 auto-merged), graduation 30-skipped 0-graduated 0-errors, auto-triage drained the queue (0 new candidates 2026-04-22T00:52Z run)
|
||||
- **open_branches**: none — R14 squash-merged as `0989fed` and deployed 2026-04-23T15:20:53Z. V1-A is the next scheduled work
|
||||
|
||||
## Active Plan
|
||||
|
||||
@@ -143,9 +146,12 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
||||
| R11 | Codex | P2 | src/atocore/api/routes.py:773-845 | `POST /admin/extract-batch` still accepts `mode="llm"` inside the container and returns a successful 0-candidate result instead of surfacing that host-only LLM extraction is unavailable from this runtime. That is a misleading API contract for operators. | fixed | Claude | 2026-04-12 | (pending) |
|
||||
| R12 | Codex | P2 | scripts/batch_llm_extract_live.py:39-190 | The host-side extractor duplicates the LLM system prompt and JSON parsing logic from `src/atocore/memory/extractor_llm.py`. It works today, but this is now a prompt/parser drift risk across the container and host implementations. | fixed | Claude | 2026-04-12 | (pending) |
|
||||
| R13 | Codex | P2 | DEV-LEDGER.md:12 | The new `286 passing` test-count claim is not reproducibly auditable from the current audit environments: neither Dalidou nor the clean worktree has `pytest` available. The claim may be true in Claude's dev shell, but it remains unverified in this audit. | fixed | Claude | 2026-04-12 | (pending) |
|
||||
| R14 | Codex | P2 | src/atocore/api/routes.py (POST /entities/{id}/promote) | The HTTP `POST /entities/{id}/promote` route does not translate the new service-layer `ValueError("source_refs required: cannot promote a candidate with no provenance...")` into a 400. A legacy no-provenance candidate promoted through the API currently surfaces as a 500. Does not block V1-0 acceptance; tidy in a follow-up. | fixed | Claude | 2026-04-22 | 0989fed |
|
||||
|
||||
## Recent Decisions
|
||||
|
||||
- **2026-04-22** **Wiki reorg reframed as read-only operator orientation.** The earlier `docs/plans/wiki-reorg-plan.md` (5 additions W-1…W-5 adding `/wiki/interactions`, `/wiki/memories`, project-page restructure, recent feed, topnav exposure) is **superseded** and marked as such in-tree. Successor is `docs/plans/operator-orientation-plan.md` in the `ATOCore-clean` workspace: no `/wiki` surface, no Human Mirror implementation, no new API routes, no schema changes, no new source of truth. First deliverables are docs-only: `docs/where-things-live.md` (operator map of source/staged/machine/registry/trusted-state/memories/interactions/logs/backups with explicit edit-vs-don't-edit boundaries) and `docs/operator-home.md` (short daily starting page indexing those docs + a 4-command read-only orientation sequence). README and operations.md link both. Optional future work (`project-overview` and `memory-candidates` CLI helpers over existing endpoints) is gated on the docs proving useful in practice. *Proposed by:* Antoine. *Drafted by:* Claude. *Pending Codex review.*
|
||||
- **2026-04-22** **V1-0 done: approved, merged, deployed, prod backfilled.** Codex pulled `f16cd52`, re-ran the two original probes (both pass), re-ran the three targeted regression suites (all pass). Squash-merged to main as `2712c5d`. Dalidou deployed via canonical deploy script; `/health` reports build_sha=`2712c5d2d03cb2a6af38b559664afd1c4cd0e050`, status=ok. Validated backup snapshot taken at `/srv/storage/atocore/backups/snapshots/20260422T190624Z` before backfill. Prod backfill: `--dry-run` found 31 active/superseded entities with no provenance; list reviewed and sane; live run updated 31 rows via the default `hand_authored=1` flag path; follow-up dry-run returned 0 rows remaining. Residual logged as R14 (P2): `POST /entities/{id}/promote` HTTP route doesn't translate the new service-layer `ValueError` into a 400 — legacy bad candidate promotes via the API return 500 instead. Does not block V1-0 acceptance. V1-0 closed. Next: V1-A (Q-001 subsystem-scoped variant + Q-6 integration). V1-A holds until the soak window ends ~2026-04-26 and the 100-memory density target is hit. *Approved + landed by:* Codex. *Ratified by:* Antoine.
|
||||
- **2026-04-22** **Engineering V1 Completion Plan — Codex sign-off (third round)**. Codex's third-round audit closed the remaining five open questions with concrete resolutions, patched inline in `docs/plans/engineering-v1-completion-plan.md`: (1) F-7 row rewritten with ground truth — schema + preserve-original + test coverage already exist (`graduated_to_entity_id` at `database.py:143-146`, `graduated` status in memory service, promote hook at `service.py:354-356,389-451`, tests at `test_engineering_v1_phase5.py:67-90`); **real gaps** are missing direct `POST /memory/{id}/graduate` route and spec's `knowledge→Fact` mismatch (no `fact` entity type exists; reconcile to `parameter` or similar); V1-E 2 → **3–4 days**; (2) Q-5 determinism reframed — don't stabilize the call to `datetime.now()`, inject regenerated timestamp + checksum as renderer inputs, remove DB iteration ordering dependencies; V1-D scope updated; (3) `project` vs `project_id` — doc note only, no rename, resolved; (4) total estimate 16.5–17.5 → **17.5–19.5 focused days** with calendar buffer on top; (5) "Minions" must not be canonized in D-3 release notes — neutral wording ("queued background processing / async workers") only. **Agreement reached**: Claude + Codex + Antoine aligned. V1-0 is ready to start once the current pipeline soak window ends (~2026-04-26) and the 100-memory density target is hit. *Patched by:* Codex. *Signed off by:* Codex ("with those edits, I'd sign off on the five questions"). *Accepted by:* Antoine. *Executor (V1-0 onwards):* Claude.
|
||||
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex second-round file-level audit** — three findings folded in, all with exact file:line refs from Codex: (1) F-1 downgraded from ✅ to 🟡 — `extractor_version` and `canonical_home` missing from `Entity` dataclass and `entities` table per `engineering-v1-acceptance.md:45`; V1-0 scope now adds both fields via additive migration + doc note that `project` IS `project_id` per "fields equivalent to" spec wording; (2) F-2 replaced with ground-truth per-query status: 9 of 20 v1-required queries done (Q-004/Q-005/Q-006/Q-008/Q-009/Q-011/Q-013/Q-016/Q-017), 1 partial (Q-001 needs subsystem-scoped variant), 10 missing (Q-002/003/007/010/012/014/018/019/020); V1-A scope shrank to Q-001 shape fix + Q-6 integration (pillar queries already implemented); V1-C closes the 8 remaining new queries + Q-020 deferred to V1-D; (3) F-5 reframed — generic `conflicts` + `conflict_members` schema already present at `database.py:190`, no migration needed; divergence is detector body (per-type dispatch needs generalization) + routes (`/admin/conflicts/*` needs `/conflicts/*` alias). Total revised to 16.5–17.5 days, ~60 tests. Plan: `docs/plans/engineering-v1-completion-plan.md` at commit `ce3a878` (Codex pulled clean). Three of Codex's eight open questions now answered; remaining: F-7 graduation depth, mirror determinism, `project` rename question, velocity calibration, minions naming. *Proposed by:* Claude. *Reviewed by:* Codex (two rounds).
|
||||
- **2026-04-22** **Engineering V1 Completion Plan revised per Codex first-round review** — original six-phase order (queries → ingest → mirror → graduation → provenance → ops) rejected by Codex as backward: provenance-at-write (F-8) and conflict-detection hooks (F-5 minimal) must precede any phase that writes active entities. Revised to seven phases: V1-0 write-time invariants (F-8 + F-5 hooks + F-1 audit) as hard prerequisite, V1-A minimum query slice proving the model, V1-B ingest, V1-C full query catalog, V1-D mirror, V1-E graduation, V1-F full F-5 spec + ops + docs. Also softened "parallel with Now list" — real collision points listed explicitly; schedule shifted ~4 weeks to reflect that V1-0 cannot start during pipeline soak. Withdrew the "50–70% built" global framing in favor of the per-criterion gap table. Workspace sync note added: Codex's Playground workspace can't see the plan file; canonical dev tree is Windows `C:\Users\antoi\ATOCore`. Plan: `docs/plans/engineering-v1-completion-plan.md`. Awaiting Codex file-level audit once workspace syncs. *Proposed by:* Claude. *First-round review by:* Codex.
|
||||
@@ -164,6 +170,10 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
||||
|
||||
## Session Log
|
||||
|
||||
- **2026-04-23 Codex + Claude (R14 closed)** Codex reviewed `claude/r14-promote-400` at `3888db9`, no findings: "The route change is narrowly scoped: `promote_entity()` still returns False for not-found/not-candidate cases, so the existing 404 behavior remains intact, while caller-fixable validation failures now surface as 400." Ran `pytest tests/test_v1_0_write_invariants.py -q` from an isolated worktree: 15 passed in 1.91s. Claude squash-merged to main as `0989fed`, followed by ledger close-out `2b86543`, then deployed via canonical script. Dalidou `/health` reports build_sha=`2b86543e6ad26011b39a44509cc8df3809725171`, build_time `2026-04-23T15:20:53Z`, status=ok. R14 closed. Orientation refreshed earlier this session also reflected the V1-A gate status: **density gate CLEARED** (784 active memories vs 100 target — density batch-extract ran between 2026-04-22 and 2026-04-23 and more than crushed the gate), **soak gate at day 5 of ~7** (F4 first run 2026-04-19; nightly clean 2026-04-19 through 2026-04-23; only chronic failure is the known p04-constraints "Zerodur" content gap). V1-A branches from a clean V1-0 baseline as soon as the soak is called done.
|
||||
|
||||
- **2026-04-22 Codex + Antoine (V1-0 closed)** Codex approved `f16cd52` after re-running both original probes (legacy-candidate promote + supersede hook — both correct) and the three targeted regression suites (`test_v1_0_write_invariants.py`, `test_engineering_v1_phase5.py`, `test_inbox_crossproject.py` — all pass). Squash-merged to main as `2712c5d` ("feat(engineering): enforce V1-0 write invariants"). Deployed to Dalidou via the canonical deploy script; `/health` build_sha=`2712c5d2d03cb2a6af38b559664afd1c4cd0e050` status=ok. Validated backup snapshot at `/srv/storage/atocore/backups/snapshots/20260422T190624Z` taken BEFORE prod backfill. Prod backfill of `scripts/v1_0_backfill_provenance.py` against live DB: dry-run found 31 active/superseded entities with no provenance, list reviewed and looked sane; live run with default `hand_authored=1` flag path updated 31 rows; follow-up dry-run returned 0 rows remaining → no lingering F-8 violations in prod. Codex logged one residual P2 (R14): HTTP `POST /entities/{id}/promote` route doesn't translate the new service-layer `ValueError` into 400 — legacy bad candidate promoted through the API surfaces as 500. Not blocking. V1-0 closed. **Gates for V1-A**: soak window ends ~2026-04-26; 100-active-memory density target (currently 84 active + the ~31 newly flagged ones — need to check how those count in density math). V1-A holds until both gates clear.
|
||||
|
||||
- **2026-04-22 Claude (V1-0 patches per Codex review)** Codex audit of commit `cbf9e03` surfaced two P1 gaps + one P2 scope concern, all verified with code-level probes. **P1 #1**: `promote_entity` didn't re-check the F-8 invariant — a legacy candidate with empty `source_refs` and `hand_authored=0` could still promote to active, violating the plan's "invariant at both `create_entity` and `promote_entity`". Fixed: `promote_entity` at `service.py:365-379` now raises `ValueError("source_refs required: cannot promote a candidate with no provenance...")` before flipping status. Stays symmetric with the create-side error. **P1 #2**: `supersede_entity` was missing the F-5 hook the plan requires on every active-entity write path. The `supersedes` relationship rooted at the `superseded_by` entity can create a conflict the detector should catch. Fixed at `service.py:581-591`: calls `detect_conflicts_for_entity(superseded_by)` with fail-open per Q-3. **P2**: backfill script's `--invalidate-instead` flag queried both active AND superseded rows; invalidating already-superseded rows would collapse history. Fixed at `scripts/v1_0_backfill_provenance.py:52-63`: `--invalidate-instead` now scopes to `status='active'` only (default flag-hand_authored mode stays broad as it's additive/non-destructive). Help text tightened to make the destructive posture explicit. **Four new regression tests** in `test_v1_0_write_invariants.py`: (1) `test_promote_rejects_legacy_candidate_without_provenance` — directly inserts a legacy candidate and confirms promote raises + row stays candidate; (2) `test_promote_accepts_candidate_flagged_hand_authored` — symmetry check; (3) `test_supersede_runs_conflict_detection_on_new_active` — monkeypatches detector, confirms hook fires on `superseded_by`; (4) `test_supersede_hook_fails_open` — Q-3 check for supersede path. **Test count**: 543 → 547 (+4 regression). Full suite `547 passed in 81.07s`. Next: commit patches on branch, push, Codex re-review.
|
||||
|
||||
- **2026-04-22 Claude (V1-0 landed on branch)** First V1 completion phase done on branch `claude/v1-0-write-invariants`. **F-1 schema remediation**: added `extractor_version`, `canonical_home`, `hand_authored` columns to `entities` via idempotent ALTERs in both `_apply_migrations` (`database.py:148-170`) and `init_engineering_schema` (`service.py:95-139`). CREATE TABLE also updated so fresh DBs get the columns natively. New `_table_exists` helper at `database.py:378`. `Entity` dataclass gains the three fields with sensible defaults. `EXTRACTOR_VERSION = "v1.0.0"` module constant at top of `service.py`. `_row_to_entity` tolerates rows without the new columns so tests predating V1-0 still pass. **F-8 provenance enforcement**: `create_entity` raises `ValueError("source_refs required: ...")` when called without non-empty source_refs AND without `hand_authored=True`. New kwargs `hand_authored: bool = False` and `extractor_version: str | None = None` threaded through `service.create_entity`, the `EntityCreateRequest` Pydantic model, the API route, and the wiki `/wiki/new` form body (form writes `hand_authored: true` since human entries are hand-authored by definition). **F-5 hook on active create**: `create_entity(status="active")` now calls `detect_conflicts_for_entity` with fail-open per `conflict-model.md:256` (errors log warning, write still succeeds). The promote path's existing hook at `service.py:400-404` was kept as-is. **Doc note** added to `engineering-ontology-v1.md` recording that `project` IS the `project_id` per "fields equivalent to" wording. **Backfill script** at `scripts/v1_0_backfill_provenance.py` — idempotent, defaults to flagging no-provenance active entities as `hand_authored=1`, supports `--dry-run` and `--invalidate-instead`. **Tests**: 10 new in `tests/test_v1_0_write_invariants.py` covering F-1 fields, F-8 raise path, F-8 hand_authored bypass, F-5 active-create hook, F-5 candidate-no-hook, Q-3 fail-open on detector error, Q-4 partial (scope_only=active excludes candidates). **Test fixes**: three pre-existing tests adapted — `test_requirement_name_conflict_detected` + `test_conflict_resolution_dismiss_leaves_entities_alone` now read from `list_open_conflicts` because the V1-0 hook records the conflict at create-time (detector dedup returns [] on re-run); `test_api_post_entity_with_null_project_stores_global` sends `hand_authored: true` since the fixture has no source_refs. **conftest.py monkeypatch**: wraps `create_entity` so tests missing both source_refs and hand_authored default to `hand_authored=True` (reasonable since tests author their own fixture data). Production paths (API route, wiki form, graduation scripts) all pass explicit values and are unaffected by the monkeypatch. **Test count**: 533 → 543 (+10), full suite `543 passed in 77.86s`. **Not yet**: commit + push + Codex review + deploy. **Branch**: `claude/v1-0-write-invariants`.
|
||||
|
||||
@@ -1,6 +1,31 @@
|
||||
# AtoCore — Current State (2026-04-19)
|
||||
# AtoCore — Current State (2026-04-22)
|
||||
|
||||
Live deploy: `877b97e` · Dalidou health: ok · Harness: 17/18.
|
||||
Live deploy: `2712c5d` · Dalidou health: ok · Harness: 17/18 · Tests: 547 passing.
|
||||
|
||||
## V1-0 landed 2026-04-22
|
||||
|
||||
Engineering V1 completion track has started. **V1-0 write-time invariants**
|
||||
merged and deployed: F-1 shared-header fields (`extractor_version`,
|
||||
`canonical_home`, `hand_authored`) added to `entities`, F-8 provenance
|
||||
enforcement at both `create_entity` and `promote_entity`, F-5 synchronous
|
||||
conflict-detection hook on every active-entity write path (create, promote,
|
||||
supersede) with Q-3 fail-open. Prod backfill ran cleanly — 31 legacy
|
||||
active/superseded entities flagged `hand_authored=1`, follow-up dry-run
|
||||
returned 0 remaining rows. Test count 533 → 547 (+14).
|
||||
|
||||
R14 (P2, non-blocking): `POST /entities/{id}/promote` route fix translates
|
||||
the new `ValueError` into 400. Branch `claude/r14-promote-400` pending
|
||||
Codex review + squash-merge.
|
||||
|
||||
**Next in the V1 track:** V1-A (minimal query slice + Q-6 killer-correctness
|
||||
integration). Gated on pipeline soak (~2026-04-26) + 100+ active memory
|
||||
density target. See `docs/plans/engineering-v1-completion-plan.md` for
|
||||
the full 7-phase roadmap and `docs/plans/v1-resume-state.md` for the
|
||||
"you are here" map.
|
||||
|
||||
---
|
||||
|
||||
## Snapshot from previous update (2026-04-19)
|
||||
|
||||
## The numbers
|
||||
|
||||
|
||||
@@ -168,16 +168,40 @@ These are the current practical priorities.
|
||||
"Zerodur" for p04 constraint queries. Investigate if it's a missing
|
||||
memory or retrieval ranking issue.
|
||||
|
||||
## Active — Engineering V1 Completion Track (started 2026-04-22)
|
||||
|
||||
The Engineering V1 sprint moved from **Next** to **Active** on 2026-04-22.
|
||||
The discovery from the gbrain review was that V1 entity infrastructure
|
||||
had been built incrementally already; the sprint is a **completion** plan
|
||||
against `engineering-v1-acceptance.md`, not a greenfield build. Full plan:
|
||||
`docs/plans/engineering-v1-completion-plan.md`. "You are here" single-page
|
||||
map: `docs/plans/v1-resume-state.md`.
|
||||
|
||||
Seven phases, ~17.5–19.5 focused days, runs in parallel with the Now list
|
||||
where surfaces are disjoint, pauses when they collide.
|
||||
|
||||
| Phase | Scope | Status |
|
||||
|---|---|---|
|
||||
| V1-0 | Write-time invariants: F-1 header fields + F-8 provenance enforcement + F-5 hook on every active-entity write + Q-3 flag-never-block | ✅ done 2026-04-22 (`2712c5d`) |
|
||||
| V1-A | Minimum query slice: Q-001 subsystem-scoped variant + Q-6 killer-correctness integration test on p05-interferometer | 🟡 gated — starts when soak (~2026-04-26) + density (100+ active memories) gates clear |
|
||||
| V1-B | KB-CAD + KB-FEM ingest (`POST /ingest/kb-cad/export`, `POST /ingest/kb-fem/export`) + D-2 schema docs | pending V1-A |
|
||||
| V1-C | Close the remaining 8 queries (Q-002/003/007/010/012/014/018/019; Q-020 to V1-D) | pending V1-B |
|
||||
| V1-D | Full mirror surface (3 spec routes + regenerate + determinism + disputed + curated markers) + Q-5 golden file | pending V1-C |
|
||||
| V1-E | Memory→entity graduation end-to-end + remaining Q-4 trust tests | pending V1-D (note: collides with memory extractor; pauses for multi-model triage work) |
|
||||
| V1-F | F-5 detector generalization + route alias + O-1/O-2/O-3 operational + D-1/D-3/D-4 docs | finish line |
|
||||
|
||||
R14 (P2, non-blocking): `POST /entities/{id}/promote` route returns 500
|
||||
on the new V1-0 `ValueError` instead of 400. Fix on branch
|
||||
`claude/r14-promote-400`, pending Codex review.
|
||||
|
||||
## Next
|
||||
|
||||
These are the next major layers after the current stabilization pass.
|
||||
These are the next major layers after V1 and the current stabilization pass.
|
||||
|
||||
1. Phase 6 AtoDrive — clarify Google Drive as a trusted operational
|
||||
source and ingest from it
|
||||
2. Phase 13 Hardening — Chroma backup policy, monitoring, alerting,
|
||||
failure visibility beyond log files
|
||||
3. Engineering V1 implementation sprint — once knowledge density is
|
||||
sufficient and the pipeline feels boring and dependable
|
||||
|
||||
## Later
|
||||
|
||||
|
||||
161
docs/plans/v1-resume-state.md
Normal file
161
docs/plans/v1-resume-state.md
Normal file
@@ -0,0 +1,161 @@
|
||||
# V1 Completion — Resume State
|
||||
|
||||
**Last updated:** 2026-04-22 (after V1-0 landed + R14 branch pushed)
|
||||
**Purpose:** single-page "you are here" so any future session can pick up
|
||||
the V1 completion sprint without re-reading the full plan history.
|
||||
|
||||
## State of play
|
||||
|
||||
- **V1-0 is DONE.** Merged to main as `2712c5d`, deployed to Dalidou,
|
||||
prod backfill ran cleanly (31 legacy entities flagged
|
||||
`hand_authored=1`, zero violations remaining).
|
||||
- **R14 is on a branch.** `claude/r14-promote-400` at `3888db9` —
|
||||
HTTP promote route returns 400 instead of 500 on V1-0 `ValueError`.
|
||||
Pending Codex review + squash-merge. Non-blocking for V1-A.
|
||||
- **V1-A is next but GATED.** Doesn't start until both gates clear.
|
||||
|
||||
## Start-gates for V1-A
|
||||
|
||||
| Gate | Condition | Status as of 2026-04-22 |
|
||||
|---|---|---|
|
||||
| Soak | Four clean nightly cycles since F4 confidence-decay first real run 2026-04-19 | Day 3 of 4 — expected clear around **2026-04-26** |
|
||||
| Density | 100+ active memories | 84 active as of last ledger update — need +16. Lever: `scripts/batch_llm_extract_live.py` against 234-interaction backlog |
|
||||
|
||||
**When both are green, start V1-A.** If only one is green, hold.
|
||||
|
||||
## Pre-flight checklist when resuming
|
||||
|
||||
Before opening the V1-A branch, run through this in order:
|
||||
|
||||
1. `git checkout main && git pull` — make sure you're at the tip
|
||||
2. Check `DEV-LEDGER.md` **Orientation** for current `live_sha`, `test_count`, `active_memories`
|
||||
3. Check `/health` on Dalidou returns the same `build_sha` as Orientation
|
||||
4. Check the dashboard for pipeline health: http://dalidou:8100/admin/dashboard
|
||||
5. Confirm R14 branch status — either merged or explicitly deferred
|
||||
6. Re-read the two core plan docs:
|
||||
- `docs/plans/engineering-v1-completion-plan.md` — the full 7-phase plan
|
||||
- `docs/architecture/engineering-v1-acceptance.md` — the acceptance contract
|
||||
7. Skim the relevant spec docs for the phase you're about to start:
|
||||
- V1-A: `engineering-query-catalog.md` (Q-001 + Q-006/Q-009/Q-011 killer queries)
|
||||
- V1-B: `tool-handoff-boundaries.md` (KB-CAD/KB-FEM export shapes)
|
||||
- V1-C: `engineering-query-catalog.md` (all remaining v1-required queries)
|
||||
- V1-D: `human-mirror-rules.md` (mirror spec end-to-end)
|
||||
- V1-E: `memory-vs-entities.md` (graduation flow)
|
||||
- V1-F: `conflict-model.md` (generic slot-key detector)
|
||||
|
||||
## What V1-A looks like when started
|
||||
|
||||
**Branch:** `claude/v1-a-pillar-queries`
|
||||
|
||||
**Scope (~1.5 days):**
|
||||
- **Q-001 shape fix.** Add a subsystem-scoped variant of `system_map()`
|
||||
matching `GET /entities/Subsystem/<id>?expand=contains` per
|
||||
`engineering-query-catalog.md:71`. The project-wide version stays
|
||||
(it serves Q-004).
|
||||
- **Q-6 integration test.** Seed p05-interferometer with five cases:
|
||||
1 satisfying Component, 1 orphan Requirement, 1 Decision on flagged
|
||||
Assumption, 1 supported ValidationClaim, 1 unsupported ValidationClaim.
|
||||
One test asserting Q-006 / Q-009 / Q-011 return exactly the expected
|
||||
members.
|
||||
- The four "pillar" queries (Q-001, Q-005, Q-006, Q-017) already work
|
||||
per Codex's 2026-04-22 audit. V1-A does NOT re-implement them —
|
||||
V1-A verifies them on seeded data.
|
||||
|
||||
**Acceptance:** Q-001 subsystem-scoped variant + Q-6 integration test both
|
||||
green. F-2 moves from 🟡 partial to slightly-less-partial.
|
||||
|
||||
**Estimated tests added:** ~4 (not ~12 — V1-A scope shrank after Codex
|
||||
confirmed most queries already work).
|
||||
|
||||
## Map of the remaining phases
|
||||
|
||||
```
|
||||
V1-0 ✅ write-time invariants landed 2026-04-22 (2712c5d)
|
||||
↓
|
||||
V1-A 🟡 minimum query slice gated on soak + density (~1.5d when started)
|
||||
↓
|
||||
V1-B KB-CAD/KB-FEM ingest + D-2 ~2d
|
||||
↓
|
||||
V1-C close 8 remaining queries ~2d
|
||||
↓
|
||||
V1-D full mirror + determinism ~3-4d (biggest phase)
|
||||
↓
|
||||
V1-E graduation + trust tests ~3-4d (pauses for multi-model triage)
|
||||
↓
|
||||
V1-F F-5 generalization + ops + docs ~3d — V1 done
|
||||
```
|
||||
|
||||
## Parallel work that can run WITHOUT touching V1
|
||||
|
||||
These are genuinely disjoint surfaces; pick any of them during the gate
|
||||
pause or as scheduling allows:
|
||||
|
||||
- **Density batch-extract** — *required* to unblock V1-A. Not optional.
|
||||
- **p04-constraints harness fix** — retrieval-ranking change, fully
|
||||
disjoint from entities. Safe to do anywhere in the V1 track.
|
||||
- **Multi-model triage (Phase 11 entry)** — memory-side work, disjoint
|
||||
from V1-A/B/C/D. **Pause before V1-E starts** because V1-E touches
|
||||
memory module semantics.
|
||||
|
||||
## What NOT to do
|
||||
|
||||
- Don't start V1-A until both gates are green.
|
||||
- Don't touch the memory extractor write path while V1-E is open.
|
||||
- Don't name the rejected "Minions" plan in any doc — neutral wording
|
||||
only ("queued background processing / async workers") per Codex
|
||||
sign-off.
|
||||
- Don't rename the `project` field to `project_id` — Codex + Antoine
|
||||
agreed it stays as `project`, with a doc note in
|
||||
`engineering-ontology-v1.md` that this IS the project_id per spec.
|
||||
|
||||
## Open review findings
|
||||
|
||||
| id | severity | summary | status |
|
||||
|---|---|---|---|
|
||||
| R14 | P2 | `POST /entities/{id}/promote` returns 500 on V1-0 `ValueError` instead of 400 | fixed on branch `claude/r14-promote-400`, pending Codex review |
|
||||
|
||||
Closed V1-0 findings: P1 "promote path allows provenance-less legacy
|
||||
candidates" (service.py:365-379), P1 "supersede path missing F-5 hook"
|
||||
(service.py:581-591), P2 "`--invalidate-instead` backfill too broad"
|
||||
(v1_0_backfill_provenance.py:52-63). All three patched and approved in
|
||||
the squash-merge to `2712c5d`.
|
||||
|
||||
## How agreement between Claude + Codex has worked so far
|
||||
|
||||
Three review rounds before V1-0 started + three during implementation:
|
||||
|
||||
1. **Rejection round.** Claude drafted a gbrain-inspired "Phase 8
|
||||
Minions + typed edges" plan; Codex rejected as wrong-packaging.
|
||||
Record: `docs/decisions/2026-04-22-gbrain-plan-rejection.md`.
|
||||
2. **Completion-plan rewrite.** Claude rewrote against
|
||||
`engineering-v1-acceptance.md`. Codex first-round review fixed the
|
||||
phase order (provenance-first).
|
||||
3. **Per-file audit.** Codex's second-round audit found F-1 / F-2 /
|
||||
F-5 gaps, all folded in.
|
||||
4. **Sign-off round.** Codex's third-round review resolved the five
|
||||
remaining open questions inline and signed off: *"with those edits,
|
||||
I'd sign off on the five questions."*
|
||||
5. **V1-0 review.** Codex found two P1 gaps (promote re-check missing,
|
||||
supersede hook missing) + one P2 (backfill scope too broad). All
|
||||
three patched. Codex re-ran probes + regression suites, approved,
|
||||
squash-merged.
|
||||
6. **V1-0 deploy + prod backfill.** Codex deployed + ran backfill,
|
||||
logged R14 as P2 residual.
|
||||
|
||||
Protocol has been: Claude writes, Codex audits, human Antoine ratifies.
|
||||
Continue this for V1-A onward.
|
||||
|
||||
## References
|
||||
|
||||
- `docs/plans/engineering-v1-completion-plan.md` — full 7-phase plan
|
||||
- `docs/decisions/2026-04-22-gbrain-plan-rejection.md` — prior rejection
|
||||
- `docs/architecture/engineering-ontology-v1.md` — V1 ontology (18 predicates)
|
||||
- `docs/architecture/engineering-query-catalog.md` — Q-001 through Q-020 spec
|
||||
- `docs/architecture/engineering-v1-acceptance.md` — F/Q/O/D acceptance table
|
||||
- `docs/architecture/promotion-rules.md` — candidate → active flow
|
||||
- `docs/architecture/conflict-model.md` — F-5 spec
|
||||
- `docs/architecture/human-mirror-rules.md` — V1-D spec
|
||||
- `docs/architecture/memory-vs-entities.md` — V1-E spec
|
||||
- `docs/architecture/tool-handoff-boundaries.md` — V1-B KB-CAD/KB-FEM
|
||||
- `docs/master-plan-status.md` — Now / Active / Next / Later
|
||||
- `DEV-LEDGER.md` — Orientation + Open Review Findings + Session Log
|
||||
216
docs/plans/wiki-reorg-plan.md
Normal file
216
docs/plans/wiki-reorg-plan.md
Normal file
@@ -0,0 +1,216 @@
|
||||
# Wiki Reorg Plan — Human-Readable Navigation of AtoCore State
|
||||
|
||||
> **SUPERSEDED — 2026-04-22.** Do not implement this plan. It has been
|
||||
> replaced by the read-only operator orientation work at
|
||||
> `docs/plans/operator-orientation-plan.md` (in the `ATOCore-clean`
|
||||
> workspace). The successor reframes the problem as orientation over
|
||||
> existing APIs and docs rather than a new `/wiki` surface, and drops
|
||||
> interaction browser / memory index / project-page restructure from
|
||||
> scope. Nothing below should be picked up as a work item. Kept in tree
|
||||
> for context only.
|
||||
|
||||
**Date:** 2026-04-22
|
||||
**Author:** Claude (after walking the live wiki at `http://dalidou:8100/wiki`
|
||||
and reading `src/atocore/engineering/wiki.py` + `/wiki/*` routes in
|
||||
`src/atocore/api/routes.py`)
|
||||
**Status:** SUPERSEDED (see banner above). Previously: Draft, pending Codex review
|
||||
**Scope boundary:** This plan does NOT touch Inbox / Global / Emerging.
|
||||
Those three surfaces stay exactly as they are today — the registration
|
||||
flow and scope semantics around them are load-bearing and out of scope
|
||||
for this reorg.
|
||||
|
||||
---
|
||||
|
||||
## Position
|
||||
|
||||
The wiki is Layer 3 of the engineering architecture (Human Mirror —
|
||||
see `docs/architecture/engineering-knowledge-hybrid-architecture.md`).
|
||||
It is a **derived view** over the same database the LLM reads via
|
||||
`atocore_context` / `atocore_search`. There is no parallel store; if a
|
||||
fact is in the DB it is reachable from the wiki.
|
||||
|
||||
The user-facing problem is not representation, it is **findability and
|
||||
structure**. Content the operator produced (proposals, decisions,
|
||||
constraints, captured conversations) is in the DB but is not reachable
|
||||
along a natural human reading path. Today the only routes to it are:
|
||||
|
||||
- typing the right keyword into `/wiki/search`
|
||||
- remembering which project it lived under and scrolling the auto-
|
||||
generated mirror markdown on `/wiki/projects/{id}`
|
||||
- the homepage "What the brain is doing" strip, which shows counts of
|
||||
audit actions but no content
|
||||
|
||||
There is no "recent conversations" view, no memory index, no way to
|
||||
browse by memory type (decision vs preference vs episodic), and the
|
||||
topnav only exposes Home / Activity / Triage / Dashboard.
|
||||
|
||||
## Goal
|
||||
|
||||
Make the wiki a surface the operator actually uses to answer three
|
||||
recurring questions:
|
||||
|
||||
1. **"Where did I say / decide / propose X?"** — find a memory or
|
||||
interaction by topic, not by exact-string search.
|
||||
2. **"What is the current state of project Y?"** — read decisions,
|
||||
constraints, and open questions as distinct sections, not as one
|
||||
wall of auto-generated markdown.
|
||||
3. **"What happened in the system recently, and what does it mean?"**
|
||||
— a human-meaningful recent feed (captures, promotions, decisions
|
||||
recorded), not a count of audit-action names.
|
||||
|
||||
Success criterion: for each of the three questions above, a first-time
|
||||
visitor reaches a useful answer in **≤ 2 clicks from `/wiki`**.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- No schema changes. All data already exists.
|
||||
- No change to ingestion, extraction, promotion, or the trust rules.
|
||||
- No change to Inbox / Global / Emerging surfaces or semantics.
|
||||
- No change to `/admin/triage` or `/admin/dashboard`.
|
||||
- No change to the `atocore_context` / `atocore_search` LLM-facing APIs.
|
||||
- No new storage. Every new page is a new render function over existing
|
||||
service-layer calls.
|
||||
|
||||
## What it will do
|
||||
|
||||
Five additions, in priority order. Each is independently mergeable and
|
||||
independently revertable.
|
||||
|
||||
### W-1 — Interaction browser: `/wiki/interactions`
|
||||
|
||||
The DB holds 234 captured interactions (per ledger `2026-04-22`). None
|
||||
are readable in the wiki. Add a paginated list view and a detail view.
|
||||
|
||||
- `/wiki/interactions?project=&client=&since=` — list, newest first,
|
||||
showing timestamp, client (claude-code / openclaw / test), inferred
|
||||
project, and the first ~160 chars of the user prompt.
|
||||
- `/wiki/interactions/{id}` — full prompt + response, plus any
|
||||
candidate / active memories extracted from it (back-link from the
|
||||
memory → interaction relation if present, else a "no extractions"
|
||||
note).
|
||||
- Filters: project (multi), client, date range, "has extractions" boolean.
|
||||
|
||||
Answers Q1 ("where did I say X") by giving the user a time-ordered
|
||||
stream of their own conversations, not just extracted summaries.
|
||||
|
||||
### W-2 — Memory index: `/wiki/memories`
|
||||
|
||||
Today `/wiki/memories/{id}` exists but there is no list view. Add one.
|
||||
|
||||
- `/wiki/memories?project=&type=&status=&tag=` — filterable list.
|
||||
Status defaults to `active`. Types: decision, preference, episodic,
|
||||
knowledge, identity, adaptation.
|
||||
- Faceted counts in the sidebar (e.g. "decision: 14 · preference: 7").
|
||||
- Each row links to `/wiki/memories/{id}` and, where present, the
|
||||
originating interaction.
|
||||
|
||||
### W-3 — Project page restructure (tabbed, not flat)
|
||||
|
||||
`/wiki/projects/{id}` today renders the full mirror markdown in one
|
||||
scroll. Restructure to tabs (or anchored sections, same thing
|
||||
semantically) over the data already produced by
|
||||
`generate_project_overview`:
|
||||
|
||||
- **Overview** — stage, client, type, description, headline state.
|
||||
- **Decisions** — memories of type `decision` for this project.
|
||||
- **Constraints** — project_state entries in the `constraints` category.
|
||||
- **Proposals** — memories tagged `proposal` or type `preference` with
|
||||
proposal semantics (exact filter TBD during W-3; see Open Questions).
|
||||
- **Entities** — current list, already rendered inline today.
|
||||
- **Memories** — all memories, reusing the W-2 list component filtered
|
||||
to this project.
|
||||
- **Timeline** — interactions for this project, reusing W-1 filtered
|
||||
to this project.
|
||||
|
||||
No new data is produced; this is purely a slicing change.
|
||||
|
||||
### W-4 — Recent feed on homepage
|
||||
|
||||
Replace the current "What the brain is doing" strip (which shows audit
|
||||
action counts) with a **content** feed: the last N events that a human
|
||||
cares about —
|
||||
|
||||
- new active memory (not candidate)
|
||||
- memory promoted candidate → active
|
||||
- state entry added / changed
|
||||
- new registered project
|
||||
- interaction captured (optional, may be noisy — gate behind a toggle)
|
||||
|
||||
Each feed row links to the thing it describes. The audit-count strip
|
||||
can move to `/wiki/activity` where the full audit timeline already lives.
|
||||
|
||||
### W-5 — Topnav exposure
|
||||
|
||||
Surface the new pages in the topnav so they are discoverable:
|
||||
|
||||
Current: `🏠 Home · 📡 Activity · 🔀 Triage · 📊 Dashboard`
|
||||
Proposed: `🏠 Home · 💬 Interactions · 🧠 Memories · 📡 Activity · 🔀 Triage · 📊 Dashboard`
|
||||
|
||||
Domain pages (`/wiki/domains/{tag}`) stay reachable via tag chips on
|
||||
memory rows, as today — no new topnav entry for them.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Every captured interaction is readable in the wiki, with a clear
|
||||
path from interaction → extracted memories and back.
|
||||
- Memories are browsable without knowing a keyword in advance.
|
||||
- A project page separates decisions, constraints, and proposals
|
||||
instead of interleaving them in auto-generated markdown.
|
||||
- The homepage tells the operator what **content** is new, not which
|
||||
audit action counters incremented.
|
||||
- Nothing in the Inbox / Global / Emerging flow changes.
|
||||
- Nothing the LLM reads changes.
|
||||
|
||||
## Non-outcomes (explicitly)
|
||||
|
||||
- This plan does not improve retrieval quality. It does not touch the
|
||||
extractor, ranking, or harness.
|
||||
- This plan does not change what is captured or when.
|
||||
- This plan does not replace `/admin/triage`. Candidate review stays there.
|
||||
|
||||
## Sequencing
|
||||
|
||||
1. **W-1 first.** Interaction browser is the biggest findability win
|
||||
and has no coupling to memory code.
|
||||
2. **W-2 next.** Memory index reuses list/filter infra from W-1.
|
||||
3. **W-3** depends on W-2 (reuses the memory list component).
|
||||
4. **W-4** is independent; can land any time after W-1.
|
||||
5. **W-5** lands last so topnav only exposes pages that exist.
|
||||
|
||||
Each step ships with at least one render test (HTML contains expected
|
||||
anchors / row counts against a seeded fixture), following the pattern
|
||||
already in `tests/engineering/` for existing wiki renders.
|
||||
|
||||
## Risk & reversibility
|
||||
|
||||
- All additions are new routes / new render functions. Reverting any
|
||||
step is a file delete + a topnav edit.
|
||||
- Project page restructure (W-3) is the only edit to an existing
|
||||
surface. Keep the flat-markdown render behind a query flag
|
||||
(`?layout=flat`) for one release so regressions are observable.
|
||||
- No DB migrations. No service-layer signatures change.
|
||||
|
||||
## Open questions for Codex
|
||||
|
||||
1. **"Proposal" as a first-class filter** (W-3) — is this a memory type,
|
||||
a domain tag, a structural field we should add, or should it stay
|
||||
derived by filter? Current DB has no explicit proposal type; we'd be
|
||||
inferring from tags/content. If that inference is unreliable, W-3's
|
||||
Proposals tab becomes noise.
|
||||
2. **Interaction → memory back-link** (W-1) — does the current schema
|
||||
already record which interaction an extracted memory came from? If
|
||||
not, is exposing that link in the wiki worth a schema addition, or
|
||||
should W-1 ship without it?
|
||||
3. **Recent feed noise floor** (W-4) — should every captured
|
||||
interaction appear in the feed, or only interactions that produced
|
||||
at least one candidate memory? The former is complete but may drown
|
||||
out signal at current capture rates (~10/day).
|
||||
4. **Ordering vs V1 Completion track** — should any of this land before
|
||||
V1-A (currently gated on soak ~2026-04-26 + density 100+), or is it
|
||||
strictly after V1 closes?
|
||||
|
||||
## Workspace note
|
||||
|
||||
Canonical dev workspace is `C:\Users\antoi\ATOCore` (per `CLAUDE.md`).
|
||||
Any Codex audit of this plan should sync from `origin/main` at or after
|
||||
`2712c5d` before reviewing.
|
||||
@@ -2187,12 +2187,17 @@ def api_promote_entity(
|
||||
from atocore.engineering.service import promote_entity
|
||||
target_project = req.target_project if req is not None else None
|
||||
note = req.note if req is not None else ""
|
||||
try:
|
||||
success = promote_entity(
|
||||
entity_id,
|
||||
actor="api-http",
|
||||
note=note,
|
||||
target_project=target_project,
|
||||
)
|
||||
except ValueError as e:
|
||||
# V1-0 F-8 re-check raises ValueError for no-provenance candidates
|
||||
# (see service.promote_entity). Surface as 400, not 500.
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
if not success:
|
||||
raise HTTPException(status_code=404, detail=f"Entity not found or not a candidate: {entity_id}")
|
||||
result = {"status": "promoted", "id": entity_id}
|
||||
|
||||
@@ -160,6 +160,42 @@ def test_promote_rejects_legacy_candidate_without_provenance(tmp_data_dir):
|
||||
assert got.status == "candidate"
|
||||
|
||||
|
||||
def test_api_promote_returns_400_on_legacy_no_provenance(tmp_data_dir):
|
||||
"""R14 (Codex, 2026-04-22): the HTTP promote route must translate
|
||||
the V1-0 ValueError for no-provenance candidates into 400, not 500.
|
||||
Previously the route didn't catch ValueError so legacy bad
|
||||
candidates surfaced as a server error."""
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
|
||||
import uuid as _uuid
|
||||
from fastapi.testclient import TestClient
|
||||
from atocore.main import app
|
||||
|
||||
entity_id = str(_uuid.uuid4())
|
||||
with get_connection() as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO entities (id, entity_type, name, project, "
|
||||
"description, properties, status, confidence, source_refs, "
|
||||
"extractor_version, canonical_home, hand_authored, "
|
||||
"created_at, updated_at) "
|
||||
"VALUES (?, 'component', 'Legacy HTTP', 'p04-gigabit', "
|
||||
"'', '{}', 'candidate', 1.0, '[]', '', 'entity', 0, "
|
||||
"CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)",
|
||||
(entity_id,),
|
||||
)
|
||||
|
||||
client = TestClient(app)
|
||||
r = client.post(f"/entities/{entity_id}/promote")
|
||||
assert r.status_code == 400
|
||||
assert "source_refs required" in r.json().get("detail", "")
|
||||
|
||||
# Row still candidate — the 400 didn't half-transition.
|
||||
got = get_entity(entity_id)
|
||||
assert got is not None
|
||||
assert got.status == "candidate"
|
||||
|
||||
|
||||
def test_promote_accepts_candidate_flagged_hand_authored(tmp_data_dir):
|
||||
"""The other side of the promote re-check: hand_authored=1 with
|
||||
empty source_refs still lets promote succeed, matching
|
||||
|
||||
Reference in New Issue
Block a user