ATOCore

Author	SHA1	Message	Date
Anto01	f16cd5272f	fix(engineering): V1-0 gap closures per Codex review Codex audit of `cbf9e03` surfaced two P1 gaps + one P2 scope concern, all verified with code-level probes. Patches below. P1: promote_entity did not re-check F-8 at status flip. Legacy candidates with source_refs='[]' and hand_authored=0 can exist from before V1-0 enforcement. promote_entity now raises ValueError before flipping status so no F-8 violation can slip into the active store through the promote path. Row stays candidate on rejection. Symmetric error shape with the create side. P1: supersede_entity was missing the F-5 hook. Plan calls for synchronous conflict detection on every active-entity write path. Supersede creates a `supersedes` relationship rooted at the `superseded_by` entity, which can produce a conflict the detector should catch. Added detect_conflicts_for_entity(superseded_by) call with fail-open per conflict-model.md:256. P2: backfill script --invalidate-instead was too broad. Query included both active AND superseded rows; invalidating superseded rows collapses audit history that V1-0 remediation never intended to touch. Now --invalidate-instead scopes to status='active' only. Default hand_authored-flag mode stays broad since it's additive/non-destructive. Help text made the destructive posture explicit. Four new regression tests in test_v1_0_write_invariants.py: - test_promote_rejects_legacy_candidate_without_provenance - test_promote_accepts_candidate_flagged_hand_authored - test_supersede_runs_conflict_detection_on_new_active - test_supersede_hook_fails_open Test count: 543 -> 547 (+4). Full suite green in 81.07s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:49:42 -04:00
Anto01	cbf9e03ab9	feat(engineering): V1-0 write-time invariants (F-1 + F-5 hook + F-8) Phase V1-0 of the Engineering V1 Completion Plan. Establishes the write-time invariants every later phase depends on so no later phase can leak invalid state into the entity store. F-1 shared-header fields per engineering-v1-acceptance.md:45: - entities.extractor_version (default "", EXTRACTOR_VERSION="v1.0.0" written by service.create_entity) - entities.canonical_home (default "entity") - entities.hand_authored (default 0, INTEGER boolean) Idempotent ALTERs in both _apply_migrations (database.py) and init_engineering_schema (service.py). CREATE TABLE also carries the columns for fresh DBs. _row_to_entity tolerates old rows without them so tests that predate V1-0 keep passing. F-8 provenance enforcement per promotion-rules.md:243: create_entity raises ValueError when source_refs is empty and hand_authored is False. New kwargs hand_authored and extractor_version threaded through the API (EntityCreateRequest) and the /wiki/new form body (human wiki writes set hand_authored true by definition). The non-negotiable invariant: every row either carries provenance or is explicitly flagged as hand-authored. F-5 synchronous conflict-detection hook on active create per engineering-v1-acceptance.md:99: create_entity(status="active") now runs detect_conflicts_for_entity with fail-open per conflict-model.md:256. Detector errors log a warning but never 4xx-block the write (Q-3 "flag, never block"). Doc note added to engineering-ontology-v1.md recording that `project` IS the `project_id` per "fields equivalent to" wording. No storage rename. Backfill script scripts/v1_0_backfill_provenance.py reports and optionally flags existing active entities that lack provenance. Idempotent. Supports --dry-run and --invalidate-instead. Tests: 10 new in test_v1_0_write_invariants.py covering F-1 fields, F-8 raise + bypass, F-5 hook on active + no-hook on candidate, Q-3 fail-open, Q-4 partial scope_only=active excludes candidates. Three pre-existing conflict tests adapted to read list_open_conflicts rather than re-run the detector (which now dedups because the hook already fired at create-time). One API test adds hand_authored=true since its fixture has no source_refs. conftest.py wraps create_entity so tests that don't pass source_refs or hand_authored default to hand_authored=True (tests author their own fixture data — reasonable default). Production paths (API route, wiki form, graduation scripts) all pass explicit values and are unaffected. Test count: 533 -> 543 (+10). Full suite green in 77.86s. Pending: Codex review on the branch before squash-merge to main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:39:30 -04:00
Anto01	9ab5b3c9d8	docs(planning): V1 Completion Plan — Codex sign-off (third round) Codex's third-round audit closed the remaining five open questions with concrete file:line resolutions, patched inline in the plan: - F-7 (P1): graduation stack is partially built — graduated_to_entity_id at database.py:143-146, graduated memory status, promote preserves original at service.py:354-356, tests at test_engineering_v1_phase5.py. Gaps: missing direct POST /memory/{id}/graduate route; spec's knowledge -> Fact mismatches ontology (no fact type). Reconcile to parameter or similar. V1-E 2 days -> 3-4 days. - Q-5 / V1-D (P2): renderer reads wall-clock in _footer at mirror.py:320. Fix is injecting regenerated timestamp + checksum as renderer inputs, sorting DB iteration, removing dict ordering deps. Render code must not call wall-clock directly. - project vs project_id (P3): doc note only, no storage rename. - Total estimate: 17.5-19.5 focused days (calendar buffer on top). - Release notes must NOT canonize "Minions" as a V2 name. Use neutral "queued background processing / async workers" wording. Sign-off from Codex: "with those edits, I'd sign off on the five questions. The only non-architectural uncertainty left in the plan is scheduling discipline against the current Now list; that does not block V1-0 once the soak window and memory-density gate clear." Plan frozen. V1-0 starts after pipeline soak (~2026-04-26) and the 100-active-memory density gate clear. Co-Authored-By: Codex <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:24:43 -04:00
Anto01	44724c81ab	docs(planning): V1 Completion Plan revised per Codex file-level audit Three findings folded in, all with exact file:line refs from Codex: - F-1 downgraded from done to partial. Entity dataclass at service.py:67 and entities table missing extractor_version and canonical_home fields per engineering-v1-acceptance.md:45. V1-0 scope now adds both via additive migration + doc note that project is the project_id per "fields equivalent to" wording. - F-2 replaced guesses with ground truth per-query status: 9 of 20 v1-required queries done, 1 partial (Q-001 needs subsystem-scoped variant), 10 missing. V1-A scope shrank to Q-001 shape fix + Q-6 integration. V1-C closes the 8 net-new queries; Q-020 deferred to V1-D (mirror). - F-5 reframed. Generic conflicts + conflict_members schema already present at database.py:190, no migration needed. Divergence is detector body (per-type dispatch needs generalization) + routes (/admin/conflicts/* needs /conflicts/* alias). V1-F scope is detector + routes only. Totals revised: 16.5-17.5 days, ~60 tests. Three of Codex's eight open questions now resolved. Remaining: F-7 graduation depth, mirror determinism, project naming, velocity calibration, minions-as-V2 naming. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:09:50 -04:00
Anto01	ce3a87857e	docs(planning): V1 Completion Plan + gbrain-plan rejection record - docs/decisions/2026-04-22-gbrain-plan-rejection.md: record of gbrain-inspired "Phase 8 Minions + typed edges" plan rejection. Three high findings from Codex verified against cited architecture docs (ontology V1 predicate set, canonical entity contract, master-plan-status Now list sequencing). - docs/plans/engineering-v1-completion-plan.md: seven-phase plan for finishing Engineering V1 against engineering-v1-acceptance.md. V1-0 (write-time invariants: F-8 provenance + F-5 hooks + F-1 audit) as hard prerequisite per Codex first-round review. Per- criterion gap audit against each F/Q/O/D acceptance item with code:line references. Explicit collision points with the Now list; schedule shifted ~4 weeks to avoid pipeline-soak window. Awaiting Codex file-level audit. - DEV-LEDGER.md: Recent Decisions + Session Log entries covering both the rejection and the revised plan. No code changes. Docs + ledger only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 13:58:10 -04:00
Anto01	e147ab2abd	feat(wiki): [[wikilinks]] with redlinks + cross-project resolver (Issue B) Last P2 from Antoine's "daily-usable" sprint. Entities referenced via [[Name]] in descriptions or mirror markdown now render as: - live wikilink if the name matches an entity in the same project - live cross-project link with "(in project X)" scope indicator if the only match is in another project - red italic redlink pointing at /wiki/new?name=... otherwise Clicking a redlink opens a pre-filled "create this entity" form that POSTs to /v1/entities and redirects to the new entity's page. - engineering/wiki.py: _wikilink_transform + _resolve_wikilink, applied in render_project (pre-markdown) and render_entity (description body). render_new_entity_form for the create page. CSS for .wikilink / .wikilink-cross / .redlink / .new-entity-form - api/routes.py: GET /wiki/new?name&project - tests/test_wikilinks.py: 12 tests including the spec regression (A references [[B]] -> redlink; create B -> link becomes live) - DEV-LEDGER.md: session log + test_count 521 -> 533 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 09:15:14 -04:00
Anto01	b94f9dff56	feat(api): PATCH /entities/{id} + /v1/engineering/* aliases PATCH lets users edit an active entity's description, properties, confidence, and source_refs without cloning — closes the duplicate-trap half-fixed by /invalidate + /supersede. Issue D just adds the /engineering/* query surface to the /v1 allowlist. - engineering/service.py: update_entity supports description replace, properties shallow merge with null-delete semantics, confidence 0..1 bounds check, source_refs dedup-append. Writes audit row - api/routes.py: PATCH /entities/{id} with EntityPatchRequest - main.py: engineering/* query endpoints aliased under /v1 (Issue D) - tests/test_patch_entity.py: 12 tests (merge, null-delete, bounds, dedup, 404, audit, v1 alias) - DEV-LEDGER.md: session log + test_count 509 -> 521 Forbidden fields via PATCH (by design): entity_type, project, name, status. Use supersede+create or the dedicated status endpoints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 09:02:13 -04:00
Anto01	081c058f77	feat(api): invalidate + supersede for active entities and memories (Issue E) Public retraction path so mistakes can be corrected without SQL. Unblocks the correction workflows that the live AKC p05 session exposed. - engineering/service.py: invalidate_active_entity returns (ok, code) with codes invalidated/already_invalid/not_active/not_found for clean HTTP mapping. supersede_entity gains superseded_by + auto-creates the supersedes relationship (new SUPERSEDES old), rejects self-supersede - memory/service.py: invalidate_memory/supersede_memory accept reason string that lands in audit note - api/routes.py: POST /entities/{id}/invalidate, /supersede; POST /memory/{id}/invalidate, /supersede (all 4 behind /v1 aliases) - tests/test_invalidate_supersede.py: 15 tests (idempotency, 404/409, supersede relationship auto-creation, self-supersede rejection, missing-replacement rejection, v1 alias presence) - DEV-LEDGER.md: session log + test_count 494 -> 509 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:56:24 -04:00
Anto01	069d155585	feat(assets): binary asset store + artifact entity + wiki evidence (Issue F) Wires visual evidence into the knowledge graph. Images, PDFs, and CAD exports can now be uploaded, deduped by SHA-256, thumbnailed, linked to entities via EVIDENCED_BY, and rendered inline on wiki pages. Unblocks AKC uploading voice-session screenshots alongside extracted entities. - assets/ module: store_asset (hash dedup + MIME allowlist + 20 MB cap), get_asset_binary, get_thumbnail (Pillow, on-disk cache under .thumbnails/<size>/), list_orphan_assets, invalidate_asset - models/database.py: new `assets` table + indexes - engineering/service.py: `artifact` added to ENTITY_TYPES - api/routes.py: POST /assets (multipart), GET /assets/{id}, /assets/{id}/thumbnail, /assets/{id}/meta, /admin/assets/orphans, DELETE /assets/{id} (409 if still referenced), GET /entities/{id}/evidence (EVIDENCED_BY artifacts with asset meta) - main.py: all new paths aliased under /v1 - engineering/wiki.py: entity pages render EVIDENCED_BY → artifact as a "Visual evidence" thumbnail strip; artifact pages render the full image + caption + capture_context - deploy/dalidou/docker-compose.yml: bind-mount ${ATOCORE_ASSETS_DIR} - config.py: assets_dir + assets_max_upload_bytes settings - requirements.txt + pyproject.toml: python-multipart, Pillow>=10.0.0 - tests/test_assets.py: 16 tests (dedup, cap, thumbnail cache, orphan detection, invalidate gating, API upload/fetch, evidence, v1 aliases, wiki rendering) - DEV-LEDGER.md: session log + cleanup note + test_count 478 -> 494 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:46:52 -04:00
Anto01	b1a3dd071e	feat(entities): inbox + cross-project (project="") support (Issue C) Makes `inbox` a reserved pseudo-project and `project=""` a first-class cross-project bucket. Unblocks AKC capturing pre-project leads/quotes and cross-project facts (materials, vendors) that don't fit a single registered project. - projects/registry.py: INBOX_PROJECT/GLOBAL_PROJECT constants, is_reserved_project(), register/update guards, resolve_project_name passthrough for "inbox" - engineering/service.py: get_entities scoping rules (inbox-only, global-only, real+global default, scope_only=true strict). promote_entity accepts target_project to retarget on promote - api/routes.py: GET /entities gains scope_only; POST /entities accepts project=null as ""; POST /entities/{id}/promote accepts {target_project, note} - engineering/wiki.py: homepage shows "Inbox & Global" cards with live counts linking to scoped lists - tests/test_inbox_crossproject.py: 15 tests (reserved enforcement, scoping rules, API shape, promote retargeting) - DEV-LEDGER.md: session log, test_count 463 -> 478 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 20:17:32 -04:00
Anto01	5fbd7e6094	feat(api): /v1 alias router for stable external contract (Issue A) Mounts an explicit allowlist of public handlers under /v1 alongside the existing unversioned paths. External clients (AKC, OpenClaw, future tools) should target /v1; internal callers (hooks, wiki, admin UI) keep working unchanged. Breaking schema changes will bump the prefix to /v2. - src/atocore/main.py: _V1_PUBLIC_PATHS allowlist + second router - tests/test_v1_aliases.py: parity + OpenAPI presence (5 tests) - README.md: API versioning section - DEV-LEDGER.md: session log, test_count 459 -> 463 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 20:04:46 -04:00
Anto01	83b4d78cb7	docs(ledger): session log for Phase 7A.1/7C/7D/7I + UI refresh + capture surface scope	2026-04-19 12:14:14 -04:00
Anto01	028d4c3594	feat: Phase 7A — semantic memory dedup ("sleep cycle" V1) New table memory_merge_candidates + service functions to cluster near-duplicate active memories within (project, memory_type) buckets, draft a unified content via LLM, and merge on human approval. Source memories become superseded (never deleted); merged memory carries union of tags, max of confidence, sum of reference_count. - schema migration for memory_merge_candidates - atocore.memory.similarity: cosine + transitive clustering - atocore.memory._dedup_prompt: stdlib-only LLM prompt preserving every specific - service: merge_memories / create_merge_candidate / get_merge_candidates / reject_merge_candidate - scripts/memory_dedup.py: host-side detector (HTTP-only, idempotent) - 5 API endpoints under /admin/memory/merge-candidates* + /admin/memory/dedup-scan - triage UI: purple "🔗 Merge Candidates" section + "🔗 Scan for duplicates" bar - batch-extract.sh Step B3 (0.90 daily, 0.85 Sundays) - deploy/dalidou/dedup-watcher.sh for UI-triggered scans - 21 new tests (374 → 395) - docs/PHASE-7-MEMORY-CONSOLIDATION.md covering 7A-7H roadmap Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:30:49 -04:00
Anto01	ba36a28453	docs: sprint documentation — ledger + master-plan sync Updated DEV-LEDGER orientation with post-sprint state: - live_sha `775960c`, tests 303, harness 17/18 on live - interactions 234 (192 claude-code + 38 openclaw) - project_state_entries 110 across 6 projects - nightly pipeline now includes auto-promote, harness, summary Updated master-plan-status.md "What Is Real Today" to match actual 2026-04-16 state. Phase 10 moved from "Next" to operational. New "Now" priorities: observe pipeline, knowledge density, multi-model triage, fix p04-constraints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:08:19 -04:00
Anto01	999788b790	chore: OpenClaw capture handler (llm_output) + ledger sync - openclaw-plugins/atocore-capture/handler.js: simplified version using before_agent_start + llm_output hooks (survives gateway restarts). The production copy lives on T420 at /tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/ - DEV-LEDGER: updated orientation (live_sha `b687e7f`, capture clients) and session log for 2026-04-16 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:04:40 -04:00
Anto01	4d4d5f437a	test(harness): fix p06-tailscale false positive, 18/18 PASS The fixture's expect_absent: "GigaBIT" was catching legitimate semantic overlap, not retrieval bleed. The p06 ARCHITECTURE.md Overview describes the Polisher Suite as built for the GigaBIT M1 mirror — it is what the polisher is for, so the word appears correctly in p06 content. All retrieved sources for this prompt were genuinely p06/shared paths; zero actual p04 chunks leaked. Narrowed the assertion to expect_absent: "[Source: p04-gigabit/", which tests the real invariant (no p04 source chunks retrieved into p06 context) without the false positive. No retrieval/ranking code change. Fixture-only fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:23:00 -04:00
Anto01	5b114baa87	docs(ledger): deploy `c2e7064` live; close R10 + R13 - R10 fixed: master-plan-status Phase 8 now disclaims "primary integration", reports current narrow surface (14 client shapes vs ~44 routes, read-heavy + project-state/ingest writes). - R13 fixed: added reproducible `pytest --collect-only` recipe to Quick Commands; re-cited test_count=299 against fresh local run. - Orientation bumped: live_sha and main_tip `c2e7064`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:19:55 -04:00
Anto01	c2e7064238	fix(extraction): R11 container 503 + R12 shared prompt module R11: POST /admin/extract-batch with mode=llm now returns 503 when the claude CLI is unavailable (was silently returning success with 0 candidates), with a message pointing at the host-side script. +2 tests. R12: extracted SYSTEM_PROMPT + parse_llm_json_array + normalize_candidate_item + build_user_message into stdlib-only src/atocore/memory/_llm_prompt.py. Both the container extractor and scripts/batch_llm_extract_live.py now import from it, eliminating the prompt/parser drift risk. Tests 297 -> 299. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 10:47:01 -04:00
Anto01	dc9fdd3a38	chore(ledger): end-of-session sync (2026-04-14) Reflects today's massive work: engineering layer + wiki + Karpathy upgrades + OpenClaw importer + auto-detection. Active memories 47 -> 84. Ready for next session to pick up cold. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:24:25 -04:00
Anto01	db89978871	docs: full session sync — master plan + ledger + atomizer-v2 ingested Master plan status updated to reflect current reality: - 5 registered projects (atomizer-v2 newly ingested, 33,253 vectors) - 47 active memories across all types - 61 project state entries - Nightly pipeline fully operational (both capture clients) - 7/14 phases baseline complete - "Now" section updated: observe/stabilize, multi-model triage, automated eval, atomizer state entries - "Next" section updated: write-back, AtoDrive, hardening - "Not Yet" items crossed off where applicable (reflection loop, auto-promotion, OpenClaw write-back) DEV-LEDGER orientation fully refreshed with current vectors, projects, pipeline state, and capture clients. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 20:32:47 -04:00
Anto01	a6ae6166a4	feat: add OpenClaw AtoCore capture plugin	2026-04-12 22:06:07 +00:00
Anto01	dbb8f915e2	chore(ledger): Batch 3 close — R9 fixed, before/after documented Before: a model returning 'p04-gigabit' for a p06-polisher interaction would silently override the known scope because the project was registered. After: interaction.project always wins when set. Model project is only a fallback for unscoped captures. Not yet guaranteed: within-project semantic errors (model says the right project but wrong content). That's a content-quality concern, not a trust-hierarchy issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 15:38:19 -04:00
Anto01	7650c339a2	audit: verify batch2 claims and findings	2026-04-12 19:06:51 +00:00
Anto01	abc8af5f7e	audit: record extraction pipeline findings	2026-04-12 16:20:42 +00:00
Anto01	b790e7eb30	audit: record final 2026-04-12 findings	2026-04-12 13:03:10 +00:00
Anto01	2b79680167	chore(ledger): Wave 2 ingestion + codex audit response session log Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 07:57:32 -04:00
Anto01	39d73e91b4	fix(R6): fall back to interaction.project when LLM returns empty Codex R6: the LLM extractor accepted the model's project field verbatim. When the model returned empty string, clearly p06 memories got promoted as project='', making them invisible to the p06 project-memory band and explaining the p06-offline-design harness failure. Fix: if model returns empty project but interaction.project is set, inherit the interaction's project. Model-supplied project still takes precedence when non-empty. Two new tests lock the fallback and precedence behaviors. R5 acknowledged (LLM extractor not yet wired into API — next task). Test count: 278 -> 280. Harness re-run pending after deploy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 07:37:14 -04:00
Anto01	89c7964237	audit: record 2026-04-12 review findings	2026-04-12 11:31:32 +00:00
Anto01	146f2e4a5e	chore: Day 8 — close mini-phase with before/after metrics Mini-phase complete. Before/after deltas: Metric Before After ───────────────────────────────────────── Rule extractor recall 0% 0% (unchanged, deprioritized) LLM extractor recall n/a 100% (new, claude -p haiku) LLM candidate yield n/a 2.55/interaction First triage accept rate n/a 31% (16/51) Active memories 20 36 (+16) p06-polisher memories 2 16 (+14) atocore memories 0 5 (+5) Retrieval harness 6/6 15/18 (expanded to 18 fixtures) Test count 264 278 (+14) 3 remaining harness failures are budget-contention on the p06 memory band: the specific memory a fixture targets ranks 4th+ and the 25% budget only holds 2-3 entries. Not a ranking bug — the per-entry 250-char cap was the one justified tweak; a second budget change risks regressing other fixtures per Codex's Day 7 hard gate. Ledger updated: Orientation, Session Log, main_tip, harness line. Next on the roadmap (from DEV-LEDGER Active Plan / docs/next-steps): - Wave 2 trusted operational ingestion (p04/p05/p06 dashboards) - Finish OpenClaw integration (Phase 8) - Auto-triage (multi-model second pass to reduce human review) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 06:41:42 -04:00
Anto01	b98a658831	chore(ledger): Day 4 complete + first triage done Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 06:06:38 -04:00
Anto01	330ecfb6a6	chore(ledger): Day 2 baseline escalated to Day 4 gate early Day 2 extractor eval baseline on a 20-interaction labeled set shows 0% yield / 0% recall / 0% precision. The 5 false negatives span 5 distinct miss classes, matching the pattern Codex's Day 4 hard gate was designed to catch but arriving two days early. No extractor code change on main. Day 1+2 artifacts committed on working branch 'claude/extractor-eval-loop' at `7d8d599`. Day 4 decision (keep rule-expanding vs prototype LLM-assisted mode) is escalated to Antoine for ratification before Day 3 work touches any extractor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 15:12:58 -04:00
Anto01	d9dc55f841	docs: formalize DEV-LEDGER review protocol	2026-04-11 15:03:33 -04:00
Anto01	81307cec47	chore: ledger session log — wire protocol commit	2026-04-11 14:46:50 -04:00
Anto01	59331e522d	feat: DEV-LEDGER.md as shared operating memory + session protocol The ledger is the one-file source of truth for "what is currently true" across Claude/Codex/human sessions: - Orientation (live SHA, main tip, test count, harness state) - Active Plan (currently Codex's 8-day extractor + harness plan with hard gates and fail-early thresholds) - Open Review Findings (P1/P2, status) - Recent Decisions (bounded to last 20) - Session Log (bounded to last 20) - Working Rules (no parallel work, branching rule, P1 block) Narrative docs under docs/ sometimes lag reality; the ledger does not. Every session MUST read it at start and append a Session Log line before ending. AGENTS.md: added a new "Session protocol" section at the top that points at the ledger. Applies to any agent (Claude, Codex, future). CLAUDE.md (new, project-local): project instructions for Claude Code in this repo. Points at DEV-LEDGER.md and AGENTS.md, spells out the deploy workflow and the Claude/Codex working model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 14:46:21 -04:00

34 Commits