Files
ATOCore/docs/current-state.md

86 lines
5.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AtoCore - Current State (2026-04-25)
Update 2026-04-25: project-id chunk/vector metadata is deployed and backfilled.
Live Dalidou is on `a87d984`; `/health` is ok with 33,253 vectors and sources
ready. Live retrieval harness is 19/20 with 0 blocking failures and 1 known
content gap (`p04-constraints` missing `Zerodur` / `1.2`). Full local suite:
571 passed.
The project-id backfill was applied per populated project after a
Chroma-inclusive backup at
`/srv/storage/atocore/backups/snapshots/20260424T154358Z`. The immediate
post-apply dry-run reported 33,253 already tagged, 0 updates, 0 missing, and 0
malformed. A later repeat dry-run after the code-only ranking deploy was
aborted because the one-off container ran too long; the earlier post-apply
idempotency result remains the migration acceptance record.
## V1-0 landed 2026-04-22
Engineering V1 completion track has started. **V1-0 write-time invariants**
merged and deployed: F-1 shared-header fields (`extractor_version`,
`canonical_home`, `hand_authored`) added to `entities`, F-8 provenance
enforcement at both `create_entity` and `promote_entity`, F-5 synchronous
conflict-detection hook on every active-entity write path (create, promote,
supersede) with Q-3 fail-open. Prod backfill ran cleanly — 31 legacy
active/superseded entities flagged `hand_authored=1`, follow-up dry-run
returned 0 remaining rows. Test count 533 → 547 (+14).
R14 is closed: `POST /entities/{id}/promote` now translates the new
caller-fixable V1-0 `ValueError` into HTTP 400.
**Next in the V1 track:** V1-A (minimal query slice + Q-6 killer-correctness
integration). Gated on pipeline soak (~2026-04-26) + 100+ active memory
density target. See `docs/plans/engineering-v1-completion-plan.md` for
the full 7-phase roadmap and `docs/plans/v1-resume-state.md` for the
"you are here" map.
---
## Snapshot from previous update (2026-04-19)
## The numbers
| | count |
|---|---|
| Active memories | 266 (180 project, 31 preference, 24 knowledge, 17 adaptation, 11 episodic, 3 identity) |
| Candidates pending | **0** (autonomous triage drained the queue) |
| Interactions captured | 605 (250 claude-code, 351 openclaw) |
| Entities (typed graph) | 50 |
| Vectors in Chroma | 33K+ |
| Projects | 6 registered (p04, p05, p06, abb-space, atomizer-v2, atocore) + apm emerging (2 memories, below auto-register threshold) |
| Unique domain tags | 210 |
| Tests | 440 passing |
## Autonomous pipeline — what runs without me
| When | Job | Does |
|---|---|---|
| every hour | `hourly-extract.sh` | Pulls new interactions → LLM extraction → 3-tier auto-triage (sonnet → opus → discard/human). 0 pending candidates right now = autonomy is working. |
| every 2 min | `dedup-watcher.sh` | Services UI-triggered dedup scans |
| daily 03:00 UTC | Full nightly (`batch-extract.sh`) | Extract · triage · auto-promote reinforced · synthesis · harness · dedup (0.90) · emerging detector · transient→durable · **confidence decay (7D)** · integrity check · alerts |
| Sundays | +Weekly deep pass | Knowledge-base lint · dedup @ 0.85 · **tag canonicalization (7C)** |
Last nightly run (2026-04-19 03:00 UTC): **31 promoted · 39 rejected · 0 needs human**. That's the brain self-organizing.
## Phase 7 — Memory Consolidation status
| Subphase | What | Status |
|---|---|---|
| 7A | Semantic dedup + merge lifecycle | live |
| 7A.1 | Tiered auto-approve (sonnet ≥0.8 + sim ≥0.92 → merge; opus escalation; human only for ambiguous) | live |
| 7B | Memory-to-memory contradiction detection (0.700.88 band, classify duplicate/contradicts/supersedes) | deferred, needs 7A signal |
| 7C | Tag canonicalization (weekly; auto-apply ≥0.8 confidence; protects project tokens) | live (first run: 0 proposals — vocabulary is clean) |
| 7D | Confidence decay (0.97/day on idle unreferenced; auto-supersede below 0.3) | live (first run: 0 decayed — nothing idle+unreferenced yet) |
| 7E | `/wiki/memories/{id}` detail page | pending |
| 7F | `/wiki/domains/{tag}` cross-project view | pending (wants 7C + more usage first) |
| 7G | Re-extraction on prompt version bump | pending |
| 7H | Chroma vector hygiene (delete vectors for superseded memories) | pending |
## Known gaps (honest, refreshed 2026-04-24)
1. **Capture surface is Claude-Code-and-OpenClaw only.** Conversations in Claude Desktop, Claude.ai web, phone, or any other LLM UI are NOT captured. Example: the rotovap/mushroom chat yesterday never reached AtoCore because no hook fired. See Q4 below.
2. **Project-scoped retrieval guard is deployed and passing.** Explicit `project_id` chunk/vector metadata is now present in SQLite and Chroma for the 33,253-vector corpus. Retrieval prefers exact metadata ownership and keeps path/tag matching as a legacy fallback.
3. **Human interface is useful but not yet the V1 Human Mirror.** Wiki/dashboard pages exist, but the spec routes, deterministic mirror files, disputed markers, and curated annotations remain V1-D work.
4. **Harness known issue:** `p04-constraints` wants "Zerodur" and "1.2"; live retrieval surfaces related constraints but not those exact strings. Treat as content/state gap until fixed.
5. **Formal docs lag the ledger during fast work.** Use `DEV-LEDGER.md` and `python scripts/live_status.py` for live truth, then copy verified claims into these docs.