264 lines
13 KiB
Markdown
264 lines
13 KiB
Markdown
# AtoCore Master Plan Status
|
||
|
||
## Current Position
|
||
|
||
AtoCore is currently between **Phase 7** and **Phase 8**.
|
||
|
||
The platform is no longer just a proof of concept. The local engine exists, the
|
||
core correctness pass is complete, Dalidou hosts the canonical runtime and
|
||
machine database, and OpenClaw on the T420 can consume AtoCore safely in
|
||
read-only additive mode.
|
||
|
||
## Phase Status
|
||
|
||
### Completed
|
||
|
||
- Phase 0 - Foundation
|
||
- Phase 0.5 - Proof of Concept
|
||
- Phase 1 - Ingestion
|
||
|
||
### Baseline Complete
|
||
|
||
- Phase 2 - Memory Core
|
||
- Phase 3 - Retrieval
|
||
- Phase 5 - Project State
|
||
- Phase 7 - Context Builder
|
||
|
||
### Baseline Complete
|
||
|
||
- Phase 4 - Identity / Preferences. As of 2026-04-12: 3 identity
|
||
memories (role, projects, infrastructure) and 3 preference memories
|
||
(no API keys, multi-model collab, action-over-discussion) seeded
|
||
on live Dalidou. Identity/preference band surfaces in context packs
|
||
at 5% budget ratio. Future identity/preference extraction happens
|
||
organically via the nightly LLM extraction pipeline.
|
||
|
||
- Phase 8 - OpenClaw Integration (baseline only, not primary surface).
|
||
As of 2026-04-15 the T420 OpenClaw helper (`t420-openclaw/atocore.py`)
|
||
is verified end-to-end against live Dalidou: health check, auto-context
|
||
with project detection, Trusted Project State surfacing, project-memory
|
||
band, fail-open on unreachable host. Tested from both the development
|
||
machine and the T420 via SSH. Scope is narrow: **14 request shapes
|
||
against ~44 server routes**, predominantly read-oriented plus
|
||
`POST/DELETE /project/state` and `POST /ingest/sources`. Memory
|
||
management, interactions capture (covered separately by the OpenClaw
|
||
capture plugin), admin/backup, entities, triage, and extraction write
|
||
paths remain out of this client's surface by design — they are scoped
|
||
to the operator client (`scripts/atocore_client.py`) per the
|
||
read-heavy additive integration model. "Primary integration" is
|
||
therefore overclaim; "baseline read + project-state write helper" is
|
||
the accurate framing.
|
||
|
||
### Baseline Complete
|
||
|
||
- Phase 9 - Reflection (all three foundation commits landed:
|
||
A capture, B reinforcement, C candidate extraction + review queue).
|
||
As of 2026-04-11 the capture → reinforce half runs automatically on
|
||
every Stop-hook capture (length-aware token-overlap matcher handles
|
||
paragraph-length memories), and project-scoped memories now reach
|
||
the context pack via a dedicated `--- Project Memories ---` band
|
||
between identity/preference and retrieved chunks. The extract half
|
||
is still a manual / batch flow by design (`scripts/atocore_client.py
|
||
batch-extract` + `triage`). First live batch-extract run over 42
|
||
captured interactions produced 1 candidate (rule extractor is
|
||
conservative and keys on structural cues like `## Decision:`
|
||
headings that rarely appear in conversational LLM responses) —
|
||
extractor tuning is a known follow-up.
|
||
|
||
### Not Yet Complete In The Intended Sense
|
||
|
||
- Phase 6 - AtoDrive
|
||
- Phase 10 - Write-back
|
||
- Phase 11 - Multi-model
|
||
- Phase 13 - Hardening
|
||
|
||
### Partial / Operational Baseline
|
||
|
||
- Phase 12 - Evaluation. The retrieval/context harness exists and runs
|
||
against live Dalidou, but coverage is still intentionally small and
|
||
should grow before this is complete in the intended sense.
|
||
|
||
### Engineering Layer Planning Sprint
|
||
|
||
**Status: complete.** All 8 architecture docs are drafted. The
|
||
engineering layer is now ready for V1 implementation against the
|
||
active project set.
|
||
|
||
- [engineering-query-catalog.md](architecture/engineering-query-catalog.md) —
|
||
the 20 v1-required queries the engineering layer must answer
|
||
- [memory-vs-entities.md](architecture/memory-vs-entities.md) —
|
||
canonical home split between memory and entity tables
|
||
- [promotion-rules.md](architecture/promotion-rules.md) —
|
||
Layer 0 → Layer 2 pipeline, triggers, review queue mechanics
|
||
- [conflict-model.md](architecture/conflict-model.md) —
|
||
detection, representation, and resolution of contradictory facts
|
||
- [tool-handoff-boundaries.md](architecture/tool-handoff-boundaries.md) —
|
||
KB-CAD / KB-FEM one-way mirror stance, ingest endpoints, drift handling
|
||
- [representation-authority.md](architecture/representation-authority.md) —
|
||
canonical home matrix across PKM / KB / repos / AtoCore for 22 fact kinds
|
||
- [human-mirror-rules.md](architecture/human-mirror-rules.md) —
|
||
templates, regeneration triggers, edit flow, "do not edit" enforcement
|
||
- [engineering-v1-acceptance.md](architecture/engineering-v1-acceptance.md) —
|
||
measurable done definition with 23 acceptance criteria
|
||
- [engineering-knowledge-hybrid-architecture.md](architecture/engineering-knowledge-hybrid-architecture.md) —
|
||
the 5-layer model (from the previous planning wave)
|
||
- [engineering-ontology-v1.md](architecture/engineering-ontology-v1.md) —
|
||
the initial V1 object and relationship inventory (previous wave)
|
||
- [project-identity-canonicalization.md](architecture/project-identity-canonicalization.md) —
|
||
the helper-at-every-service-boundary contract that keeps the
|
||
trust hierarchy dependable across alias and canonical-id callers;
|
||
required reading before adding new project-keyed entity surfaces
|
||
in the V1 implementation sprint
|
||
|
||
The next concrete next step is the V1 implementation sprint, which
|
||
should follow engineering-v1-acceptance.md as its checklist, and
|
||
must apply the project-identity-canonicalization contract at every
|
||
new service-layer entry point.
|
||
|
||
### LLM Client Integration
|
||
|
||
A separate but related architectural concern: how AtoCore is reachable
|
||
from many different LLM client contexts (OpenClaw, Claude Code, future
|
||
Codex skills, future MCP server). The layering rule is documented in:
|
||
|
||
- [llm-client-integration.md](architecture/llm-client-integration.md) —
|
||
three-layer shape: HTTP API → shared operator client
|
||
(`scripts/atocore_client.py`) → per-agent thin frontends; the
|
||
shared client is the canonical backbone every new client should
|
||
shell out to instead of reimplementing HTTP calls
|
||
|
||
This sits implicitly between Phase 8 (OpenClaw) and Phase 11
|
||
(multi-model). Memory-review and engineering-entity commands are
|
||
deferred from the shared client until their workflows are exercised.
|
||
|
||
## What Is Real Today (updated 2026-04-25)
|
||
|
||
- canonical AtoCore runtime on Dalidou (`a87d984`, deploy.sh verified)
|
||
- 33,253 vectors across 6 registered projects, with explicit `project_id`
|
||
metadata backfilled into SQLite and Chroma after snapshot
|
||
`/srv/storage/atocore/backups/snapshots/20260424T154358Z`
|
||
- 951 captured interactions as of the 2026-04-24 live dashboard; refresh
|
||
exact live counts with
|
||
`python scripts/live_status.py`
|
||
- 6 registered projects:
|
||
- `p04-gigabit` (483 docs, 15 state entries)
|
||
- `p05-interferometer` (109 docs, 18 state entries)
|
||
- `p06-polisher` (564 docs, 19 state entries)
|
||
- `atomizer-v2` (568 docs, 5 state entries)
|
||
- `abb-space` (6 state entries)
|
||
- `atocore` (drive source, 47 state entries)
|
||
- 128 Trusted Project State entries across all projects (decisions, requirements, facts, contacts, milestones)
|
||
- 290 active memories and 0 candidate memories as of the 2026-04-24 live
|
||
dashboard
|
||
- context pack assembly with 4 tiers: Trusted Project State > identity/preference > project memories > retrieved chunks
|
||
- query-relevance memory ranking with overlap-density scoring and widened
|
||
query-time candidate pools so older exact-intent project memories can rank
|
||
ahead of generic high-confidence notes
|
||
- retrieval eval harness: 20 fixtures; current live has 19 pass, 1 known
|
||
content gap, and 0 blocking failures after the project-id backfill and
|
||
memory-ranking stabilization deploy
|
||
- 571 tests passing on `main`
|
||
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
|
||
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
|
||
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
|
||
- off-host backup to clawdbot (T420) via rsync
|
||
- both Claude Code and OpenClaw capture interactions to AtoCore (OpenClaw via `before_agent_start` + `llm_output` plugin, verified live)
|
||
- DEV-LEDGER.md as shared operating memory between Claude and Codex
|
||
- observability dashboard at GET /admin/dashboard
|
||
|
||
## Now
|
||
|
||
These are the current practical priorities.
|
||
|
||
1. **Observe the enhanced pipeline** — let the nightly pipeline run for a
|
||
week with the new harness + summary + auto-promote steps. Check the
|
||
dashboard daily. Verify pipeline summary populates correctly.
|
||
2. **Knowledge density** — run batch extraction over the full 234
|
||
interactions (`--since 2026-01-01`) to mine the backlog for knowledge.
|
||
Target: 100+ active memories.
|
||
3. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
|
||
different model than the extractor for independent validation
|
||
4. **Fix p04-constraints harness failure** — retrieval doesn't surface
|
||
"Zerodur" for p04 constraint queries. Investigate if it's a missing
|
||
memory or retrieval ranking issue.
|
||
5. **Fix Dalidou Git credentials** — the host checkout can fetch but cannot
|
||
push to Gitea over HTTP in non-interactive SSH sessions. Prefer switching
|
||
the deploy checkout to a Gitea SSH key; PAT-backed `credential.helper store`
|
||
is the fallback.
|
||
|
||
## Active — Engineering V1 Completion Track (started 2026-04-22)
|
||
|
||
The Engineering V1 sprint moved from **Next** to **Active** on 2026-04-22.
|
||
The discovery from the gbrain review was that V1 entity infrastructure
|
||
had been built incrementally already; the sprint is a **completion** plan
|
||
against `engineering-v1-acceptance.md`, not a greenfield build. Full plan:
|
||
`docs/plans/engineering-v1-completion-plan.md`. "You are here" single-page
|
||
map: `docs/plans/v1-resume-state.md`.
|
||
|
||
Seven phases, ~17.5–19.5 focused days, runs in parallel with the Now list
|
||
where surfaces are disjoint, pauses when they collide.
|
||
|
||
| Phase | Scope | Status |
|
||
|---|---|---|
|
||
| V1-0 | Write-time invariants: F-1 header fields + F-8 provenance enforcement + F-5 hook on every active-entity write + Q-3 flag-never-block | ✅ done 2026-04-22 (`2712c5d`) |
|
||
| V1-A | Minimum query slice: Q-001 subsystem-scoped variant + Q-6 killer-correctness integration test on p05-interferometer | 🟡 gated — starts when soak (~2026-04-26) + density (100+ active memories) gates clear |
|
||
| V1-B | KB-CAD + KB-FEM ingest (`POST /ingest/kb-cad/export`, `POST /ingest/kb-fem/export`) + D-2 schema docs | pending V1-A |
|
||
| V1-C | Close the remaining 8 queries (Q-002/003/007/010/012/014/018/019; Q-020 to V1-D) | pending V1-B |
|
||
| V1-D | Full mirror surface (3 spec routes + regenerate + determinism + disputed + curated markers) + Q-5 golden file | pending V1-C |
|
||
| V1-E | Memory→entity graduation end-to-end + remaining Q-4 trust tests | pending V1-D (note: collides with memory extractor; pauses for multi-model triage work) |
|
||
| V1-F | F-5 detector generalization + route alias + O-1/O-2/O-3 operational + D-1/D-3/D-4 docs | finish line |
|
||
|
||
R14 is closed: `POST /entities/{id}/promote` now translates
|
||
caller-fixable V1-0 provenance validation failures into HTTP 400 instead
|
||
of leaking as HTTP 500.
|
||
|
||
## Next
|
||
|
||
These are the next major layers after V1 and the current stabilization pass.
|
||
|
||
1. Phase 6 AtoDrive — clarify Google Drive as a trusted operational
|
||
source and ingest from it
|
||
2. Phase 13 Hardening — Chroma backup policy, monitoring, alerting,
|
||
failure visibility beyond log files
|
||
|
||
## Later
|
||
|
||
These are the deliberate future expansions already supported by the architecture
|
||
direction, but not yet ready for immediate implementation.
|
||
|
||
1. Minimal engineering knowledge layer
|
||
- driven by `docs/architecture/engineering-knowledge-hybrid-architecture.md`
|
||
- guided by `docs/architecture/engineering-ontology-v1.md`
|
||
2. Minimal typed objects and relationships
|
||
3. Evidence-linking and provenance-rich structured records
|
||
4. Human mirror generation from structured state
|
||
|
||
## Not Yet
|
||
|
||
These remain intentionally deferred.
|
||
|
||
- ~~automatic write-back from OpenClaw into AtoCore~~ — OpenClaw capture
|
||
plugin now exists (`openclaw-plugins/atocore-capture/`), interactions
|
||
flow. Write-back of promoted memories back to OpenClaw's own memory
|
||
system is still deferred.
|
||
- ~~automatic memory promotion~~ — Phase 10 complete: auto-triage handles
|
||
extraction candidates, reinforcement-based auto-promotion graduates
|
||
candidates referenced 3+ times to active, stale candidates expire
|
||
after 14 days unreinforced.
|
||
- ~~reflection loop integration~~ — fully operational: capture (both
|
||
clients) → reinforce (automatic) → extract (nightly cron, sonnet) →
|
||
auto-triage (nightly, sonnet) → only needs_human reaches the user.
|
||
- replacing OpenClaw's own memory system
|
||
- live machine-DB sync between machines
|
||
- full ontology / graph expansion before the current baseline is stable
|
||
|
||
## Working Rule
|
||
|
||
The next sensible implementation threshold for the engineering ontology work is:
|
||
|
||
- after the current ingestion, retrieval, registry, OpenClaw helper, organic
|
||
routing, and backup baseline feels boring and dependable
|
||
|
||
Until then, the architecture docs should shape decisions, not force premature
|
||
schema work.
|