ATOCore/docs/master-plan-status.md

# AtoCore Master Plan Status

## Current Position

AtoCore is currently between **Phase 7** and **Phase 8**.

The platform is no longer just a proof of concept. The local engine exists, the
core correctness pass is complete, Dalidou hosts the canonical runtime and
machine database, and OpenClaw on the T420 can consume AtoCore safely in
read-only additive mode.

## Phase Status

### Completed

- Phase 0 - Foundation
- Phase 0.5 - Proof of Concept
- Phase 1 - Ingestion

### Baseline Complete

- Phase 2 - Memory Core
- Phase 3 - Retrieval
- Phase 5 - Project State
- Phase 7 - Context Builder

### Baseline Complete

- Phase 4 - Identity / Preferences. As of 2026-04-12: 3 identity
  memories (role, projects, infrastructure) and 3 preference memories
  (no API keys, multi-model collab, action-over-discussion) seeded
  on live Dalidou. Identity/preference band surfaces in context packs
  at 5% budget ratio. Future identity/preference extraction happens
  organically via the nightly LLM extraction pipeline.

- Phase 8 - OpenClaw Integration (baseline only, not primary surface).
  As of 2026-04-15 the T420 OpenClaw helper (`t420-openclaw/atocore.py`)
  is verified end-to-end against live Dalidou: health check, auto-context
  with project detection, Trusted Project State surfacing, project-memory
  band, fail-open on unreachable host. Tested from both the development
  machine and the T420 via SSH. Scope is narrow: **14 request shapes
  against ~44 server routes**, predominantly read-oriented plus
  `POST/DELETE /project/state` and `POST /ingest/sources`. Memory
  management, interactions capture (covered separately by the OpenClaw
  capture plugin), admin/backup, entities, triage, and extraction write
  paths remain out of this client's surface by design — they are scoped
  to the operator client (`scripts/atocore_client.py`) per the
  read-heavy additive integration model. "Primary integration" is
  therefore overclaim; "baseline read + project-state write helper" is
  the accurate framing.

### Baseline Complete

- Phase 9 - Reflection (all three foundation commits landed:
  A capture, B reinforcement, C candidate extraction + review queue).
  As of 2026-04-11 the capture → reinforce half runs automatically on
  every Stop-hook capture (length-aware token-overlap matcher handles
  paragraph-length memories), and project-scoped memories now reach
  the context pack via a dedicated `--- Project Memories ---` band
  between identity/preference and retrieved chunks. The extract half
  is still a manual / batch flow by design (`scripts/atocore_client.py
  batch-extract` + `triage`). First live batch-extract run over 42
  captured interactions produced 1 candidate (rule extractor is
  conservative and keys on structural cues like `## Decision:`
  headings that rarely appear in conversational LLM responses) —
  extractor tuning is a known follow-up.

### Not Yet Complete In The Intended Sense

- Phase 6 - AtoDrive
- Phase 10 - Write-back
- Phase 11 - Multi-model
- Phase 12 - Evaluation
- Phase 13 - Hardening

### Engineering Layer Planning Sprint

**Status: complete.** All 8 architecture docs are drafted. The
engineering layer is now ready for V1 implementation against the
active project set.

- [engineering-query-catalog.md](architecture/engineering-query-catalog.md) —
  the 20 v1-required queries the engineering layer must answer
- [memory-vs-entities.md](architecture/memory-vs-entities.md) —
  canonical home split between memory and entity tables
- [promotion-rules.md](architecture/promotion-rules.md) —
  Layer 0 → Layer 2 pipeline, triggers, review queue mechanics
- [conflict-model.md](architecture/conflict-model.md) —
  detection, representation, and resolution of contradictory facts
- [tool-handoff-boundaries.md](architecture/tool-handoff-boundaries.md) —
  KB-CAD / KB-FEM one-way mirror stance, ingest endpoints, drift handling
- [representation-authority.md](architecture/representation-authority.md) —
  canonical home matrix across PKM / KB / repos / AtoCore for 22 fact kinds
- [human-mirror-rules.md](architecture/human-mirror-rules.md) —
  templates, regeneration triggers, edit flow, "do not edit" enforcement
- [engineering-v1-acceptance.md](architecture/engineering-v1-acceptance.md) —
  measurable done definition with 23 acceptance criteria
- [engineering-knowledge-hybrid-architecture.md](architecture/engineering-knowledge-hybrid-architecture.md) —
  the 5-layer model (from the previous planning wave)
- [engineering-ontology-v1.md](architecture/engineering-ontology-v1.md) —
  the initial V1 object and relationship inventory (previous wave)
- [project-identity-canonicalization.md](architecture/project-identity-canonicalization.md) —
  the helper-at-every-service-boundary contract that keeps the
  trust hierarchy dependable across alias and canonical-id callers;
  required reading before adding new project-keyed entity surfaces
  in the V1 implementation sprint

The next concrete next step is the V1 implementation sprint, which
should follow engineering-v1-acceptance.md as its checklist, and
must apply the project-identity-canonicalization contract at every
new service-layer entry point.

### LLM Client Integration

A separate but related architectural concern: how AtoCore is reachable
from many different LLM client contexts (OpenClaw, Claude Code, future
Codex skills, future MCP server). The layering rule is documented in:

- [llm-client-integration.md](architecture/llm-client-integration.md) —
  three-layer shape: HTTP API → shared operator client
  (`scripts/atocore_client.py`) → per-agent thin frontends; the
  shared client is the canonical backbone every new client should
  shell out to instead of reimplementing HTTP calls

This sits implicitly between Phase 8 (OpenClaw) and Phase 11
(multi-model). Memory-review and engineering-entity commands are
deferred from the shared client until their workflows are exercised.

## What Is Real Today (updated 2026-04-16)

- canonical AtoCore runtime on Dalidou (`775960c`, deploy.sh verified)
- 33,253 vectors across 6 registered projects
- 234 captured interactions (192 claude-code, 38 openclaw, 4 test)
- 6 registered projects:
  - `p04-gigabit` (483 docs, 15 state entries)
  - `p05-interferometer` (109 docs, 18 state entries)
  - `p06-polisher` (564 docs, 19 state entries)
  - `atomizer-v2` (568 docs, 5 state entries)
  - `abb-space` (6 state entries)
  - `atocore` (drive source, 47 state entries)
- 110 Trusted Project State entries across all projects (decisions, requirements, facts, contacts, milestones)
- 84 active memories (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
- context pack assembly with 4 tiers: Trusted Project State > identity/preference > project memories > retrieved chunks
- query-relevance memory ranking with overlap-density scoring
- retrieval eval harness: 18 fixtures, 17/18 passing on live
- 303 tests passing
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
- off-host backup to clawdbot (T420) via rsync
- both Claude Code and OpenClaw capture interactions to AtoCore (OpenClaw via `before_agent_start` + `llm_output` plugin, verified live)
- DEV-LEDGER.md as shared operating memory between Claude and Codex
- observability dashboard at GET /admin/dashboard

## Now

These are the current practical priorities.

1. **Observe the enhanced pipeline** — let the nightly pipeline run for a
   week with the new harness + summary + auto-promote steps. Check the
   dashboard daily. Verify pipeline summary populates correctly.
2. **Knowledge density** — run batch extraction over the full 234
   interactions (`--since 2026-01-01`) to mine the backlog for knowledge.
   Target: 100+ active memories.
3. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
   different model than the extractor for independent validation
4. **Fix p04-constraints harness failure** — retrieval doesn't surface
   "Zerodur" for p04 constraint queries. Investigate if it's a missing
   memory or retrieval ranking issue.

## Active — Engineering V1 Completion Track (started 2026-04-22)

The Engineering V1 sprint moved from **Next** to **Active** on 2026-04-22.
The discovery from the gbrain review was that V1 entity infrastructure
had been built incrementally already; the sprint is a **completion** plan
against `engineering-v1-acceptance.md`, not a greenfield build. Full plan:
`docs/plans/engineering-v1-completion-plan.md`. "You are here" single-page
map: `docs/plans/v1-resume-state.md`.

Seven phases, ~17.5–19.5 focused days, runs in parallel with the Now list
where surfaces are disjoint, pauses when they collide.

| Phase | Scope | Status |
|---|---|---|
| V1-0 | Write-time invariants: F-1 header fields + F-8 provenance enforcement + F-5 hook on every active-entity write + Q-3 flag-never-block | ✅ done 2026-04-22 (`2712c5d`) |
| V1-A | Minimum query slice: Q-001 subsystem-scoped variant + Q-6 killer-correctness integration test on p05-interferometer | 🟡 gated — starts when soak (~2026-04-26) + density (100+ active memories) gates clear |
| V1-B | KB-CAD + KB-FEM ingest (`POST /ingest/kb-cad/export`, `POST /ingest/kb-fem/export`) + D-2 schema docs | pending V1-A |
| V1-C | Close the remaining 8 queries (Q-002/003/007/010/012/014/018/019; Q-020 to V1-D) | pending V1-B |
| V1-D | Full mirror surface (3 spec routes + regenerate + determinism + disputed + curated markers) + Q-5 golden file | pending V1-C |
| V1-E | Memory→entity graduation end-to-end + remaining Q-4 trust tests | pending V1-D (note: collides with memory extractor; pauses for multi-model triage work) |
| V1-F | F-5 detector generalization + route alias + O-1/O-2/O-3 operational + D-1/D-3/D-4 docs | finish line |

R14 (P2, non-blocking): `POST /entities/{id}/promote` route returns 500
on the new V1-0 `ValueError` instead of 400. Fix on branch
`claude/r14-promote-400`, pending Codex review.

## Next

These are the next major layers after V1 and the current stabilization pass.

1. Phase 6 AtoDrive — clarify Google Drive as a trusted operational
   source and ingest from it
2. Phase 13 Hardening — Chroma backup policy, monitoring, alerting,
   failure visibility beyond log files

## Later

These are the deliberate future expansions already supported by the architecture
direction, but not yet ready for immediate implementation.

1. Minimal engineering knowledge layer
   - driven by `docs/architecture/engineering-knowledge-hybrid-architecture.md`
   - guided by `docs/architecture/engineering-ontology-v1.md`
2. Minimal typed objects and relationships
3. Evidence-linking and provenance-rich structured records
4. Human mirror generation from structured state

## Not Yet

These remain intentionally deferred.

- ~~automatic write-back from OpenClaw into AtoCore~~ — OpenClaw capture
  plugin now exists (`openclaw-plugins/atocore-capture/`), interactions
  flow. Write-back of promoted memories back to OpenClaw's own memory
  system is still deferred.
- ~~automatic memory promotion~~ — Phase 10 complete: auto-triage handles
  extraction candidates, reinforcement-based auto-promotion graduates
  candidates referenced 3+ times to active, stale candidates expire
  after 14 days unreinforced.
- ~~reflection loop integration~~ — fully operational: capture (both
  clients) → reinforce (automatic) → extract (nightly cron, sonnet) →
  auto-triage (nightly, sonnet) → only needs_human reaches the user.
- replacing OpenClaw's own memory system
- live machine-DB sync between machines
- full ontology / graph expansion before the current baseline is stable

## Working Rule

The next sensible implementation threshold for the engineering ontology work is:

- after the current ingestion, retrieval, registry, OpenClaw helper, organic
  routing, and backup baseline feels boring and dependable

Until then, the architecture docs should shape decisions, not force premature
schema work.