# AtoCore Current State ## Status Summary AtoCore is no longer just a proof of concept. The local engine exists, the correctness pass is complete, Dalidou now hosts the canonical runtime and machine-storage location, and the T420/OpenClaw side now has a safe read-only path to consume AtoCore. The live corpus is no longer just self-knowledge: it now includes a first curated ingestion batch for the active projects. ## Phase Assessment - completed - Phase 0 - Phase 0.5 - Phase 1 - baseline complete - Phase 2 - Phase 3 - Phase 5 - Phase 7 - partial - Phase 4 - not started - Phase 6 - Phase 8 - Phase 9 - Phase 10 - Phase 11 - Phase 12 - Phase 13 ## What Exists Today - ingestion pipeline - parser and chunker - SQLite-backed memory and project state - vector retrieval - context builder - API routes for query, context, health, and source status - project registry and per-project refresh foundation - env-driven storage and deployment paths - Dalidou Docker deployment foundation - initial AtoCore self-knowledge corpus ingested on Dalidou - T420/OpenClaw read-only AtoCore helper skill - first curated active-project corpus batch for: - `p04-gigabit` - `p05-interferometer` - `p06-polisher` ## What Is True On Dalidou - deployed repo location: - `/srv/storage/atocore/app` - canonical machine DB location: - `/srv/storage/atocore/data/db/atocore.db` - canonical vector store location: - `/srv/storage/atocore/data/chroma` - source input locations: - `/srv/storage/atocore/sources/vault` - `/srv/storage/atocore/sources/drive` The service and storage foundation are live on Dalidou. The machine-data host is real and canonical. The content corpus is partially populated now. The Dalidou instance already contains: - AtoCore ecosystem and hosting docs - current-state and OpenClaw integration docs - Master Plan V3 - Build Spec V1 - trusted project-state entries for `atocore` - curated staged project docs for: - `p04-gigabit` - `p05-interferometer` - `p06-polisher` - curated repo-context docs for: - `p05`: `Fullum-Interferometer` - `p06`: `polisher-sim` - trusted project-state entries for: - `p04-gigabit` - `p05-interferometer` - `p06-polisher` Current live stats after the latest documentation sync and active-project ingest passes: - `source_documents`: 34 - `source_chunks`: 550 - `vectors`: 550 The broader long-term corpus is still not fully populated yet. Wider project and vault ingestion remains a deliberate next step rather than something already completed, but the corpus is now meaningfully seeded beyond AtoCore's own docs. For human-readable quality review, the current staged project markdown corpus is primarily visible under: - `/srv/storage/atocore/sources/vault/incoming/projects` This staged area is now useful for review because it contains the curated project docs that were actually ingested for the first active-project batch. It is important to read this staged area correctly: - it is a readable ingestion input layer - it is not the final machine-memory representation itself - seeing familiar PKM-style notes there is expected - the machine-processed intelligence lives in the DB, chunks, vectors, memory, trusted project state, and context-builder outputs ## What Is True On The T420 - SSH access is working - OpenClaw workspace inspected at `/home/papa/clawd` - OpenClaw's own memory system remains unchanged - a read-only AtoCore integration skill exists in the workspace: - `/home/papa/clawd/skills/atocore-context/` - the T420 can successfully reach Dalidou AtoCore over network/Tailscale - fail-open behavior has been verified for the helper path - OpenClaw can now seed AtoCore in two distinct ways: - project-scoped memory entries - staged document ingestion into the retrieval corpus ## What Exists In Memory vs Corpus These remain separate and that is intentional. In `/memory`: - project-scoped curated memories now exist for: - `p04-gigabit`: 5 memories - `p05-interferometer`: 6 memories - `p06-polisher`: 8 memories These are curated summaries and extracted stable project signals. In `source_documents` / retrieval corpus: - real project documents are now present for the same active project set - retrieval is no longer limited to AtoCore self-knowledge only - the current corpus is still selective rather than exhaustive - that selectivity is intentional at this stage The source refresh model now has a concrete foundation in code: - a project registry file defines known project ids, aliases, and ingest roots - the API can list registered projects - the API can refresh one registered project at a time In `Trusted Project State`: - each active seeded project now has a conservative trusted-state set - promoted facts cover: - summary - core architecture or boundary decision - key constraints - next focus This separation is healthy: - memory stores distilled project facts - corpus stores the underlying retrievable documents ## Immediate Next Focus 1. Use the new T420-side AtoCore skill in real OpenClaw workflows 2. Tighten retrieval quality for the newly seeded active projects 3. Define the first broader AtoVault/AtoDrive ingestion batches 4. Add backup/export strategy for Dalidou machine state 5. Only later consider deeper automatic OpenClaw integration or write-back ## Guiding Constraints - bad memory is worse than no memory - trusted project state must remain highest priority - human-readable sources and machine storage stay separate - OpenClaw integration must not degrade OpenClaw baseline behavior