Files
ATOCore/docs/current-state.md

5.2 KiB

AtoCore Current State

Status Summary

AtoCore is no longer just a proof of concept. The local engine exists, the correctness pass is complete, Dalidou now hosts the canonical runtime and machine-storage location, and the T420/OpenClaw side now has a safe read-only path to consume AtoCore. The live corpus is no longer just self-knowledge: it now includes a first curated ingestion batch for the active projects.

Phase Assessment

  • completed
    • Phase 0
    • Phase 0.5
    • Phase 1
  • baseline complete
    • Phase 2
    • Phase 3
    • Phase 5
    • Phase 7
  • partial
    • Phase 4
  • not started
    • Phase 6
    • Phase 8
    • Phase 9
    • Phase 10
    • Phase 11
    • Phase 12
    • Phase 13

What Exists Today

  • ingestion pipeline
  • parser and chunker
  • SQLite-backed memory and project state
  • vector retrieval
  • context builder
  • API routes for query, context, health, and source status
  • env-driven storage and deployment paths
  • Dalidou Docker deployment foundation
  • initial AtoCore self-knowledge corpus ingested on Dalidou
  • T420/OpenClaw read-only AtoCore helper skill
  • first curated active-project corpus batch for:
    • p04-gigabit
    • p05-interferometer
    • p06-polisher

What Is True On Dalidou

  • deployed repo location:
    • /srv/storage/atocore/app
  • canonical machine DB location:
    • /srv/storage/atocore/data/db/atocore.db
  • canonical vector store location:
    • /srv/storage/atocore/data/chroma
  • source input locations:
    • /srv/storage/atocore/sources/vault
    • /srv/storage/atocore/sources/drive

The service and storage foundation are live on Dalidou.

The machine-data host is real and canonical.

The content corpus is partially populated now.

The Dalidou instance already contains:

  • AtoCore ecosystem and hosting docs
  • current-state and OpenClaw integration docs
  • Master Plan V3
  • Build Spec V1
  • trusted project-state entries for atocore
  • curated staged project docs for:
    • p04-gigabit
    • p05-interferometer
    • p06-polisher
  • curated repo-context docs for:
    • p05: Fullum-Interferometer
    • p06: polisher-sim
  • trusted project-state entries for:
    • p04-gigabit
    • p05-interferometer
    • p06-polisher

Current live stats after the latest documentation sync and active-project ingest passes:

  • source_documents: 34
  • source_chunks: 550
  • vectors: 550

The broader long-term corpus is still not fully populated yet. Wider project and vault ingestion remains a deliberate next step rather than something already completed, but the corpus is now meaningfully seeded beyond AtoCore's own docs.

For human-readable quality review, the current staged project markdown corpus is primarily visible under:

  • /srv/storage/atocore/sources/vault/incoming/projects

This staged area is now useful for review because it contains the curated project docs that were actually ingested for the first active-project batch.

It is important to read this staged area correctly:

  • it is a readable ingestion input layer
  • it is not the final machine-memory representation itself
  • seeing familiar PKM-style notes there is expected
  • the machine-processed intelligence lives in the DB, chunks, vectors, memory, trusted project state, and context-builder outputs

What Is True On The T420

  • SSH access is working
  • OpenClaw workspace inspected at /home/papa/clawd
  • OpenClaw's own memory system remains unchanged
  • a read-only AtoCore integration skill exists in the workspace:
    • /home/papa/clawd/skills/atocore-context/
  • the T420 can successfully reach Dalidou AtoCore over network/Tailscale
  • fail-open behavior has been verified for the helper path
  • OpenClaw can now seed AtoCore in two distinct ways:
    • project-scoped memory entries
    • staged document ingestion into the retrieval corpus

What Exists In Memory vs Corpus

These remain separate and that is intentional.

In /memory:

  • project-scoped curated memories now exist for:
    • p04-gigabit: 5 memories
    • p05-interferometer: 6 memories
    • p06-polisher: 8 memories

These are curated summaries and extracted stable project signals.

In source_documents / retrieval corpus:

  • real project documents are now present for the same active project set
  • retrieval is no longer limited to AtoCore self-knowledge only
  • the current corpus is still selective rather than exhaustive
  • that selectivity is intentional at this stage

In Trusted Project State:

  • each active seeded project now has a conservative trusted-state set
  • promoted facts cover:
    • summary
    • core architecture or boundary decision
    • key constraints
    • next focus

This separation is healthy:

  • memory stores distilled project facts
  • corpus stores the underlying retrievable documents

Immediate Next Focus

  1. Use the new T420-side AtoCore skill in real OpenClaw workflows
  2. Tighten retrieval quality for the newly seeded active projects
  3. Define the first broader AtoVault/AtoDrive ingestion batches
  4. Add backup/export strategy for Dalidou machine state
  5. Only later consider deeper automatic OpenClaw integration or write-back

Guiding Constraints

  • bad memory is worse than no memory
  • trusted project state must remain highest priority
  • human-readable sources and machine storage stay separate
  • OpenClaw integration must not degrade OpenClaw baseline behavior