Clarify operating model and project corpus state

2026-04-06 07:25:33 -04:00
parent 5069d5b1b6
commit 8a94da4bf4
4 changed files with 220 additions and 20 deletions
--- a/docs/current-state.md
+++ b/docs/current-state.md
@@ -5,7 +5,8 @@
 AtoCore is no longer just a proof of concept. The local engine exists, the
 correctness pass is complete, Dalidou now hosts the canonical runtime and
 machine-storage location, and the T420/OpenClaw side now has a safe read-only
-path to consume AtoCore.
+path to consume AtoCore. The live corpus is no longer just self-knowledge: it
+now includes a first curated ingestion batch for the active projects.

 ## Phase Assessment

@@ -41,6 +42,10 @@ path to consume AtoCore.
 - Dalidou Docker deployment foundation
 - initial AtoCore self-knowledge corpus ingested on Dalidou
 - T420/OpenClaw read-only AtoCore helper skill
+- first curated active-project corpus batch for:
+  - `p04-gigabit`
+  - `p05-interferometer`
+  - `p06-polisher`

 ## What Is True On Dalidou

@@ -67,10 +72,23 @@ The Dalidou instance already contains:
 - Master Plan V3
 - Build Spec V1
 - trusted project-state entries for `atocore`
+- curated staged project docs for:
+  - `p04-gigabit`
+  - `p05-interferometer`
+  - `p06-polisher`
+- curated repo-context docs for:
+  - `p05`: `Fullum-Interferometer`
+  - `p06`: `polisher-sim`
+
+Current live stats after the first active-project ingest pass:
+
+- `source_documents`: 32
+- `source_chunks`: 523
+- `vectors`: 523

 The broader long-term corpus is still not fully populated yet. Wider project and
 vault ingestion remains a deliberate next step rather than something already
-completed.
+completed, but the corpus is now meaningfully seeded beyond AtoCore's own docs.

 ## What Is True On The T420

@@ -81,14 +99,41 @@ completed.
  - `/home/papa/clawd/skills/atocore-context/`
 - the T420 can successfully reach Dalidou AtoCore over network/Tailscale
 - fail-open behavior has been verified for the helper path
+- OpenClaw can now seed AtoCore in two distinct ways:
+  - project-scoped memory entries
+  - staged document ingestion into the retrieval corpus
+
+## What Exists In Memory vs Corpus
+
+These remain separate and that is intentional.
+
+In `/memory`:
+
+- project-scoped curated memories now exist for:
+  - `p04-gigabit`: 5 memories
+  - `p05-interferometer`: 6 memories
+  - `p06-polisher`: 8 memories
+
+These are curated summaries and extracted stable project signals.
+
+In `source_documents` / retrieval corpus:
+
+- real project documents are now present for the same active project set
+- retrieval is no longer limited to AtoCore self-knowledge only
+
+This separation is healthy:
+
+- memory stores distilled project facts
+- corpus stores the underlying retrievable documents

 ## Immediate Next Focus

 1. Use the new T420-side AtoCore skill in real OpenClaw workflows
-2. Ingest selected active project sources in a controlled way
-3. Define the first broader AtoVault/AtoDrive ingestion batches
-4. Add backup/export strategy for Dalidou machine state
-5. Only later consider deeper automatic OpenClaw integration or write-back
+2. Tighten retrieval quality for the newly seeded active projects
+3. Promote only the most stable active-project facts into trusted project state
+4. Define the first broader AtoVault/AtoDrive ingestion batches
+5. Add backup/export strategy for Dalidou machine state
+6. Only later consider deeper automatic OpenClaw integration or write-back

 ## Guiding Constraints