From 8a94da4bf4ab675ae566bf0c84c4236171929201 Mon Sep 17 00:00:00 2001 From: Anto01 Date: Mon, 6 Apr 2026 07:25:33 -0400 Subject: [PATCH] Clarify operating model and project corpus state --- docs/atocore-ecosystem-and-hosting.md | 20 ++++++ docs/current-state.md | 57 ++++++++++++++-- docs/next-steps.md | 70 ++++++++++++++++---- docs/operating-model.md | 93 +++++++++++++++++++++++++++ 4 files changed, 220 insertions(+), 20 deletions(-) create mode 100644 docs/operating-model.md diff --git a/docs/atocore-ecosystem-and-hosting.md b/docs/atocore-ecosystem-and-hosting.md index 2829486..29531d7 100644 --- a/docs/atocore-ecosystem-and-hosting.md +++ b/docs/atocore-ecosystem-and-hosting.md @@ -86,6 +86,24 @@ file replication of the live machine store. - OpenClaw must continue to work if AtoCore is unavailable - write-back from OpenClaw into AtoCore is deferred until later phases +## Intended Daily Operating Model + +The target workflow is: + +- the human continues to work primarily in PKM project notes, Git/Gitea repos, + Discord, and normal OpenClaw sessions +- OpenClaw keeps its own runtime behavior and memory system +- AtoCore acts as the durable external context layer that compiles trusted + project state, retrieval, and long-lived machine-readable context +- AtoCore improves prompt quality and robustness without replacing direct repo + work, direct file reads, or OpenClaw's own memory + +In other words: + +- PKM and repos remain the human-authoritative project sources +- OpenClaw remains the active operating environment +- AtoCore remains the compiled context engine and machine-memory host + ## Current Status As of the current implementation pass: @@ -95,6 +113,8 @@ As of the current implementation pass: - the service is running from Dalidou - the T420/OpenClaw machine can reach AtoCore over network - a first read-only OpenClaw-side helper exists +- the live corpus now includes initial AtoCore self-knowledge and a first + curated batch for active projects - the long-term content corpus still needs broader project and vault ingestion This means the platform is hosted on Dalidou now, the first cross-machine diff --git a/docs/current-state.md b/docs/current-state.md index 53baaf2..cf25630 100644 --- a/docs/current-state.md +++ b/docs/current-state.md @@ -5,7 +5,8 @@ AtoCore is no longer just a proof of concept. The local engine exists, the correctness pass is complete, Dalidou now hosts the canonical runtime and machine-storage location, and the T420/OpenClaw side now has a safe read-only -path to consume AtoCore. +path to consume AtoCore. The live corpus is no longer just self-knowledge: it +now includes a first curated ingestion batch for the active projects. ## Phase Assessment @@ -41,6 +42,10 @@ path to consume AtoCore. - Dalidou Docker deployment foundation - initial AtoCore self-knowledge corpus ingested on Dalidou - T420/OpenClaw read-only AtoCore helper skill +- first curated active-project corpus batch for: + - `p04-gigabit` + - `p05-interferometer` + - `p06-polisher` ## What Is True On Dalidou @@ -67,10 +72,23 @@ The Dalidou instance already contains: - Master Plan V3 - Build Spec V1 - trusted project-state entries for `atocore` +- curated staged project docs for: + - `p04-gigabit` + - `p05-interferometer` + - `p06-polisher` +- curated repo-context docs for: + - `p05`: `Fullum-Interferometer` + - `p06`: `polisher-sim` + +Current live stats after the first active-project ingest pass: + +- `source_documents`: 32 +- `source_chunks`: 523 +- `vectors`: 523 The broader long-term corpus is still not fully populated yet. Wider project and vault ingestion remains a deliberate next step rather than something already -completed. +completed, but the corpus is now meaningfully seeded beyond AtoCore's own docs. ## What Is True On The T420 @@ -81,14 +99,41 @@ completed. - `/home/papa/clawd/skills/atocore-context/` - the T420 can successfully reach Dalidou AtoCore over network/Tailscale - fail-open behavior has been verified for the helper path +- OpenClaw can now seed AtoCore in two distinct ways: + - project-scoped memory entries + - staged document ingestion into the retrieval corpus + +## What Exists In Memory vs Corpus + +These remain separate and that is intentional. + +In `/memory`: + +- project-scoped curated memories now exist for: + - `p04-gigabit`: 5 memories + - `p05-interferometer`: 6 memories + - `p06-polisher`: 8 memories + +These are curated summaries and extracted stable project signals. + +In `source_documents` / retrieval corpus: + +- real project documents are now present for the same active project set +- retrieval is no longer limited to AtoCore self-knowledge only + +This separation is healthy: + +- memory stores distilled project facts +- corpus stores the underlying retrievable documents ## Immediate Next Focus 1. Use the new T420-side AtoCore skill in real OpenClaw workflows -2. Ingest selected active project sources in a controlled way -3. Define the first broader AtoVault/AtoDrive ingestion batches -4. Add backup/export strategy for Dalidou machine state -5. Only later consider deeper automatic OpenClaw integration or write-back +2. Tighten retrieval quality for the newly seeded active projects +3. Promote only the most stable active-project facts into trusted project state +4. Define the first broader AtoVault/AtoDrive ingestion batches +5. Add backup/export strategy for Dalidou machine state +6. Only later consider deeper automatic OpenClaw integration or write-back ## Guiding Constraints diff --git a/docs/next-steps.md b/docs/next-steps.md index b4b5637..e41a044 100644 --- a/docs/next-steps.md +++ b/docs/next-steps.md @@ -9,38 +9,67 @@ AtoCore now has: - initial self-knowledge ingested into the live instance - trusted project-state entries for AtoCore itself - a first read-only OpenClaw integration path on the T420 +- a first real active-project corpus batch for: + - `p04-gigabit` + - `p05-interferometer` + - `p06-polisher` ## Immediate Next Steps 1. Use the T420 `atocore-context` skill in real OpenClaw workflows - confirm the ergonomics are good - confirm the fail-open behavior remains acceptable in practice -2. Ingest selected active projects only - - start with the current active project set - - prefer trusted operational/project sources first - - ingest broader PKM sources only after the trusted layer is loaded -3. Review retrieval quality after the first real project ingestion batch +2. Review retrieval quality after the first real project ingestion batch - check whether the top hits are useful - check whether trusted project state remains dominant -4. Define backup and export procedures for Dalidou + - reduce cross-project competition and prompt ambiguity where needed +3. Promote a small number of stable active-project facts into trusted project + state + - active architecture + - current selected path + - key constraints + - current next step +4. Continue controlled project ingestion only where the current corpus is still + thin + - a few additional anchor docs per active project +5. Define backup and export procedures for Dalidou - SQLite snapshot/backup strategy - Chroma backup or rebuild policy -5. Keep deeper automatic runtime integration deferred until the read-only model +6. Keep deeper automatic runtime integration deferred until the read-only model has proven value -## Recommended Active Project Ingestion Order +## Recommended Near-Term Project Work + +The first curated batch is already in. + +The near-term work is now: + +1. strengthen retrieval quality +2. promote the most stable facts into trusted project state +3. only then add a few more anchor docs where still needed + +## Recommended Additional Anchor Docs 1. `p04-gigabit` 2. `p05-interferometer` 3. `p06-polisher` -For each project: +P04: -1. identify the matching AtoDrive/project-operational sources -2. identify the matching PKM project folder(s) -3. ingest the trusted/operational material first -4. ingest broader notes second -5. review retrieval quality before moving on +- 1 to 2 more strong study summaries +- 1 to 2 more meeting notes with actual decisions + +P05: + +- a couple more architecture docs +- selected vendor-response notes +- possibly one or two NX/WAVE consumer docs + +P06: + +- more explicit interface/schema docs if needed +- selected operations or UI docs +- a distilled non-empty operational context doc to replace an empty `_context.md` ## Deferred On Purpose @@ -56,5 +85,18 @@ The next batch is successful if: - OpenClaw can use AtoCore naturally when context is needed - AtoCore answers correctly for the active project set +- retrieval surfaces the seeded project docs instead of mostly AtoCore meta-docs - project ingestion remains controlled rather than noisy - the canonical Dalidou instance stays stable + +## Long-Run Goal + +The long-run target is: + +- continue working normally inside PKM project stacks and Gitea repos +- let OpenClaw keep its own memory and runtime behavior +- let AtoCore supplement LLM work with stronger trusted context, retrieval, and + context assembly + +That means AtoCore should behave like a durable external context engine and +machine-memory layer, not a replacement for normal repo work or OpenClaw memory. diff --git a/docs/operating-model.md b/docs/operating-model.md new file mode 100644 index 0000000..16ece2e --- /dev/null +++ b/docs/operating-model.md @@ -0,0 +1,93 @@ +# AtoCore Operating Model + +## Purpose + +This document makes the intended day-to-day operating model explicit. + +The goal is not to replace how work already happens. The goal is to make that +existing workflow stronger by adding a durable context engine. + +## Core Idea + +Normal work continues in: + +- PKM project notes +- Gitea repositories +- Discord and OpenClaw workflows + +OpenClaw keeps: + +- its own memory +- its own runtime and orchestration behavior +- its own workspace and direct file/repo tooling + +AtoCore adds: + +- trusted project state +- retrievable cross-source context +- durable machine memory +- context assembly that improves prompt quality and robustness + +## Layer Responsibilities + +- PKM and repos + - human-authoritative project sources + - where knowledge is created, edited, reviewed, and maintained +- OpenClaw + - active operating environment + - orchestration, direct repo work, messaging, agent workflows, local memory +- AtoCore + - compiled context engine + - durable machine-memory host + - retrieval and context assembly layer + +## Why This Architecture Works + +Each layer has different strengths and weaknesses. + +- PKM and repos are rich but noisy and manual to search +- OpenClaw memory is useful but session-shaped and not the whole project record +- raw LLM repo work is powerful but can miss trusted broader context +- AtoCore can compile context across sources and provide a better prompt input + +The result should be: + +- stronger prompts +- more robust outputs +- less manual reconstruction +- better continuity across sessions and models + +## What AtoCore Should Not Replace + +AtoCore should not replace: + +- normal file reads +- direct repo search +- direct PKM work +- OpenClaw's own memory +- OpenClaw's runtime and tool behavior + +It should supplement those systems. + +## What Healthy Usage Looks Like + +When working on a project: + +1. OpenClaw still uses local workspace/repo context +2. OpenClaw still uses its own memory +3. AtoCore adds: + - trusted current project state + - retrieved project documents + - cross-source project context + - context assembly for more robust model prompts + +## Practical Rule + +Think of AtoCore as the durable external context hard drive for LLM work: + +- fast machine-readable context +- persistent project understanding +- stronger prompt inputs +- no need to replace the normal project workflow + +That is the architecture target.