Clarify source staging and refresh model
This commit is contained in:
@@ -49,6 +49,31 @@ separate.
|
|||||||
The machine database is derived operational state, not the primary
|
The machine database is derived operational state, not the primary
|
||||||
human-readable source of truth.
|
human-readable source of truth.
|
||||||
|
|
||||||
|
## Source Snapshot Vs Machine Store
|
||||||
|
|
||||||
|
The human-readable files visible under `sources/vault` or `sources/drive` are
|
||||||
|
not the final "smart storage" format of AtoCore.
|
||||||
|
|
||||||
|
They are source snapshots made visible to the canonical Dalidou instance so
|
||||||
|
AtoCore can ingest them.
|
||||||
|
|
||||||
|
The actual machine-processed state lives in:
|
||||||
|
|
||||||
|
- `source_documents`
|
||||||
|
- `source_chunks`
|
||||||
|
- vector embeddings and indexes
|
||||||
|
- project memories
|
||||||
|
- trusted project state
|
||||||
|
- context-builder output
|
||||||
|
|
||||||
|
This means the staged markdown can still look very similar to the original PKM
|
||||||
|
or repo docs. That is normal.
|
||||||
|
|
||||||
|
The intelligence does not come from rewriting everything into a new markdown
|
||||||
|
vault. It comes from ingesting selected source material into the machine store
|
||||||
|
and then using that store for retrieval, trust-aware context assembly, and
|
||||||
|
memory.
|
||||||
|
|
||||||
## Canonical Hosting Model
|
## Canonical Hosting Model
|
||||||
|
|
||||||
Dalidou is the canonical host for the AtoCore service and machine database.
|
Dalidou is the canonical host for the AtoCore service and machine database.
|
||||||
@@ -86,6 +111,14 @@ file replication of the live machine store.
|
|||||||
- OpenClaw must continue to work if AtoCore is unavailable
|
- OpenClaw must continue to work if AtoCore is unavailable
|
||||||
- write-back from OpenClaw into AtoCore is deferred until later phases
|
- write-back from OpenClaw into AtoCore is deferred until later phases
|
||||||
|
|
||||||
|
Current staging behavior:
|
||||||
|
|
||||||
|
- selected project docs may be copied into a readable staging area on Dalidou
|
||||||
|
- AtoCore ingests from that staging area into the machine store
|
||||||
|
- the staging area is not itself the durable intelligence layer
|
||||||
|
- changes to the original PKM or repo source do not propagate automatically
|
||||||
|
until a refresh or re-ingest happens
|
||||||
|
|
||||||
## Intended Daily Operating Model
|
## Intended Daily Operating Model
|
||||||
|
|
||||||
The target workflow is:
|
The target workflow is:
|
||||||
|
|||||||
@@ -84,11 +84,12 @@ The Dalidou instance already contains:
|
|||||||
- `p05-interferometer`
|
- `p05-interferometer`
|
||||||
- `p06-polisher`
|
- `p06-polisher`
|
||||||
|
|
||||||
Current live stats after the first active-project ingest pass:
|
Current live stats after the latest documentation sync and active-project ingest
|
||||||
|
passes:
|
||||||
|
|
||||||
- `source_documents`: 33
|
- `source_documents`: 34
|
||||||
- `source_chunks`: 535
|
- `source_chunks`: 550
|
||||||
- `vectors`: 535
|
- `vectors`: 550
|
||||||
|
|
||||||
The broader long-term corpus is still not fully populated yet. Wider project and
|
The broader long-term corpus is still not fully populated yet. Wider project and
|
||||||
vault ingestion remains a deliberate next step rather than something already
|
vault ingestion remains a deliberate next step rather than something already
|
||||||
@@ -102,6 +103,14 @@ primarily visible under:
|
|||||||
This staged area is now useful for review because it contains the curated
|
This staged area is now useful for review because it contains the curated
|
||||||
project docs that were actually ingested for the first active-project batch.
|
project docs that were actually ingested for the first active-project batch.
|
||||||
|
|
||||||
|
It is important to read this staged area correctly:
|
||||||
|
|
||||||
|
- it is a readable ingestion input layer
|
||||||
|
- it is not the final machine-memory representation itself
|
||||||
|
- seeing familiar PKM-style notes there is expected
|
||||||
|
- the machine-processed intelligence lives in the DB, chunks, vectors, memory,
|
||||||
|
trusted project state, and context-builder outputs
|
||||||
|
|
||||||
## What Is True On The T420
|
## What Is True On The T420
|
||||||
|
|
||||||
- SSH access is working
|
- SSH access is working
|
||||||
@@ -132,6 +141,8 @@ In `source_documents` / retrieval corpus:
|
|||||||
|
|
||||||
- real project documents are now present for the same active project set
|
- real project documents are now present for the same active project set
|
||||||
- retrieval is no longer limited to AtoCore self-knowledge only
|
- retrieval is no longer limited to AtoCore self-knowledge only
|
||||||
|
- the current corpus is still selective rather than exhaustive
|
||||||
|
- that selectivity is intentional at this stage
|
||||||
|
|
||||||
In `Trusted Project State`:
|
In `Trusted Project State`:
|
||||||
|
|
||||||
|
|||||||
@@ -26,10 +26,14 @@ AtoCore now has:
|
|||||||
3. Continue controlled project ingestion only where the current corpus is still
|
3. Continue controlled project ingestion only where the current corpus is still
|
||||||
thin
|
thin
|
||||||
- a few additional anchor docs per active project
|
- a few additional anchor docs per active project
|
||||||
4. Define backup and export procedures for Dalidou
|
4. Define a cleaner source refresh model
|
||||||
|
- make the difference between source truth, staged inputs, and machine store
|
||||||
|
explicit
|
||||||
|
- move toward a project source registry and refresh workflow
|
||||||
|
5. Define backup and export procedures for Dalidou
|
||||||
- SQLite snapshot/backup strategy
|
- SQLite snapshot/backup strategy
|
||||||
- Chroma backup or rebuild policy
|
- Chroma backup or rebuild policy
|
||||||
5. Keep deeper automatic runtime integration deferred until the read-only model
|
6. Keep deeper automatic runtime integration deferred until the read-only model
|
||||||
has proven value
|
has proven value
|
||||||
|
|
||||||
## Trusted State Status
|
## Trusted State Status
|
||||||
|
|||||||
@@ -91,3 +91,52 @@ Think of AtoCore as the durable external context hard drive for LLM work:
|
|||||||
- no need to replace the normal project workflow
|
- no need to replace the normal project workflow
|
||||||
|
|
||||||
That is the architecture target.
|
That is the architecture target.
|
||||||
|
|
||||||
|
## Why The Staged Markdown Exists
|
||||||
|
|
||||||
|
The staged markdown on Dalidou is a source-input layer, not the end product of
|
||||||
|
the system.
|
||||||
|
|
||||||
|
In the current deployment model:
|
||||||
|
|
||||||
|
1. selected PKM, AtoDrive, or repo docs are copied or mirrored into a Dalidou
|
||||||
|
source path
|
||||||
|
2. AtoCore ingests them
|
||||||
|
3. the machine store keeps the processed representation
|
||||||
|
4. retrieval and context building operate on that machine store
|
||||||
|
|
||||||
|
So if the staged docs look very similar to your original PKM notes, that is
|
||||||
|
expected. They are source material, not the compiled context layer itself.
|
||||||
|
|
||||||
|
## What Happens When A Source Changes
|
||||||
|
|
||||||
|
If you edit a PKM note or repo doc at the original source, AtoCore does not
|
||||||
|
magically know yet.
|
||||||
|
|
||||||
|
The current model is refresh-based:
|
||||||
|
|
||||||
|
1. update the human-authoritative source
|
||||||
|
2. refresh or re-stage the relevant project source set on Dalidou
|
||||||
|
3. run ingestion again
|
||||||
|
4. let AtoCore update the machine representation
|
||||||
|
|
||||||
|
This is still an intermediate workflow. The long-run target is a cleaner source
|
||||||
|
registry and refresh model so that commands like `refresh p05-interferometer`
|
||||||
|
become natural and reliable.
|
||||||
|
|
||||||
|
## Current Scope Of Ingestion
|
||||||
|
|
||||||
|
The current project corpus is intentionally selective, not exhaustive.
|
||||||
|
|
||||||
|
For active projects, the goal right now is to ingest:
|
||||||
|
|
||||||
|
- high-value anchor docs
|
||||||
|
- strong meeting notes with real decisions
|
||||||
|
- architecture and constraints docs
|
||||||
|
- selected repo context that explains the system shape
|
||||||
|
|
||||||
|
The goal is not to dump the entire PKM or whole repo tree into AtoCore on the
|
||||||
|
first pass.
|
||||||
|
|
||||||
|
So if a project only has some curated notes and not the full project universe in
|
||||||
|
the staged area yet, that is normal for the current phase.
|
||||||
|
|||||||
89
docs/source-refresh-model.md
Normal file
89
docs/source-refresh-model.md
Normal file
@@ -0,0 +1,89 @@
|
|||||||
|
# AtoCore Source Refresh Model
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
This document explains how human-authored project material should flow into the
|
||||||
|
Dalidou-hosted AtoCore machine store.
|
||||||
|
|
||||||
|
It exists to make one distinction explicit:
|
||||||
|
|
||||||
|
- source markdown is not the same thing as the machine-memory layer
|
||||||
|
- source refresh is how changes in PKM or repos become visible to AtoCore
|
||||||
|
|
||||||
|
## Current Model
|
||||||
|
|
||||||
|
Today, the flow is:
|
||||||
|
|
||||||
|
1. human-authoritative project material exists in PKM, AtoDrive, and repos
|
||||||
|
2. selected high-value files are staged into Dalidou source paths
|
||||||
|
3. AtoCore ingests those source files
|
||||||
|
4. AtoCore stores the processed representation in:
|
||||||
|
- document records
|
||||||
|
- chunks
|
||||||
|
- vectors
|
||||||
|
- project memory
|
||||||
|
- trusted project state
|
||||||
|
5. retrieval and context assembly use the machine store, not the staged folder
|
||||||
|
|
||||||
|
## Why This Feels Redundant
|
||||||
|
|
||||||
|
The staged source files can look almost identical to the original PKM notes or
|
||||||
|
repo docs because they are still source material.
|
||||||
|
|
||||||
|
That is expected.
|
||||||
|
|
||||||
|
The staged source area exists because the canonical AtoCore instance on Dalidou
|
||||||
|
needs a server-visible path to ingest from.
|
||||||
|
|
||||||
|
## What Happens When A Project Source Changes
|
||||||
|
|
||||||
|
If you edit a note in PKM or a doc in a repo:
|
||||||
|
|
||||||
|
- the original source changes immediately
|
||||||
|
- the staged Dalidou copy does not change automatically
|
||||||
|
- the AtoCore machine store also does not change automatically
|
||||||
|
|
||||||
|
To refresh AtoCore:
|
||||||
|
|
||||||
|
1. select the updated project source set
|
||||||
|
2. copy or mirror the new version into the Dalidou source area
|
||||||
|
3. run ingestion again
|
||||||
|
4. verify that retrieval and context reflect the new material
|
||||||
|
|
||||||
|
## Current Intentional Limits
|
||||||
|
|
||||||
|
The current active-project ingestion strategy is selective.
|
||||||
|
|
||||||
|
That means:
|
||||||
|
|
||||||
|
- not every note from a project is staged
|
||||||
|
- not every repo file is staged
|
||||||
|
- the goal is to start with high-value anchor docs
|
||||||
|
- broader ingestion comes later if needed
|
||||||
|
|
||||||
|
This is why the staged source area for a project may look partial or uneven at
|
||||||
|
this stage.
|
||||||
|
|
||||||
|
## Long-Run Target
|
||||||
|
|
||||||
|
The long-run workflow should become much more natural:
|
||||||
|
|
||||||
|
- each project has a registered source map
|
||||||
|
- PKM root
|
||||||
|
- AtoDrive root
|
||||||
|
- repo root
|
||||||
|
- preferred docs
|
||||||
|
- excluded noisy paths
|
||||||
|
- a command like `refresh p06-polisher` resolves the right sources
|
||||||
|
- AtoCore refreshes the machine representation cleanly
|
||||||
|
- OpenClaw consumes the improved context over API
|
||||||
|
|
||||||
|
## Healthy Mental Model
|
||||||
|
|
||||||
|
Use this distinction:
|
||||||
|
|
||||||
|
- PKM / AtoDrive / repos = human-authoritative sources
|
||||||
|
- staged Dalidou markdown = server-visible ingestion inputs
|
||||||
|
- AtoCore DB/vector state = compiled machine context layer
|
||||||
|
|
||||||
|
That separation is intentional and healthy.
|
||||||
Reference in New Issue
Block a user