Expand active project wave and serialize refreshes
This commit is contained in:
@@ -52,7 +52,7 @@ now includes a first curated ingestion batch for the active projects.
|
||||
- Dalidou Docker deployment foundation
|
||||
- initial AtoCore self-knowledge corpus ingested on Dalidou
|
||||
- T420/OpenClaw read-only AtoCore helper skill
|
||||
- first curated active-project corpus batch for:
|
||||
- full active-project markdown/text corpus wave for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
@@ -87,7 +87,7 @@ The Dalidou instance already contains:
|
||||
- Master Plan V3
|
||||
- Build Spec V1
|
||||
- trusted project-state entries for `atocore`
|
||||
- curated staged project docs for:
|
||||
- full staged project markdown/text corpora for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
@@ -99,12 +99,12 @@ The Dalidou instance already contains:
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
|
||||
Current live stats after the latest documentation sync and active-project ingest
|
||||
passes:
|
||||
Current live stats after the full active-project wave are now far beyond the
|
||||
initial seed stage:
|
||||
|
||||
- `source_documents`: 36
|
||||
- `source_chunks`: 568
|
||||
- `vectors`: 568
|
||||
- more than `1,100` source documents
|
||||
- more than `20,000` chunks
|
||||
- matching vector count
|
||||
|
||||
The broader long-term corpus is still not fully populated yet. Wider project and
|
||||
vault ingestion remains a deliberate next step rather than something already
|
||||
@@ -115,8 +115,8 @@ primarily visible under:
|
||||
|
||||
- `/srv/storage/atocore/sources/vault/incoming/projects`
|
||||
|
||||
This staged area is now useful for review because it contains the curated
|
||||
project docs that were actually ingested for the first active-project batch.
|
||||
This staged area is now useful for review because it contains the markdown/text
|
||||
project docs that were actually ingested for the full active-project wave.
|
||||
|
||||
It is important to read this staged area correctly:
|
||||
|
||||
@@ -166,10 +166,12 @@ These are curated summaries and extracted stable project signals.
|
||||
|
||||
In `source_documents` / retrieval corpus:
|
||||
|
||||
- real project documents are now present for the same active project set
|
||||
- full project markdown/text corpora are now present for the active project set
|
||||
- retrieval is no longer limited to AtoCore self-knowledge only
|
||||
- the current corpus is still selective rather than exhaustive
|
||||
- that selectivity is intentional at this stage
|
||||
- the current corpus is broad enough that ranking quality matters more than
|
||||
corpus presence alone
|
||||
- underspecified prompts can still pull in historical or archive material, so
|
||||
project-aware routing and better ranking remain important
|
||||
|
||||
The source refresh model now has a concrete foundation in code:
|
||||
|
||||
@@ -223,8 +225,8 @@ This separation is healthy:
|
||||
## Immediate Next Focus
|
||||
|
||||
1. Use the new T420-side organic routing layer in real OpenClaw workflows
|
||||
2. Keep tightening retrieval quality for the newly seeded active projects
|
||||
3. Define the first broader AtoVault/AtoDrive ingestion batches
|
||||
2. Tighten retrieval quality for the now fully ingested active project corpora
|
||||
3. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
|
||||
4. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
|
||||
5. Expand the boring operations baseline:
|
||||
- restore validation
|
||||
@@ -234,6 +236,7 @@ This separation is healthy:
|
||||
|
||||
See also:
|
||||
|
||||
- [ingestion-waves.md](C:/Users/antoi/ATOCore/docs/ingestion-waves.md)
|
||||
- [master-plan-status.md](C:/Users/antoi/ATOCore/docs/master-plan-status.md)
|
||||
|
||||
## Guiding Constraints
|
||||
|
||||
Reference in New Issue
Block a user