Harden runtime and add backup foundation
This commit is contained in:
80
docs/backup-strategy.md
Normal file
80
docs/backup-strategy.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# AtoCore Backup Strategy
|
||||
|
||||
## Purpose
|
||||
|
||||
This document describes the current backup baseline for the Dalidou-hosted
|
||||
AtoCore machine store.
|
||||
|
||||
The immediate goal is not full disaster-proof automation yet. The goal is to
|
||||
have one safe, repeatable way to snapshot the most important writable state.
|
||||
|
||||
## Current Backup Baseline
|
||||
|
||||
Today, the safest hot-backup target is:
|
||||
|
||||
- SQLite machine database
|
||||
- project registry JSON
|
||||
- backup metadata describing what was captured
|
||||
|
||||
This is now supported by:
|
||||
|
||||
- `python -m atocore.ops.backup`
|
||||
|
||||
## What The Script Captures
|
||||
|
||||
The backup command creates a timestamped snapshot under:
|
||||
|
||||
- `ATOCORE_BACKUP_DIR/snapshots/<timestamp>/`
|
||||
|
||||
It currently writes:
|
||||
|
||||
- `db/atocore.db`
|
||||
- created with SQLite's backup API
|
||||
- `config/project-registry.json`
|
||||
- copied if it exists
|
||||
- `backup-metadata.json`
|
||||
- timestamp, paths, and backup notes
|
||||
|
||||
## What It Does Not Yet Capture
|
||||
|
||||
The current script does not hot-backup Chroma.
|
||||
|
||||
That is intentional.
|
||||
|
||||
For now, Chroma should be treated as one of:
|
||||
|
||||
- rebuildable derived state
|
||||
- or something that needs a deliberate cold snapshot/export workflow
|
||||
|
||||
Until that workflow exists, do not rely on ad hoc live file copies of the
|
||||
vector store while the service is actively writing.
|
||||
|
||||
## Dalidou Use
|
||||
|
||||
On Dalidou, the canonical machine paths are:
|
||||
|
||||
- DB:
|
||||
- `/srv/storage/atocore/data/db/atocore.db`
|
||||
- registry:
|
||||
- `/srv/storage/atocore/config/project-registry.json`
|
||||
- backups:
|
||||
- `/srv/storage/atocore/backups`
|
||||
|
||||
So a normal backup run should happen on Dalidou itself, not from another
|
||||
machine.
|
||||
|
||||
## Next Backup Improvements
|
||||
|
||||
1. decide Chroma policy clearly
|
||||
- rebuild vs cold snapshot vs export
|
||||
2. add a simple scheduled backup routine on Dalidou
|
||||
3. add retention policy for old snapshots
|
||||
4. optionally add a restore validation check
|
||||
|
||||
## Healthy Rule
|
||||
|
||||
Do not design around syncing the live machine DB/vector store between machines.
|
||||
|
||||
Back up the canonical Dalidou state.
|
||||
Restore from Dalidou state.
|
||||
Keep OpenClaw as a client of AtoCore, not a storage peer.
|
||||
@@ -39,6 +39,11 @@ now includes a first curated ingestion batch for the active projects.
|
||||
- context builder
|
||||
- API routes for query, context, health, and source status
|
||||
- project registry and per-project refresh foundation
|
||||
- project registration lifecycle:
|
||||
- template
|
||||
- proposal preview
|
||||
- approved registration
|
||||
- refresh
|
||||
- env-driven storage and deployment paths
|
||||
- Dalidou Docker deployment foundation
|
||||
- initial AtoCore self-knowledge corpus ingested on Dalidou
|
||||
@@ -64,6 +69,11 @@ The service and storage foundation are live on Dalidou.
|
||||
|
||||
The machine-data host is real and canonical.
|
||||
|
||||
The project registry is now also persisted in a canonical mounted config path on
|
||||
Dalidou:
|
||||
|
||||
- `/srv/storage/atocore/config/project-registry.json`
|
||||
|
||||
The content corpus is partially populated now.
|
||||
|
||||
The Dalidou instance already contains:
|
||||
@@ -88,9 +98,9 @@ The Dalidou instance already contains:
|
||||
Current live stats after the latest documentation sync and active-project ingest
|
||||
passes:
|
||||
|
||||
- `source_documents`: 34
|
||||
- `source_chunks`: 551
|
||||
- `vectors`: 551
|
||||
- `source_documents`: 35
|
||||
- `source_chunks`: 560
|
||||
- `vectors`: 560
|
||||
|
||||
The broader long-term corpus is still not fully populated yet. Wider project and
|
||||
vault ingestion remains a deliberate next step rather than something already
|
||||
@@ -149,8 +159,28 @@ The source refresh model now has a concrete foundation in code:
|
||||
|
||||
- a project registry file defines known project ids, aliases, and ingest roots
|
||||
- the API can list registered projects
|
||||
- the API can return a registration template
|
||||
- the API can preview a registration without mutating state
|
||||
- the API can persist an approved registration
|
||||
- the API can refresh one registered project at a time
|
||||
|
||||
This lifecycle is now coherent end to end for normal use.
|
||||
|
||||
## Reliability Baseline
|
||||
|
||||
The runtime has now been hardened in a few practical ways:
|
||||
|
||||
- SQLite connections use a configurable busy timeout
|
||||
- SQLite uses WAL mode to reduce transient lock pain under normal concurrent use
|
||||
- project registry writes are atomic file replacements rather than in-place rewrites
|
||||
- a first runtime backup path now exists for:
|
||||
- SQLite
|
||||
- project registry
|
||||
- backup metadata
|
||||
|
||||
This does not eliminate every concurrency edge, but it materially improves the
|
||||
current operational baseline.
|
||||
|
||||
In `Trusted Project State`:
|
||||
|
||||
- each active seeded project now has a conservative trusted-state set
|
||||
@@ -167,7 +197,7 @@ This separation is healthy:
|
||||
|
||||
## Immediate Next Focus
|
||||
|
||||
1. Use the new T420-side AtoCore skill in real OpenClaw workflows
|
||||
1. Use the new T420-side AtoCore skill and registration flow in real OpenClaw workflows
|
||||
2. Tighten retrieval quality for the newly seeded active projects
|
||||
3. Define the first broader AtoVault/AtoDrive ingestion batches
|
||||
4. Add backup/export strategy for Dalidou machine state
|
||||
|
||||
@@ -31,10 +31,12 @@ AtoCore now has:
|
||||
explicit
|
||||
- move toward a project source registry and refresh workflow
|
||||
- foundation now exists via project registry + per-project refresh API
|
||||
- registration policy + template are now the next normal path for new projects
|
||||
- registration policy + template + proposal + approved registration are now
|
||||
the normal path for new projects
|
||||
5. Define backup and export procedures for Dalidou
|
||||
- SQLite snapshot/backup strategy
|
||||
- exercise the new SQLite + registry snapshot path on Dalidou
|
||||
- Chroma backup or rebuild policy
|
||||
- retention and restore validation
|
||||
6. Keep deeper automatic runtime integration deferred until the read-only model
|
||||
has proven value
|
||||
|
||||
@@ -101,6 +103,7 @@ P06:
|
||||
The next batch is successful if:
|
||||
|
||||
- OpenClaw can use AtoCore naturally when context is needed
|
||||
- OpenClaw can also register a new project cleanly before refreshing it
|
||||
- AtoCore answers correctly for the active project set
|
||||
- retrieval surfaces the seeded project docs instead of mostly AtoCore meta-docs
|
||||
- trusted project state remains concise and high confidence
|
||||
|
||||
Reference in New Issue
Block a user