# Dalidou Deployment ## Purpose Deploy AtoCore on Dalidou as the canonical runtime and machine-memory host. ## Model - Dalidou hosts the canonical AtoCore service. - OpenClaw on the T420 consumes AtoCore over network/Tailscale API. - `sources/vault` and `sources/drive` are read-only inputs by convention. - SQLite/Chroma machine state stays on Dalidou and is not treated as a sync peer. - The app and machine-storage host can be live before the long-term content corpus is fully populated. ## Directory layout ```text /srv/storage/atocore/ app/ # deployed repo checkout data/ db/ chroma/ cache/ tmp/ sources/ vault/ drive/ logs/ backups/ run/ ``` ## Compose workflow The compose definition lives in: ```text deploy/dalidou/docker-compose.yml ``` The Dalidou environment file should be copied to: ```text deploy/dalidou/.env ``` starting from: ```text deploy/dalidou/.env.example ``` ## First-time deployment steps 1. Place the repository under `/srv/storage/atocore/app` — ideally as a proper git clone so future updates can be pulled, not as a static snapshot: ```bash sudo git clone http://dalidou:3000/Antoine/ATOCore.git \ /srv/storage/atocore/app ``` 2. Create the canonical directories listed above. 3. Copy `deploy/dalidou/.env.example` to `deploy/dalidou/.env`. 4. Adjust the source paths if your AtoVault/AtoDrive mirrors live elsewhere. 5. Run: ```bash cd /srv/storage/atocore/app/deploy/dalidou docker compose up -d --build ``` 6. Validate: ```bash curl http://127.0.0.1:8100/health curl http://127.0.0.1:8100/sources ``` ## Updating a running deployment **Use `deploy/dalidou/deploy.sh` for every code update.** It is the one-shot sync script that: - fetches latest main from Gitea into `/srv/storage/atocore/app` - (if the app dir is not a git checkout) backs it up as `.pre-git-` and re-clones - rebuilds the container image - restarts the container - waits for `/health` to respond - compares the reported `code_version` against the `__version__` in the freshly-pulled source, and exits non-zero if they don't match (deployment drift detection) ```bash # Normal update from main bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh # Deploy a specific branch or tag ATOCORE_BRANCH=codex/some-feature \ bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh # Dry-run: show what would happen without touching anything ATOCORE_DEPLOY_DRY_RUN=1 \ bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh # Deploy from a remote host (e.g. the laptop) using the Tailscale # or LAN address instead of loopback ATOCORE_GIT_REMOTE=http://192.168.86.50:3000/Antoine/ATOCore.git \ bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh ``` The script is idempotent and safe to re-run. It never touches the database directly — schema migrations are applied automatically at service startup by the lifespan handler in `src/atocore/main.py` which calls `init_db()` (which in turn runs the ALTER TABLE statements in `_apply_migrations`). ### Troubleshooting hostname resolution `deploy.sh` defaults `ATOCORE_GIT_REMOTE` to `http://127.0.0.1:3000/Antoine/ATOCore.git` (loopback) because the hostname "dalidou" doesn't reliably resolve on the host itself — the first real Dalidou deploy hit exactly this on 2026-04-08. If you need to override (e.g. running deploy.sh from a laptop against the Dalidou LAN), set `ATOCORE_GIT_REMOTE` explicitly. The same applies to `scripts/atocore_client.py`: its default `ATOCORE_BASE_URL` is `http://dalidou:8100` for remote callers, but when running the client on Dalidou itself (or inside the container via `docker exec`), override to loopback: ```bash ATOCORE_BASE_URL=http://127.0.0.1:8100 \ python scripts/atocore_client.py health ``` If you see `{"status": "unavailable", "fail_open": true}` from the client, the first thing to check is whether the base URL resolves from where you're running the client. ### The deploy.sh self-update race When `deploy.sh` itself changes in the commit being pulled, the first run after the update is still executing the *old* script from the bash process's in-memory copy. `git reset --hard` updates the file on disk, but the running bash has already loaded the instructions. On 2026-04-09 this silently shipped an "unknown" `build_sha` because the old Step 2 (which predated env-var export) ran against fresh source. `deploy.sh` now detects this: Step 1.5 compares the sha1 of `$0` (the running script) against the sha1 of `$APP_DIR/deploy/dalidou/deploy.sh` (the on-disk copy) after the git reset. If they differ, it sets `ATOCORE_DEPLOY_REEXECED=1` and `exec`s the fresh copy so the rest of the deploy runs under the new script. The sentinel env var prevents infinite recursion. You'll see this in the logs as: ```text ==> Step 1.5: deploy.sh changed in the pulled commit; re-exec'ing ==> running script hash: ==> on-disk script hash: ==> re-exec -> /srv/storage/atocore/app/deploy/dalidou/deploy.sh ``` To opt out (debugging, for example), pre-set `ATOCORE_DEPLOY_REEXECED=1` before invoking `deploy.sh` and the self-update guard will be skipped. ### Deployment drift detection `/health` reports drift signals at three increasing levels of precision: | Field | Source | Precision | When to use | |---|---|---|---| | `version` / `code_version` | `atocore.__version__` (manual bump) | coarse — same value across many commits | quick smoke check that the right *release* is running | | `build_sha` | `ATOCORE_BUILD_SHA` env var, set by `deploy.sh` per build | precise — changes per commit | the canonical drift signal | | `build_time` / `build_branch` | same env var path | per-build | forensics when multiple branches in flight | The **precise** check (run on the laptop or any host that can curl the live service AND has the source repo at hand): ```bash # What's actually running on Dalidou LIVE_SHA=$(curl -fsS http://dalidou:8100/health | grep -o '"build_sha":"[^"]*"' | cut -d'"' -f4) # What the deployed branch tip should be EXPECTED_SHA=$(cd /srv/storage/atocore/app && git rev-parse HEAD) # Compare if [ "$LIVE_SHA" = "$EXPECTED_SHA" ]; then echo "live is current at $LIVE_SHA" else echo "DRIFT: live $LIVE_SHA vs expected $EXPECTED_SHA" echo "run deploy.sh to sync" fi ``` The `deploy.sh` script does exactly this comparison automatically in its post-deploy verification step (Step 6) and exits non-zero on mismatch. So the **simplest drift check** is just to run `deploy.sh` — if there's nothing to deploy, it succeeds quickly; if the live service is stale, it deploys and verifies. If `/health` reports `build_sha: "unknown"`, the running container was started without `deploy.sh` (probably via `docker compose up` directly), and the build provenance was never recorded. Re-run via `deploy.sh` to fix. The coarse `code_version` check is still useful as a quick visual sanity check — bumping `__version__` from `0.2.0` to `0.3.0` signals a meaningful release boundary even if the precise `build_sha` is what tools should compare against: ```bash # Quick sanity check (coarse) curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"' grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py ``` ### Schema migrations on redeploy When updating from an older `__version__`, the first startup after the redeploy runs the idempotent ALTER TABLE migrations in `_apply_migrations`. For a pre-0.2.0 → 0.2.0 upgrade the migrations add these columns to existing tables (all with safe defaults so no data is touched): - `memories.project TEXT DEFAULT ''` - `memories.last_referenced_at DATETIME` - `memories.reference_count INTEGER DEFAULT 0` - `interactions.response TEXT DEFAULT ''` - `interactions.memories_used TEXT DEFAULT '[]'` - `interactions.chunks_used TEXT DEFAULT '[]'` - `interactions.client TEXT DEFAULT ''` - `interactions.session_id TEXT DEFAULT ''` - `interactions.project TEXT DEFAULT ''` Plus new indexes on the new columns. No row data is modified. The migration is safe to run against a database that already has the columns — the `_column_exists` check makes each ALTER a no-op in that case. Backup the database before any redeploy (via `POST /admin/backup`) if you want a pre-upgrade snapshot. The migration is additive and reversible by restoring the snapshot. ## Deferred - backup automation - restore/snapshot tooling - reverse proxy / TLS exposure - automated source ingestion job - OpenClaw client wiring ## Current Reality Check When this deployment is first brought up, the service may be healthy before the real corpus has been ingested. That means: - AtoCore the system can already be hosted on Dalidou - the canonical machine-data location can already be on Dalidou - but the live knowledge/content corpus may still be empty or only partially loaded until source ingestion is run