Files

Anto01 e877e5b8ff deploy: version-visible /health + deploy.sh + update runbook

Dalidou Claude's validation run against the live service exposed a
structural gap: the deployment at /srv/storage/atocore/app has no
git connection, the running container was built from pre-Phase-9
source, and /health hardcoded 'version: 0.1.0' so drift is
invisible. Weeks of work have been shipping to Gitea but never
reaching the live service.

This commit fixes both the drift-invisibility problem and the
absence of an update workflow, so the next deploy to Dalidou can
go live cleanly and future drifts surface immediately.

Layer 1: deployment drift is now visible via /health
----------------------------------------------------
- src/atocore/__init__.py: __version__ bumped from 0.1.0 to 0.2.0
  and documented as the source of truth for the deployed code
  version, with a history block explaining when each bump happens
  (API surface change, schema change, user-visible behavior change)
- src/atocore/main.py: FastAPI constructor now uses __version__
  instead of the hardcoded '0.1.0' string, so the OpenAPI docs
  reflect the actual code version
- src/atocore/api/routes.py: /health now reads from __version__
  dynamically. Both the existing 'version' field and a new
  'code_version' field report the same value for backwards compat.
  A new docstring explains that comparing this to the main
  branch's __version__ is the fastest way to detect drift.
- pyproject.toml: version bumped to 0.2.0 to stay in sync

The comparison is now:
  curl /health -> "code_version": "0.2.0"
  grep __version__ src/atocore/__init__.py -> "0.2.0"
If those differ, the deployment is stale. Concrete, unambiguous.

Layer 2: deploy.sh as the canonical update path
-----------------------------------------------
New file: deploy/dalidou/deploy.sh

One-shot bash script that handles both the first-time deploy
(where /srv/storage/atocore/app may not be a git repo yet) and
the ongoing update case. Steps:

1. If app dir is not a git checkout, back it up as
   <dir>.pre-git-<utc-stamp> and re-clone from Gitea.
   If it IS a checkout, fetch + reset --hard origin/<branch>.
2. Report the deployable commit SHA
3. Check that deploy/dalidou/.env exists (hard fail if missing
   with a clear message pointing at .env.example)
4. docker compose up -d --build — rebuilds the image from
   current source, restarts the container
5. Poll /health for up to 30 seconds; on failure, print the
   last 50 lines of container logs and exit non-zero
6. Parse /health.code_version and compare to the __version__
   in the freshly-pulled source. If they differ, exit non-zero
   with a message suggesting docker compose down && up
7. On success, report commit + code_version + "health: ok"

Configurable via env vars:
- ATOCORE_APP_DIR (default /srv/storage/atocore/app)
- ATOCORE_GIT_REMOTE (default http://dalidou:3000/Antoine/ATOCore.git)
- ATOCORE_BRANCH (default main)
- ATOCORE_HEALTH_URL (default http://127.0.0.1:8100/health)
- ATOCORE_DEPLOY_DRY_RUN=1 for preview-only mode

Explicit non-goals documented in the script header:
- does not manage secrets (.env is the caller's responsibility)
- does not take a pre-deploy backup (call /admin/backup first
  if you want one)
- does not roll back on failure (redeploy a known-good commit
  to recover)
- does not touch the DB directly — schema migrations run at
  service startup via the lifespan handler, and all existing
  _apply_migrations ALTERs are idempotent ADD COLUMN operations

Layer 3: updated docs/dalidou-deployment.md
-------------------------------------------
- First-time deployment steps now explicitly say "git clone", not
  "place the repository", so future first-time deploys don't end
  up as static snapshots again
- New "Updating a running deployment" section covering deploy.sh
  usage with all three modes (normal / branch override / dry-run)
- New "Deployment drift detection" section with the one-liner
  comparison between /health code_version and the repo's
  __version__
- New "Schema migrations on redeploy" section enumerating the
  exact ALTER TABLE statements that run on a pre-0.2.0 -> 0.2.0
  upgrade, confirming they are additive-only and safe, and
  recommending a backup via /admin/backup before any redeploy

Full suite: 215 passing, 1 warning. No test was hardcoded to the
old version string, so the version bump was safe without test
changes.

What this commit does NOT do
----------------------------
- Does NOT execute the deploy on the live Dalidou instance. That
  requires Dalidou access and is the next step. A ready-to-paste
  prompt for Dalidou Claude will be provided separately.
- Does NOT add CI/CD, webhook-based auto-deploy, or reverse
  proxy. Those remain in the 'deferred' section of the
  deployment doc.
- Does NOT change the Dockerfile. The existing 'COPY source at
  build time' pattern is what deploy.sh relies on — rebuilding
  the image picks up new code.
- Does NOT modify the database schema. The Phase 9 migrations
  that Dalidou's DB needs will be applied automatically on next
  service startup via the existing _apply_migrations path.

2026-04-08 18:08:49 -04:00

4.9 KiB

Raw Blame History

Dalidou Deployment

Purpose

Deploy AtoCore on Dalidou as the canonical runtime and machine-memory host.

Model

Dalidou hosts the canonical AtoCore service.
OpenClaw on the T420 consumes AtoCore over network/Tailscale API.
sources/vault and sources/drive are read-only inputs by convention.
SQLite/Chroma machine state stays on Dalidou and is not treated as a sync peer.
The app and machine-storage host can be live before the long-term content corpus is fully populated.

Directory layout

/srv/storage/atocore/
  app/         # deployed repo checkout
  data/
    db/
    chroma/
    cache/
    tmp/
  sources/
    vault/
    drive/
  logs/
  backups/
  run/

Compose workflow

The compose definition lives in:

deploy/dalidou/docker-compose.yml

The Dalidou environment file should be copied to:

deploy/dalidou/.env

starting from:

deploy/dalidou/.env.example

First-time deployment steps

Place the repository under /srv/storage/atocore/app — ideally as a proper git clone so future updates can be pulled, not as a static snapshot:
```
sudo git clone http://dalidou:3000/Antoine/ATOCore.git \
    /srv/storage/atocore/app
```
Create the canonical directories listed above.
Copy deploy/dalidou/.env.example to deploy/dalidou/.env.
Adjust the source paths if your AtoVault/AtoDrive mirrors live elsewhere.

Run:

cd /srv/storage/atocore/app/deploy/dalidou
docker compose up -d --build

Validate:

curl http://127.0.0.1:8100/health
curl http://127.0.0.1:8100/sources

Updating a running deployment

Use deploy/dalidou/deploy.sh for every code update. It is the one-shot sync script that:

fetches latest main from Gitea into /srv/storage/atocore/app
(if the app dir is not a git checkout) backs it up as <dir>.pre-git-<timestamp> and re-clones
rebuilds the container image
restarts the container
waits for /health to respond
compares the reported code_version against the __version__ in the freshly-pulled source, and exits non-zero if they don't match (deployment drift detection)

# Normal update from main
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh

# Deploy a specific branch or tag
ATOCORE_BRANCH=codex/some-feature \
    bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh

# Dry-run: show what would happen without touching anything
ATOCORE_DEPLOY_DRY_RUN=1 \
    bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh

The script is idempotent and safe to re-run. It never touches the database directly — schema migrations are applied automatically at service startup by the lifespan handler in src/atocore/main.py which calls init_db() (which in turn runs the ALTER TABLE statements in _apply_migrations).

Deployment drift detection

/health reports both version and code_version fields, both set from atocore.__version__ at import time. To check whether the deployed code matches the repo's main branch:

# What's running
curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"'

# What's in the repo's main branch
grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py

If these differ, the deployment is stale. Run deploy.sh to sync.

Schema migrations on redeploy

When updating from an older __version__, the first startup after the redeploy runs the idempotent ALTER TABLE migrations in _apply_migrations. For a pre-0.2.0 → 0.2.0 upgrade the migrations add these columns to existing tables (all with safe defaults so no data is touched):

memories.project TEXT DEFAULT ''
memories.last_referenced_at DATETIME
memories.reference_count INTEGER DEFAULT 0
interactions.response TEXT DEFAULT ''
interactions.memories_used TEXT DEFAULT '[]'
interactions.chunks_used TEXT DEFAULT '[]'
interactions.client TEXT DEFAULT ''
interactions.session_id TEXT DEFAULT ''
interactions.project TEXT DEFAULT ''

Plus new indexes on the new columns. No row data is modified. The migration is safe to run against a database that already has the columns — the _column_exists check makes each ALTER a no-op in that case.

Backup the database before any redeploy (via POST /admin/backup) if you want a pre-upgrade snapshot. The migration is additive and reversible by restoring the snapshot.

Deferred

backup automation
restore/snapshot tooling
reverse proxy / TLS exposure
automated source ingestion job
OpenClaw client wiring

Current Reality Check

When this deployment is first brought up, the service may be healthy before the real corpus has been ingested.

That means:

AtoCore the system can already be hosted on Dalidou
the canonical machine-data location can already be on Dalidou
but the live knowledge/content corpus may still be empty or only partially loaded until source ingestion is run

4.9 KiB Raw Blame History