deploy: version-visible /health + deploy.sh + update runbook

Dalidou Claude's validation run against the live service exposed a structural gap: the deployment at /srv/storage/atocore/app has no git connection, the running container was built from pre-Phase-9 source, and /health hardcoded 'version: 0.1.0' so drift is invisible. Weeks of work have been shipping to Gitea but never reaching the live service. This commit fixes both the drift-invisibility problem and the absence of an update workflow, so the next deploy to Dalidou can go live cleanly and future drifts surface immediately. Layer 1: deployment drift is now visible via /health ---------------------------------------------------- - src/atocore/__init__.py: __version__ bumped from 0.1.0 to 0.2.0 and documented as the source of truth for the deployed code version, with a history block explaining when each bump happens (API surface change, schema change, user-visible behavior change) - src/atocore/main.py: FastAPI constructor now uses __version__ instead of the hardcoded '0.1.0' string, so the OpenAPI docs reflect the actual code version - src/atocore/api/routes.py: /health now reads from __version__ dynamically. Both the existing 'version' field and a new 'code_version' field report the same value for backwards compat. A new docstring explains that comparing this to the main branch's __version__ is the fastest way to detect drift. - pyproject.toml: version bumped to 0.2.0 to stay in sync The comparison is now: curl /health -> "code_version": "0.2.0" grep __version__ src/atocore/__init__.py -> "0.2.0" If those differ, the deployment is stale. Concrete, unambiguous. Layer 2: deploy.sh as the canonical update path ----------------------------------------------- New file: deploy/dalidou/deploy.sh One-shot bash script that handles both the first-time deploy (where /srv/storage/atocore/app may not be a git repo yet) and the ongoing update case. Steps: 1. If app dir is not a git checkout, back it up as <dir>.pre-git-<utc-stamp> and re-clone from Gitea. If it IS a checkout, fetch + reset --hard origin/<branch>. 2. Report the deployable commit SHA 3. Check that deploy/dalidou/.env exists (hard fail if missing with a clear message pointing at .env.example) 4. docker compose up -d --build — rebuilds the image from current source, restarts the container 5. Poll /health for up to 30 seconds; on failure, print the last 50 lines of container logs and exit non-zero 6. Parse /health.code_version and compare to the __version__ in the freshly-pulled source. If they differ, exit non-zero with a message suggesting docker compose down && up 7. On success, report commit + code_version + "health: ok" Configurable via env vars: - ATOCORE_APP_DIR (default /srv/storage/atocore/app) - ATOCORE_GIT_REMOTE (default http://dalidou:3000/Antoine/ATOCore.git) - ATOCORE_BRANCH (default main) - ATOCORE_HEALTH_URL (default http://127.0.0.1:8100/health) - ATOCORE_DEPLOY_DRY_RUN=1 for preview-only mode Explicit non-goals documented in the script header: - does not manage secrets (.env is the caller's responsibility) - does not take a pre-deploy backup (call /admin/backup first if you want one) - does not roll back on failure (redeploy a known-good commit to recover) - does not touch the DB directly — schema migrations run at service startup via the lifespan handler, and all existing _apply_migrations ALTERs are idempotent ADD COLUMN operations Layer 3: updated docs/dalidou-deployment.md ------------------------------------------- - First-time deployment steps now explicitly say "git clone", not "place the repository", so future first-time deploys don't end up as static snapshots again - New "Updating a running deployment" section covering deploy.sh usage with all three modes (normal / branch override / dry-run) - New "Deployment drift detection" section with the one-liner comparison between /health code_version and the repo's __version__ - New "Schema migrations on redeploy" section enumerating the exact ALTER TABLE statements that run on a pre-0.2.0 -> 0.2.0 upgrade, confirming they are additive-only and safe, and recommending a backup via /admin/backup before any redeploy Full suite: 215 passing, 1 warning. No test was hardcoded to the old version string, so the version bump was safe without test changes. What this commit does NOT do ---------------------------- - Does NOT execute the deploy on the live Dalidou instance. That requires Dalidou access and is the next step. A ready-to-paste prompt for Dalidou Claude will be provided separately. - Does NOT add CI/CD, webhook-based auto-deploy, or reverse proxy. Those remain in the 'deferred' section of the deployment doc. - Does NOT change the Dockerfile. The existing 'COPY source at build time' pattern is what deploy.sh relies on — rebuilding the image picks up new code. - Does NOT modify the database schema. The Phase 9 migrations that Dalidou's DB needs will be applied automatically on next service startup via the existing _apply_migrations path.
2026-04-08 18:08:49 -04:00
parent fad30d5461
commit e877e5b8ff
6 changed files with 320 additions and 13 deletions
--- a/docs/dalidou-deployment.md
+++ b/docs/dalidou-deployment.md
@@ -50,26 +50,111 @@ starting from:
 deploy/dalidou/.env.example
 ```

-## Deployment steps
+## First-time deployment steps
+
+1. Place the repository under `/srv/storage/atocore/app` — ideally as a
+   proper git clone so future updates can be pulled, not as a static
+   snapshot:
+
+   ```bash
+   sudo git clone http://dalidou:3000/Antoine/ATOCore.git \
+       /srv/storage/atocore/app
+   ```

-1. Place the repository under `/srv/storage/atocore/app`.
 2. Create the canonical directories listed above.
 3. Copy `deploy/dalidou/.env.example` to `deploy/dalidou/.env`.
 4. Adjust the source paths if your AtoVault/AtoDrive mirrors live elsewhere.
 5. Run:

-```bash
-cd /srv/storage/atocore/app/deploy/dalidou
-docker compose up -d --build
-```
+   ```bash
+   cd /srv/storage/atocore/app/deploy/dalidou
+   docker compose up -d --build
+   ```

 6. Validate:

+   ```bash
+   curl http://127.0.0.1:8100/health
+   curl http://127.0.0.1:8100/sources
+   ```
+
+## Updating a running deployment
+
+**Use `deploy/dalidou/deploy.sh` for every code update.** It is the
+one-shot sync script that:
+
+- fetches latest main from Gitea into `/srv/storage/atocore/app`
+- (if the app dir is not a git checkout) backs it up as
+  `<dir>.pre-git-<timestamp>` and re-clones
+- rebuilds the container image
+- restarts the container
+- waits for `/health` to respond
+- compares the reported `code_version` against the
+  `__version__` in the freshly-pulled source, and exits non-zero
+  if they don't match (deployment drift detection)
+
 ```bash
-curl http://127.0.0.1:8100/health
-curl http://127.0.0.1:8100/sources
+# Normal update from main
+bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
+
+# Deploy a specific branch or tag
+ATOCORE_BRANCH=codex/some-feature \
+    bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
+
+# Dry-run: show what would happen without touching anything
+ATOCORE_DEPLOY_DRY_RUN=1 \
+    bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
 ```

+The script is idempotent and safe to re-run. It never touches the
+database directly — schema migrations are applied automatically at
+service startup by the lifespan handler in `src/atocore/main.py`
+which calls `init_db()` (which in turn runs the ALTER TABLE
+statements in `_apply_migrations`).
+
+### Deployment drift detection
+
+`/health` reports both `version` and `code_version` fields, both set
+from `atocore.__version__` at import time. To check whether the
+deployed code matches the repo's `main` branch:
+
+```bash
+# What's running
+curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"'
+
+# What's in the repo's main branch
+grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py
+```
+
+If these differ, the deployment is stale. Run `deploy.sh` to sync.
+
+### Schema migrations on redeploy
+
+When updating from an older `__version__`, the first startup after
+the redeploy runs the idempotent ALTER TABLE migrations in
+`_apply_migrations`. For a pre-0.2.0 → 0.2.0 upgrade the migrations
+add these columns to existing tables (all with safe defaults so no
+data is touched):
+
+- `memories.project TEXT DEFAULT ''`
+- `memories.last_referenced_at DATETIME`
+- `memories.reference_count INTEGER DEFAULT 0`
+- `interactions.response TEXT DEFAULT ''`
+- `interactions.memories_used TEXT DEFAULT '[]'`
+- `interactions.chunks_used TEXT DEFAULT '[]'`
+- `interactions.client TEXT DEFAULT ''`
+- `interactions.session_id TEXT DEFAULT ''`
+- `interactions.project TEXT DEFAULT ''`
+
+Plus new indexes on the new columns. No row data is modified. The
+migration is safe to run against a database that already has the
+columns — the `_column_exists` check makes each ALTER a no-op in
+that case.
+
+Backup the database before any redeploy (via `POST /admin/backup`)
+if you want a pre-upgrade snapshot. The migration is additive and
+reversible by restoring the snapshot.
+
 ## Deferred

 - backup automation