deploy: version-visible /health + deploy.sh + update runbook
Dalidou Claude's validation run against the live service exposed a structural gap: the deployment at /srv/storage/atocore/app has no git connection, the running container was built from pre-Phase-9 source, and /health hardcoded 'version: 0.1.0' so drift is invisible. Weeks of work have been shipping to Gitea but never reaching the live service. This commit fixes both the drift-invisibility problem and the absence of an update workflow, so the next deploy to Dalidou can go live cleanly and future drifts surface immediately. Layer 1: deployment drift is now visible via /health ---------------------------------------------------- - src/atocore/__init__.py: __version__ bumped from 0.1.0 to 0.2.0 and documented as the source of truth for the deployed code version, with a history block explaining when each bump happens (API surface change, schema change, user-visible behavior change) - src/atocore/main.py: FastAPI constructor now uses __version__ instead of the hardcoded '0.1.0' string, so the OpenAPI docs reflect the actual code version - src/atocore/api/routes.py: /health now reads from __version__ dynamically. Both the existing 'version' field and a new 'code_version' field report the same value for backwards compat. A new docstring explains that comparing this to the main branch's __version__ is the fastest way to detect drift. - pyproject.toml: version bumped to 0.2.0 to stay in sync The comparison is now: curl /health -> "code_version": "0.2.0" grep __version__ src/atocore/__init__.py -> "0.2.0" If those differ, the deployment is stale. Concrete, unambiguous. Layer 2: deploy.sh as the canonical update path ----------------------------------------------- New file: deploy/dalidou/deploy.sh One-shot bash script that handles both the first-time deploy (where /srv/storage/atocore/app may not be a git repo yet) and the ongoing update case. Steps: 1. If app dir is not a git checkout, back it up as <dir>.pre-git-<utc-stamp> and re-clone from Gitea. If it IS a checkout, fetch + reset --hard origin/<branch>. 2. Report the deployable commit SHA 3. Check that deploy/dalidou/.env exists (hard fail if missing with a clear message pointing at .env.example) 4. docker compose up -d --build — rebuilds the image from current source, restarts the container 5. Poll /health for up to 30 seconds; on failure, print the last 50 lines of container logs and exit non-zero 6. Parse /health.code_version and compare to the __version__ in the freshly-pulled source. If they differ, exit non-zero with a message suggesting docker compose down && up 7. On success, report commit + code_version + "health: ok" Configurable via env vars: - ATOCORE_APP_DIR (default /srv/storage/atocore/app) - ATOCORE_GIT_REMOTE (default http://dalidou:3000/Antoine/ATOCore.git) - ATOCORE_BRANCH (default main) - ATOCORE_HEALTH_URL (default http://127.0.0.1:8100/health) - ATOCORE_DEPLOY_DRY_RUN=1 for preview-only mode Explicit non-goals documented in the script header: - does not manage secrets (.env is the caller's responsibility) - does not take a pre-deploy backup (call /admin/backup first if you want one) - does not roll back on failure (redeploy a known-good commit to recover) - does not touch the DB directly — schema migrations run at service startup via the lifespan handler, and all existing _apply_migrations ALTERs are idempotent ADD COLUMN operations Layer 3: updated docs/dalidou-deployment.md ------------------------------------------- - First-time deployment steps now explicitly say "git clone", not "place the repository", so future first-time deploys don't end up as static snapshots again - New "Updating a running deployment" section covering deploy.sh usage with all three modes (normal / branch override / dry-run) - New "Deployment drift detection" section with the one-liner comparison between /health code_version and the repo's __version__ - New "Schema migrations on redeploy" section enumerating the exact ALTER TABLE statements that run on a pre-0.2.0 -> 0.2.0 upgrade, confirming they are additive-only and safe, and recommending a backup via /admin/backup before any redeploy Full suite: 215 passing, 1 warning. No test was hardcoded to the old version string, so the version bump was safe without test changes. What this commit does NOT do ---------------------------- - Does NOT execute the deploy on the live Dalidou instance. That requires Dalidou access and is the next step. A ready-to-paste prompt for Dalidou Claude will be provided separately. - Does NOT add CI/CD, webhook-based auto-deploy, or reverse proxy. Those remain in the 'deferred' section of the deployment doc. - Does NOT change the Dockerfile. The existing 'COPY source at build time' pattern is what deploy.sh relies on — rebuilding the image picks up new code. - Does NOT modify the database schema. The Phase 9 migrations that Dalidou's DB needs will be applied automatically on next service startup via the existing _apply_migrations path.
This commit is contained in:
@@ -742,12 +742,23 @@ def api_validate_backup(stamp: str) -> dict:
|
||||
|
||||
@router.get("/health")
|
||||
def api_health() -> dict:
|
||||
"""Health check."""
|
||||
"""Health check.
|
||||
|
||||
The ``version`` and ``code_version`` fields both report the value
|
||||
of ``atocore.__version__`` from the deployed code. Comparing this
|
||||
to the main branch's ``__version__`` is the fastest way to detect
|
||||
deployment drift: if they differ, the running service is behind
|
||||
the repo and needs a redeploy (see
|
||||
``docs/dalidou-deployment.md`` and ``deploy/dalidou/deploy.sh``).
|
||||
"""
|
||||
from atocore import __version__
|
||||
|
||||
store = get_vector_store()
|
||||
source_status = get_source_status()
|
||||
return {
|
||||
"status": "ok",
|
||||
"version": "0.1.0",
|
||||
"version": __version__,
|
||||
"code_version": __version__,
|
||||
"vectors_count": store.count,
|
||||
"env": _config.settings.env,
|
||||
"machine_paths": {
|
||||
|
||||
Reference in New Issue
Block a user