deploy: add build_sha visibility for precise drift detection
Make /health report the precise git SHA the container was built from,
so 'is the live service current?' can be answered without ambiguity.
0.2.0 was too coarse to trust as a 'live is current' signal — many
commits share the same __version__.
Three layers:
1. /health endpoint (src/atocore/api/routes.py)
- Reads ATOCORE_BUILD_SHA, ATOCORE_BUILD_TIME, ATOCORE_BUILD_BRANCH
from environment, defaults to 'unknown'
- Reports them alongside existing code_version field
2. docker-compose.yml
- Forwards the three env vars from the host into the container
- Defaults to 'unknown' so direct `docker compose up` runs (without
deploy.sh) cleanly signal missing build provenance
3. deploy.sh
- Step 2 captures git SHA + UTC timestamp + branch and exports them
as env vars before `docker compose up -d --build`
- Step 6 reads /health post-deploy and compares the reported
build_sha against the freshly-built one. Mismatch exits non-zero
(exit code 6) with a remediation hint covering cached image,
env propagation, and concurrent restart cases
Tests (tests/test_api_storage.py):
- test_health_endpoint_reports_code_version_from_module
- test_health_endpoint_reports_build_metadata_from_env
- test_health_endpoint_reports_unknown_when_build_env_unset
Docs (docs/dalidou-deployment.md):
- Three-level drift detection table (code_version coarse,
build_sha precise, build_time/branch forensic)
- Canonical drift check script using LIVE_SHA vs EXPECTED_SHA
- Note that running deploy.sh is itself the simplest drift check
219/219 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -142,19 +142,55 @@ from where you're running the client.
|
||||
|
||||
### Deployment drift detection
|
||||
|
||||
`/health` reports both `version` and `code_version` fields, both set
|
||||
from `atocore.__version__` at import time. To check whether the
|
||||
deployed code matches the repo's `main` branch:
|
||||
`/health` reports drift signals at three increasing levels of
|
||||
precision:
|
||||
|
||||
| Field | Source | Precision | When to use |
|
||||
|---|---|---|---|
|
||||
| `version` / `code_version` | `atocore.__version__` (manual bump) | coarse — same value across many commits | quick smoke check that the right *release* is running |
|
||||
| `build_sha` | `ATOCORE_BUILD_SHA` env var, set by `deploy.sh` per build | precise — changes per commit | the canonical drift signal |
|
||||
| `build_time` / `build_branch` | same env var path | per-build | forensics when multiple branches in flight |
|
||||
|
||||
The **precise** check (run on the laptop or any host that can curl
|
||||
the live service AND has the source repo at hand):
|
||||
|
||||
```bash
|
||||
# What's running
|
||||
curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"'
|
||||
# What's actually running on Dalidou
|
||||
LIVE_SHA=$(curl -fsS http://dalidou:8100/health | grep -o '"build_sha":"[^"]*"' | cut -d'"' -f4)
|
||||
|
||||
# What's in the repo's main branch
|
||||
grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py
|
||||
# What the deployed branch tip should be
|
||||
EXPECTED_SHA=$(cd /srv/storage/atocore/app && git rev-parse HEAD)
|
||||
|
||||
# Compare
|
||||
if [ "$LIVE_SHA" = "$EXPECTED_SHA" ]; then
|
||||
echo "live is current at $LIVE_SHA"
|
||||
else
|
||||
echo "DRIFT: live $LIVE_SHA vs expected $EXPECTED_SHA"
|
||||
echo "run deploy.sh to sync"
|
||||
fi
|
||||
```
|
||||
|
||||
If these differ, the deployment is stale. Run `deploy.sh` to sync.
|
||||
The `deploy.sh` script does exactly this comparison automatically
|
||||
in its post-deploy verification step (Step 6) and exits non-zero
|
||||
on mismatch. So the **simplest drift check** is just to run
|
||||
`deploy.sh` — if there's nothing to deploy, it succeeds quickly;
|
||||
if the live service is stale, it deploys and verifies.
|
||||
|
||||
If `/health` reports `build_sha: "unknown"`, the running container
|
||||
was started without `deploy.sh` (probably via `docker compose up`
|
||||
directly), and the build provenance was never recorded. Re-run
|
||||
via `deploy.sh` to fix.
|
||||
|
||||
The coarse `code_version` check is still useful as a quick visual
|
||||
sanity check — bumping `__version__` from `0.2.0` to `0.3.0`
|
||||
signals a meaningful release boundary even if the precise
|
||||
`build_sha` is what tools should compare against:
|
||||
|
||||
```bash
|
||||
# Quick sanity check (coarse)
|
||||
curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"'
|
||||
grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py
|
||||
```
|
||||
|
||||
### Schema migrations on redeploy
|
||||
|
||||
|
||||
Reference in New Issue
Block a user