feat: nightly batch extraction in cron-backup.sh (Day 2)
Step 4 added to the daily cron: POST /admin/extract-batch with mode=llm, persist=true, limit=50. Runs after backup + cleanup + rsync. Fail-open: extraction failure never blocks the backup. Gated on ATOCORE_EXTRACT_BATCH=true (defaults to true). The endpoint uses the last_extract_batch_run timestamp from project state to auto-resume, so the cron doesn't need to track state. curl --max-time 600 gives the LLM extractor up to 10 minutes for the batch (50 interactions × ~20s each worst case = ~17 min, but most will be no-ops if already extracted). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -82,4 +82,26 @@ else
|
||||
log "Step 3: ATOCORE_BACKUP_RSYNC not set, skipping off-host copy"
|
||||
fi
|
||||
|
||||
# Step 4: Batch LLM extraction on recent interactions (optional).
|
||||
# Runs the LLM extractor (claude -p sonnet) against interactions
|
||||
# captured since the last batch run. Candidates land as
|
||||
# status=candidate for human or auto-triage review.
|
||||
# Fail-open: extraction failure never blocks backup.
|
||||
# The endpoint tracks its own last-run timestamp in project state.
|
||||
EXTRACT="${ATOCORE_EXTRACT_BATCH:-true}"
|
||||
if [[ "$EXTRACT" == "true" ]]; then
|
||||
log "Step 4: running batch LLM extraction"
|
||||
EXTRACT_RESULT=$(curl -sf -X POST \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"mode": "llm", "persist": true, "limit": 50}' \
|
||||
--max-time 600 \
|
||||
"$ATOCORE_URL/admin/extract-batch" 2>&1) && {
|
||||
log "Extraction result: $EXTRACT_RESULT"
|
||||
} || {
|
||||
log "WARN: batch extraction failed (this is non-blocking): $EXTRACT_RESULT"
|
||||
}
|
||||
else
|
||||
log "Step 4: ATOCORE_EXTRACT_BATCH not set to true, skipping extraction"
|
||||
fi
|
||||
|
||||
log "=== AtoCore daily backup complete ==="
|
||||
|
||||
Reference in New Issue
Block a user