Both scripts now:
- Retry up to 3x with 2s/4s exponential backoff on transient
failures (rate limits, capacity spikes)
- Capture claude CLI stderr in the error message (200 char cap)
instead of just the exit code — diagnostics actually useful now
- Sleep 0.5s between calls to avoid bursting the backend
Context: last batch run hit 100% failure in triage (every call
exit 1) after 40% failure in extraction. claude CLI worked fine
immediately after, so the failures were capacity/rate-limit
transients. With retry + pacing these batches should complete
cleanly now. 439 candidates are already in the queue waiting
for triage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>