Makes human triage sustainable. Before: command-line-only review,
auto-triage stopped after 100 candidates/run. Now:
1. Web UI at /admin/triage
- Lists all pending candidates with inline promote/reject/edit
- Edit content in-place before promoting (PUT /memory/{id})
- Change type via dropdown
- Keyboard shortcuts: Y=promote, N=reject, E=edit, S=scroll-next
- Cards fade out after action, queue count updates live
- Zero JS framework — vanilla fetch + event delegation
2. auto_triage.py drains queue
- Loops up to 20 batches (default) of 100 candidates each
- Tracks seen IDs so needs_human items don't reprocess
- Exits cleanly when queue empty
- Nightly cron naturally drains everything
3. Dashboard + wiki surface triage queue
- Dashboard /admin/dashboard: new "triage" section with pending
count + /admin/triage URL + warning/notice severity levels
- Wiki homepage: prominent callout "N candidates awaiting triage —
review now" linking to /admin/triage, styled with triage-warning
(>50) or triage-notice (>20) CSS
Pattern: human intervenes only when AI can't decide. The UI makes
that intervention fast (20 candidates in 60 seconds). Nightly
auto-triage drains the queue so the human queue stays bounded.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both scripts now:
- Retry up to 3x with 2s/4s exponential backoff on transient
failures (rate limits, capacity spikes)
- Capture claude CLI stderr in the error message (200 char cap)
instead of just the exit code — diagnostics actually useful now
- Sleep 0.5s between calls to avoid bursting the backend
Context: last batch run hit 100% failure in triage (every call
exit 1) after 40% failure in extraction. claude CLI worked fine
immediately after, so the failures were capacity/rate-limit
transients. With retry + pacing these batches should complete
cleanly now. 439 candidates are already in the queue waiting
for triage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous commit had the wrong message — the diff was the config
persistence fix, not triage. This properly adds rule 4 to the
triage prompt: when candidate content starts with 'From OpenClaw/',
apply a much lower bar. OpenClaw's SOUL.md, USER.md, MEMORY.md,
MODEL-ROUTING.md, and daily memory/*.md are already curated —
promote unless clearly wrong or duplicate.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three additive upgrades borrowed from Karpathy's LLM Wiki pattern:
1. CONTRADICTION DETECTION: auto-triage now has a fourth verdict —
"contradicts". When a candidate conflicts with an existing memory
(not duplicates, genuine disagreement like "Option A selected"
vs "Option B selected"), the triage model flags it and leaves
it in the queue for human review instead of silently rejecting
or double-storing. Preserves source tension rather than
suppressing it.
2. WEEKLY LINT PASS: scripts/lint_knowledge_base.py checks for:
- Orphan memories (active but zero references after 14 days)
- Stale candidates (>7 days unreviewed)
- Unused entities (no relationships)
- Empty-state projects
- Unregistered projects auto-detected in memories
Runs Sundays via the cron. Outputs a report.
3. WEEKLY SYNTHESIS: scripts/synthesize_projects.py uses sonnet to
generate a 3-5 sentence "current state" paragraph per project
from state + memories + entities. Cached in project_state under
status/synthesis_cache. Wiki project pages now show this at the
top under "Current State (auto-synthesis)". Falls back to a
deterministic summary if no cache exists.
deploy/dalidou/batch-extract.sh: added Step C (synthesis) and
Step D (lint) gated to Sundays via date check.
All additive — nothing existing changes behavior. The database
remains the source of truth; these operations just produce better
synthesized views and catch rot.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
scripts/auto_triage.py: fetches candidate memories, asks a triage
model (claude -p, default sonnet) to classify each as promote /
reject / needs_human, and executes the verdict via the API.
Trust model:
- Auto-promote: model says promote AND confidence >= 0.8 AND
dedup-checked against existing active memories for the project
- Auto-reject: model says reject
- needs_human: everything else stays in queue for manual review
The triage model receives both the candidate content AND a summary
of existing active memories for the same project, so it can detect
duplicates and near-duplicates. The system prompt explicitly lists
the rejection categories learned from the first two manual triage
passes (stale snapshots, impl details, planned-not-implemented,
process rules that belong in ledger not memory).
deploy/dalidou/batch-extract.sh now runs extraction (Step A) then
auto-triage (Step B) in sequence. The nightly cron at 03:00 UTC
will run the full pipeline: backup → cleanup → rsync → extract →
triage. Only needs_human candidates reach the human.
Supports --dry-run for preview without executing.
Supports --model override for multi-model triage (e.g. opus for
higher-quality review, or a future Gemini/Ollama backend).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>