feat: Day 3 — auto-triage via LLM second pass
scripts/auto_triage.py: fetches candidate memories, asks a triage model (claude -p, default sonnet) to classify each as promote / reject / needs_human, and executes the verdict via the API. Trust model: - Auto-promote: model says promote AND confidence >= 0.8 AND dedup-checked against existing active memories for the project - Auto-reject: model says reject - needs_human: everything else stays in queue for manual review The triage model receives both the candidate content AND a summary of existing active memories for the same project, so it can detect duplicates and near-duplicates. The system prompt explicitly lists the rejection categories learned from the first two manual triage passes (stale snapshots, impl details, planned-not-implemented, process rules that belong in ledger not memory). deploy/dalidou/batch-extract.sh now runs extraction (Step A) then auto-triage (Step B) in sequence. The nightly cron at 03:00 UTC will run the full pipeline: backup → cleanup → rsync → extract → triage. Only needs_human candidates reach the human. Supports --dry-run for preview without executing. Supports --model override for multi-model triage (e.g. opus for higher-quality review, or a future Gemini/Ollama backend). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
1
scripts/eval_data/candidate_queue_2026-04-12.json
Normal file
1
scripts/eval_data/candidate_queue_2026-04-12.json
Normal file
File diff suppressed because one or more lines are too long
29
scripts/eval_data/candidate_queue_2026-04-12.txt
Normal file
29
scripts/eval_data/candidate_queue_2026-04-12.txt
Normal file
@@ -0,0 +1,29 @@
|
||||
1. [project ] proj=atocore AtoCore extraction must stay off the hot capture path; batch endpoint only
|
||||
2. [project ] proj=atocore Auto-promote gate: confidence ≥0.8 AND no duplicate in active memories
|
||||
3. [project ] proj=atocore AtoCore LLM extraction pipeline deployed on Dalidou host, runs via cron at 03:00 UTC via scripts/batch_llm_extract_live.py
|
||||
4. [project ] proj=atocore LLM extractor runs host-side (not in container) because claude CLI not available in container environment
|
||||
5. [project ] proj=atocore Host-side extraction script scripts/batch_llm_extract_live.py uses pure stdlib, no atocore imports for deployment simplicity
|
||||
6. [project ] proj=atocore POST /admin/extract-batch accepts mode: rule|llm, POST /interactions/{id}/extract now mode-aware
|
||||
7. [knowledge ] proj=atocore claude CLI 2.0.60 removed --no-session-persistence flag, extraction sessions now persist in claude history
|
||||
8. [adaptation ] proj=atocore Durable memory extraction candidates must be <200 chars, stand-alone, typed as project|knowledge|preference|adaptation
|
||||
9. [adaptation ] proj=atocore Memory extraction confidence defaults to 0.5, raise to 0.6 only for unambiguous committed claims
|
||||
10. [project ] proj=atocore Live Dalidou is on commit 39d73e9, not e2895b5
|
||||
11. [project ] proj=atocore Live harness is reproducible at 16/18 PASS
|
||||
12. [project ] proj=atocore Live active memories count is 36
|
||||
13. [project ] proj=atocore Wave 2 project-state entries on live: p04=5, p05=6, p06=6
|
||||
14. [project ] proj=atocore R6 is fixed by commit 39d73e9
|
||||
15. [project ] proj=atocore R9: R6 fix only covers empty project fallback; wrong non-empty model project can still override known interaction scope
|
||||
16. [project ] proj=atocore R10: Phase 8 is baseline-complete but not primary-complete; OpenClaw client covers narrow read-oriented slice of API
|
||||
17. [project ] proj=atocore Phase 8 is decent baseline integration milestone but not primary-ready yet
|
||||
18. [project ] proj=atocore 4-step roadmap complete: extractor → harness → Wave 2 → OpenClaw
|
||||
19. [project ] proj=atocore Codex audit loop proven across two full round-trips in one session
|
||||
20. [project ] proj=atocore Session end state: 36 active memories, 17 project-state entries, 16/18 harness, 280 tests, main at 54d84b5
|
||||
21. [project ] proj=atocore AtoCore extraction stays off the hot capture path; LLM extraction runs as scheduled batch, not inline with POST /interactions.
|
||||
22. [project ] proj=atocore AtoCore auto-triage trust model: auto-promote only when confidence ≥0.8 AND no duplicate active memory; else needs_human.
|
||||
23. [project ] proj=atocore Multi-model triage: use different model for triage reviewer than extractor (sonnet for extract)
|
||||
24. [project ] proj=atocore R9 fix: when interaction has known project, prefer it over model's non-matching project unless model's is registered
|
||||
25. [project ] proj=atocore R7 ranking fix: add overlap-density as secondary signal (overlap_count / memory_token_count)
|
||||
26. [project ] proj=atocore Extraction pipeline skips interactions with response_chars < 50 to avoid low-signal content
|
||||
27. [project ] proj=atocore AtoCore triage uses independent model from extractor (extractor: sonnet, triage: different model or different prompt).
|
||||
28. [project ] proj=atocore AtoCore ranking scorer adds overlap-density (overlap_count / memory_tokens) as secondary signal to fix short-memory ranking.
|
||||
29. [project ] proj=atocore AtoCore project trust: when interaction has known project and model returns different project, prefer interaction's project unless
|
||||
Reference in New Issue
Block a user