Three additive upgrades borrowed from Karpathy's LLM Wiki pattern: 1. CONTRADICTION DETECTION: auto-triage now has a fourth verdict — "contradicts". When a candidate conflicts with an existing memory (not duplicates, genuine disagreement like "Option A selected" vs "Option B selected"), the triage model flags it and leaves it in the queue for human review instead of silently rejecting or double-storing. Preserves source tension rather than suppressing it. 2. WEEKLY LINT PASS: scripts/lint_knowledge_base.py checks for: - Orphan memories (active but zero references after 14 days) - Stale candidates (>7 days unreviewed) - Unused entities (no relationships) - Empty-state projects - Unregistered projects auto-detected in memories Runs Sundays via the cron. Outputs a report. 3. WEEKLY SYNTHESIS: scripts/synthesize_projects.py uses sonnet to generate a 3-5 sentence "current state" paragraph per project from state + memories + entities. Cached in project_state under status/synthesis_cache. Wiki project pages now show this at the top under "Current State (auto-synthesis)". Falls back to a deterministic summary if no cache exists. deploy/dalidou/batch-extract.sh: added Step C (synthesis) and Step D (lint) gated to Sundays via date check. All additive — nothing existing changes behavior. The database remains the source of truth; these operations just produce better synthesized views and catch rot. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
70 lines
2.1 KiB
Bash
70 lines
2.1 KiB
Bash
#!/usr/bin/env bash
|
|
#
|
|
# deploy/dalidou/batch-extract.sh
|
|
# --------------------------------
|
|
# Host-side LLM batch extraction for Dalidou.
|
|
#
|
|
# The claude CLI is available on the Dalidou HOST but NOT inside the
|
|
# Docker container. This script runs on the host, fetches recent
|
|
# interactions from the AtoCore API, runs the LLM extractor locally
|
|
# (claude -p sonnet), and posts candidates back to the API.
|
|
#
|
|
# Intended to be called from cron-backup.sh after backup/cleanup/rsync,
|
|
# or manually via:
|
|
#
|
|
# bash /srv/storage/atocore/app/deploy/dalidou/batch-extract.sh
|
|
#
|
|
# Environment variables:
|
|
# ATOCORE_URL default http://127.0.0.1:8100
|
|
# ATOCORE_EXTRACT_LIMIT default 50
|
|
|
|
set -euo pipefail
|
|
|
|
ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
|
|
LIMIT="${ATOCORE_EXTRACT_LIMIT:-50}"
|
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
APP_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
|
|
TIMESTAMP="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
|
|
|
log() { printf '[%s] %s\n' "$TIMESTAMP" "$*"; }
|
|
|
|
# The Python script needs the atocore source on PYTHONPATH
|
|
export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
|
|
|
|
log "=== AtoCore batch extraction + triage starting ==="
|
|
log "URL=$ATOCORE_URL LIMIT=$LIMIT"
|
|
|
|
# Step A: Extract candidates from recent interactions
|
|
log "Step A: LLM extraction"
|
|
python3 "$APP_DIR/scripts/batch_llm_extract_live.py" \
|
|
--base-url "$ATOCORE_URL" \
|
|
--limit "$LIMIT" \
|
|
2>&1 || {
|
|
log "WARN: batch extraction failed (non-blocking)"
|
|
}
|
|
|
|
# Step B: Auto-triage candidates in the queue
|
|
log "Step B: auto-triage"
|
|
python3 "$APP_DIR/scripts/auto_triage.py" \
|
|
--base-url "$ATOCORE_URL" \
|
|
2>&1 || {
|
|
log "WARN: auto-triage failed (non-blocking)"
|
|
}
|
|
|
|
# Step C: Weekly synthesis (Sundays only)
|
|
if [[ "$(date -u +%u)" == "7" ]]; then
|
|
log "Step C: weekly project synthesis"
|
|
python3 "$APP_DIR/scripts/synthesize_projects.py" \
|
|
--base-url "$ATOCORE_URL" \
|
|
2>&1 || {
|
|
log "WARN: synthesis failed (non-blocking)"
|
|
}
|
|
|
|
log "Step D: weekly lint pass"
|
|
python3 "$APP_DIR/scripts/lint_knowledge_base.py" \
|
|
--base-url "$ATOCORE_URL" \
|
|
2>&1 || true
|
|
fi
|
|
|
|
log "=== AtoCore batch extraction + triage complete ==="
|