Compare commits
31 Commits
775960c8c8
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 83b4d78cb7 | |||
| 9c91d778d9 | |||
| 6e43cc7383 | |||
| 877b97ec78 | |||
| e840ef4be3 | |||
| 56d5df0ab4 | |||
| 028d4c3594 | |||
| 9f262a21b0 | |||
| 7863ab3825 | |||
| 3ba49e92a9 | |||
| 02055e8db3 | |||
| cc68839306 | |||
| 45196f352f | |||
| d456d3c86a | |||
| 0dfecb3c14 | |||
| 3ca19724a5 | |||
| 3316ff99f9 | |||
| 53b71639ad | |||
| 07664bd743 | |||
| bb46e21c9b | |||
| 88f2f7c4e1 | |||
| bfa7dba4de | |||
| 271ee25d99 | |||
| d8b370fd0a | |||
| 86637f8eee | |||
| c49363fccc | |||
| 33a6c61ca6 | |||
| 33a106732f | |||
| 3011aa77da | |||
| ba36a28453 | |||
| 999788b790 |
1
.gitignore
vendored
1
.gitignore
vendored
@@ -6,6 +6,7 @@ __pycache__/
|
||||
dist/
|
||||
build/
|
||||
.pytest_cache/
|
||||
.mypy_cache/
|
||||
htmlcov/
|
||||
.coverage
|
||||
venv/
|
||||
|
||||
@@ -6,22 +6,23 @@
|
||||
|
||||
## Orientation
|
||||
|
||||
- **live_sha** (Dalidou `/health` build_sha): `c2e7064` (verified 2026-04-15 via /health, build_time 2026-04-15T15:08:51Z)
|
||||
- **last_updated**: 2026-04-15 by Claude (deploy caught up; R10/R13 closed)
|
||||
- **main_tip**: `c2e7064` (plus one pending doc/ledger commit for this session)
|
||||
- **test_count**: 299 collected via `pytest --collect-only -q` on a clean checkout, 2026-04-15 (reproduction recipe in Quick Commands)
|
||||
- **harness**: `18/18 PASS`
|
||||
- **live_sha** (Dalidou `/health` build_sha): `775960c` (verified 2026-04-16 via /health, build_time 2026-04-16T17:59:30Z)
|
||||
- **last_updated**: 2026-04-18 by Claude (Phase 7A — Memory Consolidation "sleep cycle" V1 on branch, not yet deployed)
|
||||
- **main_tip**: `999788b`
|
||||
- **test_count**: 395 (21 new Phase 7A dedup tests + accumulated Phase 5/6 tests since last ledger refresh)
|
||||
- **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression)
|
||||
- **vectors**: 33,253
|
||||
- **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
|
||||
- **candidate_memories**: 2
|
||||
- **interactions**: 234 total (192 claude-code, 38 openclaw, 4 test)
|
||||
- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08)
|
||||
- **project_state_entries**: 78 total (p04=9, p05=13, p06=13, atocore=43)
|
||||
- **project_state_entries**: 110 total (atocore=47, p06=19, p05=18, p04=15, abb=6, atomizer=5)
|
||||
- **entities**: 35 (engineering knowledge graph, Layer 2)
|
||||
- **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron, verified
|
||||
- **nightly_pipeline**: backup → cleanup → rsync → **OpenClaw import** (NEW) → vault refresh (NEW) → extract → auto-triage → weekly synth/lint Sundays
|
||||
- **capture_clients**: claude-code (Stop hook), openclaw (plugin + file importer)
|
||||
- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → **auto-promote/expire (NEW)** → weekly synth/lint Sundays → **retrieval harness (NEW)** → **pipeline summary (NEW)**
|
||||
- **capture_clients**: claude-code (Stop hook + cwd project inference), openclaw (before_agent_start + llm_output plugin, verified live)
|
||||
- **wiki**: http://dalidou:8100/wiki (browse), /wiki/projects/{id}, /wiki/entities/{id}, /wiki/search
|
||||
- **dashboard**: http://dalidou:8100/admin/dashboard
|
||||
- **dashboard**: http://dalidou:8100/admin/dashboard (now shows pipeline health, interaction totals by client, all registered projects)
|
||||
|
||||
## Active Plan
|
||||
|
||||
@@ -159,6 +160,16 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
||||
|
||||
## Session Log
|
||||
|
||||
- **2026-04-19 Claude** Shipped Phases 7A.1 (tiered auto-merge), 7C (tag canonicalization), 7D (confidence decay), 7I (OpenClaw context injection), UI refresh (memory/domain/activity pages + topnav), and closed the Claude Code retrieval asymmetry. Builds deployed: `028d4c3` → `56d5df0` → `e840ef4` → `877b97e` → `6e43cc7` → `9c91d77`. New capture-surface scope: Claude Code (Stop + UserPromptSubmit hooks, both installed and verified live) + OpenClaw (v0.2.0 plugin with capture + context injection, verified loaded on T420 gateway). `/wiki/capture` paste form removed from topnav; kept as labeled fallback. Anthropic API polling explicitly out of scope per user. Tests 414 → 459. `docs/capture-surfaces.md` documents the sanctioned scope.
|
||||
|
||||
- **2026-04-18 Claude** **Phase 7A — Memory Consolidation V1 ("sleep cycle") landed on branch.** New `docs/PHASE-7-MEMORY-CONSOLIDATION.md` covers all 8 subphases (7A dedup, 7B contradictions, 7C tag canon, 7D confidence decay, 7E memory detail, 7F domain view, 7F re-extract, 7H vector hygiene). 7A implementation: schema migration `memory_merge_candidates`, `atocore.memory.similarity` (cosine + transitive cluster), stdlib-only `atocore.memory._dedup_prompt` (llm drafts unified content preserving all specifics), `merge_memories()` + `create_merge_candidate()` + `get_merge_candidates()` + `reject_merge_candidate()` in service.py, host-side `scripts/memory_dedup.py` (HTTP + claude -p, idempotent via sorted-id set), 5 new endpoints under `/admin/memory/merge-candidates*` + `/admin/memory/dedup-scan` + `/admin/memory/dedup-status`, purple-themed "🔗 Merge Candidates" section in /admin/triage with editable draft + approve/reject buttons, "🔗 Scan for duplicates" control bar with threshold slider, nightly Step B3 in batch-extract.sh (0.90 daily, 0.85 Sundays deep), `deploy/dalidou/dedup-watcher.sh` host watcher for UI-triggered scans (mirrors graduation-watcher pattern). 21 new tests (similarity, prompt parse, idempotency, merge happy path, override content/tags, audit rows, abort-if-source-tampered, reject leaves sources alone, schema). Tests 374 → 395. Not yet deployed; harness not re-run. Next: push + deploy, install `dedup-watcher.sh` in host cron, trigger first scan, review proposals in UI.
|
||||
|
||||
- **2026-04-16 Claude** `b687e7f..999788b` **"Make It Actually Useful" sprint.** Two-part session: ops fixes then consolidation sprint.
|
||||
|
||||
**Part 1 — Ops fixes:** Deployed `b687e7f` (project inference from cwd). Fixed cron logging (was `/dev/null` — redirected to `~/atocore-logs/`). Fixed OpenClaw gateway crash-loop (`discord.replyToMode: "any"` invalid → `"all"`). Deployed `atocore-capture` plugin on T420 OpenClaw using `before_agent_start` + `llm_output` hooks — verified end-to-end: 38 `client=openclaw` interactions captured. Backfilled project tags on 179/181 unscoped interactions (165 atocore, 8 p06, 6 p04).
|
||||
|
||||
**Part 2 — Sprint (Phase A+C):** Pipeline observability: retrieval harness now runs nightly (Step E), pipeline summary persisted to project state (Step F), dashboard enhanced with interaction totals by client + pipeline health section + dynamic project list. Phase 10 landed: `auto_promote_reinforced()` (candidate→active when reference_count≥3, confidence≥0.7) + `expire_stale_candidates()` (14-day unreinforced→auto-reject), both wired into nightly cron Step B2. Seeding script created (26 entries across 6 projects — all already existed from prior session). Tests 299→303. Harness 17/18 on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression). Deployed `775960c`.
|
||||
|
||||
- **2026-04-15 Claude (pm)** Closed the last harness failure honestly. **p06-tailscale fixed: 18/18 PASS.** Root-caused: not a retrieval bug — the p06 `ARCHITECTURE.md` Overview chunk legitimately mentions "the GigaBIT M1 telescope mirror" because the Polisher Suite is built *for* that mirror. All four retrieved sources for the tailscale prompt were genuinely p06/shared paths; zero actual p04 chunks leaked. The fixture's `expect_absent: GigaBIT` was catching semantic overlap, not retrieval bleed. Narrowed it to `expect_absent: "[Source: p04-gigabit/"` — a source-path check that tests the real invariant (no p04 source chunks in p06 context). Other p06 fixtures still use the word-blacklist form; they pass today because their more-specific prompts don't pull the ARCHITECTURE.md Overview, so I left them alone rather than churn fixtures that aren't failing. Did NOT change retrieval/ranking — no code change, fixture-only fix. Tests unchanged at 299.
|
||||
|
||||
- **2026-04-15 Claude** Deploy + doc debt sweep. Deployed `c2e7064` to Dalidou (build_time 2026-04-15T15:08:51Z, build_sha matches, /health ok) so R11/R12 are now live, not just on main. **R11 verified on live**: `POST /admin/extract-batch {"mode":"llm"}` against http://127.0.0.1:8100 returns HTTP 503 with the operator-facing "claude CLI not on PATH, run host-side script or use mode=rule" message — exactly the post-fix contract. **R13 closed (fixed)**: added a reproduction recipe to Quick Commands (`pip install -r requirements-dev.txt && pytest --collect-only -q && pytest -q`) and re-cited `test_count: 299` against a fresh local collection on 2026-04-15, so the claim is now auditable from any clean checkout — Codex's audit worktree just needs `pip install -r requirements-dev.txt`. **R10 closed (fixed)**: rewrote the `docs/master-plan-status.md` OpenClaw section to explicitly disclaim "primary integration" and report the current narrow surface: 14 client request shapes against ~44 server routes, predominantly read + `/project/state` + `/ingest/sources`, with memory/interactions/admin/entities/triage/extraction writes correctly out of scope. Open findings now: none blocking. Next natural move: the last harness failure `p06-tailscale` (chunk bleed).
|
||||
|
||||
108
deploy/dalidou/auto-triage-watcher.sh
Normal file
108
deploy/dalidou/auto-triage-watcher.sh
Normal file
@@ -0,0 +1,108 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy/dalidou/auto-triage-watcher.sh
|
||||
# --------------------------------------
|
||||
# Host-side watcher for on-demand auto-triage requests from the web UI.
|
||||
#
|
||||
# The web UI at /admin/triage has an "Auto-process queue" button that
|
||||
# POSTs to /admin/triage/request-drain, which writes a timestamp to
|
||||
# AtoCore project state (atocore/config/auto_triage_requested_at).
|
||||
#
|
||||
# This script runs on the Dalidou HOST (where the claude CLI is
|
||||
# available), polls for the flag, and runs auto_triage.py when seen.
|
||||
#
|
||||
# Installed via cron to run every 2 minutes:
|
||||
# */2 * * * * /srv/storage/atocore/app/deploy/dalidou/auto-triage-watcher.sh
|
||||
#
|
||||
# Safety:
|
||||
# - Lock file prevents concurrent runs
|
||||
# - Flag is cleared after processing so one request = one run
|
||||
# - If auto_triage hangs, the lock prevents pileup until manual cleanup
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
|
||||
APP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
|
||||
LOCK_FILE="/tmp/atocore-auto-triage.lock"
|
||||
LOG_DIR="/home/papa/atocore-logs"
|
||||
mkdir -p "$LOG_DIR"
|
||||
|
||||
TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
log() { printf '[%s] %s\n' "$TS" "$*"; }
|
||||
|
||||
# Fetch the request flag via API (read-only, no lock needed)
|
||||
STATE_JSON=$(curl -sSf --max-time 5 "$ATOCORE_URL/project/state/atocore" 2>/dev/null || echo "{}")
|
||||
REQUESTED=$(echo "$STATE_JSON" | python3 -c "
|
||||
import sys, json
|
||||
try:
|
||||
d = json.load(sys.stdin)
|
||||
for e in d.get('entries', d.get('state', [])):
|
||||
if e.get('category') == 'config' and e.get('key') == 'auto_triage_requested_at':
|
||||
print(e.get('value', ''))
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
" 2>/dev/null || echo "")
|
||||
|
||||
if [[ -z "$REQUESTED" ]]; then
|
||||
# No request — silent exit
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Acquire lock (non-blocking)
|
||||
exec 9>"$LOCK_FILE" || exit 0
|
||||
if ! flock -n 9; then
|
||||
log "auto-triage already running, skipping"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Record we're starting
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_running\",\"value\":\"1\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_last_started_at\",\"value\":\"$TS\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
LOG_FILE="$LOG_DIR/auto-triage-ondemand-$(date -u +%Y%m%d-%H%M%S).log"
|
||||
log "Starting auto-triage (request: $REQUESTED, log: $LOG_FILE)"
|
||||
|
||||
# Clear the request flag FIRST so duplicate clicks queue at most one re-run
|
||||
# (the next watcher tick would then see a fresh request, not this one)
|
||||
curl -sSf -X DELETE "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"config\",\"key\":\"auto_triage_requested_at\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
# Run the drain
|
||||
cd "$APP_DIR"
|
||||
export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
|
||||
if python3 scripts/auto_triage.py --base-url "$ATOCORE_URL" >> "$LOG_FILE" 2>&1; then
|
||||
RESULT_LINE=$(tail -5 "$LOG_FILE" | grep "total:" | tail -1 || tail -1 "$LOG_FILE")
|
||||
RESULT="${RESULT_LINE:-completed}"
|
||||
log "auto-triage finished: $RESULT"
|
||||
else
|
||||
RESULT="ERROR — see $LOG_FILE"
|
||||
log "auto-triage FAILED — see $LOG_FILE"
|
||||
fi
|
||||
|
||||
FINISH_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
|
||||
# Mark done
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_running\",\"value\":\"0\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_last_finished_at\",\"value\":\"$FINISH_TS\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
# Escape quotes in result for JSON
|
||||
SAFE_RESULT=$(printf '%s' "$RESULT" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read())[1:-1])")
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"auto_triage_last_result\",\"value\":\"$SAFE_RESULT\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
@@ -65,15 +65,16 @@ python3 "$APP_DIR/scripts/auto_promote_reinforced.py" \
|
||||
log "WARN: auto-promote/expire failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step C: Weekly synthesis (Sundays only)
|
||||
if [[ "$(date -u +%u)" == "7" ]]; then
|
||||
log "Step C: weekly project synthesis"
|
||||
# Step C: Daily project synthesis (keeps wiki/mirror pages fresh)
|
||||
log "Step C: project synthesis (daily)"
|
||||
python3 "$APP_DIR/scripts/synthesize_projects.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
2>&1 || {
|
||||
log "WARN: synthesis failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step D: Weekly lint pass (Sundays only — heavier, not needed daily)
|
||||
if [[ "$(date -u +%u)" == "7" ]]; then
|
||||
log "Step D: weekly lint pass"
|
||||
python3 "$APP_DIR/scripts/lint_knowledge_base.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
@@ -149,4 +150,125 @@ print(f'Pipeline summary persisted: {json.dumps(summary)}')
|
||||
log "WARN: pipeline summary persistence failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step F2: Emerging-concepts detector (Phase 6 C.1)
|
||||
log "Step F2: emerging-concepts detector"
|
||||
python3 "$APP_DIR/scripts/detect_emerging.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
2>&1 || {
|
||||
log "WARN: emerging detector failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step F3: Transient-to-durable extension (Phase 6 C.3)
|
||||
log "Step F3: transient-to-durable extension"
|
||||
curl -sSf -X POST "$ATOCORE_URL/admin/memory/extend-reinforced" \
|
||||
-H 'Content-Type: application/json' \
|
||||
2>&1 | tail -5 || {
|
||||
log "WARN: extend-reinforced failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step F4: Confidence decay on unreferenced cold memories (Phase 7D)
|
||||
# Daily: memories with reference_count=0 AND idle > 30 days → confidence × 0.97.
|
||||
# Below 0.3 → auto-supersede with audit. Reversible via reinforcement.
|
||||
log "Step F4: confidence decay"
|
||||
curl -sSf -X POST "$ATOCORE_URL/admin/memory/decay-run" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"idle_days_threshold": 30, "daily_decay_factor": 0.97, "supersede_confidence_floor": 0.30}' \
|
||||
2>&1 | tail -5 || {
|
||||
log "WARN: decay-run failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step B3: Memory dedup scan (Phase 7A)
|
||||
# Nightly at 0.90 (tight — only near-duplicates). Sundays run a deeper
|
||||
# pass at 0.85 to catch semantically-similar-but-differently-worded memories.
|
||||
if [[ "$(date -u +%u)" == "7" ]]; then
|
||||
DEDUP_THRESHOLD="0.85"
|
||||
DEDUP_BATCH="80"
|
||||
log "Step B3: memory dedup (Sunday deep pass, threshold $DEDUP_THRESHOLD)"
|
||||
else
|
||||
DEDUP_THRESHOLD="0.90"
|
||||
DEDUP_BATCH="50"
|
||||
log "Step B3: memory dedup (daily, threshold $DEDUP_THRESHOLD)"
|
||||
fi
|
||||
python3 "$APP_DIR/scripts/memory_dedup.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
--similarity-threshold "$DEDUP_THRESHOLD" \
|
||||
--max-batch "$DEDUP_BATCH" \
|
||||
2>&1 || {
|
||||
log "WARN: memory dedup failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step B4: Tag canonicalization (Phase 7C, weekly Sundays)
|
||||
# Autonomous: LLM proposes alias→canonical maps, auto-applies confidence >= 0.8.
|
||||
# Projects tokens are protected (skipped on both sides). Borderline proposals
|
||||
# land in /admin/tags/aliases for human review.
|
||||
if [[ "$(date -u +%u)" == "7" ]]; then
|
||||
log "Step B4: tag canonicalization (Sunday)"
|
||||
python3 "$APP_DIR/scripts/canonicalize_tags.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
2>&1 || {
|
||||
log "WARN: tag canonicalization failed (non-blocking)"
|
||||
}
|
||||
fi
|
||||
|
||||
# Step G: Integrity check (Phase 4 V1)
|
||||
log "Step G: integrity check"
|
||||
python3 "$APP_DIR/scripts/integrity_check.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
2>&1 || {
|
||||
log "WARN: integrity check failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step H: Pipeline-level alerts — detect conditions that warrant attention
|
||||
log "Step H: pipeline alerts"
|
||||
python3 -c "
|
||||
import json, os, sys, urllib.request
|
||||
sys.path.insert(0, '$APP_DIR/src')
|
||||
from atocore.observability.alerts import emit_alert
|
||||
|
||||
base = '$ATOCORE_URL'
|
||||
|
||||
def get_state(project='atocore'):
|
||||
try:
|
||||
req = urllib.request.Request(f'{base}/project/state/{project}')
|
||||
resp = urllib.request.urlopen(req, timeout=10)
|
||||
return json.loads(resp.read()).get('entries', [])
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
def get_dashboard():
|
||||
try:
|
||||
req = urllib.request.Request(f'{base}/admin/dashboard')
|
||||
resp = urllib.request.urlopen(req, timeout=10)
|
||||
return json.loads(resp.read())
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
state = {(e['category'], e['key']): e['value'] for e in get_state()}
|
||||
dash = get_dashboard()
|
||||
|
||||
# Harness regression check
|
||||
harness_raw = state.get(('status', 'retrieval_harness_result'))
|
||||
if harness_raw:
|
||||
try:
|
||||
h = json.loads(harness_raw)
|
||||
passed, total = h.get('passed', 0), h.get('total', 0)
|
||||
if total > 0:
|
||||
rate = passed / total
|
||||
if rate < 0.85:
|
||||
emit_alert('warning', 'Retrieval harness below 85%',
|
||||
f'Only {passed}/{total} fixtures passing ({rate:.0%}). Failures: {h.get(\"failures\", [])[:5]}',
|
||||
context={'pass_rate': rate})
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Candidate queue pileup
|
||||
candidates = dash.get('memories', {}).get('candidates', 0)
|
||||
if candidates > 200:
|
||||
emit_alert('warning', 'Candidate queue not draining',
|
||||
f'{candidates} candidates pending. Auto-triage may be stuck or rate-limited.',
|
||||
context={'candidates': candidates})
|
||||
|
||||
print('pipeline alerts check complete')
|
||||
" 2>&1 || true
|
||||
|
||||
log "=== AtoCore batch extraction + triage complete ==="
|
||||
|
||||
110
deploy/dalidou/dedup-watcher.sh
Normal file
110
deploy/dalidou/dedup-watcher.sh
Normal file
@@ -0,0 +1,110 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy/dalidou/dedup-watcher.sh
|
||||
# -------------------------------
|
||||
# Host-side watcher for on-demand memory dedup scans (Phase 7A).
|
||||
#
|
||||
# The /admin/triage page has a "🔗 Scan for duplicates" button that POSTs
|
||||
# to /admin/memory/dedup-scan with {project, similarity_threshold, max_batch}.
|
||||
# The container writes this to project_state (atocore/config/dedup_requested_at).
|
||||
#
|
||||
# This script runs on the Dalidou HOST (where claude CLI lives), polls
|
||||
# for the flag, and runs memory_dedup.py when seen.
|
||||
#
|
||||
# Installed via cron every 2 minutes:
|
||||
# */2 * * * * /srv/storage/atocore/app/deploy/dalidou/dedup-watcher.sh \
|
||||
# >> /home/papa/atocore-logs/dedup-watcher.log 2>&1
|
||||
#
|
||||
# Mirrors deploy/dalidou/graduation-watcher.sh exactly.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
|
||||
APP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
|
||||
LOCK_FILE="/tmp/atocore-dedup.lock"
|
||||
LOG_DIR="/home/papa/atocore-logs"
|
||||
mkdir -p "$LOG_DIR"
|
||||
|
||||
TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
log() { printf '[%s] %s\n' "$TS" "$*"; }
|
||||
|
||||
# Fetch the flag via API
|
||||
STATE_JSON=$(curl -sSf --max-time 5 "$ATOCORE_URL/project/state/atocore" 2>/dev/null || echo "{}")
|
||||
REQUESTED=$(echo "$STATE_JSON" | python3 -c "
|
||||
import sys, json
|
||||
try:
|
||||
d = json.load(sys.stdin)
|
||||
for e in d.get('entries', d.get('state', [])):
|
||||
if e.get('category') == 'config' and e.get('key') == 'dedup_requested_at':
|
||||
print(e.get('value', ''))
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
" 2>/dev/null || echo "")
|
||||
|
||||
if [[ -z "$REQUESTED" ]]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
PROJECT=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('project',''))" 2>/dev/null || echo "")
|
||||
THRESHOLD=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('similarity_threshold',0.88))" 2>/dev/null || echo "0.88")
|
||||
MAX_BATCH=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('max_batch',50))" 2>/dev/null || echo "50")
|
||||
|
||||
# Acquire lock
|
||||
exec 9>"$LOCK_FILE" || exit 0
|
||||
if ! flock -n 9; then
|
||||
log "dedup already running, skipping"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Mark running
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_running\",\"value\":\"1\",\"source\":\"dedup watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_last_started_at\",\"value\":\"$TS\",\"source\":\"dedup watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
LOG_FILE="$LOG_DIR/dedup-ondemand-$(date -u +%Y%m%d-%H%M%S).log"
|
||||
log "Starting dedup (project='$PROJECT' threshold=$THRESHOLD max_batch=$MAX_BATCH, log: $LOG_FILE)"
|
||||
|
||||
# Clear the flag BEFORE running so duplicate clicks queue at most one
|
||||
curl -sSf -X DELETE "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"config\",\"key\":\"dedup_requested_at\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
cd "$APP_DIR"
|
||||
export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
|
||||
ARGS=(--base-url "$ATOCORE_URL" --similarity-threshold "$THRESHOLD" --max-batch "$MAX_BATCH")
|
||||
if [[ -n "$PROJECT" ]]; then
|
||||
ARGS+=(--project "$PROJECT")
|
||||
fi
|
||||
|
||||
if python3 scripts/memory_dedup.py "${ARGS[@]}" >> "$LOG_FILE" 2>&1; then
|
||||
RESULT=$(grep "^summary:" "$LOG_FILE" | tail -1 || tail -1 "$LOG_FILE")
|
||||
RESULT="${RESULT:-completed}"
|
||||
log "dedup finished: $RESULT"
|
||||
else
|
||||
RESULT="ERROR — see $LOG_FILE"
|
||||
log "dedup FAILED"
|
||||
fi
|
||||
|
||||
FINISH_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_running\",\"value\":\"0\",\"source\":\"dedup watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_last_finished_at\",\"value\":\"$FINISH_TS\",\"source\":\"dedup watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
SAFE_RESULT=$(printf '%s' "$RESULT" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read())[1:-1])")
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"dedup_last_result\",\"value\":\"$SAFE_RESULT\",\"source\":\"dedup watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
117
deploy/dalidou/graduation-watcher.sh
Normal file
117
deploy/dalidou/graduation-watcher.sh
Normal file
@@ -0,0 +1,117 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy/dalidou/graduation-watcher.sh
|
||||
# ------------------------------------
|
||||
# Host-side watcher for on-demand memory→entity graduation from the web UI.
|
||||
#
|
||||
# The /admin/triage page has a "🎓 Graduate memories" button that POSTs
|
||||
# to /admin/graduation/request with {project, limit}. The container
|
||||
# writes this to project_state (atocore/config/graduation_requested_at).
|
||||
#
|
||||
# This script runs on the Dalidou HOST (where claude CLI lives), polls
|
||||
# for the flag, and runs graduate_memories.py when seen.
|
||||
#
|
||||
# Installed via cron every 2 minutes:
|
||||
# */2 * * * * /srv/storage/atocore/app/deploy/dalidou/graduation-watcher.sh \
|
||||
# >> /home/papa/atocore-logs/graduation-watcher.log 2>&1
|
||||
#
|
||||
# Safety:
|
||||
# - Lock file prevents concurrent runs
|
||||
# - Flag cleared before processing so duplicate clicks queue at most one re-run
|
||||
# - Fail-open: any error logs but doesn't break the host
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
|
||||
APP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
|
||||
LOCK_FILE="/tmp/atocore-graduation.lock"
|
||||
LOG_DIR="/home/papa/atocore-logs"
|
||||
mkdir -p "$LOG_DIR"
|
||||
|
||||
TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
log() { printf '[%s] %s\n' "$TS" "$*"; }
|
||||
|
||||
# Fetch the flag via API
|
||||
STATE_JSON=$(curl -sSf --max-time 5 "$ATOCORE_URL/project/state/atocore" 2>/dev/null || echo "{}")
|
||||
REQUESTED=$(echo "$STATE_JSON" | python3 -c "
|
||||
import sys, json
|
||||
try:
|
||||
d = json.load(sys.stdin)
|
||||
for e in d.get('entries', d.get('state', [])):
|
||||
if e.get('category') == 'config' and e.get('key') == 'graduation_requested_at':
|
||||
print(e.get('value', ''))
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
" 2>/dev/null || echo "")
|
||||
|
||||
if [[ -z "$REQUESTED" ]]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Parse JSON: {project, limit, requested_at}
|
||||
PROJECT=$(echo "$REQUESTED" | python3 -c "import sys,json; d=json.load(sys.stdin) if '{' in sys.stdin.buffer.peek().decode(errors='ignore') else None; print((d or {}).get('project',''))" 2>/dev/null || echo "")
|
||||
# Fallback: python inline above can be flaky; just re-parse
|
||||
PROJECT=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('project',''))" 2>/dev/null || echo "")
|
||||
LIMIT=$(echo "$REQUESTED" | python3 -c "import sys,json; print(json.loads(sys.stdin.read() or '{}').get('limit',30))" 2>/dev/null || echo "30")
|
||||
|
||||
# Acquire lock
|
||||
exec 9>"$LOCK_FILE" || exit 0
|
||||
if ! flock -n 9; then
|
||||
log "graduation already running, skipping"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Mark running
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_running\",\"value\":\"1\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_last_started_at\",\"value\":\"$TS\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
LOG_FILE="$LOG_DIR/graduation-ondemand-$(date -u +%Y%m%d-%H%M%S).log"
|
||||
log "Starting graduation (project='$PROJECT' limit=$LIMIT, log: $LOG_FILE)"
|
||||
|
||||
# Clear the flag BEFORE running so duplicate clicks queue at most one
|
||||
curl -sSf -X DELETE "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"config\",\"key\":\"graduation_requested_at\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
# Build script args
|
||||
cd "$APP_DIR"
|
||||
export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
|
||||
ARGS=(--base-url "$ATOCORE_URL" --limit "$LIMIT")
|
||||
if [[ -n "$PROJECT" ]]; then
|
||||
ARGS+=(--project "$PROJECT")
|
||||
fi
|
||||
|
||||
if python3 scripts/graduate_memories.py "${ARGS[@]}" >> "$LOG_FILE" 2>&1; then
|
||||
RESULT=$(tail -3 "$LOG_FILE" | grep "^total:" | tail -1 || tail -1 "$LOG_FILE")
|
||||
RESULT="${RESULT:-completed}"
|
||||
log "graduation finished: $RESULT"
|
||||
else
|
||||
RESULT="ERROR — see $LOG_FILE"
|
||||
log "graduation FAILED"
|
||||
fi
|
||||
|
||||
FINISH_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
|
||||
# Mark done
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_running\",\"value\":\"0\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_last_finished_at\",\"value\":\"$FINISH_TS\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
|
||||
SAFE_RESULT=$(printf '%s' "$RESULT" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read())[1:-1])")
|
||||
curl -sSf -X POST "$ATOCORE_URL/project/state" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "{\"project\":\"atocore\",\"category\":\"status\",\"key\":\"graduation_last_result\",\"value\":\"$SAFE_RESULT\",\"source\":\"host watcher\"}" \
|
||||
>/dev/null 2>&1 || true
|
||||
64
deploy/dalidou/hourly-extract.sh
Normal file
64
deploy/dalidou/hourly-extract.sh
Normal file
@@ -0,0 +1,64 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy/dalidou/hourly-extract.sh
|
||||
# ---------------------------------
|
||||
# Lightweight hourly extraction + triage so autonomous capture stays
|
||||
# current (not a 24h-latency nightly-only affair).
|
||||
#
|
||||
# Does ONLY:
|
||||
# Step A: LLM extraction over recent interactions (last 2h window)
|
||||
# Step B: 3-tier auto-triage on the resulting candidates
|
||||
#
|
||||
# Skips the heavy nightly stuff (backup, rsync, OpenClaw import,
|
||||
# synthesis, harness, integrity check, emerging detector). Those stay
|
||||
# in cron-backup.sh at 03:00 UTC.
|
||||
#
|
||||
# Runs every hour via cron:
|
||||
# 0 * * * * /srv/storage/atocore/app/deploy/dalidou/hourly-extract.sh \
|
||||
# >> /home/papa/atocore-logs/hourly-extract.log 2>&1
|
||||
#
|
||||
# Lock file prevents overlap if a previous run is still going (which
|
||||
# can happen if claude CLI rate-limits and retries).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
|
||||
# 50 recent interactions is enough for an hour — typical usage is under 20/h.
|
||||
LIMIT="${ATOCORE_HOURLY_EXTRACT_LIMIT:-50}"
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
APP_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
|
||||
TIMESTAMP="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
LOCK_FILE="/tmp/atocore-hourly-extract.lock"
|
||||
|
||||
log() { printf '[%s] %s\n' "$TIMESTAMP" "$*"; }
|
||||
|
||||
# Acquire lock (non-blocking)
|
||||
exec 9>"$LOCK_FILE" || exit 0
|
||||
if ! flock -n 9; then
|
||||
log "hourly extract already running, skipping"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
export PYTHONPATH="$APP_DIR/src:${PYTHONPATH:-}"
|
||||
|
||||
log "=== hourly extract+triage starting ==="
|
||||
|
||||
# Step A — Extract candidates from recent interactions
|
||||
log "Step A: LLM extraction (since last run)"
|
||||
python3 "$APP_DIR/scripts/batch_llm_extract_live.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
--limit "$LIMIT" \
|
||||
2>&1 || {
|
||||
log "WARN: batch extraction failed (non-blocking)"
|
||||
}
|
||||
|
||||
# Step B — 3-tier auto-triage (sonnet → opus → discard)
|
||||
log "Step B: auto-triage (3-tier)"
|
||||
python3 "$APP_DIR/scripts/auto_triage.py" \
|
||||
--base-url "$ATOCORE_URL" \
|
||||
--max-batches 3 \
|
||||
2>&1 || {
|
||||
log "WARN: auto-triage failed (non-blocking)"
|
||||
}
|
||||
|
||||
log "=== hourly extract+triage complete ==="
|
||||
174
deploy/hooks/inject_context.py
Normal file
174
deploy/hooks/inject_context.py
Normal file
@@ -0,0 +1,174 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Claude Code UserPromptSubmit hook: inject AtoCore context.
|
||||
|
||||
Mirrors the OpenClaw 7I pattern on the Claude Code side. Every user
|
||||
prompt submitted to Claude Code is (a) sent to /context/build on the
|
||||
AtoCore API, and (b) the returned context pack is prepended to the
|
||||
prompt the LLM sees — so Claude Code answers grounded in what AtoCore
|
||||
already knows, same as OpenClaw now does.
|
||||
|
||||
Contract per Claude Code hooks spec:
|
||||
stdin: JSON with `prompt`, `session_id`, `transcript_path`, `cwd`,
|
||||
`hook_event_name`, etc.
|
||||
stdout on success: JSON
|
||||
{"hookSpecificOutput":
|
||||
{"hookEventName": "UserPromptSubmit",
|
||||
"additionalContext": "<pack>"}}
|
||||
exit 0 always — fail open. An unreachable AtoCore must never block
|
||||
the user's prompt.
|
||||
|
||||
Environment variables:
|
||||
ATOCORE_URL base URL (default http://dalidou:8100)
|
||||
ATOCORE_CONTEXT_DISABLED set to "1" to disable injection
|
||||
ATOCORE_CONTEXT_BUDGET max chars of injected pack (default 4000)
|
||||
ATOCORE_CONTEXT_TIMEOUT HTTP timeout in seconds (default 5)
|
||||
|
||||
Usage in ~/.claude/settings.json:
|
||||
"UserPromptSubmit": [{
|
||||
"matcher": "",
|
||||
"hooks": [{
|
||||
"type": "command",
|
||||
"command": "python /path/to/inject_context.py",
|
||||
"timeout": 10
|
||||
}]
|
||||
}]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
|
||||
ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100")
|
||||
CONTEXT_TIMEOUT = float(os.environ.get("ATOCORE_CONTEXT_TIMEOUT", "5"))
|
||||
CONTEXT_BUDGET = int(os.environ.get("ATOCORE_CONTEXT_BUDGET", "4000"))
|
||||
|
||||
# Don't spend an API call on trivial acks or slash commands.
|
||||
MIN_PROMPT_LENGTH = 15
|
||||
|
||||
|
||||
# Project inference table — kept in sync with capture_stop.py so both
|
||||
# hooks agree on what project a Claude Code session belongs to.
|
||||
_VAULT = "C:\\Users\\antoi\\antoine\\My Libraries\\Antoine Brain Extension"
|
||||
_PROJECT_PATH_MAP: dict[str, str] = {
|
||||
f"{_VAULT}\\2-Projects\\P04-GigaBIT-M1": "p04-gigabit",
|
||||
f"{_VAULT}\\2-Projects\\P10-Interferometer": "p05-interferometer",
|
||||
f"{_VAULT}\\2-Projects\\P11-Polisher-Fullum": "p06-polisher",
|
||||
f"{_VAULT}\\2-Projects\\P08-ABB-Space-Mirror": "abb-space",
|
||||
f"{_VAULT}\\2-Projects\\I01-Atomizer": "atomizer-v2",
|
||||
f"{_VAULT}\\2-Projects\\I02-AtoCore": "atocore",
|
||||
"C:\\Users\\antoi\\ATOCore": "atocore",
|
||||
"C:\\Users\\antoi\\Polisher-Sim": "p06-polisher",
|
||||
"C:\\Users\\antoi\\Fullum-Interferometer": "p05-interferometer",
|
||||
"C:\\Users\\antoi\\Atomizer-V2": "atomizer-v2",
|
||||
}
|
||||
|
||||
|
||||
def _infer_project(cwd: str) -> str:
|
||||
if not cwd:
|
||||
return ""
|
||||
norm = os.path.normpath(cwd).lower()
|
||||
for path_prefix, project_id in _PROJECT_PATH_MAP.items():
|
||||
if norm.startswith(os.path.normpath(path_prefix).lower()):
|
||||
return project_id
|
||||
return ""
|
||||
|
||||
|
||||
def _emit_empty() -> None:
|
||||
"""Exit 0 with no additionalContext — equivalent to no-op."""
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
def _emit_context(pack: str) -> None:
|
||||
"""Write the hook output JSON and exit 0."""
|
||||
out = {
|
||||
"hookSpecificOutput": {
|
||||
"hookEventName": "UserPromptSubmit",
|
||||
"additionalContext": pack,
|
||||
}
|
||||
}
|
||||
sys.stdout.write(json.dumps(out))
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
if os.environ.get("ATOCORE_CONTEXT_DISABLED") == "1":
|
||||
_emit_empty()
|
||||
|
||||
try:
|
||||
raw = sys.stdin.read()
|
||||
if not raw.strip():
|
||||
_emit_empty()
|
||||
hook_data = json.loads(raw)
|
||||
except Exception as exc:
|
||||
# Bad stdin → nothing to do
|
||||
print(f"inject_context: bad stdin: {exc}", file=sys.stderr)
|
||||
_emit_empty()
|
||||
|
||||
prompt = (hook_data.get("prompt") or "").strip()
|
||||
cwd = hook_data.get("cwd", "")
|
||||
|
||||
if len(prompt) < MIN_PROMPT_LENGTH:
|
||||
_emit_empty()
|
||||
|
||||
# Skip meta / system prompts that start with '<' (XML tags etc.)
|
||||
if prompt.startswith("<"):
|
||||
_emit_empty()
|
||||
|
||||
project = _infer_project(cwd)
|
||||
|
||||
body = json.dumps({
|
||||
"prompt": prompt,
|
||||
"project": project,
|
||||
"char_budget": CONTEXT_BUDGET,
|
||||
}).encode("utf-8")
|
||||
|
||||
req = urllib.request.Request(
|
||||
f"{ATOCORE_URL}/context/build",
|
||||
data=body,
|
||||
headers={"Content-Type": "application/json"},
|
||||
method="POST",
|
||||
)
|
||||
|
||||
try:
|
||||
resp = urllib.request.urlopen(req, timeout=CONTEXT_TIMEOUT)
|
||||
data = json.loads(resp.read().decode("utf-8"))
|
||||
except urllib.error.URLError as exc:
|
||||
# AtoCore unreachable — fail open
|
||||
print(f"inject_context: atocore unreachable: {exc}", file=sys.stderr)
|
||||
_emit_empty()
|
||||
except Exception as exc:
|
||||
print(f"inject_context: request failed: {exc}", file=sys.stderr)
|
||||
_emit_empty()
|
||||
|
||||
pack = (data.get("formatted_context") or "").strip()
|
||||
if not pack:
|
||||
_emit_empty()
|
||||
|
||||
# Safety truncate. /context/build respects the budget we sent, but
|
||||
# be defensive in case of a regression.
|
||||
if len(pack) > CONTEXT_BUDGET + 500:
|
||||
pack = pack[:CONTEXT_BUDGET] + "\n\n[context truncated]"
|
||||
|
||||
# Wrap so the LLM knows this is injected grounding, not user text.
|
||||
wrapped = (
|
||||
"---\n"
|
||||
"AtoCore-injected context for this prompt "
|
||||
f"(project={project or '(none)'}):\n\n"
|
||||
f"{pack}\n"
|
||||
"---"
|
||||
)
|
||||
|
||||
print(
|
||||
f"inject_context: injected {len(pack)} chars "
|
||||
f"(project={project or 'none'}, prompt_chars={len(prompt)})",
|
||||
file=sys.stderr,
|
||||
)
|
||||
_emit_context(wrapped)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
284
docs/MASTER-BRAIN-PLAN.md
Normal file
284
docs/MASTER-BRAIN-PLAN.md
Normal file
@@ -0,0 +1,284 @@
|
||||
# AtoCore Master Brain Plan
|
||||
|
||||
> Vision: AtoCore becomes the **single source of truth** that grounds every LLM
|
||||
> interaction across the entire ecosystem (Claude, OpenClaw, Codex, Ollama, future
|
||||
> agents). Every prompt is automatically enriched with full project context. The
|
||||
> brain self-grows from daily work, auto-organizes its metadata, and stays
|
||||
> flawlessly reliable.
|
||||
|
||||
## The Core Insight
|
||||
|
||||
AtoCore today is a **well-architected capture + curation system with a critical
|
||||
gap on the consumption side**. We pour water into the bucket (capture from
|
||||
Claude Code Stop hook + OpenClaw message hooks) but nothing is drinking from it
|
||||
at prompt time. Fixing that gap is the single highest-leverage move.
|
||||
|
||||
**Once every LLM call is AtoCore-grounded automatically, the feedback loop
|
||||
closes**: LLMs use the context → produce better responses → those responses
|
||||
reference the injected memories → reinforcement fires → knowledge curates
|
||||
itself. The capture side is already working. The pull side is what's missing.
|
||||
|
||||
## Universal Consumption Strategy
|
||||
|
||||
MCP is great for Claude (Claude Desktop, Claude Code, Cursor, Zed, Windsurf) but
|
||||
is **not universal**. OpenClaw has its own plugin SDK. Codex, Ollama, and GPT
|
||||
don't natively support MCP. The right strategy:
|
||||
|
||||
**HTTP API is the truth; every client gets the thinnest possible adapter.**
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ AtoCore HTTP API │ ← canonical interface
|
||||
│ /context/build │
|
||||
│ /query │
|
||||
│ /memory │
|
||||
│ /project/state │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
┌────────────┬───────────┼──────────┬────────────┐
|
||||
│ │ │ │ │
|
||||
┌──┴───┐ ┌────┴────┐ ┌───┴───┐ ┌───┴────┐ ┌───┴────┐
|
||||
│ MCP │ │OpenClaw │ │Claude │ │ Codex │ │ Ollama │
|
||||
│server│ │ plugin │ │ Code │ │ skill │ │ proxy │
|
||||
│ │ │ (pull) │ │ hook │ │ │ │ │
|
||||
└──┬───┘ └────┬────┘ └───┬───┘ └────┬───┘ └────┬───┘
|
||||
│ │ │ │ │
|
||||
Claude OpenClaw Claude Code Codex CLI Ollama
|
||||
Desktop, agent local
|
||||
Cursor, models
|
||||
Zed,
|
||||
Windsurf
|
||||
```
|
||||
|
||||
Each adapter's only job: accept a prompt, call AtoCore HTTP, prepend the
|
||||
returned context pack. The adapter itself carries no logic.
|
||||
|
||||
## Three Integration Tiers
|
||||
|
||||
### Tier 1: MCP-native clients (Claude ecosystem)
|
||||
Build **atocore-mcp** — a standalone MCP server that wraps the HTTP API. Exposes:
|
||||
- `context(query, project)` → context pack
|
||||
- `search(query)` → raw retrieval
|
||||
- `remember(type, content, project)` → create candidate memory
|
||||
- `recall(project, key)` → project state lookup
|
||||
- `list_projects()` → registered projects
|
||||
|
||||
Works with Claude Desktop, Claude Code (via `claude mcp add atocore`), Cursor,
|
||||
Zed, Windsurf without any per-client work beyond config.
|
||||
|
||||
### Tier 2: Custom plugin ecosystems (OpenClaw)
|
||||
Extend the existing `atocore-capture` plugin on T420 to also register a
|
||||
**`before_prompt_build`** hook that pulls context from AtoCore and injects it
|
||||
into the agent's system prompt. The plugin already has the HTTP client, the
|
||||
authentication, the fail-open pattern. This is ~30 lines of added code.
|
||||
|
||||
### Tier 3: Everything else (Codex, Ollama, custom agents)
|
||||
For clients without plugin/hook systems, ship a **thin proxy/middleware** the
|
||||
user configures as the LLM endpoint:
|
||||
- `atocore-proxy` listens on `localhost:PORT`
|
||||
- Intercepts OpenAI-compatible chat/completion calls
|
||||
- Pulls context from AtoCore, injects into system prompt
|
||||
- Forwards to the real model endpoint (OpenAI, Ollama, Anthropic, etc.)
|
||||
- Returns the response, then captures the interaction back to AtoCore
|
||||
|
||||
This makes AtoCore a "drop-in" layer for anything that speaks
|
||||
OpenAI-compatible HTTP — which is nearly every modern LLM runtime.
|
||||
|
||||
## Knowledge Density Plan
|
||||
|
||||
The brain is only as smart as what it knows. Current state: 80 active memories
|
||||
across 6 projects, 324 candidates in the queue being processed. Target:
|
||||
**1,000+ curated memories** to become a real master brain.
|
||||
|
||||
Mechanisms:
|
||||
1. **Finish the current triage pass** (324 → ~80 more promotions expected).
|
||||
2. **Re-extract with stronger prompt on existing 236 interactions** — tune the
|
||||
LLM extractor system prompt to pull more durable facts and fewer ephemeral
|
||||
snapshots.
|
||||
3. **Ingest all drive/vault documents as memory candidates** (not just chunks).
|
||||
Every structured markdown section with a decision/fact/requirement header
|
||||
becomes a candidate memory.
|
||||
4. **Multi-source triangulation**: same fact in 3+ sources = auto-promote to
|
||||
confidence 0.95.
|
||||
5. **Cross-project synthesis**: facts appearing in multiple project contexts
|
||||
get promoted to global domain knowledge.
|
||||
|
||||
## Auto-Organization of Metadata
|
||||
|
||||
Currently: `type`, `project`, `confidence`, `status`, `reference_count`. For
|
||||
master brain we need more structure, inferred automatically:
|
||||
|
||||
| Addition | Purpose | Mechanism |
|
||||
|---|---|---|
|
||||
| **Domain tags** (optics, mechanics, firmware, business…) | Cross-cutting retrieval | LLM inference during triage |
|
||||
| **Temporal scope** (permanent, valid_until_X, transient) | Avoid stale truth | LLM classifies during triage |
|
||||
| **Source refs** (chunk_id[], interaction_id[]) | Provenance for every fact | Enforced at creation time |
|
||||
| **Relationships** (contradicts, updates, depends_on) | Memory graph | Triage infers during review |
|
||||
| **Semantic clusters** | Detect duplicates, find gaps | Weekly HDBSCAN pass on embeddings |
|
||||
|
||||
Layer these in progressively — none of them require schema rewrites, just
|
||||
additional fields and batch jobs.
|
||||
|
||||
## Self-Growth Mechanisms
|
||||
|
||||
Four loops that make AtoCore grow autonomously:
|
||||
|
||||
### 1. Drift detection (nightly)
|
||||
Compare new chunk embeddings to existing vector distribution. Centroids >X
|
||||
cosine distance from any existing centroid = new knowledge area. Log to
|
||||
dashboard; human decides if it's noise or a domain worth curating.
|
||||
|
||||
### 2. Gap identification (continuous)
|
||||
Every `/context/build` logs `query + chunks_returned + memories_returned`.
|
||||
Weekly report: "top 10 queries with weak coverage." Those are targeted
|
||||
curation opportunities.
|
||||
|
||||
### 3. Multi-source triangulation (weekly)
|
||||
Scan memory content similarity across sources. When a fact appears in 3+
|
||||
independent sources (vault doc + drive doc + interaction), auto-promote to
|
||||
high confidence and mark as "triangulated."
|
||||
|
||||
### 4. Active learning prompts (monthly)
|
||||
Surface "you have 200 p06 memories but only 15 p04 memories. Spend 30 min
|
||||
curating p04?" via dashboard digest.
|
||||
|
||||
## Robustness Strategy (Flawless Operation Bar)
|
||||
|
||||
Current: nightly backup, off-host rsync, health endpoint, 303 tests, harness,
|
||||
enhanced dashboard with pipeline health (this session).
|
||||
|
||||
To reach "flawless":
|
||||
|
||||
| Gap | Fix | Priority |
|
||||
|---|---|---|
|
||||
| Silent pipeline failures | Alerting webhook on harness drop / pipeline skip | P1 |
|
||||
| Memory mutations untracked | Append-only audit log table | P1 |
|
||||
| Integrity drift | Nightly FK + vector-chunk parity checks | P1 |
|
||||
| Schema migrations ad-hoc | Formal migration framework with rollback | P2 |
|
||||
| Single point of failure | Daily backup to user's main computer (new) | P1 |
|
||||
| No hot standby | Second instance following primary via WAL | P3 |
|
||||
| No temporal history | Memory audit + valid_until fields | P2 |
|
||||
|
||||
### Daily Backup to Main Computer
|
||||
|
||||
Currently: Dalidou → T420 (192.168.86.39) via rsync.
|
||||
|
||||
Add: Dalidou → main computer via a pull (main computer runs the rsync,
|
||||
pulls from Dalidou). Pull-based is simpler than push — no need for SSH
|
||||
keys on Dalidou to reach the Windows machine.
|
||||
|
||||
```bash
|
||||
# On main computer, daily scheduled task:
|
||||
rsync -a papa@dalidou:/srv/storage/atocore/backups/snapshots/ \
|
||||
/path/to/local/atocore-backups/
|
||||
```
|
||||
|
||||
Configure via Windows Task Scheduler or a cron-like runner. Verify weekly
|
||||
that the latest snapshot is present.
|
||||
|
||||
## Human Interface Auto-Evolution
|
||||
|
||||
Current: wiki at `/wiki`, regenerates on every request from DB. Synthesis
|
||||
(the "current state" paragraph at top of project pages) runs **weekly on
|
||||
Sundays only**. That's why it feels stalled.
|
||||
|
||||
Fixes:
|
||||
1. **Run synthesis daily, not weekly.** It's cheap (one claude call per
|
||||
project) and keeps the human-readable overview fresh.
|
||||
2. **Trigger synthesis on major events** — when 5+ new memories land for a
|
||||
project, regenerate its synthesis.
|
||||
3. **Add "What's New" feed** — wiki homepage shows recent additions across all
|
||||
projects (last 7 days of memory promotions, state entries, entities).
|
||||
4. **Memory timeline view** — project page gets a chronological list of what
|
||||
we learned when.
|
||||
|
||||
## Phased Roadmap (8-10 weeks)
|
||||
|
||||
### Phase 1 (week 1-2): Universal Consumption
|
||||
**Goal: every LLM call is AtoCore-grounded automatically.**
|
||||
|
||||
- [ ] Build `atocore-mcp` server (wraps HTTP API, stdio transport)
|
||||
- [ ] Publish to npm / or run via `pipx` / stdlib HTTP
|
||||
- [ ] Configure in Claude Desktop (`~/.claude/mcp_servers.json`)
|
||||
- [ ] Configure in Claude Code (`claude mcp add atocore …`)
|
||||
- [ ] Extend OpenClaw plugin with `before_prompt_build` PULL
|
||||
- [ ] Write `atocore-proxy` middleware for Codex/Ollama/generic clients
|
||||
- [ ] Document configuration for each client
|
||||
|
||||
**Success:** open a fresh Claude Code session, ask a project question, verify
|
||||
the response references AtoCore memories without manual context commands.
|
||||
|
||||
### Phase 2 (week 2-3): Knowledge Density + Wiki Evolution
|
||||
- [ ] Finish current triage pass (324 candidates → active)
|
||||
- [ ] Tune extractor prompt for higher promotion rate on durable facts
|
||||
- [ ] Daily synthesis in cron (not just Sundays)
|
||||
- [ ] Event-triggered synthesis on significant project changes
|
||||
- [ ] Wiki "What's New" feed
|
||||
- [ ] Memory timeline per project
|
||||
|
||||
**Target:** 300+ active memories, wiki feels alive daily.
|
||||
|
||||
### Phase 3 (week 3-4): Auto-Organization
|
||||
- [ ] Schema: add `domain_tags`, `valid_until`, `source_refs`, `triangulated_count`
|
||||
- [ ] Triage prompt upgraded: infer tags + temporal scope + relationships
|
||||
- [ ] Weekly HDBSCAN clustering of embeddings → dup detection + gap reports
|
||||
- [ ] Relationship edges in a new `memory_relationships` table
|
||||
|
||||
### Phase 4 (week 4-5): Robustness Hardening
|
||||
- [ ] Append-only `memory_audit` table + retrofit mutations
|
||||
- [ ] Nightly integrity checks (FK validation, orphan detection, parity)
|
||||
- [ ] Alerting webhook (Discord/email) on pipeline anomalies
|
||||
- [ ] Daily backup to user's main computer (pull-based)
|
||||
- [ ] Formal migration framework
|
||||
|
||||
### Phase 5 (week 6-7): Engineering V1 Implementation
|
||||
Execute the 23 acceptance criteria in `docs/architecture/engineering-v1-acceptance.md`
|
||||
against p06-polisher as the test bed. The ontology and queries are designed;
|
||||
this phase implements them.
|
||||
|
||||
### Phase 6 (week 8-9): Self-Growth Loops
|
||||
- [ ] Drift detection (nightly)
|
||||
- [ ] Gap identification from `/context/build` logs
|
||||
- [ ] Multi-source triangulation
|
||||
- [ ] Active learning digest (monthly)
|
||||
- [ ] Cross-project synthesis
|
||||
|
||||
### Phase 7 (ongoing): Scale & Polish
|
||||
- [ ] Multi-model validation (sonnet triages, opus cross-checks on disagreements)
|
||||
- [ ] AtoDrive integration (Google Drive as trusted source)
|
||||
- [ ] Hot standby when real production dependence materializes
|
||||
- [ ] More MCP tools (write-back, memory search, entity queries)
|
||||
|
||||
## Success Criteria
|
||||
|
||||
AtoCore is a master brain when:
|
||||
|
||||
1. **Zero manual context commands.** A fresh Claude/OpenClaw session answering
|
||||
a project question without being told "use AtoCore context."
|
||||
2. **1,000+ active memories** with >90% provenance coverage (every fact
|
||||
traceable to a source).
|
||||
3. **Every project has a current, human-readable overview** updated within 24h
|
||||
of significant changes.
|
||||
4. **Harness stays >95%** across 20+ fixtures covering all active projects.
|
||||
5. **Zero silent pipeline failures** for 30 consecutive days (all failures
|
||||
surface via alert within the hour).
|
||||
6. **Claude on any task knows what we know** — user asks "what did we decide
|
||||
about X?" and the answer is grounded in AtoCore, not reconstructed from
|
||||
scratch.
|
||||
|
||||
## Where We Are Now (2026-04-16)
|
||||
|
||||
- ✅ Core infrastructure: HTTP API, SQLite, Chroma, deploy pipeline
|
||||
- ✅ Capture pipes: Claude Code Stop hook, OpenClaw message hooks
|
||||
- ✅ Nightly pipeline: backup, extract, triage, synthesis, lint, harness, summary
|
||||
- ✅ Phase 10: auto-promotion from reinforcement + candidate expiry
|
||||
- ✅ Dashboard shows pipeline health + interaction totals + all projects
|
||||
- ⚡ 324 candidates being triaged (down from 439), ~80 active memories, growing
|
||||
- ❌ No consumption at prompt time (capture-only)
|
||||
- ❌ Wiki auto-evolves only on Sundays (synthesis cadence)
|
||||
- ❌ No MCP adapter
|
||||
- ❌ No daily backup to main computer
|
||||
- ❌ Engineering V1 not implemented
|
||||
- ❌ No alerting on pipeline failures
|
||||
|
||||
The path is clear. Phase 1 is the keystone.
|
||||
96
docs/PHASE-7-MEMORY-CONSOLIDATION.md
Normal file
96
docs/PHASE-7-MEMORY-CONSOLIDATION.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# Phase 7 — Memory Consolidation (the "Sleep Cycle")
|
||||
|
||||
**Status**: 7A in progress · 7B-H scoped, deferred
|
||||
**Design principle**: *"Like human memory while sleeping, but more robotic — never discard relevant details. Consolidate, update, supersede — don't delete."*
|
||||
|
||||
## Why
|
||||
|
||||
Phases 1–6 built capture + triage + graduation + emerging-project detection. What they don't solve:
|
||||
|
||||
| # | Problem | Fix |
|
||||
|---|---|---|
|
||||
| 1 | Redundancy — "APM uses NX" said 5 different ways across 5 memories | **7A** Semantic dedup |
|
||||
| 2 | Latent contradictions — "chose Zygo" + "switched from Zygo" both active | **7B** Pair contradiction detection |
|
||||
| 3 | Tag drift — `firmware`, `fw`, `firmware-control` fragment retrieval | **7C** Tag canonicalization |
|
||||
| 4 | Confidence staleness — 6-month unreferenced memory ranks as fresh | **7D** Confidence decay |
|
||||
| 5 | No memory drill-down page | **7E** `/wiki/memories/{id}` |
|
||||
| 6 | Domain knowledge siloed per project | **7F** `/wiki/domains/{tag}` |
|
||||
| 7 | Prompt upgrades (llm-0.5 → 0.6) don't re-process old interactions | **7G** Re-extraction on version bump |
|
||||
| 8 | Superseded memory vectors still in Chroma polluting retrieval | **7H** Vector hygiene |
|
||||
|
||||
Collectively: the brain needs a nightly pass that looks at what it already knows and tidies up — dedup, resolve contradictions, canonicalize tags, decay stale facts — **without losing information**.
|
||||
|
||||
## Subphases
|
||||
|
||||
### 7A — Semantic dedup + consolidation *(this sprint)*
|
||||
|
||||
Compute embeddings on active memories, find pairs within `(project, memory_type)` bucket above similarity threshold (default 0.88), cluster, draft a unified memory via LLM, human approves in triage UI. On approve: sources become `superseded`, new merged memory created with union of `source_refs`, sum of `reference_count`, max of `confidence`. **Ships first** because redundancy compounds — every new memory potentially duplicates an old one.
|
||||
|
||||
Detailed spec lives in the working plan (`dapper-cooking-tower.md`) and across the files listed under "Files touched" below. Key decisions:
|
||||
|
||||
- LLM drafts, human approves — no silent auto-merge.
|
||||
- Same `(project, memory_type)` bucket only. Cross-project merges are rare + risky → separate flow in 7B.
|
||||
- Recompute embeddings each scan (~2s / 335 memories). Persist only if scan time becomes a problem.
|
||||
- Cluster-based proposals (A~B~C → one merge), not pair-based.
|
||||
- `status=superseded` never deleted — still queryable with filter.
|
||||
|
||||
**Schema**: new table `memory_merge_candidates` (pending | approved | rejected).
|
||||
**Cron**: nightly at threshold 0.90 (tight); weekly (Sundays) at 0.85 (deeper cleanup).
|
||||
**UI**: new "🔗 Merge Candidates" section in `/admin/triage`.
|
||||
|
||||
**Files touched in 7A**:
|
||||
- `src/atocore/models/database.py` — migration
|
||||
- `src/atocore/memory/similarity.py` — new, `compute_memory_similarity()`
|
||||
- `src/atocore/memory/_dedup_prompt.py` — new, shared LLM prompt
|
||||
- `src/atocore/memory/service.py` — `merge_memories()`
|
||||
- `scripts/memory_dedup.py` — new, host-side detector (HTTP-only)
|
||||
- `src/atocore/api/routes.py` — 5 new endpoints under `/admin/memory/`
|
||||
- `src/atocore/engineering/triage_ui.py` — merge cards section
|
||||
- `deploy/dalidou/batch-extract.sh` — Step B3
|
||||
- `deploy/dalidou/dedup-watcher.sh` — new, UI-triggered scans
|
||||
- `tests/test_memory_dedup.py` — ~10-15 new tests
|
||||
|
||||
### 7B — Memory-to-memory contradiction detection
|
||||
|
||||
Same embedding-pair machinery as 7A but within a *different* band (similarity 0.70–0.88 — semantically related but different wording). LLM classifies each pair: `duplicate | complementary | contradicts | supersedes-older`. Contradictions write a `memory_conflicts` row + surface a triage badge. Clear supersessions (both tier 1 sonnet and tier 2 opus agree) auto-mark the older as `superseded`.
|
||||
|
||||
### 7C — Tag canonicalization
|
||||
|
||||
Weekly LLM pass over `domain_tags` distribution, proposes `alias → canonical` map (e.g. `fw → firmware`). Human approves via UI (one-click pattern, same as emerging-project registration). Bulk-rewrites `domain_tags` atomically across all memories.
|
||||
|
||||
### 7D — Confidence decay
|
||||
|
||||
Daily lightweight job. For memories with `reference_count=0` AND `last_referenced_at` older than 30 days: multiply confidence by 0.97/day (~2-month half-life). Reinforcement already bumps confidence. Below 0.3 → auto-supersede with reason `decayed, no references`. Reversible (tune half-life), non-destructive (still searchable with status filter).
|
||||
|
||||
### 7E — Memory detail page `/wiki/memories/{id}`
|
||||
|
||||
Provenance chain: source_chunk → interaction → graduated_to_entity. Audit trail (Phase 4 has the data). Related memories (same project + tag + semantic neighbors). Decay trajectory plot (if 7D ships). Link target from every memory surfaced anywhere in the wiki.
|
||||
|
||||
### 7F — Cross-project domain view `/wiki/domains/{tag}`
|
||||
|
||||
One page per `domain_tag` showing all memories + graduated entities with that tag, grouped by project. "Optics across p04+p05+p06" becomes a real navigable page. Answers the long-standing question the tag system was meant to enable.
|
||||
|
||||
### 7G — Re-extraction on prompt upgrade
|
||||
|
||||
`batch_llm_extract_live.py --force-reextract --since DATE`. Dedupe key: `(interaction_id, extractor_version)` — same run on same interaction doesn't double-create. Triggered manually when `LLM_EXTRACTOR_VERSION` bumps. Not automatic (destructive).
|
||||
|
||||
### 7H — Vector store hygiene
|
||||
|
||||
Nightly: scan `source_chunks` and `memory_embeddings` (added in 7A V2) for `status=superseded|invalid`. Delete matching vectors from Chroma. Fail-open — the retrieval harness catches any real regression.
|
||||
|
||||
## Verification & ship order
|
||||
|
||||
1. **7A** — ship + observe 1 week → validate merge proposals are high-signal, rejection rate acceptable
|
||||
2. **7D** — decay is low-risk + high-compounding value; ship second
|
||||
3. **7C** — clean up tag fragmentation before 7F depends on canonical tags
|
||||
4. **7E** + **7F** — UX surfaces; ship together once data is clean
|
||||
5. **7B** — contradictions flow (pairs harder than duplicates to classify; wait for 7A data to tune threshold)
|
||||
6. **7G** — on-demand; no ship until we actually bump the extractor prompt
|
||||
7. **7H** — housekeeping; after 7A + 7B + 7D have generated enough `superseded` rows to matter
|
||||
|
||||
## Scope NOT in Phase 7
|
||||
|
||||
- Graduated memories (entity-descended) are **frozen** — exempt from dedup/decay. Entity consolidation is a separate Phase (8+).
|
||||
- Auto-merging without human approval (always human-in-the-loop in V1).
|
||||
- Summarization / compression — a different problem (reducing the number of chunks per memory, not the number of memories).
|
||||
- Forgetting policies — there's no user-facing "delete this" flow in Phase 7. Supersede + filter covers the need.
|
||||
45
docs/capture-surfaces.md
Normal file
45
docs/capture-surfaces.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# AtoCore — sanctioned capture surfaces
|
||||
|
||||
**Scope statement**: AtoCore captures conversations from **two surfaces only**. Everything else is intentionally out of scope.
|
||||
|
||||
| Surface | Hooks | Status |
|
||||
|---|---|---|
|
||||
| **Claude Code** (local CLI) | `Stop` (capture) + `UserPromptSubmit` (context injection) | both installed |
|
||||
| **OpenClaw** (agent framework on T420) | `before_agent_start` (context injection) + `llm_output` (capture) | both installed (v0.2.0 plugin, Phase 7I) |
|
||||
|
||||
Both surfaces are **symmetric** — push (capture) and pull (context injection on prompt submit) — so AtoCore learns from every turn AND every turn is grounded in what AtoCore already knows.
|
||||
|
||||
## Why these two?
|
||||
|
||||
- **Stable hook APIs.** Claude Code exposes `Stop` and `UserPromptSubmit` lifecycle hooks with documented JSON contracts. OpenClaw exposes `before_agent_start` and `llm_output`. Both run locally where we control the process.
|
||||
- **Passive from the user's perspective.** No paste, no manual capture command, no "remember this" prompt. You just use the tool and AtoCore absorbs everything durable.
|
||||
- **Failure is graceful.** If AtoCore is down, hooks exit 0 with no output — the user's turn proceeds uninterrupted.
|
||||
|
||||
## Why not Claude Desktop / Claude.ai web / Claude mobile / ChatGPT / …?
|
||||
|
||||
- Claude Desktop has MCP but no `Stop`-equivalent hook for auto-capture; auto-capture would require system-prompt coercion ("call atocore_remember every turn"), which is fragile.
|
||||
- Claude.ai web has no hook surface — would need a browser extension (real project, not shipped).
|
||||
- Claude mobile app has neither hooks nor MCP — nothing to wire into.
|
||||
- ChatGPT etc. — same as above.
|
||||
|
||||
**Anthropic API log polling is explicitly prohibited.**
|
||||
|
||||
If you find yourself wanting to capture from one of these, the real answer is: use Claude Code or OpenClaw for the work that matters. Don't paste chat transcripts into AtoCore — that contradicts the whole design principle of passive capture.
|
||||
|
||||
A `/wiki/capture` fallback form still exists (the endpoint `/interactions` is public) but it is **not promoted in the UI** and is documented as a last-resort escape hatch. If you're reaching for it, something is wrong with your workflow, not with AtoCore.
|
||||
|
||||
## Hook files
|
||||
|
||||
- `deploy/hooks/capture_stop.py` — Claude Code Stop → POSTs `/interactions`
|
||||
- `deploy/hooks/inject_context.py` — Claude Code UserPromptSubmit → POSTs `/context/build`, returns pack via `hookSpecificOutput.additionalContext`
|
||||
- `openclaw-plugins/atocore-capture/index.js` — OpenClaw plugin v0.2.0: capture + context injection
|
||||
|
||||
Both Claude Code hooks share a `_infer_project` table mapping cwd to project slug. Keep them in sync when adding a new project path.
|
||||
|
||||
## Kill switches
|
||||
|
||||
- `ATOCORE_CAPTURE_DISABLED=1` → skip Stop capture
|
||||
- `ATOCORE_CONTEXT_DISABLED=1` → skip UserPromptSubmit injection
|
||||
- OpenClaw plugin config `injectContext: false` → skip context injection (capture still fires)
|
||||
|
||||
All three are documented in the respective hook/plugin files.
|
||||
@@ -1,275 +1,49 @@
|
||||
# AtoCore Current State
|
||||
# AtoCore — Current State (2026-04-19)
|
||||
|
||||
## Status Summary
|
||||
Live deploy: `877b97e` · Dalidou health: ok · Harness: 17/18.
|
||||
|
||||
AtoCore is no longer just a proof of concept. The local engine exists, the
|
||||
correctness pass is complete, Dalidou now hosts the canonical runtime and
|
||||
machine-storage location, and the T420/OpenClaw side now has a safe read-only
|
||||
path to consume AtoCore. The live corpus is no longer just self-knowledge: it
|
||||
now includes a first curated ingestion batch for the active projects.
|
||||
## The numbers
|
||||
|
||||
## Phase Assessment
|
||||
| | count |
|
||||
|---|---|
|
||||
| Active memories | 266 (180 project, 31 preference, 24 knowledge, 17 adaptation, 11 episodic, 3 identity) |
|
||||
| Candidates pending | **0** (autonomous triage drained the queue) |
|
||||
| Interactions captured | 605 (250 claude-code, 351 openclaw) |
|
||||
| Entities (typed graph) | 50 |
|
||||
| Vectors in Chroma | 33K+ |
|
||||
| Projects | 6 registered (p04, p05, p06, abb-space, atomizer-v2, atocore) + apm emerging (2 memories, below auto-register threshold) |
|
||||
| Unique domain tags | 210 |
|
||||
| Tests | 440 passing |
|
||||
|
||||
- completed
|
||||
- Phase 0
|
||||
- Phase 0.5
|
||||
- Phase 1
|
||||
- baseline complete
|
||||
- Phase 2
|
||||
- Phase 3
|
||||
- Phase 5
|
||||
- Phase 7
|
||||
- Phase 9 (Commits A/B/C: capture, reinforcement, extractor + review queue)
|
||||
- partial
|
||||
- Phase 4
|
||||
- Phase 8
|
||||
- not started
|
||||
- Phase 6
|
||||
- Phase 10
|
||||
- Phase 11
|
||||
- Phase 12
|
||||
- Phase 13
|
||||
## Autonomous pipeline — what runs without me
|
||||
|
||||
## What Exists Today
|
||||
| When | Job | Does |
|
||||
|---|---|---|
|
||||
| every hour | `hourly-extract.sh` | Pulls new interactions → LLM extraction → 3-tier auto-triage (sonnet → opus → discard/human). 0 pending candidates right now = autonomy is working. |
|
||||
| every 2 min | `dedup-watcher.sh` | Services UI-triggered dedup scans |
|
||||
| daily 03:00 UTC | Full nightly (`batch-extract.sh`) | Extract · triage · auto-promote reinforced · synthesis · harness · dedup (0.90) · emerging detector · transient→durable · **confidence decay (7D)** · integrity check · alerts |
|
||||
| Sundays | +Weekly deep pass | Knowledge-base lint · dedup @ 0.85 · **tag canonicalization (7C)** |
|
||||
|
||||
- ingestion pipeline
|
||||
- parser and chunker
|
||||
- SQLite-backed memory and project state
|
||||
- vector retrieval
|
||||
- context builder
|
||||
- API routes for query, context, health, and source status
|
||||
- project registry and per-project refresh foundation
|
||||
- project registration lifecycle:
|
||||
- template
|
||||
- proposal preview
|
||||
- approved registration
|
||||
- safe update of existing project registrations
|
||||
- refresh
|
||||
- implementation-facing architecture notes for:
|
||||
- engineering knowledge hybrid architecture
|
||||
- engineering ontology v1
|
||||
- env-driven storage and deployment paths
|
||||
- Dalidou Docker deployment foundation
|
||||
- initial AtoCore self-knowledge corpus ingested on Dalidou
|
||||
- T420/OpenClaw read-only AtoCore helper skill
|
||||
- full active-project markdown/text corpus wave for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
Last nightly run (2026-04-19 03:00 UTC): **31 promoted · 39 rejected · 0 needs human**. That's the brain self-organizing.
|
||||
|
||||
## What Is True On Dalidou
|
||||
## Phase 7 — Memory Consolidation status
|
||||
|
||||
- deployed repo location:
|
||||
- `/srv/storage/atocore/app`
|
||||
- canonical machine DB location:
|
||||
- `/srv/storage/atocore/data/db/atocore.db`
|
||||
- canonical vector store location:
|
||||
- `/srv/storage/atocore/data/chroma`
|
||||
- source input locations:
|
||||
- `/srv/storage/atocore/sources/vault`
|
||||
- `/srv/storage/atocore/sources/drive`
|
||||
| Subphase | What | Status |
|
||||
|---|---|---|
|
||||
| 7A | Semantic dedup + merge lifecycle | live |
|
||||
| 7A.1 | Tiered auto-approve (sonnet ≥0.8 + sim ≥0.92 → merge; opus escalation; human only for ambiguous) | live |
|
||||
| 7B | Memory-to-memory contradiction detection (0.70–0.88 band, classify duplicate/contradicts/supersedes) | deferred, needs 7A signal |
|
||||
| 7C | Tag canonicalization (weekly; auto-apply ≥0.8 confidence; protects project tokens) | live (first run: 0 proposals — vocabulary is clean) |
|
||||
| 7D | Confidence decay (0.97/day on idle unreferenced; auto-supersede below 0.3) | live (first run: 0 decayed — nothing idle+unreferenced yet) |
|
||||
| 7E | `/wiki/memories/{id}` detail page | pending |
|
||||
| 7F | `/wiki/domains/{tag}` cross-project view | pending (wants 7C + more usage first) |
|
||||
| 7G | Re-extraction on prompt version bump | pending |
|
||||
| 7H | Chroma vector hygiene (delete vectors for superseded memories) | pending |
|
||||
|
||||
The service and storage foundation are live on Dalidou.
|
||||
## Known gaps (honest)
|
||||
|
||||
The machine-data host is real and canonical.
|
||||
|
||||
The project registry is now also persisted in a canonical mounted config path on
|
||||
Dalidou:
|
||||
|
||||
- `/srv/storage/atocore/config/project-registry.json`
|
||||
|
||||
The content corpus is partially populated now.
|
||||
|
||||
The Dalidou instance already contains:
|
||||
|
||||
- AtoCore ecosystem and hosting docs
|
||||
- current-state and OpenClaw integration docs
|
||||
- Master Plan V3
|
||||
- Build Spec V1
|
||||
- trusted project-state entries for `atocore`
|
||||
- full staged project markdown/text corpora for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
- curated repo-context docs for:
|
||||
- `p05`: `Fullum-Interferometer`
|
||||
- `p06`: `polisher-sim`
|
||||
- trusted project-state entries for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
|
||||
Current live stats after the full active-project wave are now far beyond the
|
||||
initial seed stage:
|
||||
|
||||
- more than `1,100` source documents
|
||||
- more than `20,000` chunks
|
||||
- matching vector count
|
||||
|
||||
The broader long-term corpus is still not fully populated yet. Wider project and
|
||||
vault ingestion remains a deliberate next step rather than something already
|
||||
completed, but the corpus is now meaningfully seeded beyond AtoCore's own docs.
|
||||
|
||||
For human-readable quality review, the current staged project markdown corpus is
|
||||
primarily visible under:
|
||||
|
||||
- `/srv/storage/atocore/sources/vault/incoming/projects`
|
||||
|
||||
This staged area is now useful for review because it contains the markdown/text
|
||||
project docs that were actually ingested for the full active-project wave.
|
||||
|
||||
It is important to read this staged area correctly:
|
||||
|
||||
- it is a readable ingestion input layer
|
||||
- it is not the final machine-memory representation itself
|
||||
- seeing familiar PKM-style notes there is expected
|
||||
- the machine-processed intelligence lives in the DB, chunks, vectors, memory,
|
||||
trusted project state, and context-builder outputs
|
||||
|
||||
## What Is True On The T420
|
||||
|
||||
- SSH access is working
|
||||
- OpenClaw workspace inspected at `/home/papa/clawd`
|
||||
- OpenClaw's own memory system remains unchanged
|
||||
- a read-only AtoCore integration skill exists in the workspace:
|
||||
- `/home/papa/clawd/skills/atocore-context/`
|
||||
- the T420 can successfully reach Dalidou AtoCore over network/Tailscale
|
||||
- fail-open behavior has been verified for the helper path
|
||||
- OpenClaw can now seed AtoCore in two distinct ways:
|
||||
- project-scoped memory entries
|
||||
- staged document ingestion into the retrieval corpus
|
||||
- the helper now supports the practical registered-project lifecycle:
|
||||
- projects
|
||||
- project-template
|
||||
- propose-project
|
||||
- register-project
|
||||
- update-project
|
||||
- refresh-project
|
||||
- the helper now also supports the first organic routing layer:
|
||||
- `detect-project "<prompt>"`
|
||||
- `auto-context "<prompt>" [budget] [project]`
|
||||
- OpenClaw can now default to AtoCore for project-knowledge questions without
|
||||
requiring explicit helper commands from the human every time
|
||||
|
||||
## What Exists In Memory vs Corpus
|
||||
|
||||
These remain separate and that is intentional.
|
||||
|
||||
In `/memory`:
|
||||
|
||||
- project-scoped curated memories now exist for:
|
||||
- `p04-gigabit`: 5 memories
|
||||
- `p05-interferometer`: 6 memories
|
||||
- `p06-polisher`: 8 memories
|
||||
|
||||
These are curated summaries and extracted stable project signals.
|
||||
|
||||
In `source_documents` / retrieval corpus:
|
||||
|
||||
- full project markdown/text corpora are now present for the active project set
|
||||
- retrieval is no longer limited to AtoCore self-knowledge only
|
||||
- the current corpus is broad enough that ranking quality matters more than
|
||||
corpus presence alone
|
||||
- underspecified prompts can still pull in historical or archive material, so
|
||||
project-aware routing and better ranking remain important
|
||||
|
||||
The source refresh model now has a concrete foundation in code:
|
||||
|
||||
- a project registry file defines known project ids, aliases, and ingest roots
|
||||
- the API can list registered projects
|
||||
- the API can return a registration template
|
||||
- the API can preview a registration without mutating state
|
||||
- the API can persist an approved registration
|
||||
- the API can update an existing registered project without changing its canonical id
|
||||
- the API can refresh one registered project at a time
|
||||
|
||||
This lifecycle is now coherent end to end for normal use.
|
||||
|
||||
The first live update passes on existing registered projects have now been
|
||||
verified against `p04-gigabit` and `p05-interferometer`:
|
||||
|
||||
- the registration description can be updated safely
|
||||
- the canonical project id remains unchanged
|
||||
- refresh still behaves cleanly after the update
|
||||
- `context/build` still returns useful project-specific context afterward
|
||||
|
||||
## Reliability Baseline
|
||||
|
||||
The runtime has now been hardened in a few practical ways:
|
||||
|
||||
- SQLite connections use a configurable busy timeout
|
||||
- SQLite uses WAL mode to reduce transient lock pain under normal concurrent use
|
||||
- project registry writes are atomic file replacements rather than in-place rewrites
|
||||
- a full runtime backup and restore path now exists and has been exercised on
|
||||
live Dalidou:
|
||||
- SQLite (hot online backup via `conn.backup()`)
|
||||
- project registry (file copy)
|
||||
- Chroma vector store (cold directory copy under `exclusive_ingestion()`)
|
||||
- backup metadata
|
||||
- `restore_runtime_backup()` with CLI entry point
|
||||
(`python -m atocore.ops.backup restore <STAMP>
|
||||
--confirm-service-stopped`), pre-restore safety snapshot for
|
||||
rollback, WAL/SHM sidecar cleanup, `PRAGMA integrity_check`
|
||||
on the restored file
|
||||
- the first live drill on 2026-04-09 surfaced and fixed a Chroma
|
||||
restore bug on Docker bind-mounted volumes (`shutil.rmtree`
|
||||
on a mount point); a regression test now asserts the
|
||||
destination inode is stable across restore
|
||||
- deploy provenance is visible end-to-end:
|
||||
- `/health` reports `build_sha`, `build_time`, `build_branch`
|
||||
from env vars wired by `deploy.sh`
|
||||
- `deploy.sh` Step 6 verifies the live `build_sha` matches the
|
||||
just-built commit (exit code 6 on drift) so "live is current?"
|
||||
can be answered precisely, not just by `__version__`
|
||||
- `deploy.sh` Step 1.5 detects that the script itself changed
|
||||
in the pulled commit and re-execs into the fresh copy, so
|
||||
the deploy never silently runs the old script against new source
|
||||
|
||||
This does not eliminate every concurrency edge, but it materially improves the
|
||||
current operational baseline.
|
||||
|
||||
In `Trusted Project State`:
|
||||
|
||||
- each active seeded project now has a conservative trusted-state set
|
||||
- promoted facts cover:
|
||||
- summary
|
||||
- core architecture or boundary decision
|
||||
- key constraints
|
||||
- next focus
|
||||
|
||||
This separation is healthy:
|
||||
|
||||
- memory stores distilled project facts
|
||||
- corpus stores the underlying retrievable documents
|
||||
|
||||
## Immediate Next Focus
|
||||
|
||||
1. ~~Re-run the full backup/restore drill~~ — DONE 2026-04-11,
|
||||
full pass (db, registry, chroma, integrity all true)
|
||||
2. ~~Turn on auto-capture of Claude Code sessions in conservative
|
||||
mode~~ — DONE 2026-04-11, Stop hook wired via
|
||||
`deploy/hooks/capture_stop.py` → `POST /interactions`
|
||||
with `reinforce=false`; kill switch via
|
||||
`ATOCORE_CAPTURE_DISABLED=1`
|
||||
3. Run a short real-use pilot with auto-capture on, verify
|
||||
interactions are landing in Dalidou, review quality
|
||||
4. Use the new T420-side organic routing layer in real OpenClaw workflows
|
||||
4. Tighten retrieval quality for the now fully ingested active project corpora
|
||||
5. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
|
||||
6. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
|
||||
7. Expand the remaining boring operations baseline:
|
||||
- retention policy cleanup script
|
||||
- off-Dalidou backup target (rsync or similar)
|
||||
8. Only later consider write-back, reflection, or deeper autonomous behaviors
|
||||
|
||||
See also:
|
||||
|
||||
- [ingestion-waves.md](C:/Users/antoi/ATOCore/docs/ingestion-waves.md)
|
||||
- [master-plan-status.md](C:/Users/antoi/ATOCore/docs/master-plan-status.md)
|
||||
|
||||
## Guiding Constraints
|
||||
|
||||
- bad memory is worse than no memory
|
||||
- trusted project state must remain highest priority
|
||||
- human-readable sources and machine storage stay separate
|
||||
- OpenClaw integration must not degrade OpenClaw baseline behavior
|
||||
1. **Capture surface is Claude-Code-and-OpenClaw only.** Conversations in Claude Desktop, Claude.ai web, phone, or any other LLM UI are NOT captured. Example: the rotovap/mushroom chat yesterday never reached AtoCore because no hook fired. See Q4 below.
|
||||
2. **OpenClaw is capture-only, not context-grounded.** The plugin POSTs `/interactions` on `llm_output` but does NOT call `/context/build` on `before_agent_start`. OpenClaw's underlying agent runs blind. See Q2 below.
|
||||
3. **Human interface (wiki) is thin and static.** 5 project cards + a "System" line. No dashboard for the autonomous activity. No per-memory detail page. See Q3/Q5.
|
||||
4. **Harness 17/18** — the `p04-constraints` fixture wants "Zerodur" but retrieval surfaces related-not-exact terms. Content gap, not a retrieval regression.
|
||||
5. **Two projects under-populated**: p05-interferometer (4 memories, 18 state) and atomizer-v2 (1 memory, 6 state). Batch re-extract with the new llm-0.6.0 prompt would help.
|
||||
|
||||
@@ -126,25 +126,29 @@ This sits implicitly between Phase 8 (OpenClaw) and Phase 11
|
||||
(multi-model). Memory-review and engineering-entity commands are
|
||||
deferred from the shared client until their workflows are exercised.
|
||||
|
||||
## What Is Real Today (updated 2026-04-12)
|
||||
## What Is Real Today (updated 2026-04-16)
|
||||
|
||||
- canonical AtoCore runtime on Dalidou (build_sha tracked, deploy.sh verified)
|
||||
- 33,253 vectors across 5 registered projects
|
||||
- project registry with template, proposal, register, update, refresh
|
||||
- 5 registered projects:
|
||||
- `p04-gigabit` (483 docs, 5 state entries)
|
||||
- `p05-interferometer` (109 docs, 9 state entries)
|
||||
- `p06-polisher` (564 docs, 9 state entries)
|
||||
- `atomizer-v2` (568 docs, newly ingested 2026-04-12)
|
||||
- `atocore` (drive source, 38 state entries)
|
||||
- 47 active memories (16 project, 16 knowledge, 6 adaptation, 3 identity, 3 preference, 3 episodic)
|
||||
- canonical AtoCore runtime on Dalidou (`775960c`, deploy.sh verified)
|
||||
- 33,253 vectors across 6 registered projects
|
||||
- 234 captured interactions (192 claude-code, 38 openclaw, 4 test)
|
||||
- 6 registered projects:
|
||||
- `p04-gigabit` (483 docs, 15 state entries)
|
||||
- `p05-interferometer` (109 docs, 18 state entries)
|
||||
- `p06-polisher` (564 docs, 19 state entries)
|
||||
- `atomizer-v2` (568 docs, 5 state entries)
|
||||
- `abb-space` (6 state entries)
|
||||
- `atocore` (drive source, 47 state entries)
|
||||
- 110 Trusted Project State entries across all projects (decisions, requirements, facts, contacts, milestones)
|
||||
- 84 active memories (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
|
||||
- context pack assembly with 4 tiers: Trusted Project State > identity/preference > project memories > retrieved chunks
|
||||
- query-relevance memory ranking with overlap-density scoring
|
||||
- retrieval eval harness: 18 fixtures, 17/18 passing
|
||||
- 290 tests passing
|
||||
- nightly pipeline: backup → cleanup → rsync → LLM extraction (sonnet) → auto-triage
|
||||
- retrieval eval harness: 18 fixtures, 17/18 passing on live
|
||||
- 303 tests passing
|
||||
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
|
||||
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
|
||||
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
|
||||
- off-host backup to clawdbot (T420) via rsync
|
||||
- both Claude Code and OpenClaw capture interactions to AtoCore
|
||||
- both Claude Code and OpenClaw capture interactions to AtoCore (OpenClaw via `before_agent_start` + `llm_output` plugin, verified live)
|
||||
- DEV-LEDGER.md as shared operating memory between Claude and Codex
|
||||
- observability dashboard at GET /admin/dashboard
|
||||
|
||||
@@ -152,26 +156,28 @@ deferred from the shared client until their workflows are exercised.
|
||||
|
||||
These are the current practical priorities.
|
||||
|
||||
1. **Observe and stabilize** — let the nightly pipeline run for a week,
|
||||
check the dashboard daily, verify memories accumulate correctly
|
||||
from organic Claude Code and OpenClaw use
|
||||
2. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
|
||||
1. **Observe the enhanced pipeline** — let the nightly pipeline run for a
|
||||
week with the new harness + summary + auto-promote steps. Check the
|
||||
dashboard daily. Verify pipeline summary populates correctly.
|
||||
2. **Knowledge density** — run batch extraction over the full 234
|
||||
interactions (`--since 2026-01-01`) to mine the backlog for knowledge.
|
||||
Target: 100+ active memories.
|
||||
3. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
|
||||
different model than the extractor for independent validation
|
||||
3. **Automated eval in cron** (Phase 12 entry) — add retrieval harness
|
||||
to the nightly cron so regressions are caught automatically
|
||||
4. **Atomizer-v2 state entries** — curate Trusted Project State for the
|
||||
newly ingested Atomizer knowledge base
|
||||
4. **Fix p04-constraints harness failure** — retrieval doesn't surface
|
||||
"Zerodur" for p04 constraint queries. Investigate if it's a missing
|
||||
memory or retrieval ranking issue.
|
||||
|
||||
## Next
|
||||
|
||||
These are the next major layers after the current stabilization pass.
|
||||
|
||||
1. Phase 10 Write-back — confidence-based auto-promotion from
|
||||
reinforcement signal (a memory reinforced N times auto-promotes)
|
||||
2. Phase 6 AtoDrive — clarify Google Drive as a trusted operational
|
||||
1. Phase 6 AtoDrive — clarify Google Drive as a trusted operational
|
||||
source and ingest from it
|
||||
3. Phase 13 Hardening — Chroma backup policy, monitoring, alerting,
|
||||
2. Phase 13 Hardening — Chroma backup policy, monitoring, alerting,
|
||||
failure visibility beyond log files
|
||||
3. Engineering V1 implementation sprint — once knowledge density is
|
||||
sufficient and the pipeline feels boring and dependable
|
||||
|
||||
## Later
|
||||
|
||||
@@ -193,9 +199,10 @@ These remain intentionally deferred.
|
||||
plugin now exists (`openclaw-plugins/atocore-capture/`), interactions
|
||||
flow. Write-back of promoted memories back to OpenClaw's own memory
|
||||
system is still deferred.
|
||||
- ~~automatic memory promotion~~ — auto-triage now handles promote/reject
|
||||
for extraction candidates. Reinforcement-based auto-promotion
|
||||
(Phase 10) is the remaining piece.
|
||||
- ~~automatic memory promotion~~ — Phase 10 complete: auto-triage handles
|
||||
extraction candidates, reinforcement-based auto-promotion graduates
|
||||
candidates referenced 3+ times to active, stale candidates expire
|
||||
after 14 days unreinforced.
|
||||
- ~~reflection loop integration~~ — fully operational: capture (both
|
||||
clients) → reinforce (automatic) → extract (nightly cron, sonnet) →
|
||||
auto-triage (nightly, sonnet) → only needs_human reaches the user.
|
||||
|
||||
274
docs/universal-consumption.md
Normal file
274
docs/universal-consumption.md
Normal file
@@ -0,0 +1,274 @@
|
||||
# Universal Consumption — Connecting LLM Clients to AtoCore
|
||||
|
||||
Phase 1 of the Master Brain plan. Every LLM interaction across the ecosystem
|
||||
pulls context from AtoCore automatically, without the user or agent having
|
||||
to remember to ask for it.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ AtoCore HTTP API │ ← single source of truth
|
||||
│ http://dalidou:8100│
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
┌────────────────────┼────────────────────┐
|
||||
│ │ │
|
||||
┌───┴────┐ ┌─────┴────┐ ┌────┴────┐
|
||||
│ MCP │ │ OpenClaw │ │ HTTP │
|
||||
│ server │ │ plugin │ │ proxy │
|
||||
└───┬────┘ └──────┬───┘ └────┬────┘
|
||||
│ │ │
|
||||
Claude/Cursor/ OpenClaw Codex/Ollama/
|
||||
Zed/Windsurf any OpenAI-compat client
|
||||
```
|
||||
|
||||
Three adapters, one HTTP backend. Each adapter is a thin passthrough — no
|
||||
business logic duplicated.
|
||||
|
||||
---
|
||||
|
||||
## Adapter 1: MCP Server (Claude Desktop, Claude Code, Cursor, Zed, Windsurf)
|
||||
|
||||
The MCP server is `scripts/atocore_mcp.py` — stdlib-only Python, stdio
|
||||
transport, wraps the HTTP API. Claude-family clients see AtoCore as built-in
|
||||
tools just like `Read` or `Bash`.
|
||||
|
||||
### Tools exposed
|
||||
|
||||
- **`atocore_context`** (most important): Full context pack for a query —
|
||||
Trusted Project State + memories + retrieved chunks. Use at the start of
|
||||
any project-related conversation to ground it.
|
||||
- **`atocore_search`**: Semantic search over ingested documents (top-K chunks).
|
||||
- **`atocore_memory_list`**: List active memories, filterable by project + type.
|
||||
- **`atocore_memory_create`**: Propose a candidate memory (enters triage queue).
|
||||
- **`atocore_project_state`**: Get Trusted Project State entries by category.
|
||||
- **`atocore_projects`**: List registered projects + aliases.
|
||||
- **`atocore_health`**: Service status check.
|
||||
|
||||
### Registration
|
||||
|
||||
#### Claude Code (CLI)
|
||||
```bash
|
||||
claude mcp add atocore -- python C:/Users/antoi/ATOCore/scripts/atocore_mcp.py
|
||||
claude mcp list # verify: "atocore ... ✓ Connected"
|
||||
```
|
||||
|
||||
#### Claude Desktop (GUI)
|
||||
Edit `~/Library/Application Support/Claude/claude_desktop_config.json`
|
||||
(macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"atocore": {
|
||||
"command": "python",
|
||||
"args": ["C:/Users/antoi/ATOCore/scripts/atocore_mcp.py"],
|
||||
"env": {
|
||||
"ATOCORE_URL": "http://dalidou:8100"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
Restart Claude Desktop.
|
||||
|
||||
#### Cursor / Zed / Windsurf
|
||||
Similar JSON config in each tool's MCP settings. Consult their docs —
|
||||
the config schema is standard MCP.
|
||||
|
||||
### Configuration
|
||||
|
||||
Environment variables the MCP server honors:
|
||||
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `ATOCORE_URL` | `http://dalidou:8100` | Where to reach AtoCore |
|
||||
| `ATOCORE_TIMEOUT` | `10` | Per-request HTTP timeout (seconds) |
|
||||
|
||||
### Behavior
|
||||
|
||||
- Fail-open: if Dalidou is unreachable, tools return "AtoCore unavailable"
|
||||
error messages but don't crash the client.
|
||||
- Zero business logic: every tool is a direct HTTP passthrough.
|
||||
- stdlib only: no MCP SDK dependency.
|
||||
|
||||
---
|
||||
|
||||
## Adapter 2: OpenClaw Plugin (`openclaw-plugins/atocore-capture/handler.js`)
|
||||
|
||||
The plugin on T420 OpenClaw has two responsibilities:
|
||||
|
||||
1. **CAPTURE**: On `before_agent_start` + `llm_output`, POST completed turns
|
||||
to AtoCore `/interactions` (existing).
|
||||
2. **PULL**: On `before_prompt_build`, call `/context/build` and inject the
|
||||
context pack via `prependContext` so the agent's system prompt includes
|
||||
AtoCore knowledge.
|
||||
|
||||
### Deployment
|
||||
|
||||
The plugin is loaded from
|
||||
`/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/`
|
||||
on the T420 (per OpenClaw's plugin config at `~/.openclaw/openclaw.json`).
|
||||
|
||||
To update:
|
||||
```bash
|
||||
scp openclaw-plugins/atocore-capture/handler.js \
|
||||
papa@192.168.86.39:/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/index.js
|
||||
ssh papa@192.168.86.39 'systemctl --user restart openclaw-gateway'
|
||||
```
|
||||
|
||||
Verify in gateway logs: look for "ready (7 plugins: acpx, atocore-capture, ...)"
|
||||
|
||||
### Configuration (env vars set on T420)
|
||||
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `ATOCORE_BASE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
|
||||
| `ATOCORE_PULL_DISABLED` | (unset) | Set to `1` to disable context pull |
|
||||
|
||||
### Behavior
|
||||
|
||||
- Fail-open: AtoCore unreachable = no injection, no capture, agent runs
|
||||
normally.
|
||||
- 6s timeout on context pull, 10s on capture — won't stall the agent.
|
||||
- Context pack prepended as a clearly-bracketed block so the agent can see
|
||||
it's auto-injected grounding info.
|
||||
|
||||
---
|
||||
|
||||
## Adapter 3: HTTP Proxy (`scripts/atocore_proxy.py`)
|
||||
|
||||
A stdlib-only OpenAI-compatible HTTP proxy. Sits between any
|
||||
OpenAI-API-speaking client and the real provider, enriches every
|
||||
`/chat/completions` request with AtoCore context.
|
||||
|
||||
Works with:
|
||||
- **Codex CLI** (OpenAI-compatible endpoint)
|
||||
- **Ollama** (has OpenAI-compatible `/v1` endpoint since 0.1.24)
|
||||
- **LiteLLM**, **llama.cpp server**, custom agents
|
||||
- Anything that can be pointed at a custom base URL
|
||||
|
||||
### Start it
|
||||
|
||||
```bash
|
||||
# For Ollama (local models):
|
||||
ATOCORE_UPSTREAM=http://localhost:11434/v1 \
|
||||
python scripts/atocore_proxy.py
|
||||
|
||||
# For OpenAI cloud:
|
||||
ATOCORE_UPSTREAM=https://api.openai.com/v1 \
|
||||
ATOCORE_CLIENT_LABEL=codex \
|
||||
python scripts/atocore_proxy.py
|
||||
|
||||
# Test:
|
||||
curl http://127.0.0.1:11435/healthz
|
||||
```
|
||||
|
||||
### Point a client at it
|
||||
|
||||
Set the client's OpenAI base URL to `http://127.0.0.1:11435/v1`.
|
||||
|
||||
#### Ollama example:
|
||||
```bash
|
||||
OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
|
||||
some-openai-client --model llama3:8b
|
||||
```
|
||||
|
||||
#### Codex CLI:
|
||||
Set `OPENAI_BASE_URL=http://127.0.0.1:11435/v1` in your codex config.
|
||||
|
||||
### Configuration
|
||||
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `ATOCORE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
|
||||
| `ATOCORE_UPSTREAM` | (required) | Real provider base URL |
|
||||
| `ATOCORE_PROXY_PORT` | `11435` | Proxy listen port |
|
||||
| `ATOCORE_PROXY_HOST` | `127.0.0.1` | Proxy bind address |
|
||||
| `ATOCORE_CLIENT_LABEL` | `proxy` | Client id in captures |
|
||||
| `ATOCORE_INJECT` | `1` | Inject context (set `0` to disable) |
|
||||
| `ATOCORE_CAPTURE` | `1` | Capture interactions (set `0` to disable) |
|
||||
|
||||
### Behavior
|
||||
|
||||
- GET requests (model listing etc) pass through unchanged
|
||||
- POST to `/chat/completions` (or `/v1/chat/completions`) gets enriched:
|
||||
1. Last user message extracted as query
|
||||
2. AtoCore `/context/build` called with 6s timeout
|
||||
3. Pack injected as system message (or prepended to existing system)
|
||||
4. Enriched body forwarded to upstream
|
||||
5. After success, interaction POSTed to `/interactions` in background
|
||||
- Fail-open: AtoCore unreachable = pass through without injection
|
||||
- Streaming responses: currently buffered (not true stream). Good enough for
|
||||
most cases; can be upgraded later if needed.
|
||||
|
||||
### Running as a service
|
||||
|
||||
On Linux, create `~/.config/systemd/user/atocore-proxy.service`:
|
||||
```ini
|
||||
[Unit]
|
||||
Description=AtoCore HTTP proxy
|
||||
|
||||
[Service]
|
||||
Environment=ATOCORE_UPSTREAM=http://localhost:11434/v1
|
||||
Environment=ATOCORE_CLIENT_LABEL=ollama
|
||||
ExecStart=/usr/bin/python3 /path/to/scripts/atocore_proxy.py
|
||||
Restart=on-failure
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
```
|
||||
Then: `systemctl --user enable --now atocore-proxy`
|
||||
|
||||
On Windows, register via Task Scheduler (similar pattern to backup task)
|
||||
or use NSSM to install as a service.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Fresh end-to-end test to confirm Phase 1 is working:
|
||||
|
||||
### For Claude Code (MCP)
|
||||
1. Open a new Claude Code session (not this one).
|
||||
2. Ask: "what do we know about p06 polisher's control architecture?"
|
||||
3. Claude should invoke `atocore_context` or `atocore_project_state`
|
||||
on its own and answer grounded in AtoCore data.
|
||||
|
||||
### For OpenClaw (plugin pull)
|
||||
1. Send a Discord message to OpenClaw: "what's the status on p04?"
|
||||
2. Check T420 logs: `journalctl --user -u openclaw-gateway --since "1 min ago" | grep atocore-pull`
|
||||
3. Expect: `atocore-pull:injected project=p04-gigabit chars=NNN`
|
||||
|
||||
### For proxy (any OpenAI-compat client)
|
||||
1. Start proxy with appropriate upstream
|
||||
2. Run a client query through it
|
||||
3. Check stderr: `[atocore-proxy] inject: project=... chars=...`
|
||||
4. Check `curl http://127.0.0.1:8100/interactions?client=proxy` — should
|
||||
show the captured turn
|
||||
|
||||
---
|
||||
|
||||
## Why not just MCP everywhere?
|
||||
|
||||
MCP is great for Claude-family clients but:
|
||||
- Not supported natively by Codex CLI, Ollama, or OpenAI's own API
|
||||
- No universal "attach MCP" mechanism in all LLM runtimes
|
||||
- HTTP APIs are truly universal
|
||||
|
||||
HTTP API is the truth, each adapter is the thinnest possible shim for its
|
||||
ecosystem. When new adapters are needed (Gemini CLI, Claude Code plugin
|
||||
system, etc.), they follow the same pattern.
|
||||
|
||||
---
|
||||
|
||||
## Future enhancements
|
||||
|
||||
- **Streaming passthrough** in the proxy (currently buffered for simplicity)
|
||||
- **Response grounding check**: parse assistant output for references to
|
||||
injected context, count reinforcement events
|
||||
- **Per-client metrics** in the dashboard: how often each client pulls,
|
||||
context pack size, injection rate
|
||||
- **Smart project detection**: today we use keyword matching; could use
|
||||
AtoCore's own project resolver endpoint
|
||||
140
docs/windows-backup-setup.md
Normal file
140
docs/windows-backup-setup.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# Windows Main-Computer Backup Setup
|
||||
|
||||
The AtoCore backup pipeline runs nightly on Dalidou and already pushes snapshots
|
||||
off-host to the T420 (`papa@192.168.86.39`). This doc sets up a **second**,
|
||||
pull-based daily backup to your Windows main computer at
|
||||
`C:\Users\antoi\Documents\ATOCore_Backups\`.
|
||||
|
||||
Pull-based means the Windows machine pulls from Dalidou. This is simpler than
|
||||
push because Dalidou doesn't need SSH keys to reach Windows, and the backup
|
||||
only runs when the Windows machine is powered on and can reach Dalidou.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Windows 10/11 with OpenSSH client (built-in since Win10 1809)
|
||||
- SSH key-based auth to `papa@dalidou` already working (you're using it today)
|
||||
- `C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1` present
|
||||
|
||||
## Test the script manually
|
||||
|
||||
```powershell
|
||||
powershell.exe -ExecutionPolicy Bypass -File `
|
||||
C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
[timestamp] === AtoCore backup pull starting ===
|
||||
[timestamp] Dalidou reachable.
|
||||
[timestamp] Pulling snapshots via scp...
|
||||
[timestamp] Pulled N snapshots successfully (total X MB, latest: ...)
|
||||
[timestamp] === backup complete ===
|
||||
```
|
||||
|
||||
Target directory: `C:\Users\antoi\Documents\ATOCore_Backups\snapshots\`
|
||||
Logs: `C:\Users\antoi\Documents\ATOCore_Backups\_logs\backup-*.log`
|
||||
|
||||
## Register the Task Scheduler task
|
||||
|
||||
### Option A — automatic registration (recommended)
|
||||
|
||||
Run this PowerShell command **as your user** (no admin needed — uses HKCU task):
|
||||
|
||||
```powershell
|
||||
$action = New-ScheduledTaskAction -Execute 'powershell.exe' `
|
||||
-Argument '-ExecutionPolicy Bypass -NonInteractive -WindowStyle Hidden -File C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1'
|
||||
|
||||
# Run daily at 10:00 local time; if missed (computer off), run at next logon
|
||||
$trigger = New-ScheduledTaskTrigger -Daily -At 10:00AM
|
||||
$trigger.StartBoundary = (Get-Date -Format 'yyyy-MM-ddTHH:mm:ss')
|
||||
|
||||
$settings = New-ScheduledTaskSettingsSet `
|
||||
-AllowStartIfOnBatteries `
|
||||
-DontStopIfGoingOnBatteries `
|
||||
-StartWhenAvailable `
|
||||
-ExecutionTimeLimit (New-TimeSpan -Minutes 10) `
|
||||
-RestartCount 2 `
|
||||
-RestartInterval (New-TimeSpan -Minutes 30)
|
||||
|
||||
Register-ScheduledTask -TaskName 'AtoCore Backup Pull' `
|
||||
-Description 'Daily pull of AtoCore backup snapshots from Dalidou' `
|
||||
-Action $action -Trigger $trigger -Settings $settings `
|
||||
-User $env:USERNAME
|
||||
```
|
||||
|
||||
Key settings:
|
||||
- `-StartWhenAvailable`: if the computer was off at 10:00, run as soon as it
|
||||
comes online
|
||||
- `-AllowStartIfOnBatteries`: works on laptop battery too
|
||||
- `-ExecutionTimeLimit 10min`: kill hung tasks
|
||||
- `-RestartCount 2`: retry twice if it fails (Dalidou temporarily unreachable)
|
||||
|
||||
### Option B -- Task Scheduler GUI
|
||||
|
||||
1. Open Task Scheduler (`taskschd.msc`)
|
||||
2. Create Basic Task -> name: `AtoCore Backup Pull`
|
||||
3. Trigger: Daily, 10:00 AM, recur every 1 day
|
||||
4. Action: Start a program
|
||||
- Program: `powershell.exe`
|
||||
- Arguments: `-ExecutionPolicy Bypass -NonInteractive -WindowStyle Hidden -File "C:\Users\antoi\ATOCore\scripts\windows\atocore-backup-pull.ps1"`
|
||||
5. Finish, then edit the task:
|
||||
- Settings tab: check "Run task as soon as possible after a scheduled start is missed"
|
||||
- Settings tab: "If the task fails, restart every 30 minutes, up to 2 times"
|
||||
- Conditions tab: uncheck "Start only if computer is on AC power" (if you want it on battery)
|
||||
|
||||
## Verify
|
||||
|
||||
After the first scheduled run:
|
||||
|
||||
```powershell
|
||||
# Most recent log
|
||||
Get-ChildItem C:\Users\antoi\Documents\ATOCore_Backups\_logs\ |
|
||||
Sort-Object Name -Descending |
|
||||
Select-Object -First 1 |
|
||||
Get-Content
|
||||
|
||||
# Latest snapshot present?
|
||||
Get-ChildItem C:\Users\antoi\Documents\ATOCore_Backups\snapshots\ |
|
||||
Sort-Object Name -Descending |
|
||||
Select-Object -First 3
|
||||
```
|
||||
|
||||
## Unregister (if needed)
|
||||
|
||||
```powershell
|
||||
Unregister-ScheduledTask -TaskName 'AtoCore Backup Pull' -Confirm:$false
|
||||
```
|
||||
|
||||
## How it behaves
|
||||
|
||||
- **Computer on, Dalidou reachable**: pulls latest snapshots silently in ~15s
|
||||
- **Computer on, Dalidou unreachable** (remote work, network down): fail-open,
|
||||
exits without error, logs "Dalidou unreachable"
|
||||
- **Computer off at scheduled time**: Task Scheduler runs it as soon as the
|
||||
computer wakes up
|
||||
- **Many days off**: one run catches up; scp only transfers files not already
|
||||
present (snapshots are date-stamped directories, idempotent overwrites)
|
||||
|
||||
## What gets backed up
|
||||
|
||||
The snapshots tree contains:
|
||||
- `YYYYMMDDTHHMMSSZ/config/` — project registry, AtoCore config
|
||||
- `YYYYMMDDTHHMMSSZ/db/` — SQLite snapshot of all memory, state, interactions
|
||||
- `YYYYMMDDTHHMMSSZ/backup-metadata.json` — SHA, timestamp, source info
|
||||
|
||||
Chroma vectors are **not** in the snapshot by default
|
||||
(`ATOCORE_BACKUP_CHROMA=false` on Dalidou). They can be rebuilt from the
|
||||
source documents if lost. To include them, set `ATOCORE_BACKUP_CHROMA=true`
|
||||
in the Dalidou cron environment.
|
||||
|
||||
## Three-tier backup summary
|
||||
|
||||
After this setup:
|
||||
|
||||
| Tier | Location | Cadence | Purpose |
|
||||
|---|---|---|---|
|
||||
| Live | Dalidou `/srv/storage/atocore/backups/snapshots/` | Nightly 03:00 UTC | Fast restore |
|
||||
| Off-host | T420 `papa@192.168.86.39:/home/papa/atocore-backups/` | Nightly after Dalidou | Dalidou dies |
|
||||
| User machine | `C:\Users\antoi\Documents\ATOCore_Backups\` | Daily 10:00 local | Full home-network failure |
|
||||
|
||||
Three independent copies. Any two can be lost simultaneously without data loss.
|
||||
@@ -1,29 +1,40 @@
|
||||
# AtoCore Capture Plugin for OpenClaw
|
||||
# AtoCore Capture + Context Plugin for OpenClaw
|
||||
|
||||
Minimal OpenClaw plugin that mirrors Claude Code's `capture_stop.py` behavior:
|
||||
Two-way bridge between OpenClaw agents and AtoCore:
|
||||
|
||||
**Capture (since v1)**
|
||||
- watches user-triggered assistant turns
|
||||
- POSTs `prompt` + `response` to `POST /interactions`
|
||||
- sets `client="openclaw"`
|
||||
- sets `reinforce=true`
|
||||
- sets `client="openclaw"`, `reinforce=true`
|
||||
- fails open on network or API errors
|
||||
|
||||
## Config
|
||||
**Context injection (Phase 7I, v2+)**
|
||||
- on `before_agent_start`, fetches a context pack from `POST /context/build`
|
||||
- prepends the pack to the agent's prompt so whatever LLM runs underneath
|
||||
(sonnet, opus, codex, local model — whichever OpenClaw delegates to)
|
||||
answers grounded in what AtoCore already knows
|
||||
- original user prompt is still what gets captured later (no recursion)
|
||||
- fails open: context unreachable → agent runs as before
|
||||
|
||||
Optional plugin config:
|
||||
## Config
|
||||
|
||||
```json
|
||||
{
|
||||
"baseUrl": "http://dalidou:8100",
|
||||
"minPromptLength": 15,
|
||||
"maxResponseLength": 50000
|
||||
"maxResponseLength": 50000,
|
||||
"injectContext": true,
|
||||
"contextCharBudget": 4000
|
||||
}
|
||||
```
|
||||
|
||||
If `baseUrl` is omitted, the plugin uses `ATOCORE_BASE_URL` or defaults to `http://dalidou:8100`.
|
||||
- `baseUrl` — defaults to `ATOCORE_BASE_URL` env or `http://dalidou:8100`
|
||||
- `injectContext` — set to `false` to disable the Phase 7I context injection and make this a pure one-way capture plugin again
|
||||
- `contextCharBudget` — cap on injected context size. `/context/build` respects it too; this is a client-side safety net. Default 4000 chars (~1000 tokens).
|
||||
|
||||
## Notes
|
||||
|
||||
- Project detection is intentionally left empty for now. Unscoped capture is acceptable because AtoCore's extraction pipeline handles unscoped interactions.
|
||||
- Extraction is **not** part of the capture path. This plugin only records interactions and lets AtoCore reinforcement run automatically.
|
||||
- The plugin captures only user-triggered turns, not heartbeats or system-only runs.
|
||||
- Project detection is intentionally left empty — AtoCore's extraction pipeline handles unscoped interactions and infers the project from content.
|
||||
- Extraction is **not** part of this plugin. Interactions are captured; batch extraction runs via cron on the AtoCore host.
|
||||
- Context injection only fires for user-triggered turns (not heartbeats or system-only runs).
|
||||
- Timeouts: context fetch is 5s (short so a slow AtoCore never blocks a user turn); capture post is 10s.
|
||||
|
||||
154
openclaw-plugins/atocore-capture/handler.js
Normal file
154
openclaw-plugins/atocore-capture/handler.js
Normal file
@@ -0,0 +1,154 @@
|
||||
/**
|
||||
* AtoCore OpenClaw plugin — capture + pull.
|
||||
*
|
||||
* Two responsibilities:
|
||||
*
|
||||
* 1. CAPTURE (existing): On before_agent_start, buffer the user prompt.
|
||||
* On llm_output, POST prompt+response to AtoCore /interactions.
|
||||
* This is the "write" side — OpenClaw turns feed AtoCore's memory.
|
||||
*
|
||||
* 2. PULL (Phase 1 master brain): On before_prompt_build, call AtoCore
|
||||
* /context/build and inject the returned context via prependContext.
|
||||
* Every OpenClaw response is automatically grounded in what AtoCore
|
||||
* knows (project state, memories, relevant chunks).
|
||||
*
|
||||
* Fail-open throughout: AtoCore unreachable = no injection, no capture,
|
||||
* never blocks the agent.
|
||||
*/
|
||||
|
||||
import { definePluginEntry } from "openclaw/plugin-sdk/core";
|
||||
|
||||
const BASE_URL = process.env.ATOCORE_BASE_URL || "http://dalidou:8100";
|
||||
const MIN_LEN = 15;
|
||||
const MAX_RESP = 50000;
|
||||
const CONTEXT_TIMEOUT_MS = 6000;
|
||||
const CAPTURE_TIMEOUT_MS = 10000;
|
||||
|
||||
function trim(v) { return typeof v === "string" ? v.trim() : ""; }
|
||||
function trunc(t, m) { return !t || t.length <= m ? t : t.slice(0, m) + "\n\n[truncated]"; }
|
||||
|
||||
function detectProject(prompt) {
|
||||
const lower = (prompt || "").toLowerCase();
|
||||
const hints = [
|
||||
["p04", "p04-gigabit"],
|
||||
["gigabit", "p04-gigabit"],
|
||||
["p05", "p05-interferometer"],
|
||||
["interferometer", "p05-interferometer"],
|
||||
["p06", "p06-polisher"],
|
||||
["polisher", "p06-polisher"],
|
||||
["fullum", "p06-polisher"],
|
||||
["abb", "abb-space"],
|
||||
["atomizer", "atomizer-v2"],
|
||||
["atocore", "atocore"],
|
||||
];
|
||||
for (const [token, proj] of hints) {
|
||||
if (lower.includes(token)) return proj;
|
||||
}
|
||||
return "";
|
||||
}
|
||||
|
||||
export default definePluginEntry({
|
||||
register(api) {
|
||||
const log = api.logger;
|
||||
let lastPrompt = null;
|
||||
|
||||
// --- PULL: inject AtoCore context into every prompt ---
|
||||
api.on("before_prompt_build", async (event, ctx) => {
|
||||
if (process.env.ATOCORE_PULL_DISABLED === "1") return;
|
||||
const prompt = trim(event?.prompt || "");
|
||||
if (prompt.length < MIN_LEN) return;
|
||||
|
||||
const project = detectProject(prompt);
|
||||
|
||||
try {
|
||||
const res = await fetch(BASE_URL.replace(/\/$/, "") + "/context/build", {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ prompt, project }),
|
||||
signal: AbortSignal.timeout(CONTEXT_TIMEOUT_MS),
|
||||
});
|
||||
if (!res.ok) {
|
||||
log.info("atocore-pull:http_error", { status: res.status });
|
||||
return;
|
||||
}
|
||||
const data = await res.json();
|
||||
const contextPack = data.formatted_context || "";
|
||||
if (!contextPack.trim()) return;
|
||||
|
||||
log.info("atocore-pull:injected", {
|
||||
project: project || "(none)",
|
||||
chars: contextPack.length,
|
||||
});
|
||||
|
||||
return {
|
||||
prependContext:
|
||||
"--- AtoCore Context (auto-injected) ---\n" +
|
||||
contextPack +
|
||||
"\n--- End AtoCore Context ---\n",
|
||||
};
|
||||
} catch (err) {
|
||||
log.info("atocore-pull:error", { error: String(err).slice(0, 200) });
|
||||
}
|
||||
});
|
||||
|
||||
// --- CAPTURE: buffer user prompts on agent start ---
|
||||
api.on("before_agent_start", async (event, ctx) => {
|
||||
const prompt = trim(event?.prompt || event?.cleanedBody || "");
|
||||
if (prompt.length < MIN_LEN || prompt.startsWith("<")) {
|
||||
lastPrompt = null;
|
||||
return;
|
||||
}
|
||||
// Filter cron-initiated agent runs. OpenClaw's scheduled tasks fire
|
||||
// agent sessions with prompts that begin "[cron:<id> ...]". These are
|
||||
// automated polls (DXF email watcher, calendar reminders, etc.), not
|
||||
// real user turns — they're pure noise in the AtoCore capture stream.
|
||||
if (prompt.startsWith("[cron:")) {
|
||||
lastPrompt = null;
|
||||
return;
|
||||
}
|
||||
lastPrompt = { text: prompt, sessionKey: ctx?.sessionKey || "", ts: Date.now() };
|
||||
log.info("atocore-capture:prompt_buffered", { len: prompt.length });
|
||||
});
|
||||
|
||||
// --- CAPTURE: send completed turns to AtoCore ---
|
||||
api.on("llm_output", async (event, ctx) => {
|
||||
if (!lastPrompt) return;
|
||||
const texts = Array.isArray(event?.assistantTexts) ? event.assistantTexts : [];
|
||||
const response = trunc(trim(texts.join("\n\n")), MAX_RESP);
|
||||
if (!response) return;
|
||||
|
||||
const prompt = lastPrompt.text;
|
||||
const sessionKey = lastPrompt.sessionKey || ctx?.sessionKey || "";
|
||||
const project = detectProject(prompt);
|
||||
lastPrompt = null;
|
||||
|
||||
log.info("atocore-capture:posting", {
|
||||
promptLen: prompt.length,
|
||||
responseLen: response.length,
|
||||
project: project || "(none)",
|
||||
});
|
||||
|
||||
fetch(BASE_URL.replace(/\/$/, "") + "/interactions", {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({
|
||||
prompt,
|
||||
response,
|
||||
client: "openclaw",
|
||||
session_id: sessionKey,
|
||||
project,
|
||||
reinforce: true,
|
||||
}),
|
||||
signal: AbortSignal.timeout(CAPTURE_TIMEOUT_MS),
|
||||
}).then(res => {
|
||||
log.info("atocore-capture:posted", { status: res.status });
|
||||
}).catch(err => {
|
||||
log.warn("atocore-capture:post_error", { error: String(err).slice(0, 200) });
|
||||
});
|
||||
});
|
||||
|
||||
api.on("session_end", async () => {
|
||||
lastPrompt = null;
|
||||
});
|
||||
}
|
||||
});
|
||||
@@ -3,6 +3,11 @@ import { definePluginEntry } from "openclaw/plugin-sdk/core";
|
||||
const DEFAULT_BASE_URL = process.env.ATOCORE_BASE_URL || "http://dalidou:8100";
|
||||
const DEFAULT_MIN_PROMPT_LENGTH = 15;
|
||||
const DEFAULT_MAX_RESPONSE_LENGTH = 50_000;
|
||||
// Phase 7I — context injection: cap how much AtoCore context we stuff
|
||||
// back into the prompt. The /context/build endpoint respects a budget
|
||||
// parameter too, but we keep a client-side safety net.
|
||||
const DEFAULT_CONTEXT_CHAR_BUDGET = 4_000;
|
||||
const DEFAULT_INJECT_CONTEXT = true;
|
||||
|
||||
function trimText(value) {
|
||||
return typeof value === "string" ? value.trim() : "";
|
||||
@@ -41,6 +46,37 @@ async function postInteraction(baseUrl, payload, logger) {
|
||||
}
|
||||
}
|
||||
|
||||
// Phase 7I — fetch a context pack for the incoming prompt so the agent
|
||||
// answers grounded in what AtoCore already knows. Fail-open: if the
|
||||
// request times out or errors, we just don't inject; the agent runs as
|
||||
// before. Never block the user's turn on AtoCore availability.
|
||||
async function fetchContextPack(baseUrl, prompt, project, charBudget, logger) {
|
||||
try {
|
||||
const res = await fetch(`${baseUrl.replace(/\/$/, "")}/context/build`, {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({
|
||||
prompt,
|
||||
project: project || "",
|
||||
char_budget: charBudget
|
||||
}),
|
||||
signal: AbortSignal.timeout(5_000)
|
||||
});
|
||||
if (!res.ok) {
|
||||
logger?.debug?.("atocore_context_fetch_failed", { status: res.status });
|
||||
return null;
|
||||
}
|
||||
const data = await res.json();
|
||||
const pack = trimText(data?.formatted_context || "");
|
||||
return pack || null;
|
||||
} catch (error) {
|
||||
logger?.debug?.("atocore_context_fetch_error", {
|
||||
error: error instanceof Error ? error.message : String(error)
|
||||
});
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
export default definePluginEntry({
|
||||
register(api) {
|
||||
const logger = api.logger;
|
||||
@@ -55,6 +91,28 @@ export default definePluginEntry({
|
||||
pendingBySession.delete(ctx.sessionId);
|
||||
return;
|
||||
}
|
||||
|
||||
// Phase 7I — inject AtoCore context into the agent's prompt so it
|
||||
// answers grounded in what the brain already knows. Config-gated
|
||||
// (injectContext: false disables). Fail-open.
|
||||
const baseUrl = trimText(config.baseUrl) || DEFAULT_BASE_URL;
|
||||
const injectContext = config.injectContext !== false && DEFAULT_INJECT_CONTEXT;
|
||||
const charBudget = Number(config.contextCharBudget || DEFAULT_CONTEXT_CHAR_BUDGET);
|
||||
if (injectContext && event && typeof event === "object") {
|
||||
const pack = await fetchContextPack(baseUrl, prompt, "", charBudget, logger);
|
||||
if (pack) {
|
||||
// Prepend to the event's prompt so the agent sees grounded info
|
||||
// before the user's question. OpenClaw's agent receives
|
||||
// event.prompt as its primary input; modifying it here grounds
|
||||
// whatever LLM the agent delegates to (sonnet, opus, codex,
|
||||
// local model — doesn't matter).
|
||||
event.prompt = `${pack}\n\n---\n\n${prompt}`;
|
||||
logger?.debug?.("atocore_context_injected", { chars: pack.length });
|
||||
}
|
||||
}
|
||||
|
||||
// Record the ORIGINAL user prompt (not the injected version) so
|
||||
// captured interactions stay clean for later extraction.
|
||||
pendingBySession.set(ctx.sessionId, {
|
||||
prompt,
|
||||
sessionId: ctx.sessionId,
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"name": "@atomaste/atocore-openclaw-capture",
|
||||
"private": true,
|
||||
"version": "0.0.0",
|
||||
"version": "0.2.0",
|
||||
"type": "module",
|
||||
"description": "OpenClaw plugin that captures assistant turns to AtoCore interactions"
|
||||
"description": "OpenClaw plugin: captures assistant turns to AtoCore interactions AND injects AtoCore context into agent prompts before they run (Phase 7I two-way bridge)"
|
||||
}
|
||||
|
||||
914
scripts/atocore_mcp.py
Normal file
914
scripts/atocore_mcp.py
Normal file
@@ -0,0 +1,914 @@
|
||||
#!/usr/bin/env python3
|
||||
"""AtoCore MCP server — stdio transport, stdlib-only.
|
||||
|
||||
Exposes the AtoCore HTTP API as MCP tools so any MCP-aware client
|
||||
(Claude Desktop, Claude Code, Cursor, Zed, Windsurf) can pull
|
||||
context + memories automatically at prompt time.
|
||||
|
||||
Design:
|
||||
- stdlib only (no mcp SDK dep) — MCP protocol is simple JSON-RPC
|
||||
over stdio, and AtoCore's philosophy prefers stdlib.
|
||||
- Thin wrapper: every tool is a direct pass-through to an HTTP
|
||||
endpoint. Zero business logic here — the AtoCore server is
|
||||
the single source of truth.
|
||||
- Fail-open: if AtoCore is unreachable, tools return a graceful
|
||||
"unavailable" message rather than crashing the client.
|
||||
|
||||
Protocol: MCP 2024-11-05 / 2025-03-26 compatible
|
||||
https://spec.modelcontextprotocol.io/specification/
|
||||
|
||||
Usage (standalone test):
|
||||
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"0"}}}' | python atocore_mcp.py
|
||||
|
||||
Register with Claude Code:
|
||||
claude mcp add atocore -- python /path/to/atocore_mcp.py
|
||||
|
||||
Environment:
|
||||
ATOCORE_URL base URL of the AtoCore HTTP API (default http://dalidou:8100)
|
||||
ATOCORE_TIMEOUT per-request HTTP timeout seconds (default 10)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
# Force UTF-8 on stdio — MCP protocol expects UTF-8 but Windows Python
|
||||
# defaults stdout to cp1252, which crashes on any non-ASCII char (emojis,
|
||||
# ≥, →, etc.) in tool responses. This call is a no-op on Linux/macOS
|
||||
# where UTF-8 is already the default.
|
||||
try:
|
||||
sys.stdin.reconfigure(encoding="utf-8")
|
||||
sys.stdout.reconfigure(encoding="utf-8")
|
||||
sys.stderr.reconfigure(encoding="utf-8")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# --- Configuration ---
|
||||
|
||||
ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100").rstrip("/")
|
||||
HTTP_TIMEOUT = float(os.environ.get("ATOCORE_TIMEOUT", "10"))
|
||||
SERVER_NAME = "atocore"
|
||||
SERVER_VERSION = "0.1.0"
|
||||
PROTOCOL_VERSION = "2024-11-05"
|
||||
|
||||
|
||||
# --- stderr logging (stdout is reserved for JSON-RPC) ---
|
||||
|
||||
def log(msg: str) -> None:
|
||||
print(f"[atocore-mcp] {msg}", file=sys.stderr, flush=True)
|
||||
|
||||
|
||||
# --- HTTP helpers ---
|
||||
|
||||
def http_get(path: str, params: dict | None = None) -> dict:
|
||||
"""GET a JSON response from AtoCore. Raises on HTTP error."""
|
||||
url = ATOCORE_URL + path
|
||||
if params:
|
||||
# Drop empty params so the URL stays clean
|
||||
clean = {k: v for k, v in params.items() if v not in (None, "", [], {})}
|
||||
if clean:
|
||||
url += "?" + urllib.parse.urlencode(clean)
|
||||
req = urllib.request.Request(url, headers={"Accept": "application/json"})
|
||||
with urllib.request.urlopen(req, timeout=HTTP_TIMEOUT) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def http_post(path: str, body: dict) -> dict:
|
||||
url = ATOCORE_URL + path
|
||||
data = json.dumps(body).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
url, data=data, method="POST",
|
||||
headers={"Content-Type": "application/json", "Accept": "application/json"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=HTTP_TIMEOUT) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def safe_call(fn, *args, **kwargs) -> tuple[dict | None, str | None]:
|
||||
"""Run an HTTP call, return (result, error_message_or_None)."""
|
||||
try:
|
||||
return fn(*args, **kwargs), None
|
||||
except urllib.error.HTTPError as e:
|
||||
try:
|
||||
body = e.read().decode("utf-8", errors="replace")
|
||||
except Exception:
|
||||
body = ""
|
||||
return None, f"AtoCore HTTP {e.code}: {body[:200]}"
|
||||
except urllib.error.URLError as e:
|
||||
return None, f"AtoCore unreachable at {ATOCORE_URL}: {e.reason}"
|
||||
except Exception as e:
|
||||
return None, f"AtoCore error: {type(e).__name__}: {str(e)[:200]}"
|
||||
|
||||
|
||||
# --- Tool definitions ---
|
||||
# Each tool: name, description, inputSchema (JSON Schema), handler
|
||||
|
||||
def _tool_context(args: dict) -> str:
|
||||
"""Build a full context pack for a query — state + memories + retrieved chunks."""
|
||||
query = (args.get("query") or "").strip()
|
||||
project = args.get("project") or ""
|
||||
if not query:
|
||||
return "Error: 'query' is required."
|
||||
result, err = safe_call(http_post, "/context/build", {
|
||||
"prompt": query, "project": project,
|
||||
})
|
||||
if err:
|
||||
return f"AtoCore context unavailable: {err}"
|
||||
pack = result.get("formatted_context", "") or ""
|
||||
if not pack.strip():
|
||||
return "(AtoCore returned an empty context pack — no matching state, memories, or chunks.)"
|
||||
return pack
|
||||
|
||||
|
||||
def _tool_search(args: dict) -> str:
|
||||
"""Retrieval only — raw chunks ranked by semantic similarity."""
|
||||
query = (args.get("query") or "").strip()
|
||||
project = args.get("project") or ""
|
||||
top_k = int(args.get("top_k") or 5)
|
||||
if not query:
|
||||
return "Error: 'query' is required."
|
||||
result, err = safe_call(http_post, "/query", {
|
||||
"prompt": query, "project": project, "top_k": top_k,
|
||||
})
|
||||
if err:
|
||||
return f"AtoCore search unavailable: {err}"
|
||||
chunks = result.get("results", []) or []
|
||||
if not chunks:
|
||||
return "No results."
|
||||
lines = []
|
||||
for i, c in enumerate(chunks, 1):
|
||||
src = c.get("source_file") or c.get("title") or "unknown"
|
||||
heading = c.get("heading_path") or ""
|
||||
snippet = (c.get("content") or "")[:300]
|
||||
score = c.get("score", 0.0)
|
||||
head_str = f" ({heading})" if heading else ""
|
||||
lines.append(f"[{i}] score={score:.3f} source={src}{head_str}\n{snippet}")
|
||||
return "\n\n".join(lines)
|
||||
|
||||
|
||||
def _tool_memory_list(args: dict) -> str:
|
||||
"""List active memories, optionally filtered by project and type."""
|
||||
params = {
|
||||
"status": "active",
|
||||
"limit": int(args.get("limit") or 20),
|
||||
}
|
||||
if args.get("project"):
|
||||
params["project"] = args["project"]
|
||||
if args.get("memory_type"):
|
||||
params["memory_type"] = args["memory_type"]
|
||||
result, err = safe_call(http_get, "/memory", params=params)
|
||||
if err:
|
||||
return f"AtoCore memory list unavailable: {err}"
|
||||
memories = result.get("memories", []) or []
|
||||
if not memories:
|
||||
return "No memories match."
|
||||
lines = []
|
||||
for m in memories:
|
||||
mt = m.get("memory_type", "?")
|
||||
proj = m.get("project") or "(global)"
|
||||
conf = m.get("confidence", 0.0)
|
||||
refs = m.get("reference_count", 0)
|
||||
content = (m.get("content") or "")[:250]
|
||||
lines.append(f"[{mt}/{proj}] conf={conf:.2f} refs={refs}\n {content}")
|
||||
return "\n\n".join(lines)
|
||||
|
||||
|
||||
def _tool_memory_create(args: dict) -> str:
|
||||
"""Create a candidate memory (enters the triage queue)."""
|
||||
memory_type = (args.get("memory_type") or "").strip()
|
||||
content = (args.get("content") or "").strip()
|
||||
project = args.get("project") or ""
|
||||
confidence = float(args.get("confidence") or 0.5)
|
||||
if not memory_type or not content:
|
||||
return "Error: 'memory_type' and 'content' are required."
|
||||
valid_types = ["identity", "preference", "project", "episodic", "knowledge", "adaptation"]
|
||||
if memory_type not in valid_types:
|
||||
return f"Error: memory_type must be one of {valid_types}."
|
||||
result, err = safe_call(http_post, "/memory", {
|
||||
"memory_type": memory_type,
|
||||
"content": content,
|
||||
"project": project,
|
||||
"confidence": confidence,
|
||||
"status": "candidate",
|
||||
})
|
||||
if err:
|
||||
return f"AtoCore memory create failed: {err}"
|
||||
mid = result.get("id", "?")
|
||||
return f"Candidate memory created: id={mid} type={memory_type} project={project or '(global)'}"
|
||||
|
||||
|
||||
def _tool_project_state(args: dict) -> str:
|
||||
"""Get Trusted Project State entries for a project."""
|
||||
project = (args.get("project") or "").strip()
|
||||
category = args.get("category") or ""
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
path = f"/project/state/{urllib.parse.quote(project)}"
|
||||
params = {"category": category} if category else None
|
||||
result, err = safe_call(http_get, path, params=params)
|
||||
if err:
|
||||
return f"AtoCore project state unavailable: {err}"
|
||||
entries = result.get("entries", []) or result.get("state", []) or []
|
||||
if not entries:
|
||||
return f"No state entries for project '{project}'."
|
||||
lines = []
|
||||
for e in entries:
|
||||
cat = e.get("category", "?")
|
||||
key = e.get("key", "?")
|
||||
value = (e.get("value") or "")[:300]
|
||||
src = e.get("source") or ""
|
||||
lines.append(f"[{cat}/{key}] (source: {src})\n {value}")
|
||||
return "\n\n".join(lines)
|
||||
|
||||
|
||||
def _tool_projects(args: dict) -> str:
|
||||
"""List registered AtoCore projects."""
|
||||
result, err = safe_call(http_get, "/projects")
|
||||
if err:
|
||||
return f"AtoCore projects unavailable: {err}"
|
||||
projects = result.get("projects", []) or []
|
||||
if not projects:
|
||||
return "No projects registered."
|
||||
lines = []
|
||||
for p in projects:
|
||||
pid = p.get("project_id") or p.get("id") or p.get("name") or "?"
|
||||
aliases = p.get("aliases", []) or []
|
||||
alias_str = f" (aliases: {', '.join(aliases)})" if aliases else ""
|
||||
lines.append(f"- {pid}{alias_str}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_remember(args: dict) -> str:
|
||||
"""Phase 6 Part B — universal capture from any Claude session.
|
||||
|
||||
Wraps POST /memory to create a candidate memory tagged with
|
||||
source='mcp-remember'. The existing 3-tier triage is the quality
|
||||
gate: nothing becomes active until sonnet (+ opus if borderline)
|
||||
approves it. Returns the memory id so the caller can reference it
|
||||
in the same session.
|
||||
"""
|
||||
content = (args.get("content") or "").strip()
|
||||
if not content:
|
||||
return "Error: 'content' is required."
|
||||
|
||||
memory_type = (args.get("memory_type") or "knowledge").strip()
|
||||
valid_types = ["identity", "preference", "project", "episodic", "knowledge", "adaptation"]
|
||||
if memory_type not in valid_types:
|
||||
return f"Error: memory_type must be one of {valid_types}."
|
||||
|
||||
project = (args.get("project") or "").strip()
|
||||
try:
|
||||
confidence = float(args.get("confidence") or 0.6)
|
||||
except (TypeError, ValueError):
|
||||
confidence = 0.6
|
||||
confidence = max(0.0, min(1.0, confidence))
|
||||
|
||||
valid_until = (args.get("valid_until") or "").strip()
|
||||
tags = args.get("domain_tags") or []
|
||||
if not isinstance(tags, list):
|
||||
tags = []
|
||||
# Normalize tags: lowercase, dedupe, cap at 10
|
||||
clean_tags: list[str] = []
|
||||
for t in tags[:10]:
|
||||
if not isinstance(t, str):
|
||||
continue
|
||||
t = t.strip().lower()
|
||||
if t and t not in clean_tags:
|
||||
clean_tags.append(t)
|
||||
|
||||
payload = {
|
||||
"memory_type": memory_type,
|
||||
"content": content,
|
||||
"project": project,
|
||||
"confidence": confidence,
|
||||
"status": "candidate",
|
||||
}
|
||||
if valid_until:
|
||||
payload["valid_until"] = valid_until
|
||||
if clean_tags:
|
||||
payload["domain_tags"] = clean_tags
|
||||
|
||||
result, err = safe_call(http_post, "/memory", payload)
|
||||
if err:
|
||||
return f"AtoCore remember failed: {err}"
|
||||
|
||||
mid = result.get("id", "?")
|
||||
scope = project if project else "(global)"
|
||||
tag_str = f" tags=[{', '.join(clean_tags)}]" if clean_tags else ""
|
||||
expires = f" valid_until={valid_until}" if valid_until else ""
|
||||
return (
|
||||
f"Remembered as candidate: id={mid}\n"
|
||||
f" type={memory_type} project={scope} confidence={confidence:.2f}{tag_str}{expires}\n"
|
||||
f"Will flow through the standard triage pipeline within 24h "
|
||||
f"(or on next auto-process button click at /admin/triage)."
|
||||
)
|
||||
|
||||
|
||||
def _tool_health(args: dict) -> str:
|
||||
"""Check AtoCore service health."""
|
||||
result, err = safe_call(http_get, "/health")
|
||||
if err:
|
||||
return f"AtoCore unreachable: {err}"
|
||||
sha = result.get("build_sha", "?")[:8]
|
||||
vectors = result.get("vectors_count", "?")
|
||||
env = result.get("env", "?")
|
||||
return f"AtoCore healthy: sha={sha} vectors={vectors} env={env}"
|
||||
|
||||
|
||||
# --- Phase 5H: Engineering query tools ---
|
||||
|
||||
|
||||
def _tool_system_map(args: dict) -> str:
|
||||
"""Q-001 + Q-004: subsystem/component tree for a project."""
|
||||
project = (args.get("project") or "").strip()
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
result, err = safe_call(
|
||||
http_get, f"/engineering/projects/{urllib.parse.quote(project)}/systems"
|
||||
)
|
||||
if err:
|
||||
return f"Engineering query failed: {err}"
|
||||
subs = result.get("subsystems", []) or []
|
||||
orphans = result.get("orphan_components", []) or []
|
||||
if not subs and not orphans:
|
||||
return f"No subsystems or components registered for {project}."
|
||||
lines = [f"System map for {project}:"]
|
||||
for s in subs:
|
||||
lines.append(f"\n[{s['name']}] — {s.get('description') or '(no description)'}")
|
||||
for c in s.get("components", []):
|
||||
mats = ", ".join(c.get("materials", [])) or "-"
|
||||
lines.append(f" • {c['name']} (materials: {mats})")
|
||||
if orphans:
|
||||
lines.append(f"\nOrphan components (not attached to any subsystem):")
|
||||
for c in orphans:
|
||||
lines.append(f" • {c['name']}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_gaps(args: dict) -> str:
|
||||
"""Q-006 + Q-009 + Q-011: find coverage gaps. Director's most-used query."""
|
||||
project = (args.get("project") or "").strip()
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
result, err = safe_call(
|
||||
http_get, f"/engineering/gaps",
|
||||
params={"project": project},
|
||||
)
|
||||
if err:
|
||||
return f"Gap query failed: {err}"
|
||||
|
||||
orphan = result.get("orphan_requirements", {})
|
||||
risky = result.get("risky_decisions", {})
|
||||
unsup = result.get("unsupported_claims", {})
|
||||
|
||||
counts = f"{orphan.get('count',0)}/{risky.get('count',0)}/{unsup.get('count',0)}"
|
||||
lines = [f"Coverage gaps for {project} (orphan reqs / risky decisions / unsupported claims: {counts}):\n"]
|
||||
|
||||
if orphan.get("count", 0):
|
||||
lines.append(f"ORPHAN REQUIREMENTS ({orphan['count']}) — no component claims to satisfy:")
|
||||
for g in orphan.get("gaps", [])[:10]:
|
||||
lines.append(f" • {g['name']}: {(g.get('description') or '')[:120]}")
|
||||
lines.append("")
|
||||
if risky.get("count", 0):
|
||||
lines.append(f"RISKY DECISIONS ({risky['count']}) — based on flagged assumptions:")
|
||||
for g in risky.get("gaps", [])[:10]:
|
||||
lines.append(f" • {g['decision_name']} (assumption: {g['assumption_name']} — {g['assumption_status']})")
|
||||
lines.append("")
|
||||
if unsup.get("count", 0):
|
||||
lines.append(f"UNSUPPORTED CLAIMS ({unsup['count']}) — no Result entity backs them:")
|
||||
for g in unsup.get("gaps", [])[:10]:
|
||||
lines.append(f" • {g['name']}: {(g.get('description') or '')[:120]}")
|
||||
|
||||
if orphan.get("count", 0) == 0 and risky.get("count", 0) == 0 and unsup.get("count", 0) == 0:
|
||||
lines.append("✓ No gaps detected — every requirement satisfied, no flagged assumptions, all claims have evidence.")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_requirements_for(args: dict) -> str:
|
||||
"""Q-005: requirements that a component satisfies."""
|
||||
component_id = (args.get("component_id") or "").strip()
|
||||
if not component_id:
|
||||
return "Error: 'component_id' is required."
|
||||
result, err = safe_call(
|
||||
http_get, f"/engineering/components/{urllib.parse.quote(component_id)}/requirements"
|
||||
)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
reqs = result.get("requirements", []) or []
|
||||
if not reqs:
|
||||
return "No requirements associated with this component."
|
||||
lines = [f"Component satisfies {len(reqs)} requirement(s):"]
|
||||
for r in reqs:
|
||||
lines.append(f" • {r['name']}: {(r.get('description') or '')[:150]}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_decisions_affecting(args: dict) -> str:
|
||||
"""Q-008: decisions affecting a project or subsystem."""
|
||||
project = (args.get("project") or "").strip()
|
||||
subsystem = args.get("subsystem_id") or args.get("subsystem") or ""
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
params = {"project": project}
|
||||
if subsystem:
|
||||
params["subsystem"] = subsystem
|
||||
result, err = safe_call(http_get, "/engineering/decisions", params=params)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
decisions = result.get("decisions", []) or []
|
||||
if not decisions:
|
||||
scope = f"subsystem {subsystem}" if subsystem else f"project {project}"
|
||||
return f"No decisions recorded for {scope}."
|
||||
scope = f"subsystem {subsystem}" if subsystem else project
|
||||
lines = [f"{len(decisions)} decision(s) affecting {scope}:"]
|
||||
for d in decisions:
|
||||
lines.append(f" • {d['name']}: {(d.get('description') or '')[:150]}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_recent_changes(args: dict) -> str:
|
||||
"""Q-013: what changed recently in the engineering graph."""
|
||||
project = (args.get("project") or "").strip()
|
||||
since = args.get("since") or ""
|
||||
limit = int(args.get("limit") or 20)
|
||||
if not project:
|
||||
return "Error: 'project' is required."
|
||||
params = {"project": project, "limit": limit}
|
||||
if since:
|
||||
params["since"] = since
|
||||
result, err = safe_call(http_get, "/engineering/changes", params=params)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
changes = result.get("changes", []) or []
|
||||
if not changes:
|
||||
return f"No entity changes in {project} since {since or '(all time)'}."
|
||||
lines = [f"Recent changes in {project} ({len(changes)}):"]
|
||||
for c in changes:
|
||||
lines.append(
|
||||
f" [{c['timestamp'][:16]}] {c['action']:10s} "
|
||||
f"[{c.get('entity_type','?')}] {c.get('entity_name','?')} "
|
||||
f"by {c.get('actor','?')}"
|
||||
)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_impact(args: dict) -> str:
|
||||
"""Q-016: impact of changing an entity (downstream BFS)."""
|
||||
entity = (args.get("entity_id") or args.get("entity") or "").strip()
|
||||
if not entity:
|
||||
return "Error: 'entity_id' is required."
|
||||
max_depth = int(args.get("max_depth") or 3)
|
||||
result, err = safe_call(
|
||||
http_get, "/engineering/impact",
|
||||
params={"entity": entity, "max_depth": max_depth},
|
||||
)
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
root = result.get("root") or {}
|
||||
impacted = result.get("impacted", []) or []
|
||||
if not impacted:
|
||||
return f"Nothing downstream of [{root.get('entity_type','?')}] {root.get('name','?')}."
|
||||
lines = [
|
||||
f"Changing [{root.get('entity_type')}] {root.get('name')} "
|
||||
f"would affect {len(impacted)} entity(ies) (max depth {max_depth}):"
|
||||
]
|
||||
for i in impacted[:25]:
|
||||
indent = " " * i.get("depth", 1)
|
||||
lines.append(f"{indent}→ [{i['entity_type']}] {i['name']} (via {i['relationship']})")
|
||||
if len(impacted) > 25:
|
||||
lines.append(f" ... and {len(impacted)-25} more")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _tool_evidence(args: dict) -> str:
|
||||
"""Q-017: evidence chain for an entity."""
|
||||
entity = (args.get("entity_id") or args.get("entity") or "").strip()
|
||||
if not entity:
|
||||
return "Error: 'entity_id' is required."
|
||||
result, err = safe_call(http_get, "/engineering/evidence", params={"entity": entity})
|
||||
if err:
|
||||
return f"Query failed: {err}"
|
||||
root = result.get("root") or {}
|
||||
chain = result.get("evidence_chain", []) or []
|
||||
lines = [f"Evidence for [{root.get('entity_type','?')}] {root.get('name','?')}:"]
|
||||
if not chain:
|
||||
lines.append(" (no inbound provenance edges)")
|
||||
else:
|
||||
for e in chain:
|
||||
lines.append(
|
||||
f" {e['via']} ← [{e['source_type']}] {e['source_name']}: "
|
||||
f"{(e.get('source_description') or '')[:100]}"
|
||||
)
|
||||
refs = result.get("direct_source_refs") or []
|
||||
if refs:
|
||||
lines.append(f"\nDirect source_refs: {refs[:5]}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
TOOLS = [
|
||||
{
|
||||
"name": "atocore_context",
|
||||
"description": (
|
||||
"Get the full AtoCore context pack for a user query. Returns "
|
||||
"Trusted Project State (high trust), relevant memories, and "
|
||||
"retrieved source chunks formatted for prompt injection. "
|
||||
"Use this FIRST on any project-related query to ground the "
|
||||
"conversation in what AtoCore already knows."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {"type": "string", "description": "The user's question or task"},
|
||||
"project": {"type": "string", "description": "Project hint (e.g. 'p04-gigabit'); optional"},
|
||||
},
|
||||
"required": ["query"],
|
||||
},
|
||||
"handler": _tool_context,
|
||||
},
|
||||
{
|
||||
"name": "atocore_search",
|
||||
"description": (
|
||||
"Semantic search over AtoCore's ingested source documents. "
|
||||
"Returns top-K ranked chunks. Use this when you need raw "
|
||||
"references rather than a full context pack."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {"type": "string"},
|
||||
"project": {"type": "string", "description": "optional project filter"},
|
||||
"top_k": {"type": "integer", "minimum": 1, "maximum": 20, "default": 5},
|
||||
},
|
||||
"required": ["query"],
|
||||
},
|
||||
"handler": _tool_search,
|
||||
},
|
||||
{
|
||||
"name": "atocore_memory_list",
|
||||
"description": (
|
||||
"List active memories (curated facts, decisions, preferences). "
|
||||
"Filter by project and/or memory_type. Use this to inspect what "
|
||||
"AtoCore currently remembers about a topic."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
"memory_type": {
|
||||
"type": "string",
|
||||
"enum": ["identity", "preference", "project", "episodic", "knowledge", "adaptation"],
|
||||
},
|
||||
"limit": {"type": "integer", "minimum": 1, "maximum": 100, "default": 20},
|
||||
},
|
||||
},
|
||||
"handler": _tool_memory_list,
|
||||
},
|
||||
{
|
||||
"name": "atocore_memory_create",
|
||||
"description": (
|
||||
"Propose a new memory for AtoCore. Creates a CANDIDATE that "
|
||||
"enters the triage queue for human/auto review — not immediately "
|
||||
"active. Use this to capture durable facts/decisions that "
|
||||
"should persist across sessions. Do NOT use for transient state "
|
||||
"or session-specific notes."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"memory_type": {
|
||||
"type": "string",
|
||||
"enum": ["identity", "preference", "project", "episodic", "knowledge", "adaptation"],
|
||||
},
|
||||
"content": {"type": "string", "description": "The fact/decision/preference to remember"},
|
||||
"project": {"type": "string", "description": "project id if project-scoped; empty for global"},
|
||||
"confidence": {"type": "number", "minimum": 0, "maximum": 1, "default": 0.5},
|
||||
},
|
||||
"required": ["memory_type", "content"],
|
||||
},
|
||||
"handler": _tool_memory_create,
|
||||
},
|
||||
{
|
||||
"name": "atocore_remember",
|
||||
"description": (
|
||||
"Save a durable fact to AtoCore's memory layer from any conversation. "
|
||||
"Use when the user says 'remember this', 'save that for later', "
|
||||
"'don't lose this fact', or when you identify a decision/insight/"
|
||||
"preference worth persisting across future sessions. The fact "
|
||||
"goes through quality review before being consulted in future "
|
||||
"context packs (so durable facts get kept, noise gets rejected). "
|
||||
"Call multiple times if one conversation has multiple distinct "
|
||||
"facts worth remembering — one tool call per atomic fact. "
|
||||
"Prefer 'knowledge' type for cross-project engineering insights, "
|
||||
"'project' for facts specific to one project, 'preference' for "
|
||||
"user work-style notes, 'adaptation' for standing behavioral rules."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"content": {
|
||||
"type": "string",
|
||||
"description": "The atomic fact to remember. Under 250 chars. Should stand alone without session context.",
|
||||
},
|
||||
"memory_type": {
|
||||
"type": "string",
|
||||
"enum": ["identity", "preference", "project", "episodic", "knowledge", "adaptation"],
|
||||
"default": "knowledge",
|
||||
},
|
||||
"project": {
|
||||
"type": "string",
|
||||
"description": "Project id if scoped. Empty for cross-project. Unregistered names flagged by triage as 'emerging project' proposals.",
|
||||
},
|
||||
"confidence": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 1,
|
||||
"default": 0.6,
|
||||
"description": "0.5-0.7 typical. 0.8+ only for ratified/committed claims.",
|
||||
},
|
||||
"valid_until": {
|
||||
"type": "string",
|
||||
"description": "ISO date YYYY-MM-DD if time-bounded (e.g. current state, scheduled event, quote expiry). Empty for permanent facts.",
|
||||
},
|
||||
"domain_tags": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": "Lowercase topical tags (optics, thermal, firmware, procurement, etc.) for cross-project retrieval. 2-5 tags typical.",
|
||||
},
|
||||
},
|
||||
"required": ["content"],
|
||||
},
|
||||
"handler": _tool_remember,
|
||||
},
|
||||
{
|
||||
"name": "atocore_project_state",
|
||||
"description": (
|
||||
"Get Trusted Project State entries for a given project — the "
|
||||
"highest-trust tier with curated decisions, requirements, "
|
||||
"facts, contacts, milestones. Use this to look up authoritative "
|
||||
"project info."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": ["status", "decision", "requirement", "contact", "milestone", "fact", "config"],
|
||||
},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_project_state,
|
||||
},
|
||||
{
|
||||
"name": "atocore_projects",
|
||||
"description": "List all registered AtoCore projects (id + aliases).",
|
||||
"inputSchema": {"type": "object", "properties": {}},
|
||||
"handler": _tool_projects,
|
||||
},
|
||||
{
|
||||
"name": "atocore_health",
|
||||
"description": "Check AtoCore service health (build SHA, vector count, env).",
|
||||
"inputSchema": {"type": "object", "properties": {}},
|
||||
"handler": _tool_health,
|
||||
},
|
||||
# --- Phase 5H: Engineering knowledge graph tools ---
|
||||
{
|
||||
"name": "atocore_engineering_map",
|
||||
"description": (
|
||||
"Get the subsystem/component tree for an engineering project. "
|
||||
"Returns the full system architecture: subsystems, their components, "
|
||||
"materials, and any orphan components not attached to a subsystem. "
|
||||
"Use when the user asks about project structure or system design."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string", "description": "Project id (e.g. p04-gigabit)"},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_system_map,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_gaps",
|
||||
"description": (
|
||||
"Find coverage gaps in a project's engineering graph: orphan "
|
||||
"requirements (no component satisfies them), risky decisions "
|
||||
"(based on flagged assumptions), and unsupported claims (no "
|
||||
"Result evidence). This is the director's most useful query — "
|
||||
"answers 'what am I forgetting?' in seconds."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_gaps,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_requirements_for_component",
|
||||
"description": "List the requirements a specific component claims to satisfy (Q-005).",
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"component_id": {"type": "string"},
|
||||
},
|
||||
"required": ["component_id"],
|
||||
},
|
||||
"handler": _tool_requirements_for,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_decisions",
|
||||
"description": (
|
||||
"Decisions that affect a project, optionally scoped to a specific "
|
||||
"subsystem. Use when the user asks 'what did we decide about X?'"
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
"subsystem_id": {"type": "string", "description": "optional subsystem entity id"},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_decisions_affecting,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_changes",
|
||||
"description": (
|
||||
"Recent changes to the engineering graph for a project: which "
|
||||
"entities were created/promoted/rejected/updated, by whom, when. "
|
||||
"Use for 'what changed recently?' type questions."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"project": {"type": "string"},
|
||||
"since": {"type": "string", "description": "ISO timestamp; optional"},
|
||||
"limit": {"type": "integer", "minimum": 1, "maximum": 200, "default": 20},
|
||||
},
|
||||
"required": ["project"],
|
||||
},
|
||||
"handler": _tool_recent_changes,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_impact",
|
||||
"description": (
|
||||
"Impact analysis: what's downstream of a given entity. BFS over "
|
||||
"outbound relationships up to max_depth. Use to answer 'what would "
|
||||
"break if I change X?'"
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"entity_id": {"type": "string"},
|
||||
"max_depth": {"type": "integer", "minimum": 1, "maximum": 5, "default": 3},
|
||||
},
|
||||
"required": ["entity_id"],
|
||||
},
|
||||
"handler": _tool_impact,
|
||||
},
|
||||
{
|
||||
"name": "atocore_engineering_evidence",
|
||||
"description": (
|
||||
"Evidence chain for an entity: what supports it? Walks inbound "
|
||||
"SUPPORTS / EVIDENCED_BY / DESCRIBED_BY / VALIDATED_BY / ANALYZED_BY "
|
||||
"edges. Use for 'how do we know X is true?' type questions."
|
||||
),
|
||||
"inputSchema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"entity_id": {"type": "string"},
|
||||
},
|
||||
"required": ["entity_id"],
|
||||
},
|
||||
"handler": _tool_evidence,
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
# --- JSON-RPC handlers ---
|
||||
|
||||
def handle_initialize(params: dict) -> dict:
|
||||
return {
|
||||
"protocolVersion": PROTOCOL_VERSION,
|
||||
"capabilities": {
|
||||
"tools": {"listChanged": False},
|
||||
},
|
||||
"serverInfo": {"name": SERVER_NAME, "version": SERVER_VERSION},
|
||||
}
|
||||
|
||||
|
||||
def handle_tools_list(params: dict) -> dict:
|
||||
return {
|
||||
"tools": [
|
||||
{"name": t["name"], "description": t["description"], "inputSchema": t["inputSchema"]}
|
||||
for t in TOOLS
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
def handle_tools_call(params: dict) -> dict:
|
||||
tool_name = params.get("name", "")
|
||||
args = params.get("arguments", {}) or {}
|
||||
tool = next((t for t in TOOLS if t["name"] == tool_name), None)
|
||||
if tool is None:
|
||||
return {
|
||||
"content": [{"type": "text", "text": f"Unknown tool: {tool_name}"}],
|
||||
"isError": True,
|
||||
}
|
||||
try:
|
||||
text = tool["handler"](args)
|
||||
except Exception as e:
|
||||
log(f"tool {tool_name} raised: {e}")
|
||||
return {
|
||||
"content": [{"type": "text", "text": f"Tool error: {type(e).__name__}: {e}"}],
|
||||
"isError": True,
|
||||
}
|
||||
return {"content": [{"type": "text", "text": text}]}
|
||||
|
||||
|
||||
def handle_ping(params: dict) -> dict:
|
||||
return {}
|
||||
|
||||
|
||||
METHODS = {
|
||||
"initialize": handle_initialize,
|
||||
"tools/list": handle_tools_list,
|
||||
"tools/call": handle_tools_call,
|
||||
"ping": handle_ping,
|
||||
}
|
||||
|
||||
|
||||
# --- stdio main loop ---
|
||||
|
||||
def send(obj: dict) -> None:
|
||||
"""Write a single-line JSON message to stdout and flush."""
|
||||
sys.stdout.write(json.dumps(obj, ensure_ascii=False) + "\n")
|
||||
sys.stdout.flush()
|
||||
|
||||
|
||||
def make_response(req_id, result=None, error=None) -> dict:
|
||||
resp = {"jsonrpc": "2.0", "id": req_id}
|
||||
if error is not None:
|
||||
resp["error"] = error
|
||||
else:
|
||||
resp["result"] = result if result is not None else {}
|
||||
return resp
|
||||
|
||||
|
||||
def main() -> int:
|
||||
log(f"starting (AtoCore at {ATOCORE_URL})")
|
||||
for line in sys.stdin:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
msg = json.loads(line)
|
||||
except json.JSONDecodeError as e:
|
||||
log(f"parse error: {e}")
|
||||
continue
|
||||
|
||||
method = msg.get("method", "")
|
||||
req_id = msg.get("id")
|
||||
params = msg.get("params", {}) or {}
|
||||
|
||||
# Notifications (no id) don't need a response
|
||||
if req_id is None:
|
||||
if method == "notifications/initialized":
|
||||
log("client initialized")
|
||||
continue
|
||||
|
||||
handler = METHODS.get(method)
|
||||
if handler is None:
|
||||
send(make_response(req_id, error={
|
||||
"code": -32601,
|
||||
"message": f"Method not found: {method}",
|
||||
}))
|
||||
continue
|
||||
|
||||
try:
|
||||
result = handler(params)
|
||||
send(make_response(req_id, result=result))
|
||||
except Exception as e:
|
||||
log(f"handler {method} raised: {e}")
|
||||
send(make_response(req_id, error={
|
||||
"code": -32603,
|
||||
"message": f"Internal error: {type(e).__name__}: {e}",
|
||||
}))
|
||||
|
||||
log("stdin closed, exiting")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
321
scripts/atocore_proxy.py
Normal file
321
scripts/atocore_proxy.py
Normal file
@@ -0,0 +1,321 @@
|
||||
#!/usr/bin/env python3
|
||||
"""AtoCore Proxy — OpenAI-compatible HTTP middleware.
|
||||
|
||||
Acts as a drop-in layer for any client that speaks the OpenAI Chat
|
||||
Completions API (Codex, Ollama, LiteLLM, custom agents). Sits between
|
||||
the client and the real model provider:
|
||||
|
||||
client -> atocore_proxy -> real_provider (OpenAI, Ollama, Anthropic, ...)
|
||||
|
||||
For each chat completion request:
|
||||
1. Extract the user's last message as the "query"
|
||||
2. Call AtoCore /context/build to get a context pack
|
||||
3. Inject the pack as a system message (or prepend to existing system)
|
||||
4. Forward the enriched request to the real provider
|
||||
5. Capture the full interaction back to AtoCore /interactions
|
||||
|
||||
Fail-open: if AtoCore is unreachable, the request passes through
|
||||
unchanged. If the real provider fails, the error is propagated to the
|
||||
client as-is.
|
||||
|
||||
Configuration (env vars):
|
||||
ATOCORE_URL AtoCore base URL (default http://dalidou:8100)
|
||||
ATOCORE_UPSTREAM real provider base URL (e.g. http://localhost:11434/v1 for Ollama)
|
||||
ATOCORE_PROXY_PORT port to listen on (default 11435)
|
||||
ATOCORE_PROXY_HOST bind address (default 127.0.0.1)
|
||||
ATOCORE_CLIENT_LABEL client id recorded in captures (default "proxy")
|
||||
ATOCORE_CAPTURE "1" to capture interactions back (default "1")
|
||||
ATOCORE_INJECT "1" to inject context (default "1")
|
||||
|
||||
Usage:
|
||||
# Proxy for Ollama:
|
||||
ATOCORE_UPSTREAM=http://localhost:11434/v1 python atocore_proxy.py
|
||||
|
||||
# Then point your client at http://localhost:11435/v1 instead of the
|
||||
# real provider.
|
||||
|
||||
Stdlib only — deliberate to keep the dependency footprint at zero.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import http.server
|
||||
import json
|
||||
import os
|
||||
import socketserver
|
||||
import sys
|
||||
import threading
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
from typing import Any
|
||||
|
||||
ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100").rstrip("/")
|
||||
UPSTREAM_URL = os.environ.get("ATOCORE_UPSTREAM", "").rstrip("/")
|
||||
PROXY_PORT = int(os.environ.get("ATOCORE_PROXY_PORT", "11435"))
|
||||
PROXY_HOST = os.environ.get("ATOCORE_PROXY_HOST", "127.0.0.1")
|
||||
CLIENT_LABEL = os.environ.get("ATOCORE_CLIENT_LABEL", "proxy")
|
||||
CAPTURE_ENABLED = os.environ.get("ATOCORE_CAPTURE", "1") == "1"
|
||||
INJECT_ENABLED = os.environ.get("ATOCORE_INJECT", "1") == "1"
|
||||
ATOCORE_TIMEOUT = float(os.environ.get("ATOCORE_TIMEOUT", "6"))
|
||||
UPSTREAM_TIMEOUT = float(os.environ.get("ATOCORE_UPSTREAM_TIMEOUT", "300"))
|
||||
|
||||
PROJECT_HINTS = [
|
||||
("p04-gigabit", ["p04", "gigabit"]),
|
||||
("p05-interferometer", ["p05", "interferometer"]),
|
||||
("p06-polisher", ["p06", "polisher", "fullum"]),
|
||||
("abb-space", ["abb"]),
|
||||
("atomizer-v2", ["atomizer"]),
|
||||
("atocore", ["atocore", "dalidou"]),
|
||||
]
|
||||
|
||||
|
||||
def log(msg: str) -> None:
|
||||
print(f"[atocore-proxy] {msg}", file=sys.stderr, flush=True)
|
||||
|
||||
|
||||
def detect_project(text: str) -> str:
|
||||
lower = (text or "").lower()
|
||||
for proj, tokens in PROJECT_HINTS:
|
||||
if any(t in lower for t in tokens):
|
||||
return proj
|
||||
return ""
|
||||
|
||||
|
||||
def get_last_user_message(body: dict) -> str:
|
||||
messages = body.get("messages", []) or []
|
||||
for m in reversed(messages):
|
||||
if m.get("role") == "user":
|
||||
content = m.get("content", "")
|
||||
if isinstance(content, list):
|
||||
# OpenAI multi-part content: extract text parts
|
||||
parts = [p.get("text", "") for p in content if p.get("type") == "text"]
|
||||
return "\n".join(parts)
|
||||
return str(content)
|
||||
return ""
|
||||
|
||||
|
||||
def get_assistant_text(response: dict) -> str:
|
||||
"""Extract assistant text from an OpenAI-style completion response."""
|
||||
choices = response.get("choices", []) or []
|
||||
if not choices:
|
||||
return ""
|
||||
msg = choices[0].get("message", {}) or {}
|
||||
content = msg.get("content", "")
|
||||
if isinstance(content, list):
|
||||
parts = [p.get("text", "") for p in content if p.get("type") == "text"]
|
||||
return "\n".join(parts)
|
||||
return str(content)
|
||||
|
||||
|
||||
def fetch_context(query: str, project: str) -> str:
|
||||
"""Pull a context pack from AtoCore. Returns '' on any failure."""
|
||||
if not INJECT_ENABLED or not query:
|
||||
return ""
|
||||
try:
|
||||
data = json.dumps({"prompt": query, "project": project}).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
ATOCORE_URL + "/context/build",
|
||||
data=data,
|
||||
method="POST",
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=ATOCORE_TIMEOUT) as resp:
|
||||
result = json.loads(resp.read().decode("utf-8"))
|
||||
return result.get("formatted_context", "") or ""
|
||||
except Exception as e:
|
||||
log(f"context fetch failed: {type(e).__name__}: {e}")
|
||||
return ""
|
||||
|
||||
|
||||
def capture_interaction(prompt: str, response: str, project: str) -> None:
|
||||
"""POST the completed turn back to AtoCore. Fire-and-forget."""
|
||||
if not CAPTURE_ENABLED or not prompt or not response:
|
||||
return
|
||||
|
||||
def _post():
|
||||
try:
|
||||
data = json.dumps({
|
||||
"prompt": prompt,
|
||||
"response": response,
|
||||
"client": CLIENT_LABEL,
|
||||
"project": project,
|
||||
"reinforce": True,
|
||||
}).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
ATOCORE_URL + "/interactions",
|
||||
data=data,
|
||||
method="POST",
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
urllib.request.urlopen(req, timeout=ATOCORE_TIMEOUT)
|
||||
except Exception as e:
|
||||
log(f"capture failed: {type(e).__name__}: {e}")
|
||||
|
||||
threading.Thread(target=_post, daemon=True).start()
|
||||
|
||||
|
||||
def inject_context(body: dict, context_pack: str) -> dict:
|
||||
"""Prepend the AtoCore context as a system message, or augment existing."""
|
||||
if not context_pack.strip():
|
||||
return body
|
||||
header = "--- AtoCore Context (auto-injected) ---\n"
|
||||
footer = "\n--- End AtoCore Context ---\n"
|
||||
injection = header + context_pack + footer
|
||||
|
||||
messages = list(body.get("messages", []) or [])
|
||||
if messages and messages[0].get("role") == "system":
|
||||
# Augment existing system message
|
||||
existing = messages[0].get("content", "") or ""
|
||||
if isinstance(existing, list):
|
||||
# multi-part: prepend a text part
|
||||
messages[0]["content"] = [{"type": "text", "text": injection}] + existing
|
||||
else:
|
||||
messages[0]["content"] = injection + "\n" + str(existing)
|
||||
else:
|
||||
messages.insert(0, {"role": "system", "content": injection})
|
||||
|
||||
body["messages"] = messages
|
||||
return body
|
||||
|
||||
|
||||
def forward_to_upstream(body: dict, headers: dict[str, str], path: str) -> tuple[int, dict]:
|
||||
"""Forward the enriched body to the upstream provider. Returns (status, response_dict)."""
|
||||
if not UPSTREAM_URL:
|
||||
return 503, {"error": {"message": "ATOCORE_UPSTREAM not configured"}}
|
||||
url = UPSTREAM_URL + path
|
||||
data = json.dumps(body).encode("utf-8")
|
||||
# Strip hop-by-hop / host-specific headers
|
||||
fwd_headers = {"Content-Type": "application/json"}
|
||||
for k, v in headers.items():
|
||||
lk = k.lower()
|
||||
if lk in ("authorization", "x-api-key", "anthropic-version"):
|
||||
fwd_headers[k] = v
|
||||
req = urllib.request.Request(url, data=data, method="POST", headers=fwd_headers)
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=UPSTREAM_TIMEOUT) as resp:
|
||||
return resp.status, json.loads(resp.read().decode("utf-8"))
|
||||
except urllib.error.HTTPError as e:
|
||||
try:
|
||||
body_bytes = e.read()
|
||||
payload = json.loads(body_bytes.decode("utf-8"))
|
||||
except Exception:
|
||||
payload = {"error": {"message": f"upstream HTTP {e.code}"}}
|
||||
return e.code, payload
|
||||
except Exception as e:
|
||||
log(f"upstream error: {e}")
|
||||
return 502, {"error": {"message": f"upstream unreachable: {e}"}}
|
||||
|
||||
|
||||
class ProxyHandler(http.server.BaseHTTPRequestHandler):
|
||||
# Silence default request logging (we log what matters ourselves)
|
||||
def log_message(self, format: str, *args: Any) -> None:
|
||||
pass
|
||||
|
||||
def _read_body(self) -> dict:
|
||||
length = int(self.headers.get("Content-Length", "0") or "0")
|
||||
if length <= 0:
|
||||
return {}
|
||||
raw = self.rfile.read(length)
|
||||
try:
|
||||
return json.loads(raw.decode("utf-8"))
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
def _send_json(self, status: int, payload: dict) -> None:
|
||||
body = json.dumps(payload).encode("utf-8")
|
||||
self.send_response(status)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.send_header("Content-Length", str(len(body)))
|
||||
self.send_header("Access-Control-Allow-Origin", "*")
|
||||
self.end_headers()
|
||||
self.wfile.write(body)
|
||||
|
||||
def do_OPTIONS(self) -> None: # CORS preflight
|
||||
self.send_response(204)
|
||||
self.send_header("Access-Control-Allow-Origin", "*")
|
||||
self.send_header("Access-Control-Allow-Methods", "POST, GET, OPTIONS")
|
||||
self.send_header("Access-Control-Allow-Headers", "Content-Type, Authorization, X-API-Key")
|
||||
self.end_headers()
|
||||
|
||||
def do_GET(self) -> None:
|
||||
parsed = urllib.parse.urlparse(self.path)
|
||||
if parsed.path == "/healthz":
|
||||
self._send_json(200, {
|
||||
"status": "ok",
|
||||
"atocore": ATOCORE_URL,
|
||||
"upstream": UPSTREAM_URL or "(not configured)",
|
||||
"inject": INJECT_ENABLED,
|
||||
"capture": CAPTURE_ENABLED,
|
||||
})
|
||||
return
|
||||
# Pass through GET to upstream (model listing etc)
|
||||
if not UPSTREAM_URL:
|
||||
self._send_json(503, {"error": {"message": "ATOCORE_UPSTREAM not configured"}})
|
||||
return
|
||||
try:
|
||||
req = urllib.request.Request(UPSTREAM_URL + parsed.path + (f"?{parsed.query}" if parsed.query else ""))
|
||||
for k in ("Authorization", "X-API-Key"):
|
||||
v = self.headers.get(k)
|
||||
if v:
|
||||
req.add_header(k, v)
|
||||
with urllib.request.urlopen(req, timeout=UPSTREAM_TIMEOUT) as resp:
|
||||
data = resp.read()
|
||||
self.send_response(resp.status)
|
||||
self.send_header("Content-Type", resp.headers.get("Content-Type", "application/json"))
|
||||
self.send_header("Content-Length", str(len(data)))
|
||||
self.end_headers()
|
||||
self.wfile.write(data)
|
||||
except Exception as e:
|
||||
self._send_json(502, {"error": {"message": f"upstream error: {e}"}})
|
||||
|
||||
def do_POST(self) -> None:
|
||||
parsed = urllib.parse.urlparse(self.path)
|
||||
body = self._read_body()
|
||||
|
||||
# Only enrich chat completions; other endpoints pass through
|
||||
if parsed.path.endswith("/chat/completions") or parsed.path == "/v1/chat/completions":
|
||||
prompt = get_last_user_message(body)
|
||||
project = detect_project(prompt)
|
||||
context = fetch_context(prompt, project) if prompt else ""
|
||||
if context:
|
||||
log(f"inject: project={project or '(none)'} chars={len(context)}")
|
||||
body = inject_context(body, context)
|
||||
|
||||
status, response = forward_to_upstream(body, dict(self.headers), parsed.path)
|
||||
self._send_json(status, response)
|
||||
|
||||
if status == 200:
|
||||
assistant_text = get_assistant_text(response)
|
||||
capture_interaction(prompt, assistant_text, project)
|
||||
else:
|
||||
# Non-chat endpoints (embeddings, completions, etc.) — pure passthrough
|
||||
status, response = forward_to_upstream(body, dict(self.headers), parsed.path)
|
||||
self._send_json(status, response)
|
||||
|
||||
|
||||
class ThreadedServer(socketserver.ThreadingMixIn, http.server.HTTPServer):
|
||||
daemon_threads = True
|
||||
allow_reuse_address = True
|
||||
|
||||
|
||||
def main() -> int:
|
||||
if not UPSTREAM_URL:
|
||||
log("WARNING: ATOCORE_UPSTREAM not set. Chat completions will fail.")
|
||||
log("Example: ATOCORE_UPSTREAM=http://localhost:11434/v1 for Ollama")
|
||||
server = ThreadedServer((PROXY_HOST, PROXY_PORT), ProxyHandler)
|
||||
log(f"listening on {PROXY_HOST}:{PROXY_PORT}")
|
||||
log(f"AtoCore: {ATOCORE_URL} inject={INJECT_ENABLED} capture={CAPTURE_ENABLED}")
|
||||
log(f"Upstream: {UPSTREAM_URL or '(not configured)'}")
|
||||
log(f"Client label: {CLIENT_LABEL}")
|
||||
log("Ready. Point your OpenAI-compatible client at /v1/chat/completions")
|
||||
try:
|
||||
server.serve_forever()
|
||||
except KeyboardInterrupt:
|
||||
log("stopping")
|
||||
server.server_close()
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -29,25 +29,61 @@ import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import tempfile
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://localhost:8100")
|
||||
DEFAULT_MODEL = os.environ.get("ATOCORE_TRIAGE_MODEL", "sonnet")
|
||||
|
||||
# 3-tier escalation config (Phase "Triage Quality")
|
||||
TIER1_MODEL = os.environ.get("ATOCORE_TRIAGE_MODEL_TIER1",
|
||||
os.environ.get("ATOCORE_TRIAGE_MODEL", "sonnet"))
|
||||
TIER2_MODEL = os.environ.get("ATOCORE_TRIAGE_MODEL_TIER2", "opus")
|
||||
# Tier 3: default "discard" (auto-reject uncertain after opus disagrees/wavers),
|
||||
# alternative "human" routes them to /admin/triage.
|
||||
TIER3_ACTION = os.environ.get("ATOCORE_TRIAGE_TIER3", "discard").lower()
|
||||
DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_TRIAGE_TIMEOUT_S", "60"))
|
||||
TIER2_TIMEOUT_S = float(os.environ.get("ATOCORE_TRIAGE_TIER2_TIMEOUT_S", "120"))
|
||||
AUTO_PROMOTE_MIN_CONFIDENCE = 0.8
|
||||
# Below this, tier 1 decision is "not confident enough" and we escalate
|
||||
ESCALATION_CONFIDENCE_THRESHOLD = float(
|
||||
os.environ.get("ATOCORE_TRIAGE_ESCALATION_THRESHOLD", "0.75")
|
||||
)
|
||||
|
||||
# Kept for legacy callers that reference DEFAULT_MODEL
|
||||
DEFAULT_MODEL = TIER1_MODEL
|
||||
|
||||
TRIAGE_SYSTEM_PROMPT = """You are a memory triage reviewer for a personal context engine called AtoCore. You review candidate memories extracted from LLM conversations and decide whether each should be promoted to active status, rejected, or flagged for human review.
|
||||
|
||||
You will receive:
|
||||
- The candidate memory content and type
|
||||
- A list of existing active memories for the same project (to check for duplicates)
|
||||
- The candidate memory content, type, and claimed project
|
||||
- A list of existing active memories for the same project (to check for duplicates + contradictions)
|
||||
- Trusted project state entries (curated ground truth — higher trust than memories)
|
||||
- Known project ids so you can flag misattribution
|
||||
|
||||
For each candidate, output exactly one JSON object:
|
||||
|
||||
{"verdict": "promote|reject|needs_human|contradicts", "confidence": 0.0-1.0, "reason": "one sentence", "conflicts_with": "id of existing memory if contradicts"}
|
||||
{"verdict": "promote|reject|needs_human|contradicts", "confidence": 0.0-1.0, "reason": "one sentence", "conflicts_with": "id of existing memory if contradicts", "domain_tags": ["tag1","tag2"], "valid_until": null}
|
||||
|
||||
DOMAIN TAGS (Phase 3): A lowercase list of 2-5 topical keywords describing
|
||||
the SUBJECT matter (not the project). This enables cross-project retrieval:
|
||||
a query about "optics" can pull matches from p04 + p05 + p06.
|
||||
|
||||
Good tags are single lowercase words or hyphenated terms. Mix:
|
||||
- domain keywords (optics, thermal, firmware, materials, controls)
|
||||
- project tokens when clearly scoped (p04, p05, p06, abb)
|
||||
- lifecycle/activity words (procurement, design, validation, vendor)
|
||||
|
||||
Always emit domain_tags on a promote. For reject, empty list is fine.
|
||||
|
||||
VALID_UNTIL (Phase 3): ISO date "YYYY-MM-DD" OR null (permanent).
|
||||
Set to a near-future date when the candidate is time-bounded:
|
||||
- Status snapshots ("current blocker is X") → ~2 weeks out
|
||||
- Scheduled events ("meeting Friday") → event date
|
||||
- Quotes with expiry → quote expiry date
|
||||
Leave null for durable decisions, engineering insights, ratified requirements.
|
||||
|
||||
Rules:
|
||||
|
||||
@@ -65,9 +101,24 @@ Rules:
|
||||
|
||||
4. OPENCLAW-CURATED content (candidate content starts with "From OpenClaw/"): apply a MUCH LOWER bar. OpenClaw's SOUL.md, USER.md, MEMORY.md, MODEL-ROUTING.md, and dated memory/*.md files are ALREADY curated by OpenClaw as canonical continuity. Promote unless clearly wrong or a genuine duplicate. Do NOT reject OpenClaw content as "process rule belongs elsewhere" or "session log" — that's exactly what AtoCore wants to absorb. Session events, project updates, stakeholder notes, and decisions from OpenClaw daily memory files ARE valuable context and should promote.
|
||||
|
||||
5. NEEDS_HUMAN when you're genuinely unsure — the candidate might be valuable but you can't tell without domain knowledge. This should be rare (< 20% of candidates).
|
||||
5. NEEDS_HUMAN when you're genuinely unsure — the candidate might be valuable but you can't tell without domain knowledge. This should be rare (< 20% of candidates). If this is just noise/filler, prefer REJECT with low confidence.
|
||||
|
||||
6. Output ONLY the JSON object. No prose, no markdown, no explanation outside the reason field."""
|
||||
6. PROJECT VALIDATION: The candidate has a "claimed project". You'll see the list of registered project ids. If the claimed project doesn't match any registered id AND the content clearly belongs to a registered project, include "suggested_project": "<correct_id>" in your output so the caller can auto-fix the attribution. If the content is genuinely cross-project or global, leave project empty (suggested_project=""). Misattribution is the #1 pollution source — flag it.
|
||||
|
||||
7. TEMPORAL SENSITIVITY: Be aggressive with valid_until for anything that reads like "current state", "right now", "this week", "as of". Stale facts pollute context. When in doubt, set a 2-4 week expiry rather than null.
|
||||
|
||||
8. CONFIDENCE GRADING:
|
||||
- 0.9+: crystal clear durable fact or clear noise
|
||||
- 0.75-0.9: confident but not cryptographic-certain
|
||||
- 0.6-0.75: borderline — will escalate to opus for second opinion
|
||||
- <0.6: genuinely ambiguous — needs human or will be discarded
|
||||
|
||||
9. Output ONLY the JSON object. No prose, no markdown, no explanation outside the reason field. Include optional "suggested_project" field when misattribution detected."""
|
||||
|
||||
|
||||
TIER2_SECOND_OPINION_PROMPT = TRIAGE_SYSTEM_PROMPT + """
|
||||
|
||||
ESCALATED REVIEW: You are seeing this candidate because the tier-1 (sonnet) reviewer could not decide confidently. You will be shown tier-1's verdict + reason as additional context. Your job is to resolve the uncertainty with more careful thinking. Use your full context window to cross-reference the existing memories. If you ALSO cannot decide with confidence >= 0.8, output verdict="needs_human" with a clear explanation of what information would break the tie. That signal will route to a human (or auto-discard, depending on config)."""
|
||||
|
||||
_sandbox_cwd = None
|
||||
|
||||
@@ -104,33 +155,78 @@ def fetch_active_memories_for_project(base_url, project):
|
||||
return result.get("memories", [])
|
||||
|
||||
|
||||
def triage_one(candidate, active_memories, model, timeout_s):
|
||||
"""Ask the triage model to classify one candidate."""
|
||||
if not shutil.which("claude"):
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": "claude CLI not available"}
|
||||
def fetch_project_state(base_url, project):
|
||||
"""Fetch trusted project state for ground-truth context."""
|
||||
if not project:
|
||||
return []
|
||||
try:
|
||||
result = api_get(base_url, f"/project/state/{urllib.parse.quote(project)}")
|
||||
return result.get("entries", result.get("state", []))
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
|
||||
def fetch_registered_projects(base_url):
|
||||
"""Return list of registered project ids + aliases for misattribution check."""
|
||||
try:
|
||||
result = api_get(base_url, "/projects")
|
||||
projects = result.get("projects", [])
|
||||
out = {}
|
||||
for p in projects:
|
||||
pid = p.get("project_id") or p.get("id") or p.get("name")
|
||||
if pid:
|
||||
out[pid] = p.get("aliases", []) or []
|
||||
return out
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
|
||||
def build_triage_user_message(candidate, active_memories, project_state, known_projects):
|
||||
"""Richer context for the triage model: memories + state + project registry."""
|
||||
active_summary = "\n".join(
|
||||
f"- [{m['memory_type']}] {m['content'][:150]}"
|
||||
for m in active_memories[:20]
|
||||
f"- [{m['memory_type']}] {m['content'][:200]}"
|
||||
for m in active_memories[:30]
|
||||
) or "(no active memories for this project)"
|
||||
|
||||
user_message = (
|
||||
state_summary = ""
|
||||
if project_state:
|
||||
lines = []
|
||||
for e in project_state[:20]:
|
||||
cat = e.get("category", "?")
|
||||
key = e.get("key", "?")
|
||||
val = (e.get("value") or "")[:200]
|
||||
lines.append(f"- [{cat}/{key}] {val}")
|
||||
state_summary = "\n".join(lines)
|
||||
else:
|
||||
state_summary = "(no trusted state entries for this project)"
|
||||
|
||||
projects_line = ", ".join(sorted(known_projects.keys())) if known_projects else "(none)"
|
||||
|
||||
return (
|
||||
f"CANDIDATE TO TRIAGE:\n"
|
||||
f" type: {candidate['memory_type']}\n"
|
||||
f" project: {candidate.get('project') or '(none)'}\n"
|
||||
f" claimed project: {candidate.get('project') or '(none)'}\n"
|
||||
f" content: {candidate['content']}\n\n"
|
||||
f"REGISTERED PROJECT IDS: {projects_line}\n\n"
|
||||
f"TRUSTED PROJECT STATE (ground truth, higher trust than memories):\n{state_summary}\n\n"
|
||||
f"EXISTING ACTIVE MEMORIES FOR THIS PROJECT:\n{active_summary}\n\n"
|
||||
f"Return the JSON verdict now."
|
||||
)
|
||||
|
||||
|
||||
def _call_claude(system_prompt, user_message, model, timeout_s):
|
||||
"""Shared CLI caller with retry + stderr capture."""
|
||||
args = [
|
||||
"claude", "-p",
|
||||
"--model", model,
|
||||
"--append-system-prompt", TRIAGE_SYSTEM_PROMPT,
|
||||
"--append-system-prompt", system_prompt,
|
||||
"--disable-slash-commands",
|
||||
user_message,
|
||||
]
|
||||
|
||||
last_error = ""
|
||||
for attempt in range(3):
|
||||
if attempt > 0:
|
||||
time.sleep(2 ** attempt)
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
args, capture_output=True, text=True,
|
||||
@@ -138,14 +234,50 @@ def triage_one(candidate, active_memories, model, timeout_s):
|
||||
encoding="utf-8", errors="replace",
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": "triage model timed out"}
|
||||
last_error = f"{model} timed out"
|
||||
continue
|
||||
except Exception as exc:
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": f"subprocess error: {exc}"}
|
||||
last_error = f"subprocess error: {exc}"
|
||||
continue
|
||||
|
||||
if completed.returncode != 0:
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": f"claude exit {completed.returncode}"}
|
||||
if completed.returncode == 0:
|
||||
return (completed.stdout or "").strip(), None
|
||||
|
||||
raw = (completed.stdout or "").strip()
|
||||
stderr = (completed.stderr or "").strip()[:200]
|
||||
last_error = f"{model} exit {completed.returncode}: {stderr}" if stderr else f"{model} exit {completed.returncode}"
|
||||
return None, last_error
|
||||
|
||||
|
||||
def triage_one(candidate, active_memories, project_state, known_projects, model, timeout_s):
|
||||
"""Tier-1 triage: ask the cheap model for a verdict."""
|
||||
if not shutil.which("claude"):
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": "claude CLI not available"}
|
||||
|
||||
user_message = build_triage_user_message(candidate, active_memories, project_state, known_projects)
|
||||
raw, err = _call_claude(TRIAGE_SYSTEM_PROMPT, user_message, model, timeout_s)
|
||||
if err:
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": err}
|
||||
return parse_verdict(raw)
|
||||
|
||||
|
||||
def triage_escalation(candidate, tier1_verdict, active_memories, project_state, known_projects, model, timeout_s):
|
||||
"""Tier-2 escalation: opus sees tier-1's verdict + reasoning, tries again."""
|
||||
if not shutil.which("claude"):
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": "claude CLI not available"}
|
||||
|
||||
base_msg = build_triage_user_message(candidate, active_memories, project_state, known_projects)
|
||||
tier1_context = (
|
||||
f"\nTIER-1 REVIEW (sonnet, for your reference):\n"
|
||||
f" verdict: {tier1_verdict.get('verdict')}\n"
|
||||
f" confidence: {tier1_verdict.get('confidence', 0.0):.2f}\n"
|
||||
f" reason: {tier1_verdict.get('reason', '')[:300]}\n\n"
|
||||
f"Resolve the uncertainty. If you also can't decide with confidence ≥ 0.8, "
|
||||
f"return verdict='needs_human' with a specific explanation of what information "
|
||||
f"would break the tie.\n\nReturn the JSON verdict now."
|
||||
)
|
||||
raw, err = _call_claude(TIER2_SECOND_OPINION_PROMPT, base_msg + tier1_context, model, timeout_s)
|
||||
if err:
|
||||
return {"verdict": "needs_human", "confidence": 0.0, "reason": f"tier2: {err}"}
|
||||
return parse_verdict(raw)
|
||||
|
||||
|
||||
@@ -184,81 +316,235 @@ def parse_verdict(raw):
|
||||
|
||||
reason = str(parsed.get("reason", "")).strip()[:200]
|
||||
conflicts_with = str(parsed.get("conflicts_with", "")).strip()
|
||||
|
||||
# Phase 3: domain tags + expiry
|
||||
raw_tags = parsed.get("domain_tags") or []
|
||||
if isinstance(raw_tags, str):
|
||||
raw_tags = [t.strip() for t in raw_tags.split(",") if t.strip()]
|
||||
if not isinstance(raw_tags, list):
|
||||
raw_tags = []
|
||||
domain_tags = []
|
||||
for t in raw_tags[:10]:
|
||||
if not isinstance(t, str):
|
||||
continue
|
||||
tag = t.strip().lower()
|
||||
if tag and tag not in domain_tags:
|
||||
domain_tags.append(tag)
|
||||
|
||||
valid_until = parsed.get("valid_until")
|
||||
if valid_until is None:
|
||||
valid_until = ""
|
||||
else:
|
||||
valid_until = str(valid_until).strip()
|
||||
if valid_until.lower() in ("", "null", "none", "permanent"):
|
||||
valid_until = ""
|
||||
|
||||
# Triage Quality: project misattribution flag
|
||||
suggested_project = str(parsed.get("suggested_project", "")).strip()
|
||||
|
||||
return {
|
||||
"verdict": verdict,
|
||||
"confidence": confidence,
|
||||
"reason": reason,
|
||||
"conflicts_with": conflicts_with,
|
||||
"domain_tags": domain_tags,
|
||||
"valid_until": valid_until,
|
||||
"suggested_project": suggested_project,
|
||||
}
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Auto-triage candidate memories")
|
||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
|
||||
parser.add_argument("--model", default=DEFAULT_MODEL)
|
||||
parser.add_argument("--dry-run", action="store_true", help="preview without executing")
|
||||
args = parser.parse_args()
|
||||
def _apply_metadata_update(base_url, mid, verdict_obj):
|
||||
"""Persist tags + valid_until + suggested_project before the promote call."""
|
||||
tags = verdict_obj.get("domain_tags") or []
|
||||
valid_until = verdict_obj.get("valid_until") or ""
|
||||
suggested = verdict_obj.get("suggested_project") or ""
|
||||
|
||||
# Fetch candidates
|
||||
result = api_get(args.base_url, "/memory?status=candidate&limit=100")
|
||||
candidates = result.get("memories", [])
|
||||
print(f"candidates: {len(candidates)} model: {args.model} dry_run: {args.dry_run}")
|
||||
|
||||
if not candidates:
|
||||
print("queue empty, nothing to triage")
|
||||
body = {}
|
||||
if tags:
|
||||
body["domain_tags"] = tags
|
||||
if valid_until:
|
||||
body["valid_until"] = valid_until
|
||||
if not body and not suggested:
|
||||
return
|
||||
|
||||
# Cache active memories per project for dedup
|
||||
active_cache = {}
|
||||
promoted = rejected = needs_human = errors = 0
|
||||
if body:
|
||||
try:
|
||||
import urllib.request as _ur
|
||||
req = _ur.Request(
|
||||
f"{base_url}/memory/{mid}", method="PUT",
|
||||
headers={"Content-Type": "application/json"},
|
||||
data=json.dumps(body).encode("utf-8"),
|
||||
)
|
||||
_ur.urlopen(req, timeout=10).read()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
for i, cand in enumerate(candidates, 1):
|
||||
# Project auto-fix via direct SQLite update would bypass audit; use PUT if supported.
|
||||
# For now we log the suggestion — operator script can apply it in batch.
|
||||
if suggested:
|
||||
# noop here — handled by caller which tracks suggested_project_fixes
|
||||
pass
|
||||
|
||||
|
||||
def process_candidate(cand, base_url, active_cache, state_cache, known_projects, dry_run):
|
||||
"""Run the 3-tier triage and apply the resulting action.
|
||||
|
||||
Returns (action, note) where action in {promote, reject, discard, human, error}.
|
||||
"""
|
||||
mid = cand["id"]
|
||||
project = cand.get("project") or ""
|
||||
if project not in active_cache:
|
||||
active_cache[project] = fetch_active_memories_for_project(args.base_url, project)
|
||||
active_cache[project] = fetch_active_memories_for_project(base_url, project)
|
||||
if project not in state_cache:
|
||||
state_cache[project] = fetch_project_state(base_url, project)
|
||||
|
||||
verdict_obj = triage_one(cand, active_cache[project], args.model, DEFAULT_TIMEOUT_S)
|
||||
# === Tier 1 ===
|
||||
v1 = triage_one(
|
||||
cand, active_cache[project], state_cache[project],
|
||||
known_projects, TIER1_MODEL, DEFAULT_TIMEOUT_S,
|
||||
)
|
||||
|
||||
# Project misattribution fix: suggested_project surfaces from tier 1
|
||||
suggested = (v1.get("suggested_project") or "").strip()
|
||||
if suggested and suggested != project and suggested in known_projects:
|
||||
# Try to re-canonicalize the memory's project
|
||||
if not dry_run:
|
||||
try:
|
||||
import urllib.request as _ur
|
||||
req = _ur.Request(
|
||||
f"{base_url}/memory/{mid}", method="PUT",
|
||||
headers={"Content-Type": "application/json"},
|
||||
data=json.dumps({"content": cand["content"]}).encode("utf-8"),
|
||||
)
|
||||
_ur.urlopen(req, timeout=10).read() # triggers canonicalization via update
|
||||
except Exception:
|
||||
pass
|
||||
print(f" ↺ misattribution flagged: {project!r} → {suggested!r}")
|
||||
|
||||
# High-confidence tier 1 decision → act
|
||||
if v1["verdict"] in ("promote", "reject") and v1["confidence"] >= AUTO_PROMOTE_MIN_CONFIDENCE:
|
||||
return _apply_verdict(v1, cand, base_url, active_cache, dry_run, tier="sonnet")
|
||||
|
||||
# Borderline or uncertain → escalate to tier 2 (opus)
|
||||
print(f" ↑ escalating (tier1 verdict={v1['verdict']} conf={v1['confidence']:.2f})")
|
||||
v2 = triage_escalation(
|
||||
cand, v1, active_cache[project], state_cache[project],
|
||||
known_projects, TIER2_MODEL, TIER2_TIMEOUT_S,
|
||||
)
|
||||
|
||||
# Tier 2 is confident → act
|
||||
if v2["verdict"] in ("promote", "reject") and v2["confidence"] >= AUTO_PROMOTE_MIN_CONFIDENCE:
|
||||
return _apply_verdict(v2, cand, base_url, active_cache, dry_run, tier="opus")
|
||||
|
||||
# Tier 3: still uncertain — route per config
|
||||
if TIER3_ACTION == "discard":
|
||||
reason = f"tier1+tier2 uncertain: {v2.get('reason', '')[:150]}"
|
||||
if dry_run:
|
||||
return ("discard", reason)
|
||||
try:
|
||||
api_post(base_url, f"/memory/{mid}/reject")
|
||||
except Exception:
|
||||
return ("error", reason)
|
||||
return ("discard", reason)
|
||||
else:
|
||||
# "human" — leave in queue for /admin/triage review
|
||||
return ("human", v2.get("reason", "no reason")[:200])
|
||||
|
||||
|
||||
def _apply_verdict(verdict_obj, cand, base_url, active_cache, dry_run, tier):
|
||||
"""Execute the promote/reject action and update metadata."""
|
||||
mid = cand["id"]
|
||||
verdict = verdict_obj["verdict"]
|
||||
conf = verdict_obj["confidence"]
|
||||
reason = verdict_obj["reason"]
|
||||
conflicts_with = verdict_obj.get("conflicts_with", "")
|
||||
reason = f"[{tier}] {verdict_obj['reason']}"
|
||||
|
||||
if verdict == "promote":
|
||||
if dry_run:
|
||||
return ("promote", reason)
|
||||
_apply_metadata_update(base_url, mid, verdict_obj)
|
||||
try:
|
||||
api_post(base_url, f"/memory/{mid}/promote")
|
||||
project = cand.get("project") or ""
|
||||
if project in active_cache:
|
||||
active_cache[project].append(cand)
|
||||
return ("promote", reason)
|
||||
except Exception as e:
|
||||
return ("error", f"promote failed: {e}")
|
||||
else:
|
||||
if dry_run:
|
||||
return ("reject", reason)
|
||||
try:
|
||||
api_post(base_url, f"/memory/{mid}/reject")
|
||||
return ("reject", reason)
|
||||
except Exception as e:
|
||||
return ("error", f"reject failed: {e}")
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Auto-triage candidate memories (3-tier escalation)")
|
||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
|
||||
parser.add_argument("--dry-run", action="store_true", help="preview without executing")
|
||||
parser.add_argument("--max-batches", type=int, default=20,
|
||||
help="Max batches of 100 to process per run")
|
||||
parser.add_argument("--no-escalation", action="store_true",
|
||||
help="Disable tier-2 escalation (legacy single-model behavior)")
|
||||
args = parser.parse_args()
|
||||
|
||||
seen_ids: set[str] = set()
|
||||
active_cache: dict[str, list] = {}
|
||||
state_cache: dict[str, list] = {}
|
||||
|
||||
known_projects = fetch_registered_projects(args.base_url)
|
||||
print(f"Registered projects: {sorted(known_projects.keys())}")
|
||||
print(f"Tier1: {TIER1_MODEL} Tier2: {TIER2_MODEL} Tier3: {TIER3_ACTION} "
|
||||
f"escalation_threshold: {ESCALATION_CONFIDENCE_THRESHOLD}")
|
||||
|
||||
counts = {"promote": 0, "reject": 0, "discard": 0, "human": 0, "error": 0}
|
||||
batch_num = 0
|
||||
|
||||
while batch_num < args.max_batches:
|
||||
batch_num += 1
|
||||
result = api_get(args.base_url, "/memory?status=candidate&limit=100")
|
||||
all_candidates = result.get("memories", [])
|
||||
candidates = [c for c in all_candidates if c["id"] not in seen_ids]
|
||||
|
||||
if not candidates:
|
||||
if batch_num == 1:
|
||||
print("queue empty, nothing to triage")
|
||||
else:
|
||||
print(f"\nQueue drained after batch {batch_num-1}.")
|
||||
break
|
||||
|
||||
print(f"\n=== batch {batch_num}: {len(candidates)} candidates dry_run: {args.dry_run} ===")
|
||||
|
||||
for i, cand in enumerate(candidates, 1):
|
||||
if i > 1:
|
||||
time.sleep(0.5)
|
||||
seen_ids.add(cand["id"])
|
||||
mid = cand["id"]
|
||||
label = f"[{i:2d}/{len(candidates)}] {mid[:8]} [{cand['memory_type']}]"
|
||||
|
||||
if verdict == "promote" and conf >= AUTO_PROMOTE_MIN_CONFIDENCE:
|
||||
if args.dry_run:
|
||||
print(f" WOULD PROMOTE {label} conf={conf:.2f} {reason}")
|
||||
else:
|
||||
try:
|
||||
api_post(args.base_url, f"/memory/{mid}/promote")
|
||||
print(f" PROMOTED {label} conf={conf:.2f} {reason}")
|
||||
active_cache[project].append(cand)
|
||||
except Exception:
|
||||
errors += 1
|
||||
promoted += 1
|
||||
elif verdict == "reject":
|
||||
if args.dry_run:
|
||||
print(f" WOULD REJECT {label} conf={conf:.2f} {reason}")
|
||||
else:
|
||||
try:
|
||||
api_post(args.base_url, f"/memory/{mid}/reject")
|
||||
print(f" REJECTED {label} conf={conf:.2f} {reason}")
|
||||
except Exception:
|
||||
errors += 1
|
||||
rejected += 1
|
||||
elif verdict == "contradicts":
|
||||
# Leave candidate in queue but flag the conflict in content
|
||||
# so the wiki/triage shows it. This is conservative: we
|
||||
# don't silently merge or reject when sources disagree.
|
||||
print(f" CONTRADICTS {label} vs {conflicts_with[:8] if conflicts_with else '?'} {reason}")
|
||||
contradicts_count = locals().get('contradicts_count', 0) + 1
|
||||
needs_human += 1
|
||||
else:
|
||||
print(f" NEEDS_HUMAN {label} conf={conf:.2f} {reason}")
|
||||
needs_human += 1
|
||||
action, note = process_candidate(
|
||||
cand, args.base_url, active_cache, state_cache,
|
||||
known_projects, args.dry_run,
|
||||
)
|
||||
except Exception as e:
|
||||
action, note = ("error", f"exception: {e}")
|
||||
|
||||
print(f"\npromoted={promoted} rejected={rejected} needs_human={needs_human} errors={errors}")
|
||||
counts[action] = counts.get(action, 0) + 1
|
||||
verb = {"promote": "PROMOTED ", "reject": "REJECTED ",
|
||||
"discard": "DISCARDED ", "human": "NEEDS_HUM ",
|
||||
"error": "ERROR "}.get(action, action.upper())
|
||||
if args.dry_run and action in ("promote", "reject", "discard"):
|
||||
verb = "WOULD " + verb.strip()
|
||||
print(f" {verb} {label} {note[:120]}")
|
||||
|
||||
print(
|
||||
f"\ntotal: promoted={counts['promote']} rejected={counts['reject']} "
|
||||
f"discarded={counts['discard']} human={counts['human']} errors={counts['error']} "
|
||||
f"batches={batch_num}"
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
@@ -126,6 +126,12 @@ def extract_one(prompt, response, project, model, timeout_s):
|
||||
user_message,
|
||||
]
|
||||
|
||||
# Retry with exponential backoff on transient failures (rate limits etc)
|
||||
import time as _time
|
||||
last_error = ""
|
||||
for attempt in range(3):
|
||||
if attempt > 0:
|
||||
_time.sleep(2 ** attempt) # 2s, 4s
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
args, capture_output=True, text=True,
|
||||
@@ -133,16 +139,22 @@ def extract_one(prompt, response, project, model, timeout_s):
|
||||
encoding="utf-8", errors="replace",
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
return [], "timeout"
|
||||
last_error = "timeout"
|
||||
continue
|
||||
except Exception as exc:
|
||||
return [], f"subprocess_error: {exc}"
|
||||
|
||||
if completed.returncode != 0:
|
||||
return [], f"exit_{completed.returncode}"
|
||||
last_error = f"subprocess_error: {exc}"
|
||||
continue
|
||||
|
||||
if completed.returncode == 0:
|
||||
raw = (completed.stdout or "").strip()
|
||||
return parse_candidates(raw, project), ""
|
||||
|
||||
# Capture stderr for diagnostics (truncate to 200 chars)
|
||||
stderr = (completed.stderr or "").strip()[:200]
|
||||
last_error = f"exit_{completed.returncode}: {stderr}" if stderr else f"exit_{completed.returncode}"
|
||||
|
||||
return [], last_error
|
||||
|
||||
|
||||
def parse_candidates(raw, interaction_project):
|
||||
"""Parse model JSON output into candidate dicts.
|
||||
@@ -164,6 +176,8 @@ def parse_candidates(raw, interaction_project):
|
||||
"content": normalized["content"],
|
||||
"project": project,
|
||||
"confidence": normalized["confidence"],
|
||||
"domain_tags": normalized.get("domain_tags") or [],
|
||||
"valid_until": normalized.get("valid_until") or "",
|
||||
})
|
||||
return results
|
||||
|
||||
@@ -192,10 +206,14 @@ def main():
|
||||
total_persisted = 0
|
||||
errors = 0
|
||||
|
||||
for summary in interaction_summaries:
|
||||
import time as _time
|
||||
for ix, summary in enumerate(interaction_summaries):
|
||||
resp_chars = summary.get("response_chars", 0) or 0
|
||||
if resp_chars < 50:
|
||||
continue
|
||||
# Light pacing between calls to avoid bursting the claude CLI
|
||||
if ix > 0:
|
||||
_time.sleep(0.5)
|
||||
iid = summary["id"]
|
||||
try:
|
||||
raw = api_get(
|
||||
@@ -234,6 +252,8 @@ def main():
|
||||
"project": c["project"],
|
||||
"confidence": c["confidence"],
|
||||
"status": "candidate",
|
||||
"domain_tags": c.get("domain_tags") or [],
|
||||
"valid_until": c.get("valid_until") or "",
|
||||
})
|
||||
total_persisted += 1
|
||||
except urllib.error.HTTPError as exc:
|
||||
|
||||
254
scripts/canonicalize_tags.py
Normal file
254
scripts/canonicalize_tags.py
Normal file
@@ -0,0 +1,254 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Phase 7C — tag canonicalization detector.
|
||||
|
||||
Weekly (or on-demand) LLM pass that:
|
||||
1. Fetches the tag distribution across all active memories via HTTP
|
||||
2. Asks claude-p to propose alias→canonical mappings
|
||||
3. AUTO-APPLIES aliases with confidence >= AUTO_APPROVE_CONF (0.8)
|
||||
4. Submits lower-confidence proposals as pending for human review
|
||||
|
||||
Autonomous by default — matches the Phase 7A.1 pattern. Set
|
||||
--no-auto-approve to force every proposal into human review.
|
||||
|
||||
Host-side because claude CLI lives on Dalidou, not the container.
|
||||
Reuses the PYTHONPATH=src pattern from scripts/memory_dedup.py.
|
||||
|
||||
Usage:
|
||||
python3 scripts/canonicalize_tags.py [--base-url URL] [--dry-run] [--no-auto-approve]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
|
||||
_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
|
||||
if _SRC_DIR not in sys.path:
|
||||
sys.path.insert(0, _SRC_DIR)
|
||||
|
||||
from atocore.memory._tag_canon_prompt import ( # noqa: E402
|
||||
PROTECTED_PROJECT_TOKENS,
|
||||
SYSTEM_PROMPT,
|
||||
TAG_CANON_PROMPT_VERSION,
|
||||
build_user_message,
|
||||
normalize_alias_item,
|
||||
parse_canon_output,
|
||||
)
|
||||
|
||||
DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
|
||||
DEFAULT_MODEL = os.environ.get("ATOCORE_TAG_CANON_MODEL", "sonnet")
|
||||
DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_TAG_CANON_TIMEOUT_S", "90"))
|
||||
|
||||
AUTO_APPROVE_CONF = float(os.environ.get("ATOCORE_TAG_CANON_AUTO_APPROVE_CONF", "0.8"))
|
||||
MIN_ALIAS_COUNT = int(os.environ.get("ATOCORE_TAG_CANON_MIN_ALIAS_COUNT", "1"))
|
||||
|
||||
_sandbox_cwd = None
|
||||
|
||||
|
||||
def get_sandbox_cwd() -> str:
|
||||
global _sandbox_cwd
|
||||
if _sandbox_cwd is None:
|
||||
_sandbox_cwd = tempfile.mkdtemp(prefix="ato-tagcanon-")
|
||||
return _sandbox_cwd
|
||||
|
||||
|
||||
def api_get(base_url: str, path: str) -> dict:
|
||||
req = urllib.request.Request(f"{base_url}{path}")
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def api_post(base_url: str, path: str, body: dict | None = None) -> dict:
|
||||
data = json.dumps(body or {}).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{base_url}{path}", method="POST",
|
||||
headers={"Content-Type": "application/json"}, data=data,
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def call_claude(user_message: str, model: str, timeout_s: float) -> tuple[str | None, str | None]:
|
||||
if not shutil.which("claude"):
|
||||
return None, "claude CLI not available"
|
||||
args = [
|
||||
"claude", "-p",
|
||||
"--model", model,
|
||||
"--append-system-prompt", SYSTEM_PROMPT,
|
||||
"--disable-slash-commands",
|
||||
user_message,
|
||||
]
|
||||
last_error = ""
|
||||
for attempt in range(3):
|
||||
if attempt > 0:
|
||||
time.sleep(2 ** attempt)
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
args, capture_output=True, text=True,
|
||||
timeout=timeout_s, cwd=get_sandbox_cwd(),
|
||||
encoding="utf-8", errors="replace",
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
last_error = f"{model} timed out"
|
||||
continue
|
||||
except Exception as exc:
|
||||
last_error = f"subprocess error: {exc}"
|
||||
continue
|
||||
if completed.returncode == 0:
|
||||
return (completed.stdout or "").strip(), None
|
||||
stderr = (completed.stderr or "").strip()[:200]
|
||||
last_error = f"{model} exit {completed.returncode}: {stderr}"
|
||||
return None, last_error
|
||||
|
||||
|
||||
def fetch_tag_distribution(base_url: str) -> dict[str, int]:
|
||||
"""Count tag occurrences across active memories (client-side)."""
|
||||
try:
|
||||
result = api_get(base_url, "/memory?active_only=true&limit=2000")
|
||||
except Exception as e:
|
||||
print(f"ERROR: could not fetch memories: {e}", file=sys.stderr)
|
||||
return {}
|
||||
mems = result.get("memories", [])
|
||||
counts: dict[str, int] = {}
|
||||
for m in mems:
|
||||
tags = m.get("domain_tags") or []
|
||||
if isinstance(tags, str):
|
||||
try:
|
||||
tags = json.loads(tags)
|
||||
except Exception:
|
||||
tags = []
|
||||
if not isinstance(tags, list):
|
||||
continue
|
||||
for t in tags:
|
||||
if not isinstance(t, str):
|
||||
continue
|
||||
key = t.strip().lower()
|
||||
if key:
|
||||
counts[key] = counts.get(key, 0) + 1
|
||||
return counts
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Phase 7C tag canonicalization detector")
|
||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
|
||||
parser.add_argument("--model", default=DEFAULT_MODEL)
|
||||
parser.add_argument("--timeout-s", type=float, default=DEFAULT_TIMEOUT_S)
|
||||
parser.add_argument("--no-auto-approve", action="store_true",
|
||||
help="Disable autonomous apply; all proposals → human queue")
|
||||
parser.add_argument("--dry-run", action="store_true",
|
||||
help="Print decisions without touching state")
|
||||
args = parser.parse_args()
|
||||
|
||||
base = args.base_url.rstrip("/")
|
||||
autonomous = not args.no_auto_approve
|
||||
|
||||
print(
|
||||
f"canonicalize_tags {TAG_CANON_PROMPT_VERSION} | model={args.model} | "
|
||||
f"autonomous={autonomous} | auto-approve conf>={AUTO_APPROVE_CONF}"
|
||||
)
|
||||
|
||||
dist = fetch_tag_distribution(base)
|
||||
print(f"tag distribution: {len(dist)} unique tags, "
|
||||
f"{sum(dist.values())} total references")
|
||||
if not dist:
|
||||
print("no tags found — nothing to canonicalize")
|
||||
return
|
||||
|
||||
user_msg = build_user_message(dist)
|
||||
raw, err = call_claude(user_msg, args.model, args.timeout_s)
|
||||
if err or raw is None:
|
||||
print(f"ERROR: LLM call failed: {err}", file=sys.stderr)
|
||||
return
|
||||
|
||||
aliases_raw = parse_canon_output(raw)
|
||||
print(f"LLM returned {len(aliases_raw)} raw alias proposals")
|
||||
|
||||
auto_applied = 0
|
||||
auto_skipped_missing_canonical = 0
|
||||
proposals_created = 0
|
||||
duplicates_skipped = 0
|
||||
|
||||
for item in aliases_raw:
|
||||
norm = normalize_alias_item(item)
|
||||
if norm is None:
|
||||
continue
|
||||
alias = norm["alias"]
|
||||
canonical = norm["canonical"]
|
||||
confidence = norm["confidence"]
|
||||
|
||||
alias_count = dist.get(alias, 0)
|
||||
canonical_count = dist.get(canonical, 0)
|
||||
|
||||
# Sanity: alias must actually exist in the current distribution
|
||||
if alias_count < MIN_ALIAS_COUNT:
|
||||
print(f" SKIP {alias!r} → {canonical!r}: alias not in distribution")
|
||||
continue
|
||||
if canonical_count == 0:
|
||||
auto_skipped_missing_canonical += 1
|
||||
print(f" SKIP {alias!r} → {canonical!r}: canonical missing from distribution")
|
||||
continue
|
||||
|
||||
label = f"{alias!r} ({alias_count}) → {canonical!r} ({canonical_count}) conf={confidence:.2f}"
|
||||
|
||||
auto_apply = autonomous and confidence >= AUTO_APPROVE_CONF
|
||||
if auto_apply:
|
||||
if args.dry_run:
|
||||
auto_applied += 1
|
||||
print(f" [dry-run] would auto-apply: {label}")
|
||||
continue
|
||||
try:
|
||||
result = api_post(base, "/admin/tags/aliases/apply", {
|
||||
"alias": alias, "canonical": canonical,
|
||||
"confidence": confidence, "reason": norm["reason"],
|
||||
"alias_count": alias_count, "canonical_count": canonical_count,
|
||||
"actor": "auto-tag-canon",
|
||||
})
|
||||
touched = result.get("memories_touched", 0)
|
||||
auto_applied += 1
|
||||
print(f" ✅ auto-applied: {label} ({touched} memories)")
|
||||
except Exception as e:
|
||||
print(f" ⚠️ auto-apply failed: {label} — {e}", file=sys.stderr)
|
||||
time.sleep(0.2)
|
||||
continue
|
||||
|
||||
# Lower confidence → human review
|
||||
if args.dry_run:
|
||||
proposals_created += 1
|
||||
print(f" [dry-run] would propose for review: {label}")
|
||||
continue
|
||||
try:
|
||||
result = api_post(base, "/admin/tags/aliases/propose", {
|
||||
"alias": alias, "canonical": canonical,
|
||||
"confidence": confidence, "reason": norm["reason"],
|
||||
"alias_count": alias_count, "canonical_count": canonical_count,
|
||||
})
|
||||
if result.get("proposal_id"):
|
||||
proposals_created += 1
|
||||
print(f" → pending proposal: {label}")
|
||||
else:
|
||||
duplicates_skipped += 1
|
||||
print(f" (duplicate pending proposal): {label}")
|
||||
except Exception as e:
|
||||
print(f" ⚠️ propose failed: {label} — {e}", file=sys.stderr)
|
||||
time.sleep(0.2)
|
||||
|
||||
print(
|
||||
f"\nsummary: proposals_seen={len(aliases_raw)} "
|
||||
f"auto_applied={auto_applied} "
|
||||
f"proposals_created={proposals_created} "
|
||||
f"duplicates_skipped={duplicates_skipped} "
|
||||
f"skipped_missing_canonical={auto_skipped_missing_canonical}"
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
223
scripts/detect_emerging.py
Normal file
223
scripts/detect_emerging.py
Normal file
@@ -0,0 +1,223 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Phase 6 C.1 — Emerging-concepts detector (HTTP-only).
|
||||
|
||||
Scans active + candidate memories via the HTTP API to surface:
|
||||
1. Unregistered projects — project strings appearing on 3+ memories
|
||||
that aren't in the project registry. Surface for one-click
|
||||
registration.
|
||||
2. Emerging categories — top 20 domain_tags by frequency, for
|
||||
"what themes are emerging in my work?" intelligence.
|
||||
3. Reinforced transients — active memories with reference_count >= 5
|
||||
AND valid_until set. These "were temporary but now durable"; a
|
||||
sibling endpoint (/admin/memory/extend-reinforced) actually
|
||||
performs the extension.
|
||||
|
||||
Writes results to project_state under atocore/proposals/* via the API.
|
||||
Runs host-side (cron calls it) so uses stdlib only — no atocore deps.
|
||||
|
||||
Usage:
|
||||
python3 scripts/detect_emerging.py [--base-url URL] [--dry-run]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from collections import Counter, defaultdict
|
||||
|
||||
PROJECT_MIN_MEMORIES = int(os.environ.get("ATOCORE_EMERGING_PROJECT_MIN", "3"))
|
||||
PROJECT_ALERT_THRESHOLD = int(os.environ.get("ATOCORE_EMERGING_ALERT_THRESHOLD", "5"))
|
||||
TOP_TAGS_LIMIT = int(os.environ.get("ATOCORE_EMERGING_TOP_TAGS", "20"))
|
||||
|
||||
|
||||
def api_get(base_url: str, path: str, timeout: int = 30) -> dict:
|
||||
req = urllib.request.Request(f"{base_url}{path}")
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def api_post(base_url: str, path: str, body: dict, timeout: int = 10) -> dict:
|
||||
data = json.dumps(body).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{base_url}{path}", method="POST",
|
||||
headers={"Content-Type": "application/json"}, data=data,
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def fetch_registered_project_names(base_url: str) -> set[str]:
|
||||
"""Set of all registered project ids + aliases, lowercased."""
|
||||
try:
|
||||
result = api_get(base_url, "/projects")
|
||||
except Exception as e:
|
||||
print(f"WARN: could not load project registry: {e}", file=sys.stderr)
|
||||
return set()
|
||||
registered = set()
|
||||
for p in result.get("projects", []):
|
||||
pid = (p.get("project_id") or p.get("id") or p.get("name") or "").strip()
|
||||
if pid:
|
||||
registered.add(pid.lower())
|
||||
for alias in p.get("aliases", []) or []:
|
||||
if isinstance(alias, str) and alias.strip():
|
||||
registered.add(alias.strip().lower())
|
||||
return registered
|
||||
|
||||
|
||||
def fetch_memories(base_url: str, status: str, limit: int = 500) -> list[dict]:
|
||||
try:
|
||||
params = f"limit={limit}"
|
||||
if status == "active":
|
||||
params += "&active_only=true"
|
||||
else:
|
||||
params += f"&status={status}"
|
||||
result = api_get(base_url, f"/memory?{params}")
|
||||
return result.get("memories", [])
|
||||
except Exception as e:
|
||||
print(f"WARN: could not fetch {status} memories: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
|
||||
def fetch_previous_proposals(base_url: str) -> list[dict]:
|
||||
"""Read last run's unregistered_projects to diff against this run."""
|
||||
try:
|
||||
result = api_get(base_url, "/project/state/atocore")
|
||||
entries = result.get("entries", result.get("state", []))
|
||||
for e in entries:
|
||||
if e.get("category") == "proposals" and e.get("key") == "unregistered_projects_prev":
|
||||
try:
|
||||
return json.loads(e.get("value") or "[]")
|
||||
except Exception:
|
||||
return []
|
||||
except Exception:
|
||||
pass
|
||||
return []
|
||||
|
||||
|
||||
def set_state(base_url: str, category: str, key: str, value: str, source: str = "emerging detector") -> None:
|
||||
api_post(base_url, "/project/state", {
|
||||
"project": "atocore",
|
||||
"category": category,
|
||||
"key": key,
|
||||
"value": value,
|
||||
"source": source,
|
||||
})
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Detect emerging projects + categories")
|
||||
parser.add_argument("--base-url", default=os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100"))
|
||||
parser.add_argument("--dry-run", action="store_true", help="Report without writing to project state")
|
||||
args = parser.parse_args()
|
||||
|
||||
base = args.base_url.rstrip("/")
|
||||
|
||||
registered = fetch_registered_project_names(base)
|
||||
active = fetch_memories(base, "active")
|
||||
candidates = fetch_memories(base, "candidate")
|
||||
all_mems = active + candidates
|
||||
|
||||
# --- Unregistered projects ---
|
||||
project_mems: dict[str, list] = defaultdict(list)
|
||||
for m in all_mems:
|
||||
proj = (m.get("project") or "").strip().lower()
|
||||
if not proj or proj in registered:
|
||||
continue
|
||||
project_mems[proj].append(m)
|
||||
|
||||
unregistered = []
|
||||
for proj, mems in sorted(project_mems.items()):
|
||||
if len(mems) < PROJECT_MIN_MEMORIES:
|
||||
continue
|
||||
unregistered.append({
|
||||
"project": proj,
|
||||
"count": len(mems),
|
||||
"sample_memory_ids": [m.get("id") for m in mems[:3]],
|
||||
"sample_contents": [(m.get("content") or "")[:150] for m in mems[:3]],
|
||||
})
|
||||
|
||||
# --- Emerging domain_tags (active only) ---
|
||||
tag_counter: Counter = Counter()
|
||||
for m in active:
|
||||
for t in (m.get("domain_tags") or []):
|
||||
if isinstance(t, str) and t.strip():
|
||||
tag_counter[t.strip().lower()] += 1
|
||||
emerging_tags = [{"tag": tag, "count": cnt} for tag, cnt in tag_counter.most_common(TOP_TAGS_LIMIT)]
|
||||
|
||||
# --- Reinforced transients (active, high refs, has expiry) ---
|
||||
reinforced = []
|
||||
for m in active:
|
||||
ref_count = int(m.get("reference_count") or 0)
|
||||
vu = (m.get("valid_until") or "").strip()
|
||||
if ref_count >= 5 and vu:
|
||||
reinforced.append({
|
||||
"memory_id": m.get("id"),
|
||||
"reference_count": ref_count,
|
||||
"valid_until": vu,
|
||||
"content_preview": (m.get("content") or "")[:150],
|
||||
"project": m.get("project") or "",
|
||||
})
|
||||
|
||||
result = {
|
||||
"unregistered_projects": unregistered,
|
||||
"emerging_categories": emerging_tags,
|
||||
"reinforced_transients": reinforced,
|
||||
"counts": {
|
||||
"active_memories": len(active),
|
||||
"candidate_memories": len(candidates),
|
||||
"unregistered_project_count": len(unregistered),
|
||||
"emerging_tag_count": len(emerging_tags),
|
||||
"reinforced_transient_count": len(reinforced),
|
||||
},
|
||||
}
|
||||
|
||||
print(json.dumps(result, indent=2))
|
||||
|
||||
if args.dry_run:
|
||||
return
|
||||
|
||||
# --- Persist to project state via HTTP ---
|
||||
try:
|
||||
set_state(base, "proposals", "unregistered_projects", json.dumps(unregistered))
|
||||
set_state(base, "proposals", "emerging_categories", json.dumps(emerging_tags))
|
||||
set_state(base, "proposals", "reinforced_transients", json.dumps(reinforced))
|
||||
except Exception as e:
|
||||
print(f"WARN: failed to persist proposals: {e}", file=sys.stderr)
|
||||
|
||||
# --- Alert on NEW projects crossing the threshold ---
|
||||
try:
|
||||
prev = fetch_previous_proposals(base)
|
||||
prev_names = {p.get("project") for p in prev if isinstance(p, dict)}
|
||||
newly_crossed = [
|
||||
p for p in unregistered
|
||||
if p["count"] >= PROJECT_ALERT_THRESHOLD
|
||||
and p["project"] not in prev_names
|
||||
]
|
||||
if newly_crossed:
|
||||
names = ", ".join(p["project"] for p in newly_crossed)
|
||||
# Use existing alert mechanism via state (Phase 4 infra)
|
||||
try:
|
||||
set_state(base, "alert", "last_warning", json.dumps({
|
||||
"title": f"Emerging project(s) detected: {names}",
|
||||
"message": (
|
||||
f"{len(newly_crossed)} unregistered project(s) crossed "
|
||||
f"the {PROJECT_ALERT_THRESHOLD}-memory threshold. "
|
||||
f"Review at /wiki or /admin/dashboard."
|
||||
),
|
||||
"timestamp": "",
|
||||
}))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Snapshot for next run's diff
|
||||
set_state(base, "proposals", "unregistered_projects_prev", json.dumps(unregistered))
|
||||
except Exception as e:
|
||||
print(f"WARN: alert/state write failed: {e}", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
237
scripts/graduate_memories.py
Normal file
237
scripts/graduate_memories.py
Normal file
@@ -0,0 +1,237 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Phase 5F — Memory → Entity graduation batch pass.
|
||||
|
||||
Takes active memories, asks claude-p whether each describes a typed
|
||||
engineering entity, and creates entity candidates for the ones that do.
|
||||
Each candidate carries source_refs back to its source memory so human
|
||||
review can trace provenance.
|
||||
|
||||
Human reviews the entity candidates via /admin/triage (same UI as memory
|
||||
triage). When a candidate is promoted, a post-promote hook marks the source
|
||||
memory as `graduated` and sets `graduated_to_entity_id` for traceability.
|
||||
|
||||
This is THE population move: without it, the engineering graph stays sparse
|
||||
and the killer queries (Q-006/009/011) have nothing to find gaps in.
|
||||
|
||||
Usage:
|
||||
python3 scripts/graduate_memories.py --base-url http://127.0.0.1:8100 \\
|
||||
--project p05-interferometer --limit 20
|
||||
|
||||
# Dry run (don't create entities, just show decisions):
|
||||
python3 scripts/graduate_memories.py --project p05-interferometer --dry-run
|
||||
|
||||
# Process all active memories across all projects (big run):
|
||||
python3 scripts/graduate_memories.py --limit 200
|
||||
|
||||
Host-side because claude CLI lives on Dalidou, not in the container.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from typing import Any
|
||||
|
||||
# Make src/ importable so we can reuse the stdlib-only prompt module
|
||||
_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
|
||||
if _SRC_DIR not in sys.path:
|
||||
sys.path.insert(0, _SRC_DIR)
|
||||
|
||||
from atocore.engineering._graduation_prompt import ( # noqa: E402
|
||||
GRADUATION_PROMPT_VERSION,
|
||||
SYSTEM_PROMPT,
|
||||
build_user_message,
|
||||
parse_graduation_output,
|
||||
)
|
||||
|
||||
|
||||
DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
|
||||
DEFAULT_MODEL = os.environ.get("ATOCORE_LLM_EXTRACTOR_MODEL", "sonnet")
|
||||
DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_GRADUATION_TIMEOUT_S", "90"))
|
||||
|
||||
_sandbox_cwd = None
|
||||
|
||||
|
||||
def get_sandbox_cwd() -> str:
|
||||
"""Temp cwd so claude CLI doesn't auto-discover project CLAUDE.md files."""
|
||||
global _sandbox_cwd
|
||||
if _sandbox_cwd is None:
|
||||
_sandbox_cwd = tempfile.mkdtemp(prefix="ato-graduate-")
|
||||
return _sandbox_cwd
|
||||
|
||||
|
||||
def api_get(base_url: str, path: str) -> dict:
|
||||
req = urllib.request.Request(f"{base_url}{path}")
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def api_post(base_url: str, path: str, body: dict | None = None) -> dict:
|
||||
data = json.dumps(body or {}).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{base_url}{path}", method="POST",
|
||||
headers={"Content-Type": "application/json"}, data=data,
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def graduate_one(memory: dict, model: str, timeout_s: float) -> dict[str, Any] | None:
|
||||
"""Ask claude whether this memory describes a typed entity.
|
||||
|
||||
Returns None on any failure (parse error, timeout, exit!=0).
|
||||
Applies retry+pacing to match the pattern in auto_triage/batch_extract.
|
||||
"""
|
||||
if not shutil.which("claude"):
|
||||
return None
|
||||
|
||||
user_msg = build_user_message(
|
||||
memory_content=memory.get("content", "") or "",
|
||||
memory_project=memory.get("project", "") or "",
|
||||
memory_type=memory.get("memory_type", "") or "",
|
||||
)
|
||||
|
||||
args = [
|
||||
"claude", "-p",
|
||||
"--model", model,
|
||||
"--append-system-prompt", SYSTEM_PROMPT,
|
||||
"--disable-slash-commands",
|
||||
user_msg,
|
||||
]
|
||||
|
||||
last_error = ""
|
||||
for attempt in range(3):
|
||||
if attempt > 0:
|
||||
time.sleep(2 ** attempt)
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
args, capture_output=True, text=True,
|
||||
timeout=timeout_s, cwd=get_sandbox_cwd(),
|
||||
encoding="utf-8", errors="replace",
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
last_error = "timeout"
|
||||
continue
|
||||
except Exception as exc:
|
||||
last_error = f"subprocess error: {exc}"
|
||||
continue
|
||||
|
||||
if completed.returncode == 0:
|
||||
return parse_graduation_output(completed.stdout or "")
|
||||
|
||||
stderr = (completed.stderr or "").strip()[:200]
|
||||
last_error = f"exit_{completed.returncode}: {stderr}" if stderr else f"exit_{completed.returncode}"
|
||||
|
||||
print(f" ! claude failed after 3 tries: {last_error}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def create_entity_candidate(
|
||||
base_url: str,
|
||||
decision: dict,
|
||||
memory: dict,
|
||||
) -> str | None:
|
||||
"""Create an entity candidate with source_refs pointing at the memory."""
|
||||
try:
|
||||
result = api_post(base_url, "/entities", {
|
||||
"entity_type": decision["entity_type"],
|
||||
"name": decision["name"],
|
||||
"project": memory.get("project", "") or "",
|
||||
"description": decision["description"],
|
||||
"properties": {
|
||||
"graduated_from_memory": memory["id"],
|
||||
"proposed_relationships": decision["relationships"],
|
||||
"prompt_version": GRADUATION_PROMPT_VERSION,
|
||||
},
|
||||
"status": "candidate",
|
||||
"confidence": decision["confidence"],
|
||||
"source_refs": [f"memory:{memory['id']}"],
|
||||
})
|
||||
return result.get("id")
|
||||
except Exception as e:
|
||||
print(f" ! entity create failed: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Graduate active memories into entity candidates")
|
||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
|
||||
parser.add_argument("--model", default=DEFAULT_MODEL)
|
||||
parser.add_argument("--project", default=None, help="Only graduate memories in this project")
|
||||
parser.add_argument("--limit", type=int, default=50, help="Max memories to process")
|
||||
parser.add_argument("--min-confidence", type=float, default=0.3,
|
||||
help="Skip memories with confidence below this (they're probably noise)")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Show decisions without creating entities")
|
||||
args = parser.parse_args()
|
||||
|
||||
# Fetch active memories
|
||||
query = "status=active"
|
||||
query += f"&limit={args.limit}"
|
||||
if args.project:
|
||||
query += f"&project={args.project}"
|
||||
result = api_get(args.base_url, f"/memory?{query}")
|
||||
memories = result.get("memories", [])
|
||||
|
||||
# Filter by min_confidence + skip already-graduated
|
||||
memories = [m for m in memories
|
||||
if m.get("confidence", 0) >= args.min_confidence
|
||||
and m.get("status") != "graduated"]
|
||||
|
||||
print(f"graduating: {len(memories)} memories project={args.project or '(all)'} "
|
||||
f"model={args.model} dry_run={args.dry_run}")
|
||||
|
||||
graduated = 0
|
||||
skipped = 0
|
||||
errors = 0
|
||||
entities_created: list[str] = []
|
||||
|
||||
for i, mem in enumerate(memories, 1):
|
||||
if i > 1:
|
||||
time.sleep(0.5) # light pacing, matches auto_triage
|
||||
mid = mem["id"]
|
||||
label = f"[{i:3d}/{len(memories)}] {mid[:8]} [{mem.get('memory_type','?')}]"
|
||||
|
||||
decision = graduate_one(mem, args.model, DEFAULT_TIMEOUT_S)
|
||||
if decision is None:
|
||||
print(f" ERROR {label} (graduate_one returned None)")
|
||||
errors += 1
|
||||
continue
|
||||
|
||||
if not decision.get("graduate"):
|
||||
reason = decision.get("reason", "(no reason)")
|
||||
print(f" skip {label} {reason}")
|
||||
skipped += 1
|
||||
continue
|
||||
|
||||
etype = decision["entity_type"]
|
||||
ename = decision["name"]
|
||||
nrel = len(decision.get("relationships", []))
|
||||
|
||||
if args.dry_run:
|
||||
print(f" WOULD {label} → [{etype}] {ename!r} ({nrel} rels)")
|
||||
graduated += 1
|
||||
else:
|
||||
entity_id = create_entity_candidate(args.base_url, decision, mem)
|
||||
if entity_id:
|
||||
print(f" CREATE {label} → [{etype}] {ename!r} ({nrel} rels) entity={entity_id[:8]}")
|
||||
graduated += 1
|
||||
entities_created.append(entity_id)
|
||||
else:
|
||||
errors += 1
|
||||
|
||||
print(f"\ntotal: graduated={graduated} skipped={skipped} errors={errors}")
|
||||
if entities_created:
|
||||
print(f"Review at /admin/triage ({len(entities_created)} entity candidates created)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
49
scripts/integrity_check.py
Normal file
49
scripts/integrity_check.py
Normal file
@@ -0,0 +1,49 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Trigger the integrity check inside the AtoCore container.
|
||||
|
||||
The scan itself lives in the container (needs direct DB access via the
|
||||
already-loaded sqlite connection). This host-side wrapper just POSTs to
|
||||
/admin/integrity-check so the nightly cron can kick it off from bash
|
||||
without needing the container's Python deps on the host.
|
||||
|
||||
Usage:
|
||||
python3 scripts/integrity_check.py [--base-url URL] [--dry-run]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--base-url", default=os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100"))
|
||||
parser.add_argument("--dry-run", action="store_true",
|
||||
help="Report without persisting findings to state")
|
||||
args = parser.parse_args()
|
||||
|
||||
url = args.base_url.rstrip("/") + "/admin/integrity-check"
|
||||
if args.dry_run:
|
||||
url += "?persist=false"
|
||||
|
||||
req = urllib.request.Request(url, method="POST")
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
result = json.loads(resp.read().decode("utf-8"))
|
||||
except Exception as e:
|
||||
print(f"ERROR: could not reach {url}: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
print(json.dumps(result, indent=2))
|
||||
if not result.get("ok", True):
|
||||
# Non-zero exit so cron logs flag it
|
||||
sys.exit(2)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
463
scripts/memory_dedup.py
Normal file
463
scripts/memory_dedup.py
Normal file
@@ -0,0 +1,463 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Phase 7A — semantic memory dedup detector.
|
||||
|
||||
Finds clusters of near-duplicate active memories and writes merge-
|
||||
candidate proposals for human review in the triage UI.
|
||||
|
||||
Algorithm:
|
||||
1. Fetch active memories via HTTP
|
||||
2. Group by (project, memory_type) — cross-bucket merges are deferred
|
||||
to Phase 7B contradiction flow
|
||||
3. Within each group, embed contents via atocore.retrieval.embeddings
|
||||
4. Greedy transitive cluster at similarity >= threshold
|
||||
5. For each cluster of size >= 2, ask claude-p to draft unified content
|
||||
6. POST the proposal to /admin/memory/merge-candidates/create (server-
|
||||
side dedupes by the sorted memory-id set, so re-runs don't double-
|
||||
create)
|
||||
|
||||
Host-side because claude CLI lives on Dalidou, not the container. Reuses
|
||||
the same PYTHONPATH=src pattern as scripts/graduate_memories.py for
|
||||
atocore imports (embeddings, similarity, prompt module).
|
||||
|
||||
Usage:
|
||||
python3 scripts/memory_dedup.py --base-url http://127.0.0.1:8100 \\
|
||||
--similarity-threshold 0.88 --max-batch 50
|
||||
|
||||
Threshold conventions (see Phase 7 doc):
|
||||
0.88 interactive / default — balanced precision/recall
|
||||
0.90 nightly cron — tight, only near-duplicates
|
||||
0.85 weekly cron — deeper cleanup
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from collections import defaultdict
|
||||
from typing import Any
|
||||
|
||||
# Make src/ importable — same pattern as graduate_memories.py
|
||||
_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
_SRC_DIR = os.path.abspath(os.path.join(_SCRIPT_DIR, "..", "src"))
|
||||
if _SRC_DIR not in sys.path:
|
||||
sys.path.insert(0, _SRC_DIR)
|
||||
|
||||
from atocore.memory._dedup_prompt import ( # noqa: E402
|
||||
DEDUP_PROMPT_VERSION,
|
||||
SYSTEM_PROMPT,
|
||||
TIER2_SYSTEM_PROMPT,
|
||||
build_tier2_user_message,
|
||||
build_user_message,
|
||||
normalize_merge_verdict,
|
||||
parse_merge_verdict,
|
||||
)
|
||||
from atocore.memory.similarity import cluster_by_threshold # noqa: E402
|
||||
|
||||
DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://127.0.0.1:8100")
|
||||
DEFAULT_MODEL = os.environ.get("ATOCORE_DEDUP_MODEL", "sonnet")
|
||||
DEFAULT_TIER2_MODEL = os.environ.get("ATOCORE_DEDUP_TIER2_MODEL", "opus")
|
||||
DEFAULT_TIMEOUT_S = float(os.environ.get("ATOCORE_DEDUP_TIMEOUT_S", "60"))
|
||||
|
||||
# Phase 7A.1 — auto-merge tiering thresholds.
|
||||
# TIER-1 auto-approve: if sonnet confidence >= this AND min pairwise
|
||||
# similarity >= AUTO_APPROVE_SIM AND all sources share project+type → merge silently.
|
||||
AUTO_APPROVE_CONF = float(os.environ.get("ATOCORE_DEDUP_AUTO_APPROVE_CONF", "0.8"))
|
||||
AUTO_APPROVE_SIM = float(os.environ.get("ATOCORE_DEDUP_AUTO_APPROVE_SIM", "0.92"))
|
||||
# TIER-2 escalation band: sonnet uncertain but pair is similar enough to be worth opus time.
|
||||
TIER2_MIN_CONF = float(os.environ.get("ATOCORE_DEDUP_TIER2_MIN_CONF", "0.5"))
|
||||
TIER2_MIN_SIM = float(os.environ.get("ATOCORE_DEDUP_TIER2_MIN_SIM", "0.85"))
|
||||
|
||||
_sandbox_cwd = None
|
||||
|
||||
|
||||
def get_sandbox_cwd() -> str:
|
||||
global _sandbox_cwd
|
||||
if _sandbox_cwd is None:
|
||||
_sandbox_cwd = tempfile.mkdtemp(prefix="ato-dedup-")
|
||||
return _sandbox_cwd
|
||||
|
||||
|
||||
def api_get(base_url: str, path: str) -> dict:
|
||||
req = urllib.request.Request(f"{base_url}{path}")
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def api_post(base_url: str, path: str, body: dict | None = None) -> dict:
|
||||
data = json.dumps(body or {}).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{base_url}{path}", method="POST",
|
||||
headers={"Content-Type": "application/json"}, data=data,
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
return json.loads(resp.read().decode("utf-8"))
|
||||
|
||||
|
||||
def call_claude(system_prompt: str, user_message: str, model: str, timeout_s: float) -> tuple[str | None, str | None]:
|
||||
"""Shared CLI caller with retry + stderr capture (mirrors auto_triage)."""
|
||||
if not shutil.which("claude"):
|
||||
return None, "claude CLI not available"
|
||||
args = [
|
||||
"claude", "-p",
|
||||
"--model", model,
|
||||
"--append-system-prompt", system_prompt,
|
||||
"--disable-slash-commands",
|
||||
user_message,
|
||||
]
|
||||
last_error = ""
|
||||
for attempt in range(3):
|
||||
if attempt > 0:
|
||||
time.sleep(2 ** attempt)
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
args, capture_output=True, text=True,
|
||||
timeout=timeout_s, cwd=get_sandbox_cwd(),
|
||||
encoding="utf-8", errors="replace",
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
last_error = f"{model} timed out"
|
||||
continue
|
||||
except Exception as exc:
|
||||
last_error = f"subprocess error: {exc}"
|
||||
continue
|
||||
if completed.returncode == 0:
|
||||
return (completed.stdout or "").strip(), None
|
||||
stderr = (completed.stderr or "").strip()[:200]
|
||||
last_error = f"{model} exit {completed.returncode}: {stderr}" if stderr else f"{model} exit {completed.returncode}"
|
||||
return None, last_error
|
||||
|
||||
|
||||
def fetch_active_memories(base_url: str, project: str | None) -> list[dict]:
|
||||
# The /memory endpoint with active_only=true returns active memories.
|
||||
# Graduated memories are exempt from dedup — they're frozen pointers
|
||||
# to entities. Filter them out on the client side.
|
||||
params = "active_only=true&limit=2000"
|
||||
if project:
|
||||
params += f"&project={urllib.request.quote(project)}"
|
||||
try:
|
||||
result = api_get(base_url, f"/memory?{params}")
|
||||
except Exception as e:
|
||||
print(f"ERROR: could not fetch memories: {e}", file=sys.stderr)
|
||||
return []
|
||||
mems = result.get("memories", [])
|
||||
return [m for m in mems if (m.get("status") or "active") == "active"]
|
||||
|
||||
|
||||
def group_memories(mems: list[dict]) -> dict[tuple[str, str], list[dict]]:
|
||||
"""Bucket by (project, memory_type). Empty project is its own bucket."""
|
||||
buckets: dict[tuple[str, str], list[dict]] = defaultdict(list)
|
||||
for m in mems:
|
||||
key = ((m.get("project") or "").strip().lower(), (m.get("memory_type") or "").strip().lower())
|
||||
buckets[key].append(m)
|
||||
return buckets
|
||||
|
||||
|
||||
def draft_merge(sources: list[dict], model: str, timeout_s: float) -> dict[str, Any] | None:
|
||||
"""Tier-1 draft: cheap sonnet call proposes the unified content."""
|
||||
user_msg = build_user_message(sources)
|
||||
raw, err = call_claude(SYSTEM_PROMPT, user_msg, model, timeout_s)
|
||||
if err:
|
||||
print(f" WARN: claude tier-1 failed: {err}", file=sys.stderr)
|
||||
return None
|
||||
parsed = parse_merge_verdict(raw or "")
|
||||
if parsed is None:
|
||||
print(f" WARN: could not parse tier-1 verdict: {(raw or '')[:200]}", file=sys.stderr)
|
||||
return None
|
||||
return normalize_merge_verdict(parsed)
|
||||
|
||||
|
||||
def tier2_review(
|
||||
sources: list[dict],
|
||||
tier1_verdict: dict[str, Any],
|
||||
model: str,
|
||||
timeout_s: float,
|
||||
) -> dict[str, Any] | None:
|
||||
"""Tier-2 second opinion: opus confirms or overrides the tier-1 draft."""
|
||||
user_msg = build_tier2_user_message(sources, tier1_verdict)
|
||||
raw, err = call_claude(TIER2_SYSTEM_PROMPT, user_msg, model, timeout_s)
|
||||
if err:
|
||||
print(f" WARN: claude tier-2 failed: {err}", file=sys.stderr)
|
||||
return None
|
||||
parsed = parse_merge_verdict(raw or "")
|
||||
if parsed is None:
|
||||
print(f" WARN: could not parse tier-2 verdict: {(raw or '')[:200]}", file=sys.stderr)
|
||||
return None
|
||||
return normalize_merge_verdict(parsed)
|
||||
|
||||
|
||||
def min_pairwise_similarity(texts: list[str]) -> float:
|
||||
"""Return the minimum pairwise cosine similarity across N texts.
|
||||
|
||||
Used to sanity-check a transitive cluster: A~B~C doesn't guarantee
|
||||
A~C, so the auto-approve threshold should be met by the WEAKEST
|
||||
pair, not just by the strongest. If the cluster has N=2 this is just
|
||||
the one pairwise similarity.
|
||||
"""
|
||||
if len(texts) < 2:
|
||||
return 0.0
|
||||
# Reuse similarity_matrix rather than computing it ourselves
|
||||
from atocore.memory.similarity import similarity_matrix
|
||||
m = similarity_matrix(texts)
|
||||
min_sim = 1.0
|
||||
for i in range(len(texts)):
|
||||
for j in range(i + 1, len(texts)):
|
||||
if m[i][j] < min_sim:
|
||||
min_sim = m[i][j]
|
||||
return min_sim
|
||||
|
||||
|
||||
def submit_candidate(
|
||||
base_url: str,
|
||||
memory_ids: list[str],
|
||||
similarity: float,
|
||||
verdict: dict[str, Any],
|
||||
dry_run: bool,
|
||||
) -> str | None:
|
||||
body = {
|
||||
"memory_ids": memory_ids,
|
||||
"similarity": similarity,
|
||||
"proposed_content": verdict["content"],
|
||||
"proposed_memory_type": verdict["memory_type"],
|
||||
"proposed_project": verdict["project"],
|
||||
"proposed_tags": verdict["domain_tags"],
|
||||
"proposed_confidence": verdict["confidence"],
|
||||
"reason": verdict["reason"],
|
||||
}
|
||||
if dry_run:
|
||||
print(f" [dry-run] would POST: {json.dumps(body)[:200]}...")
|
||||
return "dry-run"
|
||||
try:
|
||||
result = api_post(base_url, "/admin/memory/merge-candidates/create", body)
|
||||
return result.get("candidate_id")
|
||||
except urllib.error.HTTPError as e:
|
||||
print(f" ERROR: submit failed: {e.code} {e.read().decode()[:200]}", file=sys.stderr)
|
||||
return None
|
||||
except Exception as e:
|
||||
print(f" ERROR: submit failed: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def auto_approve(base_url: str, candidate_id: str, actor: str, dry_run: bool) -> str | None:
|
||||
"""POST /admin/memory/merge-candidates/{id}/approve. Returns result_memory_id."""
|
||||
if dry_run:
|
||||
return "dry-run"
|
||||
try:
|
||||
result = api_post(
|
||||
base_url,
|
||||
f"/admin/memory/merge-candidates/{candidate_id}/approve",
|
||||
{"actor": actor},
|
||||
)
|
||||
return result.get("result_memory_id")
|
||||
except Exception as e:
|
||||
print(f" ERROR: auto-approve failed: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def same_bucket(sources: list[dict]) -> bool:
|
||||
"""All sources share the same (project, memory_type)."""
|
||||
if not sources:
|
||||
return False
|
||||
proj = (sources[0].get("project") or "").strip().lower()
|
||||
mtype = (sources[0].get("memory_type") or "").strip().lower()
|
||||
for s in sources[1:]:
|
||||
if (s.get("project") or "").strip().lower() != proj:
|
||||
return False
|
||||
if (s.get("memory_type") or "").strip().lower() != mtype:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Phase 7A semantic dedup detector (tiered)")
|
||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
|
||||
parser.add_argument("--project", default="", help="Only scan this project (empty = all)")
|
||||
parser.add_argument("--similarity-threshold", type=float, default=0.88)
|
||||
parser.add_argument("--max-batch", type=int, default=50,
|
||||
help="Max clusters to process per run")
|
||||
parser.add_argument("--model", default=DEFAULT_MODEL, help="Tier-1 model (default: sonnet)")
|
||||
parser.add_argument("--tier2-model", default=DEFAULT_TIER2_MODEL, help="Tier-2 model (default: opus)")
|
||||
parser.add_argument("--timeout-s", type=float, default=DEFAULT_TIMEOUT_S)
|
||||
parser.add_argument("--no-auto-approve", action="store_true",
|
||||
help="Disable autonomous merging; all merges land in human triage queue")
|
||||
parser.add_argument("--dry-run", action="store_true")
|
||||
args = parser.parse_args()
|
||||
|
||||
base = args.base_url.rstrip("/")
|
||||
autonomous = not args.no_auto_approve
|
||||
|
||||
print(
|
||||
f"memory_dedup {DEDUP_PROMPT_VERSION} | threshold={args.similarity_threshold} | "
|
||||
f"tier1={args.model} tier2={args.tier2_model} | "
|
||||
f"autonomous={autonomous} | "
|
||||
f"auto-approve: conf>={AUTO_APPROVE_CONF} sim>={AUTO_APPROVE_SIM}"
|
||||
)
|
||||
mems = fetch_active_memories(base, args.project or None)
|
||||
print(f"fetched {len(mems)} active memories")
|
||||
if not mems:
|
||||
return
|
||||
|
||||
buckets = group_memories(mems)
|
||||
print(f"grouped into {len(buckets)} (project, memory_type) buckets")
|
||||
|
||||
clusters_found = 0
|
||||
auto_merged_tier1 = 0
|
||||
auto_merged_tier2 = 0
|
||||
human_candidates = 0
|
||||
tier1_rejections = 0
|
||||
tier2_overrides = 0 # opus disagreed with sonnet
|
||||
skipped_low_sim = 0
|
||||
skipped_existing = 0
|
||||
processed = 0
|
||||
|
||||
for (proj, mtype), group in sorted(buckets.items()):
|
||||
if len(group) < 2:
|
||||
continue
|
||||
if processed >= args.max_batch:
|
||||
print(f"reached max-batch={args.max_batch}, stopping")
|
||||
break
|
||||
|
||||
texts = [(m.get("content") or "") for m in group]
|
||||
clusters = cluster_by_threshold(texts, args.similarity_threshold)
|
||||
clusters = [c for c in clusters if len(c) >= 2]
|
||||
if not clusters:
|
||||
continue
|
||||
|
||||
print(f"\n[{proj or '(global)'}/{mtype}] {len(group)} mems → {len(clusters)} cluster(s)")
|
||||
for cluster in clusters:
|
||||
if processed >= args.max_batch:
|
||||
break
|
||||
clusters_found += 1
|
||||
sources = [group[i] for i in cluster]
|
||||
ids = [s["id"] for s in sources]
|
||||
cluster_texts = [texts[i] for i in cluster]
|
||||
min_sim = min_pairwise_similarity(cluster_texts)
|
||||
print(f" cluster of {len(cluster)} (min_sim={min_sim:.3f}): {[s['id'][:8] for s in sources]}")
|
||||
|
||||
# Tier-1 draft
|
||||
tier1 = draft_merge(sources, args.model, args.timeout_s)
|
||||
processed += 1
|
||||
if tier1 is None:
|
||||
continue
|
||||
if tier1["action"] == "reject":
|
||||
tier1_rejections += 1
|
||||
print(f" TIER-1 rejected: {tier1['reason'][:100]}")
|
||||
continue
|
||||
|
||||
# --- Tiering decision ---
|
||||
bucket_ok = same_bucket(sources)
|
||||
tier1_ok = (
|
||||
tier1["confidence"] >= AUTO_APPROVE_CONF
|
||||
and min_sim >= AUTO_APPROVE_SIM
|
||||
and bucket_ok
|
||||
)
|
||||
|
||||
if autonomous and tier1_ok:
|
||||
cid = submit_candidate(base, ids, min_sim, tier1, args.dry_run)
|
||||
if cid == "dry-run":
|
||||
auto_merged_tier1 += 1
|
||||
print(" [dry-run] would auto-merge (tier-1)")
|
||||
elif cid:
|
||||
new_id = auto_approve(base, cid, actor="auto-dedup-tier1", dry_run=args.dry_run)
|
||||
if new_id:
|
||||
auto_merged_tier1 += 1
|
||||
print(f" ✅ auto-merged (tier-1) → {str(new_id)[:8]}")
|
||||
else:
|
||||
print(f" ⚠️ tier-1 approve failed; candidate {cid[:8]} left pending")
|
||||
human_candidates += 1
|
||||
else:
|
||||
skipped_existing += 1
|
||||
time.sleep(0.3)
|
||||
continue
|
||||
|
||||
# Not tier-1 auto-approve. Decide if it's worth tier-2 escalation.
|
||||
tier2_eligible = (
|
||||
autonomous
|
||||
and min_sim >= TIER2_MIN_SIM
|
||||
and tier1["confidence"] >= TIER2_MIN_CONF
|
||||
and bucket_ok
|
||||
)
|
||||
|
||||
if tier2_eligible:
|
||||
print(" → escalating to tier-2 (opus)…")
|
||||
tier2 = tier2_review(sources, tier1, args.tier2_model, args.timeout_s)
|
||||
if tier2 is None:
|
||||
# Opus errored. Fall back to human triage.
|
||||
cid = submit_candidate(base, ids, min_sim, tier1, args.dry_run)
|
||||
if cid and cid != "dry-run":
|
||||
human_candidates += 1
|
||||
print(f" → candidate {cid[:8]} (tier-2 errored, human review)")
|
||||
time.sleep(0.5)
|
||||
continue
|
||||
|
||||
if tier2["action"] == "reject":
|
||||
tier2_overrides += 1
|
||||
print(f" ❌ TIER-2 override (reject): {tier2['reason'][:100]}")
|
||||
time.sleep(0.5)
|
||||
continue
|
||||
|
||||
if tier2["confidence"] >= AUTO_APPROVE_CONF:
|
||||
# Opus confirms. Auto-merge using opus's (possibly refined) content.
|
||||
cid = submit_candidate(base, ids, min_sim, tier2, args.dry_run)
|
||||
if cid == "dry-run":
|
||||
auto_merged_tier2 += 1
|
||||
print(" [dry-run] would auto-merge (tier-2)")
|
||||
elif cid:
|
||||
new_id = auto_approve(base, cid, actor="auto-dedup-tier2", dry_run=args.dry_run)
|
||||
if new_id:
|
||||
auto_merged_tier2 += 1
|
||||
print(f" ✅ auto-merged (tier-2) → {str(new_id)[:8]}")
|
||||
else:
|
||||
human_candidates += 1
|
||||
print(f" ⚠️ tier-2 approve failed; candidate {cid[:8]} left pending")
|
||||
else:
|
||||
skipped_existing += 1
|
||||
time.sleep(0.5)
|
||||
continue
|
||||
|
||||
# Opus confirmed but low confidence → human review with opus's draft
|
||||
cid = submit_candidate(base, ids, min_sim, tier2, args.dry_run)
|
||||
if cid and cid != "dry-run":
|
||||
human_candidates += 1
|
||||
print(f" → candidate {cid[:8]} (tier-2 low-confidence, human review)")
|
||||
time.sleep(0.5)
|
||||
continue
|
||||
|
||||
# Below tier-2 eligibility (either non-autonomous mode, or
|
||||
# similarity too low / cross-bucket). Always human review.
|
||||
if min_sim < TIER2_MIN_SIM or not bucket_ok:
|
||||
skipped_low_sim += 1
|
||||
# Still create a human candidate so it's visible, but log why
|
||||
print(f" → below auto-tier thresholds (min_sim={min_sim:.3f}, bucket_ok={bucket_ok})")
|
||||
|
||||
cid = submit_candidate(base, ids, min_sim, tier1, args.dry_run)
|
||||
if cid == "dry-run":
|
||||
human_candidates += 1
|
||||
elif cid:
|
||||
human_candidates += 1
|
||||
print(f" → candidate {cid[:8]} (human review)")
|
||||
else:
|
||||
skipped_existing += 1
|
||||
time.sleep(0.3)
|
||||
|
||||
print(
|
||||
f"\nsummary: clusters_found={clusters_found} "
|
||||
f"auto_merged_tier1={auto_merged_tier1} "
|
||||
f"auto_merged_tier2={auto_merged_tier2} "
|
||||
f"human_candidates={human_candidates} "
|
||||
f"tier1_rejections={tier1_rejections} "
|
||||
f"tier2_overrides={tier2_overrides} "
|
||||
f"skipped_low_sim={skipped_low_sim} "
|
||||
f"skipped_existing={skipped_existing}"
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
87
scripts/windows/atocore-backup-pull.ps1
Normal file
87
scripts/windows/atocore-backup-pull.ps1
Normal file
@@ -0,0 +1,87 @@
|
||||
# atocore-backup-pull.ps1
|
||||
#
|
||||
# Pull the latest AtoCore backup snapshot from Dalidou to this Windows machine.
|
||||
# Designed to be run by Windows Task Scheduler. Fail-open by design -- if
|
||||
# Dalidou is unreachable (laptop on the road, etc.), exit cleanly without error.
|
||||
#
|
||||
# Usage (manual test):
|
||||
# powershell.exe -ExecutionPolicy Bypass -File atocore-backup-pull.ps1
|
||||
#
|
||||
# Scheduled task: see docs/windows-backup-setup.md for Task Scheduler config.
|
||||
|
||||
$ErrorActionPreference = "Continue"
|
||||
|
||||
# --- Configuration ---
|
||||
$Remote = "papa@dalidou"
|
||||
$RemoteSnapshots = "/srv/storage/atocore/backups/snapshots"
|
||||
$LocalBackupDir = "$env:USERPROFILE\Documents\ATOCore_Backups"
|
||||
$LogDir = "$LocalBackupDir\_logs"
|
||||
$ReachabilityTest = 5 # seconds timeout for SSH probe
|
||||
|
||||
# --- Setup ---
|
||||
if (-not (Test-Path $LocalBackupDir)) {
|
||||
New-Item -ItemType Directory -Path $LocalBackupDir -Force | Out-Null
|
||||
}
|
||||
if (-not (Test-Path $LogDir)) {
|
||||
New-Item -ItemType Directory -Path $LogDir -Force | Out-Null
|
||||
}
|
||||
|
||||
$Timestamp = Get-Date -Format "yyyy-MM-dd_HHmmss"
|
||||
$LogFile = "$LogDir\backup-$Timestamp.log"
|
||||
|
||||
function Log($msg) {
|
||||
$line = "[{0}] {1}" -f (Get-Date -Format "yyyy-MM-dd HH:mm:ss"), $msg
|
||||
Write-Host $line
|
||||
Add-Content -Path $LogFile -Value $line
|
||||
}
|
||||
|
||||
Log "=== AtoCore backup pull starting ==="
|
||||
Log "Remote: $Remote"
|
||||
Log "Local target: $LocalBackupDir"
|
||||
|
||||
# --- Reachability check: fail open if Dalidou is offline ---
|
||||
Log "Checking Dalidou reachability..."
|
||||
$probe = & ssh -o ConnectTimeout=$ReachabilityTest -o BatchMode=yes `
|
||||
-o StrictHostKeyChecking=accept-new `
|
||||
$Remote "echo ok" 2>&1
|
||||
if ($LASTEXITCODE -ne 0 -or $probe -ne "ok") {
|
||||
Log "Dalidou unreachable ($probe) -- fail-open exit"
|
||||
exit 0
|
||||
}
|
||||
Log "Dalidou reachable."
|
||||
|
||||
# --- Pull the entire snapshots directory ---
|
||||
# Dalidou's retention policy (7 daily + 4 weekly + 6 monthly) already caps
|
||||
# the snapshot count, so pulling the whole dir is bounded and simple. scp
|
||||
# will overwrite local files -- we rely on this to pick up new snapshots.
|
||||
Log "Pulling snapshots via scp..."
|
||||
$LocalSnapshotsDir = Join-Path $LocalBackupDir "snapshots"
|
||||
if (-not (Test-Path $LocalSnapshotsDir)) {
|
||||
New-Item -ItemType Directory -Path $LocalSnapshotsDir -Force | Out-Null
|
||||
}
|
||||
|
||||
& scp -o BatchMode=yes -r "${Remote}:${RemoteSnapshots}/*" "$LocalSnapshotsDir\" 2>&1 |
|
||||
ForEach-Object { Add-Content -Path $LogFile -Value $_ }
|
||||
|
||||
if ($LASTEXITCODE -ne 0) {
|
||||
Log "scp failed with exit $LASTEXITCODE"
|
||||
exit 0 # fail-open
|
||||
}
|
||||
|
||||
# --- Stats ---
|
||||
$snapshots = Get-ChildItem -Path $LocalSnapshotsDir -Directory |
|
||||
Where-Object { $_.Name -match "^\d{8}T\d{6}Z$" } |
|
||||
Sort-Object Name -Descending
|
||||
|
||||
$totalSize = (Get-ChildItem $LocalSnapshotsDir -Recurse -File | Measure-Object -Property Length -Sum).Sum
|
||||
$SizeMB = [math]::Round($totalSize / 1MB, 2)
|
||||
$latest = if ($snapshots.Count -gt 0) { $snapshots[0].Name } else { "(none)" }
|
||||
|
||||
Log ("Pulled {0} snapshots successfully (total {1} MB, latest: {2})" -f $snapshots.Count, $SizeMB, $latest)
|
||||
Log "=== backup complete ==="
|
||||
|
||||
# --- Log retention: keep last 30 log files ---
|
||||
Get-ChildItem -Path $LogDir -Filter "backup-*.log" |
|
||||
Sort-Object Name -Descending |
|
||||
Select-Object -Skip 30 |
|
||||
ForEach-Object { Remove-Item $_.FullName -Force -ErrorAction SilentlyContinue }
|
||||
File diff suppressed because it is too large
Load Diff
@@ -508,6 +508,23 @@ def _build_engineering_context(
|
||||
f" {direction} {rel.relationship_type} [{other.entity_type}] {other.name}"
|
||||
)
|
||||
|
||||
# Phase 5H: append a compact gaps summary so the LLM always sees
|
||||
# "what we're currently missing" alongside the entity neighborhood.
|
||||
# This is the director's most-used insight — orphan requirements,
|
||||
# risky decisions, unsupported claims — surfaced in every context pack
|
||||
# for project-scoped queries.
|
||||
try:
|
||||
from atocore.engineering.queries import all_gaps as _all_gaps
|
||||
gaps = _all_gaps(project)
|
||||
orphan_n = gaps["orphan_requirements"]["count"]
|
||||
risky_n = gaps["risky_decisions"]["count"]
|
||||
unsup_n = gaps["unsupported_claims"]["count"]
|
||||
if orphan_n or risky_n or unsup_n:
|
||||
lines.append("")
|
||||
lines.append(f"Gaps: {orphan_n} orphan reqs, {risky_n} risky decisions, {unsup_n} unsupported claims")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
lines.append("--- End Engineering Context ---")
|
||||
text = "\n".join(lines)
|
||||
|
||||
|
||||
194
src/atocore/engineering/_graduation_prompt.py
Normal file
194
src/atocore/engineering/_graduation_prompt.py
Normal file
@@ -0,0 +1,194 @@
|
||||
"""Shared LLM prompt for memory → entity graduation (Phase 5F).
|
||||
|
||||
Mirrors the pattern of ``atocore.memory._llm_prompt``: stdlib-only so both
|
||||
the container extractor path and the host-side graduate_memories.py script
|
||||
use the same system prompt and parser, eliminating drift.
|
||||
|
||||
Graduation asks: "does this active memory describe a TYPED engineering entity
|
||||
that belongs in the knowledge graph?" If yes, produce an entity candidate
|
||||
with type + name + description + zero-or-more relationship hints. If no,
|
||||
return null so the memory stays as-is.
|
||||
|
||||
Design note: we DON'T ask the LLM to resolve targets of relationships (e.g.,
|
||||
"connect to Subsystem 'Optics'"). That's done in a second pass after human
|
||||
review — partly to keep this prompt cheap, partly because name-matching
|
||||
targets across projects is a hard problem worth its own pass.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
GRADUATION_PROMPT_VERSION = "graduate-0.1.0"
|
||||
MAX_CONTENT_CHARS = 1500
|
||||
|
||||
ENTITY_TYPES = {
|
||||
"project",
|
||||
"system",
|
||||
"subsystem",
|
||||
"component",
|
||||
"interface",
|
||||
"requirement",
|
||||
"constraint",
|
||||
"decision",
|
||||
"material",
|
||||
"parameter",
|
||||
"analysis_model",
|
||||
"result",
|
||||
"validation_claim",
|
||||
"vendor",
|
||||
"process",
|
||||
}
|
||||
|
||||
SYSTEM_PROMPT = """You are a knowledge-graph curator for an engineering firm's context system (AtoCore).
|
||||
|
||||
Your job: given one active MEMORY (a curated fact about an engineering project), decide whether it describes a TYPED engineering entity that belongs in the structured graph. If yes, emit the entity candidate. If no, return null.
|
||||
|
||||
A memory gets graduated when its content names a specific thing that has lifecycle, relationships, or cross-references in engineering work. A memory stays as-is when it's a general observation, preference, or loose context.
|
||||
|
||||
ENTITY TYPES (choose the best fit):
|
||||
|
||||
- project — a named project (usually already registered; rare to emit)
|
||||
- subsystem — a named chunk of a system with defined boundaries (e.g., "Primary Optics", "Cable Tensioning", "Motion Control")
|
||||
- component — a discrete physical or logical part (e.g., "Primary Mirror", "Pivot Pin", "Z-axis Servo Drive")
|
||||
- interface — a named boundary between two subsystems/components (e.g., "Mirror-to-Cell mounting interface")
|
||||
- requirement — a "must" or "shall" statement (e.g., "Surface figure < 25nm RMS")
|
||||
- constraint — a non-negotiable limit (e.g., "Thermal operating range 0-40°C")
|
||||
- decision — a committed design direction (e.g., "Selected Zerodur over ULE for primary blank")
|
||||
- material — a named material used in a component (e.g., "Zerodur", "Invar 36")
|
||||
- parameter — a specific named value or assumption (e.g., "Ambient temperature 22°C", "Lead time 6 weeks")
|
||||
- analysis_model — a named FEA / optical / thermal model (e.g., "Preston wear model v2")
|
||||
- result — a named measurement or simulation output (e.g., "FEA thermal sweep 2026-03")
|
||||
- validation_claim — an asserted claim to be backed by evidence (e.g., "Margin is adequate for full envelope")
|
||||
- vendor — a supplier / partner entity (e.g., "Schott AG", "ABB Space", "Nabeel")
|
||||
- process — a named workflow step (e.g., "Ion beam figuring pass", "Incoming inspection")
|
||||
- system — whole project's system envelope (rare; usually project handles this)
|
||||
|
||||
WHEN TO GRADUATE:
|
||||
|
||||
GRADUATE if the memory clearly names one of these entities with enough detail to be useful. Examples:
|
||||
- "Selected Zerodur for the p04 primary mirror blank" → 2 entities: decision(name="Select Zerodur for primary blank") + material(name="Zerodur")
|
||||
- "ABB Space (INO) is the polishing vendor for p04" → vendor(name="ABB Space")
|
||||
- "Surface figure target is < 25nm RMS after IBF" → requirement(name="Surface figure < 25nm RMS after IBF")
|
||||
- "The Preston model assumes 5N min contact pressure" → parameter(name="Preston min contact pressure = 5N")
|
||||
|
||||
DON'T GRADUATE if the memory is:
|
||||
- A preference or work-style note (those stay as memories)
|
||||
- A session observation ("we tested X today") — no durable typed thing
|
||||
- A general insight / rule of thumb ("Always calibrate before measuring")
|
||||
- An OpenClaw MEMORY.md import of conversational history
|
||||
- Something where you can't pick a clear entity type with confidence
|
||||
|
||||
OUTPUT FORMAT — exactly one JSON object:
|
||||
|
||||
If graduating, emit:
|
||||
{
|
||||
"graduate": true,
|
||||
"entity_type": "component|requirement|decision|...",
|
||||
"name": "short noun phrase, <60 chars",
|
||||
"description": "one-sentence description that adds context beyond the name",
|
||||
"confidence": 0.0-1.0,
|
||||
"relationships": [
|
||||
{"rel_type": "part_of|satisfies|uses_material|based_on_assumption|constrained_by|affected_by_decision|supports|evidenced_by|described_by", "target_hint": "name of the target entity (human will resolve)"}
|
||||
]
|
||||
}
|
||||
|
||||
If not graduating, emit:
|
||||
{"graduate": false, "reason": "one-sentence reason"}
|
||||
|
||||
Rules:
|
||||
- Output ONLY the JSON object, no markdown, no prose
|
||||
- name MUST be <60 chars and specific; reject vague names like "the system"
|
||||
- confidence: 0.6-0.7 is typical. Raise to 0.8+ only if the memory is very specific and unambiguous.
|
||||
- relationships array can be empty
|
||||
- target_hint is a free-text name; the human-review stage will resolve it to an actual entity id (or reject if the target doesn't exist yet)
|
||||
- If the memory describes MULTIPLE entities, pick the single most important one; a second pass can catch the others
|
||||
"""
|
||||
|
||||
|
||||
def build_user_message(memory_content: str, memory_project: str, memory_type: str) -> str:
|
||||
return (
|
||||
f"MEMORY PROJECT: {memory_project or '(unscoped)'}\n"
|
||||
f"MEMORY TYPE: {memory_type}\n\n"
|
||||
f"MEMORY CONTENT:\n{memory_content[:MAX_CONTENT_CHARS]}\n\n"
|
||||
"Return the JSON decision now."
|
||||
)
|
||||
|
||||
|
||||
def parse_graduation_output(raw: str) -> dict[str, Any] | None:
|
||||
"""Parse the LLM's graduation decision. Return None on any parse error.
|
||||
|
||||
On success returns the normalized decision dict with keys:
|
||||
graduate (bool), entity_type (str), name (str), description (str),
|
||||
confidence (float), relationships (list of {rel_type, target_hint})
|
||||
OR {"graduate": false, "reason": "..."}
|
||||
"""
|
||||
text = (raw or "").strip()
|
||||
if not text:
|
||||
return None
|
||||
if text.startswith("```"):
|
||||
text = text.strip("`")
|
||||
nl = text.find("\n")
|
||||
if nl >= 0:
|
||||
text = text[nl + 1:]
|
||||
if text.endswith("```"):
|
||||
text = text[:-3]
|
||||
text = text.strip()
|
||||
|
||||
# Tolerate leading prose
|
||||
if not text.lstrip().startswith("{"):
|
||||
start = text.find("{")
|
||||
end = text.rfind("}")
|
||||
if start >= 0 and end > start:
|
||||
text = text[start:end + 1]
|
||||
|
||||
try:
|
||||
parsed = json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
return None
|
||||
|
||||
if not isinstance(parsed, dict):
|
||||
return None
|
||||
|
||||
graduate = bool(parsed.get("graduate", False))
|
||||
if not graduate:
|
||||
return {"graduate": False, "reason": str(parsed.get("reason", ""))[:200]}
|
||||
|
||||
entity_type = str(parsed.get("entity_type") or "").strip().lower()
|
||||
if entity_type not in ENTITY_TYPES:
|
||||
return None
|
||||
|
||||
name = str(parsed.get("name") or "").strip()
|
||||
if not name or len(name) > 120:
|
||||
return None
|
||||
|
||||
description = str(parsed.get("description") or "").strip()[:500]
|
||||
|
||||
try:
|
||||
confidence = float(parsed.get("confidence", 0.6))
|
||||
except (TypeError, ValueError):
|
||||
confidence = 0.6
|
||||
confidence = max(0.0, min(1.0, confidence))
|
||||
|
||||
raw_rels = parsed.get("relationships") or []
|
||||
if not isinstance(raw_rels, list):
|
||||
raw_rels = []
|
||||
relationships: list[dict] = []
|
||||
for r in raw_rels[:10]:
|
||||
if not isinstance(r, dict):
|
||||
continue
|
||||
rtype = str(r.get("rel_type") or "").strip().lower()
|
||||
target = str(r.get("target_hint") or "").strip()
|
||||
if not rtype or not target:
|
||||
continue
|
||||
relationships.append({"rel_type": rtype, "target_hint": target[:120]})
|
||||
|
||||
return {
|
||||
"graduate": True,
|
||||
"entity_type": entity_type,
|
||||
"name": name,
|
||||
"description": description,
|
||||
"confidence": confidence,
|
||||
"relationships": relationships,
|
||||
}
|
||||
291
src/atocore/engineering/conflicts.py
Normal file
291
src/atocore/engineering/conflicts.py
Normal file
@@ -0,0 +1,291 @@
|
||||
"""Phase 5G — Conflict detection on entity promote.
|
||||
|
||||
When a candidate entity is promoted to active, we check whether another
|
||||
active entity is already claiming the "same slot" with an incompatible
|
||||
value. If so, we emit a conflicts row + conflict_members rows so the
|
||||
human can resolve.
|
||||
|
||||
Slot keys are per-entity-type (from ``conflict-model.md``). V1 starts
|
||||
narrow with 3 slot kinds to avoid false positives:
|
||||
|
||||
1. **component.material** — a component should normally have ONE
|
||||
dominant material (via USES_MATERIAL edge). Two active USES_MATERIAL
|
||||
edges from the same component pointing at different materials =
|
||||
conflict.
|
||||
2. **component.part_of** — a component should belong to AT MOST one
|
||||
subsystem (via PART_OF). Two active PART_OF edges = conflict.
|
||||
3. **requirement.value** — two active Requirements with the same name in
|
||||
the same project but different descriptions = conflict.
|
||||
|
||||
Rule: **flag, never block**. The promote succeeds; the conflict row is
|
||||
just a flag for the human. Users see conflicts in the dashboard and on
|
||||
wiki entity pages with a "⚠️ Disputed" badge.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("conflicts")
|
||||
|
||||
|
||||
def detect_conflicts_for_entity(entity_id: str) -> list[str]:
|
||||
"""Run conflict detection for a newly-promoted active entity.
|
||||
|
||||
Returns a list of conflict_ids created. Fail-open: any detection error
|
||||
is logged and returns an empty list; the promote itself is not affected.
|
||||
"""
|
||||
try:
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT * FROM entities WHERE id = ? AND status = 'active'",
|
||||
(entity_id,),
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return []
|
||||
|
||||
created: list[str] = []
|
||||
etype = row["entity_type"]
|
||||
project = row["project"] or ""
|
||||
|
||||
if etype == "component":
|
||||
created.extend(_check_component_conflicts(entity_id, project))
|
||||
elif etype == "requirement":
|
||||
created.extend(_check_requirement_conflicts(entity_id, row["name"], project))
|
||||
|
||||
return created
|
||||
except Exception as e:
|
||||
log.warning("conflict_detection_failed", entity_id=entity_id, error=str(e))
|
||||
return []
|
||||
|
||||
|
||||
def _check_component_conflicts(component_id: str, project: str) -> list[str]:
|
||||
"""Check material + part_of slot uniqueness for a component."""
|
||||
created: list[str] = []
|
||||
with get_connection() as conn:
|
||||
# component.material conflicts
|
||||
mat_edges = conn.execute(
|
||||
"SELECT r.id AS rel_id, r.target_entity_id, e.name "
|
||||
"FROM relationships r "
|
||||
"JOIN entities e ON e.id = r.target_entity_id "
|
||||
"WHERE r.source_entity_id = ? AND r.relationship_type = 'uses_material' "
|
||||
"AND e.status = 'active'",
|
||||
(component_id,),
|
||||
).fetchall()
|
||||
if len(mat_edges) > 1:
|
||||
cid = _record_conflict(
|
||||
slot_kind="component.material",
|
||||
slot_key=component_id,
|
||||
project=project,
|
||||
note=f"component has {len(mat_edges)} active material edges",
|
||||
members=[
|
||||
{
|
||||
"kind": "entity",
|
||||
"id": m["target_entity_id"],
|
||||
"snapshot": m["name"],
|
||||
}
|
||||
for m in mat_edges
|
||||
],
|
||||
)
|
||||
if cid:
|
||||
created.append(cid)
|
||||
|
||||
# component.part_of conflicts
|
||||
pof_edges = conn.execute(
|
||||
"SELECT r.id AS rel_id, r.target_entity_id, e.name "
|
||||
"FROM relationships r "
|
||||
"JOIN entities e ON e.id = r.target_entity_id "
|
||||
"WHERE r.source_entity_id = ? AND r.relationship_type = 'part_of' "
|
||||
"AND e.status = 'active'",
|
||||
(component_id,),
|
||||
).fetchall()
|
||||
if len(pof_edges) > 1:
|
||||
cid = _record_conflict(
|
||||
slot_kind="component.part_of",
|
||||
slot_key=component_id,
|
||||
project=project,
|
||||
note=f"component is part_of {len(pof_edges)} subsystems",
|
||||
members=[
|
||||
{
|
||||
"kind": "entity",
|
||||
"id": p["target_entity_id"],
|
||||
"snapshot": p["name"],
|
||||
}
|
||||
for p in pof_edges
|
||||
],
|
||||
)
|
||||
if cid:
|
||||
created.append(cid)
|
||||
|
||||
return created
|
||||
|
||||
|
||||
def _check_requirement_conflicts(requirement_id: str, name: str, project: str) -> list[str]:
|
||||
"""Two active Requirements with the same name in the same project."""
|
||||
with get_connection() as conn:
|
||||
peers = conn.execute(
|
||||
"SELECT id, description FROM entities "
|
||||
"WHERE entity_type = 'requirement' AND status = 'active' "
|
||||
"AND project = ? AND LOWER(name) = LOWER(?) AND id != ?",
|
||||
(project, name, requirement_id),
|
||||
).fetchall()
|
||||
if not peers:
|
||||
return []
|
||||
|
||||
members = [{"kind": "entity", "id": requirement_id, "snapshot": name}]
|
||||
for p in peers:
|
||||
members.append({"kind": "entity", "id": p["id"],
|
||||
"snapshot": (p["description"] or "")[:200]})
|
||||
|
||||
cid = _record_conflict(
|
||||
slot_kind="requirement.name",
|
||||
slot_key=f"{project}|{name.lower()}",
|
||||
project=project,
|
||||
note=f"{len(peers)+1} active requirements share the name '{name}'",
|
||||
members=members,
|
||||
)
|
||||
return [cid] if cid else []
|
||||
|
||||
|
||||
def _record_conflict(
|
||||
slot_kind: str,
|
||||
slot_key: str,
|
||||
project: str,
|
||||
note: str,
|
||||
members: list[dict],
|
||||
) -> str | None:
|
||||
"""Persist a conflict + its members; skip if an open conflict already
|
||||
exists for the same (slot_kind, slot_key)."""
|
||||
try:
|
||||
with get_connection() as conn:
|
||||
existing = conn.execute(
|
||||
"SELECT id FROM conflicts WHERE slot_kind = ? AND slot_key = ? "
|
||||
"AND status = 'open'",
|
||||
(slot_kind, slot_key),
|
||||
).fetchone()
|
||||
if existing:
|
||||
return None # don't dup
|
||||
|
||||
conflict_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO conflicts (id, slot_kind, slot_key, project, "
|
||||
"status, note) VALUES (?, ?, ?, ?, 'open', ?)",
|
||||
(conflict_id, slot_kind, slot_key, project, note[:500]),
|
||||
)
|
||||
for m in members:
|
||||
conn.execute(
|
||||
"INSERT INTO conflict_members (id, conflict_id, member_kind, "
|
||||
"member_id, value_snapshot) VALUES (?, ?, ?, ?, ?)",
|
||||
(str(uuid.uuid4()), conflict_id,
|
||||
m.get("kind", "entity"), m.get("id", ""),
|
||||
(m.get("snapshot") or "")[:500]),
|
||||
)
|
||||
|
||||
log.info("conflict_detected", conflict_id=conflict_id,
|
||||
slot_kind=slot_kind, project=project)
|
||||
|
||||
# Emit a warning alert so the operator sees it
|
||||
try:
|
||||
from atocore.observability.alerts import emit_alert
|
||||
emit_alert(
|
||||
severity="warning",
|
||||
title=f"Entity conflict: {slot_kind}",
|
||||
message=note,
|
||||
context={"project": project, "slot_key": slot_key,
|
||||
"member_count": len(members)},
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return conflict_id
|
||||
except Exception as e:
|
||||
log.warning("conflict_record_failed", error=str(e))
|
||||
return None
|
||||
|
||||
|
||||
def list_open_conflicts(project: str | None = None) -> list[dict]:
|
||||
"""Return open conflicts with their members."""
|
||||
with get_connection() as conn:
|
||||
query = "SELECT * FROM conflicts WHERE status = 'open'"
|
||||
params: list = []
|
||||
if project:
|
||||
query += " AND project = ?"
|
||||
params.append(project)
|
||||
query += " ORDER BY detected_at DESC"
|
||||
rows = conn.execute(query, params).fetchall()
|
||||
|
||||
conflicts = []
|
||||
for r in rows:
|
||||
member_rows = conn.execute(
|
||||
"SELECT * FROM conflict_members WHERE conflict_id = ?",
|
||||
(r["id"],),
|
||||
).fetchall()
|
||||
conflicts.append({
|
||||
"id": r["id"],
|
||||
"slot_kind": r["slot_kind"],
|
||||
"slot_key": r["slot_key"],
|
||||
"project": r["project"] or "",
|
||||
"status": r["status"],
|
||||
"note": r["note"] or "",
|
||||
"detected_at": r["detected_at"],
|
||||
"members": [
|
||||
{
|
||||
"id": m["id"],
|
||||
"member_kind": m["member_kind"],
|
||||
"member_id": m["member_id"],
|
||||
"snapshot": m["value_snapshot"] or "",
|
||||
}
|
||||
for m in member_rows
|
||||
],
|
||||
})
|
||||
return conflicts
|
||||
|
||||
|
||||
def resolve_conflict(
|
||||
conflict_id: str,
|
||||
action: str, # "dismiss", "supersede_others", "no_action"
|
||||
winner_id: str | None = None,
|
||||
actor: str = "api",
|
||||
) -> bool:
|
||||
"""Resolve a conflict. Optionally marks non-winner members as superseded."""
|
||||
if action not in ("dismiss", "supersede_others", "no_action"):
|
||||
raise ValueError(f"Invalid action: {action}")
|
||||
|
||||
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT * FROM conflicts WHERE id = ?", (conflict_id,)
|
||||
).fetchone()
|
||||
if row is None or row["status"] != "open":
|
||||
return False
|
||||
|
||||
if action == "supersede_others":
|
||||
if not winner_id:
|
||||
raise ValueError("winner_id required for supersede_others")
|
||||
# Mark non-winner member entities as superseded
|
||||
member_rows = conn.execute(
|
||||
"SELECT member_id FROM conflict_members WHERE conflict_id = ?",
|
||||
(conflict_id,),
|
||||
).fetchall()
|
||||
for m in member_rows:
|
||||
if m["member_id"] != winner_id:
|
||||
conn.execute(
|
||||
"UPDATE entities SET status = 'superseded', updated_at = ? "
|
||||
"WHERE id = ? AND status = 'active'",
|
||||
(now, m["member_id"]),
|
||||
)
|
||||
|
||||
conn.execute(
|
||||
"UPDATE conflicts SET status = 'resolved', resolution = ?, "
|
||||
"resolved_at = ? WHERE id = ?",
|
||||
(action, now, conflict_id),
|
||||
)
|
||||
|
||||
log.info("conflict_resolved", conflict_id=conflict_id,
|
||||
action=action, actor=actor)
|
||||
return True
|
||||
@@ -29,6 +29,7 @@ def generate_project_overview(project: str) -> str:
|
||||
sections = [
|
||||
_header(project),
|
||||
_synthesis_section(project),
|
||||
_gaps_section(project), # Phase 5: killer queries surface here
|
||||
_state_section(project),
|
||||
_system_architecture(project),
|
||||
_decisions_section(project),
|
||||
@@ -41,6 +42,66 @@ def generate_project_overview(project: str) -> str:
|
||||
return "\n\n".join(s for s in sections if s)
|
||||
|
||||
|
||||
def _gaps_section(project: str) -> str:
|
||||
"""Phase 5: surface the 3 killer-query gaps on every project page.
|
||||
|
||||
If any gap is non-empty, it appears near the top so the director
|
||||
sees "what am I forgetting?" before the rest of the report.
|
||||
"""
|
||||
try:
|
||||
from atocore.engineering.queries import all_gaps
|
||||
result = all_gaps(project)
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
orphan = result["orphan_requirements"]["count"]
|
||||
risky = result["risky_decisions"]["count"]
|
||||
unsup = result["unsupported_claims"]["count"]
|
||||
|
||||
if orphan == 0 and risky == 0 and unsup == 0:
|
||||
return (
|
||||
"## Coverage Gaps\n\n"
|
||||
"> ✅ No gaps detected: every requirement is satisfied, "
|
||||
"no decisions rest on flagged assumptions, every claim has evidence.\n"
|
||||
)
|
||||
|
||||
lines = ["## Coverage Gaps", ""]
|
||||
lines.append(
|
||||
"> ⚠️ Items below need attention — gaps in the engineering graph.\n"
|
||||
)
|
||||
|
||||
if orphan:
|
||||
lines.append(f"### {orphan} Orphan Requirement(s)")
|
||||
lines.append("*Requirements with no component claiming to satisfy them:*")
|
||||
lines.append("")
|
||||
for r in result["orphan_requirements"]["gaps"][:10]:
|
||||
lines.append(f"- **{r['name']}** — {(r['description'] or '')[:120]}")
|
||||
if orphan > 10:
|
||||
lines.append(f"- _...and {orphan - 10} more_")
|
||||
lines.append("")
|
||||
|
||||
if risky:
|
||||
lines.append(f"### {risky} Risky Decision(s)")
|
||||
lines.append("*Decisions based on assumptions that are flagged, superseded, or invalid:*")
|
||||
lines.append("")
|
||||
for d in result["risky_decisions"]["gaps"][:10]:
|
||||
lines.append(
|
||||
f"- **{d['decision_name']}** — based on flagged assumption "
|
||||
f"_{d['assumption_name']}_ ({d['assumption_status']})"
|
||||
)
|
||||
lines.append("")
|
||||
|
||||
if unsup:
|
||||
lines.append(f"### {unsup} Unsupported Claim(s)")
|
||||
lines.append("*Validation claims with no supporting Result entity:*")
|
||||
lines.append("")
|
||||
for c in result["unsupported_claims"]["gaps"][:10]:
|
||||
lines.append(f"- **{c['name']}** — {(c['description'] or '')[:120]}")
|
||||
lines.append("")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _synthesis_section(project: str) -> str:
|
||||
"""Generate a short LLM synthesis of the current project state.
|
||||
|
||||
|
||||
467
src/atocore/engineering/queries.py
Normal file
467
src/atocore/engineering/queries.py
Normal file
@@ -0,0 +1,467 @@
|
||||
"""Phase 5 Engineering V1 — The 10 canonical queries.
|
||||
|
||||
Each function maps to one or more catalog IDs in
|
||||
``docs/architecture/engineering-query-catalog.md``. Return values are plain
|
||||
dicts so API and wiki renderers can consume them without importing dataclasses.
|
||||
|
||||
Design principles:
|
||||
- All queries filter to status='active' unless the caller asks otherwise
|
||||
- All project filters go through ``resolve_project_name`` (canonicalization)
|
||||
- Graph traversals are bounded (depth <= 3 for impact, limit 200 for lists)
|
||||
- The 3 "killer" queries (gaps) accept project as required — gaps are always
|
||||
scoped to one project in V1
|
||||
|
||||
These queries are the *useful surface* of the entity graph. Before this module,
|
||||
the graph was data with no narrative; after this module, the director can ask
|
||||
real questions about coverage, risk, and evidence.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from atocore.engineering.service import (
|
||||
Entity,
|
||||
_row_to_entity,
|
||||
get_entity,
|
||||
get_relationships,
|
||||
)
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Structure queries (Q-001, Q-004, Q-005, Q-008)
|
||||
# ============================================================
|
||||
|
||||
|
||||
def system_map(project: str) -> dict:
|
||||
"""Q-001 + Q-004: return the full subsystem/component tree for a project.
|
||||
|
||||
Shape:
|
||||
{
|
||||
"project": "p05-interferometer",
|
||||
"subsystems": [
|
||||
{
|
||||
"id": ..., "name": ..., "description": ...,
|
||||
"components": [{id, name, description, materials: [...]}],
|
||||
},
|
||||
...
|
||||
],
|
||||
"orphan_components": [...], # components with no PART_OF edge
|
||||
}
|
||||
"""
|
||||
project = resolve_project_name(project) if project else ""
|
||||
out: dict = {"project": project, "subsystems": [], "orphan_components": []}
|
||||
|
||||
with get_connection() as conn:
|
||||
# All subsystems in project
|
||||
subsys_rows = conn.execute(
|
||||
"SELECT * FROM entities WHERE status = 'active' "
|
||||
"AND project = ? AND entity_type = 'subsystem' "
|
||||
"ORDER BY name",
|
||||
(project,),
|
||||
).fetchall()
|
||||
|
||||
# All components in project
|
||||
comp_rows = conn.execute(
|
||||
"SELECT * FROM entities WHERE status = 'active' "
|
||||
"AND project = ? AND entity_type = 'component'",
|
||||
(project,),
|
||||
).fetchall()
|
||||
|
||||
# PART_OF edges: component → subsystem
|
||||
part_of_rows = conn.execute(
|
||||
"SELECT source_entity_id, target_entity_id FROM relationships "
|
||||
"WHERE relationship_type = 'part_of'"
|
||||
).fetchall()
|
||||
part_of_map: dict[str, str] = {
|
||||
r["source_entity_id"]: r["target_entity_id"] for r in part_of_rows
|
||||
}
|
||||
|
||||
# uses_material edges for components
|
||||
mat_rows = conn.execute(
|
||||
"SELECT r.source_entity_id, e.name FROM relationships r "
|
||||
"JOIN entities e ON e.id = r.target_entity_id "
|
||||
"WHERE r.relationship_type = 'uses_material' AND e.status = 'active'"
|
||||
).fetchall()
|
||||
materials_by_comp: dict[str, list[str]] = {}
|
||||
for r in mat_rows:
|
||||
materials_by_comp.setdefault(r["source_entity_id"], []).append(r["name"])
|
||||
|
||||
# Build: subsystems → their components
|
||||
subsys_comps: dict[str, list[dict]] = {s["id"]: [] for s in subsys_rows}
|
||||
orphans: list[dict] = []
|
||||
for c in comp_rows:
|
||||
parent = part_of_map.get(c["id"])
|
||||
comp_dict = {
|
||||
"id": c["id"],
|
||||
"name": c["name"],
|
||||
"description": c["description"] or "",
|
||||
"materials": materials_by_comp.get(c["id"], []),
|
||||
}
|
||||
if parent and parent in subsys_comps:
|
||||
subsys_comps[parent].append(comp_dict)
|
||||
else:
|
||||
orphans.append(comp_dict)
|
||||
|
||||
out["subsystems"] = [
|
||||
{
|
||||
"id": s["id"],
|
||||
"name": s["name"],
|
||||
"description": s["description"] or "",
|
||||
"components": subsys_comps.get(s["id"], []),
|
||||
}
|
||||
for s in subsys_rows
|
||||
]
|
||||
out["orphan_components"] = orphans
|
||||
return out
|
||||
|
||||
|
||||
def decisions_affecting(project: str, subsystem_id: str | None = None) -> dict:
|
||||
"""Q-008: decisions that affect a subsystem (or whole project).
|
||||
|
||||
Walks AFFECTED_BY_DECISION edges. If subsystem_id is given, returns
|
||||
decisions linked to that subsystem or any of its components. Otherwise,
|
||||
all decisions in the project.
|
||||
"""
|
||||
project = resolve_project_name(project) if project else ""
|
||||
|
||||
target_ids: set[str] = set()
|
||||
if subsystem_id:
|
||||
target_ids.add(subsystem_id)
|
||||
# Include components PART_OF the subsystem
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT source_entity_id FROM relationships "
|
||||
"WHERE relationship_type = 'part_of' AND target_entity_id = ?",
|
||||
(subsystem_id,),
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
target_ids.add(r["source_entity_id"])
|
||||
|
||||
with get_connection() as conn:
|
||||
if target_ids:
|
||||
placeholders = ",".join("?" * len(target_ids))
|
||||
rows = conn.execute(
|
||||
f"SELECT DISTINCT e.* FROM entities e "
|
||||
f"JOIN relationships r ON r.source_entity_id = e.id "
|
||||
f"WHERE e.status = 'active' AND e.entity_type = 'decision' "
|
||||
f"AND e.project = ? AND r.relationship_type = 'affected_by_decision' "
|
||||
f"AND r.target_entity_id IN ({placeholders}) "
|
||||
f"ORDER BY e.updated_at DESC",
|
||||
(project, *target_ids),
|
||||
).fetchall()
|
||||
else:
|
||||
rows = conn.execute(
|
||||
"SELECT * FROM entities WHERE status = 'active' "
|
||||
"AND entity_type = 'decision' AND project = ? "
|
||||
"ORDER BY updated_at DESC LIMIT 200",
|
||||
(project,),
|
||||
).fetchall()
|
||||
|
||||
decisions = [_entity_dict(_row_to_entity(r)) for r in rows]
|
||||
return {
|
||||
"project": project,
|
||||
"subsystem_id": subsystem_id or "",
|
||||
"decisions": decisions,
|
||||
"count": len(decisions),
|
||||
}
|
||||
|
||||
|
||||
def requirements_for(component_id: str) -> dict:
|
||||
"""Q-005: requirements that a component satisfies."""
|
||||
with get_connection() as conn:
|
||||
# Component → SATISFIES → Requirement
|
||||
rows = conn.execute(
|
||||
"SELECT e.* FROM entities e "
|
||||
"JOIN relationships r ON r.target_entity_id = e.id "
|
||||
"WHERE r.source_entity_id = ? AND r.relationship_type = 'satisfies' "
|
||||
"AND e.entity_type = 'requirement' AND e.status = 'active' "
|
||||
"ORDER BY e.name",
|
||||
(component_id,),
|
||||
).fetchall()
|
||||
requirements = [_entity_dict(_row_to_entity(r)) for r in rows]
|
||||
return {
|
||||
"component_id": component_id,
|
||||
"requirements": requirements,
|
||||
"count": len(requirements),
|
||||
}
|
||||
|
||||
|
||||
def recent_changes(project: str, since: str | None = None, limit: int = 50) -> dict:
|
||||
"""Q-013: what changed recently in the project (entity audit log).
|
||||
|
||||
Uses the shared memory_audit table filtered by entity_kind='entity' and
|
||||
joins back to entities for the project scope.
|
||||
"""
|
||||
project = resolve_project_name(project) if project else ""
|
||||
since = since or "2020-01-01"
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT a.id, a.memory_id AS entity_id, a.action, a.actor, "
|
||||
"a.timestamp, a.note, e.entity_type, e.name, e.project "
|
||||
"FROM memory_audit a "
|
||||
"LEFT JOIN entities e ON e.id = a.memory_id "
|
||||
"WHERE a.entity_kind = 'entity' AND a.timestamp >= ? "
|
||||
"AND (e.project = ? OR e.project IS NULL) "
|
||||
"ORDER BY a.timestamp DESC LIMIT ?",
|
||||
(since, project, limit),
|
||||
).fetchall()
|
||||
|
||||
changes = []
|
||||
for r in rows:
|
||||
changes.append({
|
||||
"audit_id": r["id"],
|
||||
"entity_id": r["entity_id"],
|
||||
"entity_type": r["entity_type"] or "?",
|
||||
"entity_name": r["name"] or "(deleted)",
|
||||
"action": r["action"],
|
||||
"actor": r["actor"] or "api",
|
||||
"note": r["note"] or "",
|
||||
"timestamp": r["timestamp"],
|
||||
})
|
||||
return {"project": project, "since": since, "changes": changes, "count": len(changes)}
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Killer queries (Q-006, Q-009, Q-011) — the "what am I forgetting?" queries
|
||||
# ============================================================
|
||||
|
||||
|
||||
def orphan_requirements(project: str) -> dict:
|
||||
"""Q-006: requirements in project with NO inbound SATISFIES edge.
|
||||
|
||||
These are "something we said must be true" with nothing actually
|
||||
satisfying them. The single highest-value query for an engineering
|
||||
director: shows what's unclaimed by design.
|
||||
"""
|
||||
project = resolve_project_name(project) if project else ""
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT * FROM entities WHERE status = 'active' "
|
||||
"AND project = ? AND entity_type = 'requirement' "
|
||||
"AND NOT EXISTS ("
|
||||
" SELECT 1 FROM relationships r "
|
||||
" WHERE r.relationship_type = 'satisfies' "
|
||||
" AND r.target_entity_id = entities.id"
|
||||
") "
|
||||
"ORDER BY updated_at DESC",
|
||||
(project,),
|
||||
).fetchall()
|
||||
|
||||
orphans = [_entity_dict(_row_to_entity(r)) for r in rows]
|
||||
return {
|
||||
"project": project,
|
||||
"query": "Q-006 orphan requirements",
|
||||
"description": "Requirements with no SATISFIES relationship — nothing claims to meet them.",
|
||||
"gaps": orphans,
|
||||
"count": len(orphans),
|
||||
}
|
||||
|
||||
|
||||
def risky_decisions(project: str) -> dict:
|
||||
"""Q-009: decisions linked to assumptions flagged as unresolved.
|
||||
|
||||
Walks BASED_ON_ASSUMPTION edges. An assumption is "flagged" if its
|
||||
properties.flagged=True OR status='superseded' OR status='invalid'.
|
||||
"""
|
||||
project = resolve_project_name(project) if project else ""
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT DISTINCT d.*, a.name AS assumption_name, a.id AS assumption_id, "
|
||||
"a.status AS assumption_status, a.properties AS assumption_props "
|
||||
"FROM entities d "
|
||||
"JOIN relationships r ON r.source_entity_id = d.id "
|
||||
"JOIN entities a ON a.id = r.target_entity_id "
|
||||
"WHERE d.status = 'active' AND d.entity_type = 'decision' "
|
||||
"AND d.project = ? "
|
||||
"AND r.relationship_type = 'based_on_assumption' "
|
||||
"AND ("
|
||||
" a.status IN ('superseded', 'invalid') OR "
|
||||
" a.properties LIKE '%\"flagged\": true%' OR "
|
||||
" a.properties LIKE '%\"flagged\":true%'"
|
||||
") "
|
||||
"ORDER BY d.updated_at DESC",
|
||||
(project,),
|
||||
).fetchall()
|
||||
|
||||
risky = []
|
||||
for r in rows:
|
||||
risky.append({
|
||||
"decision_id": r["id"],
|
||||
"decision_name": r["name"],
|
||||
"decision_description": r["description"] or "",
|
||||
"assumption_id": r["assumption_id"],
|
||||
"assumption_name": r["assumption_name"],
|
||||
"assumption_status": r["assumption_status"],
|
||||
})
|
||||
return {
|
||||
"project": project,
|
||||
"query": "Q-009 risky decisions",
|
||||
"description": "Decisions based on assumptions that are flagged, superseded, or invalid.",
|
||||
"gaps": risky,
|
||||
"count": len(risky),
|
||||
}
|
||||
|
||||
|
||||
def unsupported_claims(project: str) -> dict:
|
||||
"""Q-011: validation claims with NO inbound SUPPORTS edge.
|
||||
|
||||
These are asserted claims (e.g., "margin is adequate") with no
|
||||
Result entity actually supporting them. High-risk: the engineer
|
||||
believes it, but there's no evidence on file.
|
||||
"""
|
||||
project = resolve_project_name(project) if project else ""
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT * FROM entities WHERE status = 'active' "
|
||||
"AND project = ? AND entity_type = 'validation_claim' "
|
||||
"AND NOT EXISTS ("
|
||||
" SELECT 1 FROM relationships r "
|
||||
" WHERE r.relationship_type = 'supports' "
|
||||
" AND r.target_entity_id = entities.id"
|
||||
") "
|
||||
"ORDER BY updated_at DESC",
|
||||
(project,),
|
||||
).fetchall()
|
||||
|
||||
claims = [_entity_dict(_row_to_entity(r)) for r in rows]
|
||||
return {
|
||||
"project": project,
|
||||
"query": "Q-011 unsupported claims",
|
||||
"description": "Validation claims with no supporting Result — asserted but not evidenced.",
|
||||
"gaps": claims,
|
||||
"count": len(claims),
|
||||
}
|
||||
|
||||
|
||||
def all_gaps(project: str) -> dict:
|
||||
"""Combined: run Q-006, Q-009, Q-011 for a project in one go."""
|
||||
return {
|
||||
"project": resolve_project_name(project) if project else "",
|
||||
"generated_at": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
|
||||
"orphan_requirements": orphan_requirements(project),
|
||||
"risky_decisions": risky_decisions(project),
|
||||
"unsupported_claims": unsupported_claims(project),
|
||||
}
|
||||
|
||||
|
||||
# ============================================================
|
||||
# History + impact (Q-016, Q-017)
|
||||
# ============================================================
|
||||
|
||||
|
||||
def impact_analysis(entity_id: str, max_depth: int = 3) -> dict:
|
||||
"""Q-016: transitive outbound reach of an entity.
|
||||
|
||||
Walks outbound edges breadth-first to max_depth. Answers "what would
|
||||
be affected if I changed component X?" by finding everything downstream.
|
||||
"""
|
||||
visited: set[str] = {entity_id}
|
||||
impacted: list[dict] = []
|
||||
frontier = [(entity_id, 0)]
|
||||
|
||||
while frontier:
|
||||
current_id, depth = frontier.pop(0)
|
||||
if depth >= max_depth:
|
||||
continue
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT r.relationship_type, r.target_entity_id, "
|
||||
"e.entity_type, e.name, e.status "
|
||||
"FROM relationships r "
|
||||
"JOIN entities e ON e.id = r.target_entity_id "
|
||||
"WHERE r.source_entity_id = ? AND e.status = 'active'",
|
||||
(current_id,),
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
tid = r["target_entity_id"]
|
||||
if tid in visited:
|
||||
continue
|
||||
visited.add(tid)
|
||||
impacted.append({
|
||||
"entity_id": tid,
|
||||
"entity_type": r["entity_type"],
|
||||
"name": r["name"],
|
||||
"relationship": r["relationship_type"],
|
||||
"depth": depth + 1,
|
||||
})
|
||||
frontier.append((tid, depth + 1))
|
||||
|
||||
root = get_entity(entity_id)
|
||||
return {
|
||||
"root": _entity_dict(root) if root else None,
|
||||
"impacted_count": len(impacted),
|
||||
"impacted": impacted,
|
||||
"max_depth": max_depth,
|
||||
}
|
||||
|
||||
|
||||
def evidence_chain(entity_id: str) -> dict:
|
||||
"""Q-017: what evidence supports this entity?
|
||||
|
||||
Walks inbound SUPPORTS / EVIDENCED_BY / DESCRIBED_BY edges to surface
|
||||
the provenance chain: "this claim is supported by that result, which
|
||||
was produced by that analysis model, which was described by that doc."
|
||||
"""
|
||||
provenance_edges = ("supports", "evidenced_by", "described_by",
|
||||
"validated_by", "analyzed_by")
|
||||
placeholders = ",".join("?" * len(provenance_edges))
|
||||
|
||||
with get_connection() as conn:
|
||||
# Inbound edges of the provenance family
|
||||
inbound_rows = conn.execute(
|
||||
f"SELECT r.relationship_type, r.source_entity_id, "
|
||||
f"e.entity_type, e.name, e.description, e.status "
|
||||
f"FROM relationships r "
|
||||
f"JOIN entities e ON e.id = r.source_entity_id "
|
||||
f"WHERE r.target_entity_id = ? AND e.status = 'active' "
|
||||
f"AND r.relationship_type IN ({placeholders})",
|
||||
(entity_id, *provenance_edges),
|
||||
).fetchall()
|
||||
|
||||
# Also look at source_refs on the entity itself
|
||||
root = get_entity(entity_id)
|
||||
|
||||
chain = []
|
||||
for r in inbound_rows:
|
||||
chain.append({
|
||||
"via": r["relationship_type"],
|
||||
"source_id": r["source_entity_id"],
|
||||
"source_type": r["entity_type"],
|
||||
"source_name": r["name"],
|
||||
"source_description": (r["description"] or "")[:200],
|
||||
})
|
||||
|
||||
return {
|
||||
"root": _entity_dict(root) if root else None,
|
||||
"direct_source_refs": root.source_refs if root else [],
|
||||
"evidence_chain": chain,
|
||||
"count": len(chain),
|
||||
}
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Helpers
|
||||
# ============================================================
|
||||
|
||||
|
||||
def _entity_dict(e: Entity) -> dict:
|
||||
"""Flatten an Entity to a public-API dict."""
|
||||
return {
|
||||
"id": e.id,
|
||||
"entity_type": e.entity_type,
|
||||
"name": e.name,
|
||||
"project": e.project,
|
||||
"description": e.description,
|
||||
"properties": e.properties,
|
||||
"status": e.status,
|
||||
"confidence": e.confidence,
|
||||
"source_refs": e.source_refs,
|
||||
"updated_at": e.updated_at,
|
||||
}
|
||||
@@ -9,6 +9,7 @@ from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
log = get_logger("engineering")
|
||||
|
||||
@@ -31,18 +32,29 @@ ENTITY_TYPES = [
|
||||
]
|
||||
|
||||
RELATIONSHIP_TYPES = [
|
||||
# Structural family
|
||||
"contains",
|
||||
"part_of",
|
||||
"interfaces_with",
|
||||
# Intent family
|
||||
"satisfies",
|
||||
"constrained_by",
|
||||
"affected_by_decision",
|
||||
"based_on_assumption", # Phase 5 — Q-009 killer query
|
||||
"supersedes",
|
||||
# Validation family
|
||||
"analyzed_by",
|
||||
"validated_by",
|
||||
"supports", # Phase 5 — Q-011 killer query
|
||||
"conflicts_with", # Phase 5 — Q-012 future
|
||||
"depends_on",
|
||||
"uses_material",
|
||||
# Provenance family
|
||||
"described_by",
|
||||
"supersedes",
|
||||
"updated_by_session", # Phase 5 — session→entity provenance
|
||||
"evidenced_by", # Phase 5 — Q-017 evidence trace
|
||||
"summarized_in", # Phase 5 — mirror caches
|
||||
# Domain-specific (pre-existing, retained)
|
||||
"uses_material",
|
||||
]
|
||||
|
||||
ENTITY_STATUSES = ["candidate", "active", "superseded", "invalid"]
|
||||
@@ -132,6 +144,7 @@ def create_entity(
|
||||
status: str = "active",
|
||||
confidence: float = 1.0,
|
||||
source_refs: list[str] | None = None,
|
||||
actor: str = "api",
|
||||
) -> Entity:
|
||||
if entity_type not in ENTITY_TYPES:
|
||||
raise ValueError(f"Invalid entity type: {entity_type}. Must be one of {ENTITY_TYPES}")
|
||||
@@ -140,6 +153,11 @@ def create_entity(
|
||||
if not name or not name.strip():
|
||||
raise ValueError("Entity name must be non-empty")
|
||||
|
||||
# Phase 5: enforce project canonicalization contract at the write seam.
|
||||
# Aliases like "p04" become "p04-gigabit" so downstream reads stay
|
||||
# consistent with the registry.
|
||||
project = resolve_project_name(project) if project else ""
|
||||
|
||||
entity_id = str(uuid.uuid4())
|
||||
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
props = properties or {}
|
||||
@@ -159,6 +177,22 @@ def create_entity(
|
||||
)
|
||||
|
||||
log.info("entity_created", entity_id=entity_id, entity_type=entity_type, name=name)
|
||||
|
||||
# Phase 5: entity audit rows share the memory_audit table via
|
||||
# entity_kind="entity" discriminator. Same infrastructure, unified history.
|
||||
_audit_entity(
|
||||
entity_id=entity_id,
|
||||
action="created",
|
||||
actor=actor,
|
||||
after={
|
||||
"entity_type": entity_type,
|
||||
"name": name.strip(),
|
||||
"project": project,
|
||||
"status": status,
|
||||
"confidence": confidence,
|
||||
},
|
||||
)
|
||||
|
||||
return Entity(
|
||||
id=entity_id, entity_type=entity_type, name=name.strip(),
|
||||
project=project, description=description, properties=props,
|
||||
@@ -167,6 +201,35 @@ def create_entity(
|
||||
)
|
||||
|
||||
|
||||
def _audit_entity(
|
||||
entity_id: str,
|
||||
action: str,
|
||||
actor: str = "api",
|
||||
before: dict | None = None,
|
||||
after: dict | None = None,
|
||||
note: str = "",
|
||||
) -> None:
|
||||
"""Append an entity mutation row to the shared memory_audit table."""
|
||||
try:
|
||||
with get_connection() as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO memory_audit (id, memory_id, action, actor, "
|
||||
"before_json, after_json, note, entity_kind) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?, 'entity')",
|
||||
(
|
||||
str(uuid.uuid4()),
|
||||
entity_id,
|
||||
action,
|
||||
actor or "api",
|
||||
json.dumps(before or {}),
|
||||
json.dumps(after or {}),
|
||||
(note or "")[:500],
|
||||
),
|
||||
)
|
||||
except Exception as e:
|
||||
log.warning("entity_audit_failed", entity_id=entity_id, action=action, error=str(e))
|
||||
|
||||
|
||||
def create_relationship(
|
||||
source_entity_id: str,
|
||||
target_entity_id: str,
|
||||
@@ -198,6 +261,17 @@ def create_relationship(
|
||||
target=target_entity_id,
|
||||
rel_type=relationship_type,
|
||||
)
|
||||
# Phase 5: relationship audit as an entity action on the source
|
||||
_audit_entity(
|
||||
entity_id=source_entity_id,
|
||||
action="relationship_added",
|
||||
actor="api",
|
||||
after={
|
||||
"rel_id": rel_id,
|
||||
"rel_type": relationship_type,
|
||||
"target": target_entity_id,
|
||||
},
|
||||
)
|
||||
return Relationship(
|
||||
id=rel_id, source_entity_id=source_entity_id,
|
||||
target_entity_id=target_entity_id,
|
||||
@@ -206,6 +280,190 @@ def create_relationship(
|
||||
)
|
||||
|
||||
|
||||
# --- Phase 5: Entity promote/reject lifecycle ---
|
||||
|
||||
|
||||
def _set_entity_status(
|
||||
entity_id: str,
|
||||
new_status: str,
|
||||
actor: str = "api",
|
||||
note: str = "",
|
||||
) -> bool:
|
||||
"""Transition an entity's status with audit."""
|
||||
if new_status not in ENTITY_STATUSES:
|
||||
raise ValueError(f"Invalid status: {new_status}")
|
||||
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT status FROM entities WHERE id = ?", (entity_id,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return False
|
||||
old_status = row["status"]
|
||||
if old_status == new_status:
|
||||
return False
|
||||
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
conn.execute(
|
||||
"UPDATE entities SET status = ?, updated_at = ? WHERE id = ?",
|
||||
(new_status, now, entity_id),
|
||||
)
|
||||
|
||||
# Action verb mirrors memory pattern
|
||||
if new_status == "active" and old_status == "candidate":
|
||||
action = "promoted"
|
||||
elif new_status == "invalid" and old_status == "candidate":
|
||||
action = "rejected"
|
||||
elif new_status == "invalid":
|
||||
action = "invalidated"
|
||||
elif new_status == "superseded":
|
||||
action = "superseded"
|
||||
else:
|
||||
action = "status_changed"
|
||||
|
||||
_audit_entity(
|
||||
entity_id=entity_id,
|
||||
action=action,
|
||||
actor=actor,
|
||||
before={"status": old_status},
|
||||
after={"status": new_status},
|
||||
note=note,
|
||||
)
|
||||
log.info("entity_status_changed", entity_id=entity_id,
|
||||
old=old_status, new=new_status, action=action)
|
||||
return True
|
||||
|
||||
|
||||
def promote_entity(entity_id: str, actor: str = "api", note: str = "") -> bool:
|
||||
"""Promote a candidate entity to active.
|
||||
|
||||
Phase 5F graduation hook: if this entity has source_refs pointing at
|
||||
memories (format "memory:<uuid>"), mark those source memories as
|
||||
``status=graduated`` and set their ``graduated_to_entity_id`` forward
|
||||
pointer. This preserves the memory as an immutable historical record
|
||||
while signalling that it's been absorbed into the typed graph.
|
||||
"""
|
||||
entity = get_entity(entity_id)
|
||||
if entity is None or entity.status != "candidate":
|
||||
return False
|
||||
|
||||
ok = _set_entity_status(entity_id, "active", actor=actor, note=note)
|
||||
if not ok:
|
||||
return False
|
||||
|
||||
# Phase 5F: mark source memories as graduated
|
||||
memory_ids = [
|
||||
ref.split(":", 1)[1]
|
||||
for ref in (entity.source_refs or [])
|
||||
if isinstance(ref, str) and ref.startswith("memory:")
|
||||
]
|
||||
if memory_ids:
|
||||
_graduate_source_memories(memory_ids, entity_id, actor=actor)
|
||||
|
||||
# Phase 5G: sync conflict detection on promote. Fail-open — detection
|
||||
# errors log but never undo the successful promote.
|
||||
try:
|
||||
from atocore.engineering.conflicts import detect_conflicts_for_entity
|
||||
detect_conflicts_for_entity(entity_id)
|
||||
except Exception as e:
|
||||
log.warning("conflict_detection_failed", entity_id=entity_id, error=str(e))
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def _graduate_source_memories(memory_ids: list[str], entity_id: str, actor: str) -> None:
|
||||
"""Mark source memories as graduated and set forward pointer."""
|
||||
if not memory_ids:
|
||||
return
|
||||
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
with get_connection() as conn:
|
||||
for mid in memory_ids:
|
||||
try:
|
||||
row = conn.execute(
|
||||
"SELECT status FROM memories WHERE id = ?", (mid,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
continue
|
||||
old_status = row["status"]
|
||||
if old_status == "graduated":
|
||||
continue # already graduated — maybe by a different entity
|
||||
conn.execute(
|
||||
"UPDATE memories SET status = 'graduated', "
|
||||
"graduated_to_entity_id = ?, updated_at = ? WHERE id = ?",
|
||||
(entity_id, now, mid),
|
||||
)
|
||||
# Write a memory_audit row for the graduation
|
||||
conn.execute(
|
||||
"INSERT INTO memory_audit (id, memory_id, action, actor, "
|
||||
"before_json, after_json, note, entity_kind) "
|
||||
"VALUES (?, ?, 'graduated', ?, ?, ?, ?, 'memory')",
|
||||
(
|
||||
str(uuid.uuid4()),
|
||||
mid,
|
||||
actor or "api",
|
||||
json.dumps({"status": old_status}),
|
||||
json.dumps({
|
||||
"status": "graduated",
|
||||
"graduated_to_entity_id": entity_id,
|
||||
}),
|
||||
f"graduated to entity {entity_id[:8]}",
|
||||
),
|
||||
)
|
||||
log.info("memory_graduated", memory_id=mid,
|
||||
entity_id=entity_id, old_status=old_status)
|
||||
except Exception as e:
|
||||
log.warning("memory_graduation_failed",
|
||||
memory_id=mid, entity_id=entity_id, error=str(e))
|
||||
|
||||
|
||||
def reject_entity_candidate(entity_id: str, actor: str = "api", note: str = "") -> bool:
|
||||
"""Reject a candidate entity (status → invalid)."""
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT status FROM entities WHERE id = ?", (entity_id,)
|
||||
).fetchone()
|
||||
if row is None or row["status"] != "candidate":
|
||||
return False
|
||||
return _set_entity_status(entity_id, "invalid", actor=actor, note=note)
|
||||
|
||||
|
||||
def supersede_entity(entity_id: str, actor: str = "api", note: str = "") -> bool:
|
||||
"""Mark an active entity as superseded by a newer one."""
|
||||
return _set_entity_status(entity_id, "superseded", actor=actor, note=note)
|
||||
|
||||
|
||||
def get_entity_audit(entity_id: str, limit: int = 100) -> list[dict]:
|
||||
"""Fetch audit entries for an entity from the shared audit table."""
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT id, memory_id AS entity_id, action, actor, before_json, "
|
||||
"after_json, note, timestamp FROM memory_audit "
|
||||
"WHERE entity_kind = 'entity' AND memory_id = ? "
|
||||
"ORDER BY timestamp DESC LIMIT ?",
|
||||
(entity_id, limit),
|
||||
).fetchall()
|
||||
out = []
|
||||
for r in rows:
|
||||
try:
|
||||
before = json.loads(r["before_json"] or "{}")
|
||||
except Exception:
|
||||
before = {}
|
||||
try:
|
||||
after = json.loads(r["after_json"] or "{}")
|
||||
except Exception:
|
||||
after = {}
|
||||
out.append({
|
||||
"id": r["id"],
|
||||
"entity_id": r["entity_id"],
|
||||
"action": r["action"],
|
||||
"actor": r["actor"] or "api",
|
||||
"before": before,
|
||||
"after": after,
|
||||
"note": r["note"] or "",
|
||||
"timestamp": r["timestamp"],
|
||||
})
|
||||
return out
|
||||
|
||||
|
||||
def get_entities(
|
||||
entity_type: str | None = None,
|
||||
project: str | None = None,
|
||||
|
||||
747
src/atocore/engineering/triage_ui.py
Normal file
747
src/atocore/engineering/triage_ui.py
Normal file
@@ -0,0 +1,747 @@
|
||||
"""Human triage UI for AtoCore candidate memories.
|
||||
|
||||
Renders a lightweight HTML page at /admin/triage with all pending
|
||||
candidate memories, each with inline Promote / Reject / Edit buttons.
|
||||
No framework, no JS build, no database — reads candidates from the
|
||||
AtoCore DB and posts back to the existing REST endpoints.
|
||||
|
||||
Design principle: the user should be able to triage 20 candidates in
|
||||
60 seconds from any browser. Keyboard shortcuts (y/n/e/s) make it
|
||||
feel like email triage (archive/delete).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import html as _html
|
||||
|
||||
from atocore.engineering.wiki import render_html
|
||||
from atocore.memory.service import get_memories
|
||||
|
||||
|
||||
VALID_TYPES = ["identity", "preference", "project", "episodic", "knowledge", "adaptation"]
|
||||
|
||||
|
||||
def _escape(s: str | None) -> str:
|
||||
return _html.escape(s or "", quote=True)
|
||||
|
||||
|
||||
def _render_candidate_card(cand) -> str:
|
||||
"""One candidate row with inline forms for promote/reject/edit."""
|
||||
mid = _escape(cand.id)
|
||||
content = _escape(cand.content)
|
||||
memory_type = _escape(cand.memory_type)
|
||||
project = _escape(cand.project or "")
|
||||
project_display = project or "(global)"
|
||||
confidence = f"{cand.confidence:.2f}"
|
||||
refs = cand.reference_count or 0
|
||||
created = _escape(str(cand.created_at or ""))
|
||||
tags = cand.domain_tags or []
|
||||
tags_str = _escape(", ".join(tags))
|
||||
valid_until = _escape(cand.valid_until or "")
|
||||
# Strip time portion for HTML date input
|
||||
valid_until_date = valid_until[:10] if valid_until else ""
|
||||
|
||||
type_options = "".join(
|
||||
f'<option value="{t}"{" selected" if t == cand.memory_type else ""}>{t}</option>'
|
||||
for t in VALID_TYPES
|
||||
)
|
||||
|
||||
# Tag badges rendered from current tags
|
||||
badges_html = ""
|
||||
if tags:
|
||||
badges_html = '<div class="cand-tags-display">' + "".join(
|
||||
f'<span class="tag-badge">{_escape(t)}</span>' for t in tags
|
||||
) + '</div>'
|
||||
|
||||
return f"""
|
||||
<div class="cand" id="cand-{mid}" data-id="{mid}">
|
||||
<div class="cand-head">
|
||||
<span class="cand-type">[{memory_type}]</span>
|
||||
<span class="cand-project">{project_display}</span>
|
||||
<span class="cand-meta">conf {confidence} · refs {refs} · {created[:16]}</span>
|
||||
</div>
|
||||
<div class="cand-body">
|
||||
<textarea class="cand-content" id="content-{mid}">{content}</textarea>
|
||||
</div>
|
||||
{badges_html}
|
||||
<div class="cand-meta-row">
|
||||
<label class="cand-field-label">Tags:
|
||||
<input type="text" class="cand-tags-input" id="tags-{mid}"
|
||||
value="{tags_str}" placeholder="optics, thermal, p04" />
|
||||
</label>
|
||||
<label class="cand-field-label">Valid until:
|
||||
<input type="date" class="cand-valid-until" id="valid-until-{mid}"
|
||||
value="{valid_until_date}" />
|
||||
</label>
|
||||
</div>
|
||||
<div class="cand-actions">
|
||||
<button class="btn-promote" data-id="{mid}" title="Promote (Y)">✅ Promote</button>
|
||||
<button class="btn-reject" data-id="{mid}" title="Reject (N)">❌ Reject</button>
|
||||
<button class="btn-save-promote" data-id="{mid}" title="Save edits + promote (E)">✏️ Save&Promote</button>
|
||||
<label class="cand-type-label">Type:
|
||||
<select class="cand-type-select" id="type-{mid}">{type_options}</select>
|
||||
</label>
|
||||
</div>
|
||||
<div class="cand-status" id="status-{mid}"></div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
|
||||
_TRIAGE_SCRIPT = """
|
||||
<script>
|
||||
async function apiCall(url, method, body) {
|
||||
try {
|
||||
const opts = { method };
|
||||
if (body) {
|
||||
opts.headers = { 'Content-Type': 'application/json' };
|
||||
opts.body = JSON.stringify(body);
|
||||
}
|
||||
const res = await fetch(url, opts);
|
||||
return { ok: res.ok, status: res.status, json: res.ok ? await res.json().catch(()=>null) : null };
|
||||
} catch (e) { return { ok: false, status: 0, error: String(e) }; }
|
||||
}
|
||||
|
||||
async function requestAutoTriage() {
|
||||
const btn = document.getElementById('auto-triage-btn');
|
||||
const status = document.getElementById('auto-triage-status');
|
||||
if (!btn) return;
|
||||
btn.disabled = true;
|
||||
btn.textContent = '⏳ Requesting...';
|
||||
const r = await apiCall('/admin/triage/request-drain', 'POST');
|
||||
if (r.ok) {
|
||||
status.textContent = '✓ Requested. Host watcher runs every 2 min. Refresh this page in a minute to check progress.';
|
||||
status.className = 'auto-triage-msg ok';
|
||||
btn.textContent = '✓ Requested';
|
||||
pollDrainStatus();
|
||||
} else {
|
||||
status.textContent = '❌ Request failed: ' + r.status;
|
||||
status.className = 'auto-triage-msg err';
|
||||
btn.disabled = false;
|
||||
btn.textContent = '🤖 Auto-process queue';
|
||||
}
|
||||
}
|
||||
|
||||
async function pollDrainStatus() {
|
||||
const status = document.getElementById('auto-triage-status');
|
||||
const btn = document.getElementById('auto-triage-btn');
|
||||
let polls = 0;
|
||||
const timer = setInterval(async () => {
|
||||
polls++;
|
||||
const r = await apiCall('/admin/triage/drain-status', 'GET');
|
||||
if (!r.ok || !r.json) return;
|
||||
const s = r.json;
|
||||
if (s.is_running) {
|
||||
status.textContent = '⚙️ Auto-triage running on host... (started ' + (s.last_started_at || '?') + ')';
|
||||
status.className = 'auto-triage-msg ok';
|
||||
} else if (s.last_finished_at && !s.requested_at) {
|
||||
status.textContent = '✅ Last run finished: ' + s.last_finished_at + ' → ' + (s.last_result || 'complete');
|
||||
status.className = 'auto-triage-msg ok';
|
||||
if (btn) { btn.disabled = false; btn.textContent = '🤖 Auto-process queue'; }
|
||||
clearInterval(timer);
|
||||
// Reload page to pick up new queue state
|
||||
setTimeout(() => window.location.reload(), 3000);
|
||||
}
|
||||
if (polls > 60) { clearInterval(timer); } // stop after ~10 min of polling
|
||||
}, 10000); // poll every 10s
|
||||
}
|
||||
|
||||
function setStatus(id, msg, ok) {
|
||||
const el = document.getElementById('status-' + id);
|
||||
if (!el) return;
|
||||
el.textContent = msg;
|
||||
el.className = 'cand-status ' + (ok ? 'ok' : 'err');
|
||||
}
|
||||
|
||||
function removeCard(id) {
|
||||
setTimeout(() => {
|
||||
const card = document.getElementById('cand-' + id);
|
||||
if (card) {
|
||||
card.style.opacity = '0';
|
||||
setTimeout(() => card.remove(), 300);
|
||||
}
|
||||
updateCount();
|
||||
}, 400);
|
||||
}
|
||||
|
||||
function updateCount() {
|
||||
const n = document.querySelectorAll('.cand').length;
|
||||
const el = document.getElementById('cand-count');
|
||||
if (el) el.textContent = n;
|
||||
const next = document.querySelector('.cand');
|
||||
if (next) next.scrollIntoView({ behavior: 'smooth', block: 'start' });
|
||||
}
|
||||
|
||||
async function promote(id) {
|
||||
setStatus(id, 'Promoting…', true);
|
||||
const r = await apiCall('/memory/' + encodeURIComponent(id) + '/promote', 'POST');
|
||||
if (r.ok) { setStatus(id, '✅ Promoted', true); removeCard(id); }
|
||||
else setStatus(id, '❌ Failed: ' + r.status, false);
|
||||
}
|
||||
|
||||
async function reject(id) {
|
||||
setStatus(id, 'Rejecting…', true);
|
||||
const r = await apiCall('/memory/' + encodeURIComponent(id) + '/reject', 'POST');
|
||||
if (r.ok) { setStatus(id, '❌ Rejected', true); removeCard(id); }
|
||||
else setStatus(id, '❌ Failed: ' + r.status, false);
|
||||
}
|
||||
|
||||
function parseTags(str) {
|
||||
return (str || '').split(/[,;]/).map(s => s.trim().toLowerCase()).filter(Boolean);
|
||||
}
|
||||
|
||||
async function savePromote(id) {
|
||||
const content = document.getElementById('content-' + id).value.trim();
|
||||
const mtype = document.getElementById('type-' + id).value;
|
||||
const tagsStr = document.getElementById('tags-' + id)?.value || '';
|
||||
const validUntil = document.getElementById('valid-until-' + id)?.value || '';
|
||||
if (!content) { setStatus(id, 'Content is empty', false); return; }
|
||||
setStatus(id, 'Saving…', true);
|
||||
const body = {
|
||||
content: content,
|
||||
memory_type: mtype,
|
||||
domain_tags: parseTags(tagsStr),
|
||||
valid_until: validUntil,
|
||||
};
|
||||
const r1 = await apiCall('/memory/' + encodeURIComponent(id), 'PUT', body);
|
||||
if (!r1.ok) { setStatus(id, '❌ Save failed: ' + r1.status, false); return; }
|
||||
const r2 = await apiCall('/memory/' + encodeURIComponent(id) + '/promote', 'POST');
|
||||
if (r2.ok) { setStatus(id, '✅ Saved & Promoted', true); removeCard(id); }
|
||||
else setStatus(id, '❌ Promote failed: ' + r2.status, false);
|
||||
}
|
||||
|
||||
// Also save tag/expiry edits when plain "Promote" is clicked if fields changed
|
||||
async function promoteWithMeta(id) {
|
||||
const tagsStr = document.getElementById('tags-' + id)?.value || '';
|
||||
const validUntil = document.getElementById('valid-until-' + id)?.value || '';
|
||||
if (tagsStr.trim() || validUntil) {
|
||||
await apiCall('/memory/' + encodeURIComponent(id), 'PUT', {
|
||||
domain_tags: parseTags(tagsStr),
|
||||
valid_until: validUntil,
|
||||
});
|
||||
}
|
||||
return promote(id);
|
||||
}
|
||||
|
||||
document.addEventListener('click', (e) => {
|
||||
const id = e.target.dataset?.id;
|
||||
if (!id) return;
|
||||
if (e.target.classList.contains('btn-promote')) promoteWithMeta(id);
|
||||
else if (e.target.classList.contains('btn-reject')) reject(id);
|
||||
else if (e.target.classList.contains('btn-save-promote')) savePromote(id);
|
||||
});
|
||||
|
||||
// Keyboard shortcuts on the currently-focused card
|
||||
document.addEventListener('keydown', (e) => {
|
||||
// Don't intercept if user is typing in textarea/select/input
|
||||
const t = e.target.tagName;
|
||||
if (t === 'TEXTAREA' || t === 'INPUT' || t === 'SELECT') return;
|
||||
const first = document.querySelector('.cand');
|
||||
if (!first) return;
|
||||
const id = first.dataset.id;
|
||||
if (e.key === 'y' || e.key === 'Y') { e.preventDefault(); promoteWithMeta(id); }
|
||||
else if (e.key === 'n' || e.key === 'N') { e.preventDefault(); reject(id); }
|
||||
else if (e.key === 'e' || e.key === 'E') {
|
||||
e.preventDefault();
|
||||
document.getElementById('content-' + id)?.focus();
|
||||
}
|
||||
else if (e.key === 's' || e.key === 'S') { e.preventDefault(); first.scrollIntoView({behavior:'smooth'}); }
|
||||
});
|
||||
</script>
|
||||
"""
|
||||
|
||||
|
||||
_TRIAGE_CSS = """
|
||||
<style>
|
||||
.triage-header { display:flex; justify-content:space-between; align-items:baseline; margin-bottom:1rem; }
|
||||
.triage-header .count { font-size:1.4rem; font-weight:600; color:var(--accent); }
|
||||
.triage-help { background:var(--card); border-left:4px solid var(--accent); padding:0.8rem 1rem; margin-bottom:1.5rem; border-radius:4px; font-size:0.9rem; }
|
||||
.triage-help kbd { background:var(--hover); padding:2px 6px; border-radius:3px; font-family:monospace; font-size:0.85em; border:1px solid var(--border); }
|
||||
.cand { background:var(--card); border:1px solid var(--border); border-radius:6px; padding:1rem; margin-bottom:1rem; transition:opacity 0.3s; }
|
||||
.cand-head { display:flex; gap:0.8rem; align-items:center; margin-bottom:0.6rem; font-size:0.9rem; }
|
||||
.cand-type { font-weight:600; color:var(--accent); font-family:monospace; }
|
||||
.cand-project { color:var(--text); opacity:0.8; font-family:monospace; }
|
||||
.cand-meta { color:var(--text); opacity:0.55; font-size:0.8rem; margin-left:auto; }
|
||||
.cand-content { width:100%; min-height:80px; font-family:inherit; font-size:0.95rem; padding:0.5rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:4px; resize:vertical; box-sizing:border-box; }
|
||||
.cand-content:focus { outline:none; border-color:var(--accent); }
|
||||
.cand-actions { display:flex; gap:0.5rem; margin-top:0.8rem; align-items:center; flex-wrap:wrap; }
|
||||
.cand-actions button { padding:0.4rem 0.9rem; border:1px solid var(--border); background:var(--card); color:var(--text); border-radius:4px; cursor:pointer; font-size:0.88rem; }
|
||||
.cand-actions button:hover { background:var(--hover); }
|
||||
.btn-promote:hover { background:#059669; color:white; border-color:#059669; }
|
||||
.btn-reject:hover { background:#dc2626; color:white; border-color:#dc2626; }
|
||||
.btn-save-promote:hover { background:var(--accent); color:white; border-color:var(--accent); }
|
||||
.cand-type-label { font-size:0.85rem; margin-left:auto; opacity:0.7; }
|
||||
.cand-type-select { padding:0.25rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px; font-family:monospace; }
|
||||
.cand-status { margin-top:0.5rem; font-size:0.85rem; min-height:1.2em; }
|
||||
.cand-status.ok { color:#059669; }
|
||||
.cand-status.err { color:#dc2626; }
|
||||
.empty { text-align:center; padding:3rem; opacity:0.6; }
|
||||
.auto-triage-bar { display:flex; gap:0.8rem; align-items:center; background:var(--card); border:1px solid var(--border); border-radius:6px; padding:0.7rem 1rem; margin-bottom:1.2rem; flex-wrap:wrap; }
|
||||
.auto-triage-bar button { padding:0.55rem 1.1rem; border:1px solid var(--accent); background:var(--accent); color:white; border-radius:4px; cursor:pointer; font-weight:600; font-size:0.95rem; }
|
||||
.auto-triage-bar button:hover:not(:disabled) { opacity:0.9; }
|
||||
.auto-triage-bar button:disabled { opacity:0.5; cursor:not-allowed; }
|
||||
.auto-triage-msg { flex:1; min-width:200px; font-size:0.85rem; opacity:0.75; }
|
||||
.auto-triage-msg.ok { color:var(--accent); opacity:1; font-weight:500; }
|
||||
.auto-triage-msg.err { color:#dc2626; opacity:1; font-weight:500; }
|
||||
.cand-tags-display { margin-top:0.5rem; display:flex; gap:0.35rem; flex-wrap:wrap; }
|
||||
.tag-badge { background:var(--accent); color:white; padding:0.15rem 0.55rem; border-radius:10px; font-size:0.72rem; font-family:monospace; font-weight:500; }
|
||||
.cand-meta-row { display:flex; gap:0.8rem; margin-top:0.6rem; align-items:center; flex-wrap:wrap; }
|
||||
.cand-field-label { display:flex; gap:0.3rem; align-items:center; font-size:0.85rem; opacity:0.75; }
|
||||
.cand-tags-input { flex:1; min-width:200px; padding:0.3rem 0.5rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px; font-family:monospace; font-size:0.85rem; }
|
||||
.cand-tags-input:focus { outline:none; border-color:var(--accent); }
|
||||
.cand-valid-until { padding:0.3rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px; font-family:inherit; font-size:0.85rem; }
|
||||
</style>
|
||||
"""
|
||||
|
||||
|
||||
def _render_entity_card(entity) -> str:
|
||||
"""Phase 5: entity candidate card with promote/reject."""
|
||||
eid = _escape(entity.id)
|
||||
name = _escape(entity.name)
|
||||
etype = _escape(entity.entity_type)
|
||||
project = _escape(entity.project or "(global)")
|
||||
desc = _escape(entity.description or "")
|
||||
conf = f"{entity.confidence:.2f}"
|
||||
src_refs = entity.source_refs or []
|
||||
source_display = _escape(", ".join(src_refs[:3])) if src_refs else "(no provenance)"
|
||||
|
||||
return f"""
|
||||
<div class="cand cand-entity" id="ecand-{eid}" data-entity-id="{eid}">
|
||||
<div class="cand-head">
|
||||
<span class="cand-type entity-type">[entity · {etype}]</span>
|
||||
<span class="cand-project">{project}</span>
|
||||
<span class="cand-meta">conf {conf} · src: {source_display}</span>
|
||||
</div>
|
||||
<div class="cand-body">
|
||||
<div class="entity-name">{name}</div>
|
||||
<div class="entity-desc">{desc}</div>
|
||||
</div>
|
||||
<div class="cand-actions">
|
||||
<button class="btn-entity-promote" data-entity-id="{eid}" title="Promote entity (Y)">✅ Promote Entity</button>
|
||||
<button class="btn-entity-reject" data-entity-id="{eid}" title="Reject entity (N)">❌ Reject</button>
|
||||
<a class="btn-link" href="/wiki/entities/{eid}">View in wiki →</a>
|
||||
</div>
|
||||
<div class="cand-status" id="estatus-{eid}"></div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
|
||||
_ENTITY_TRIAGE_SCRIPT = """
|
||||
<script>
|
||||
async function entityPromote(id) {
|
||||
const st = document.getElementById('estatus-' + id);
|
||||
st.textContent = 'Promoting…';
|
||||
st.className = 'cand-status ok';
|
||||
const r = await fetch('/entities/' + encodeURIComponent(id) + '/promote', {method:'POST'});
|
||||
if (r.ok) {
|
||||
st.textContent = '✅ Entity promoted';
|
||||
setTimeout(() => {
|
||||
const card = document.getElementById('ecand-' + id);
|
||||
if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
|
||||
}, 400);
|
||||
} else st.textContent = '❌ ' + r.status;
|
||||
}
|
||||
async function entityReject(id) {
|
||||
const st = document.getElementById('estatus-' + id);
|
||||
st.textContent = 'Rejecting…';
|
||||
st.className = 'cand-status ok';
|
||||
const r = await fetch('/entities/' + encodeURIComponent(id) + '/reject', {method:'POST'});
|
||||
if (r.ok) {
|
||||
st.textContent = '❌ Entity rejected';
|
||||
setTimeout(() => {
|
||||
const card = document.getElementById('ecand-' + id);
|
||||
if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
|
||||
}, 400);
|
||||
} else st.textContent = '❌ ' + r.status;
|
||||
}
|
||||
document.addEventListener('click', (e) => {
|
||||
const eid = e.target.dataset?.entityId;
|
||||
if (!eid) return;
|
||||
if (e.target.classList.contains('btn-entity-promote')) entityPromote(eid);
|
||||
else if (e.target.classList.contains('btn-entity-reject')) entityReject(eid);
|
||||
});
|
||||
</script>
|
||||
"""
|
||||
|
||||
_ENTITY_TRIAGE_CSS = """
|
||||
<style>
|
||||
.cand-entity { border-left: 3px solid #059669; }
|
||||
.entity-type { background: #059669; color: white; padding: 0.1rem 0.5rem; border-radius: 3px; font-size: 0.75rem; }
|
||||
.entity-name { font-size: 1.15rem; font-weight: 600; margin-bottom: 0.3rem; }
|
||||
.entity-desc { opacity: 0.85; font-size: 0.95rem; }
|
||||
.btn-entity-promote { background: #059669; color: white; border-color: #059669; }
|
||||
.btn-entity-reject:hover { background: #dc2626; color: white; border-color: #dc2626; }
|
||||
.btn-link { padding: 0.4rem 0.9rem; text-decoration: none; color: var(--accent); border: 1px solid var(--border); border-radius: 4px; font-size: 0.88rem; }
|
||||
.btn-link:hover { background: var(--hover); }
|
||||
.section-break { border-top: 2px solid var(--border); margin: 2rem 0 1rem 0; padding-top: 1rem; }
|
||||
</style>
|
||||
"""
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Phase 7A — Merge candidates (semantic dedup)
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
_MERGE_TRIAGE_CSS = """
|
||||
<style>
|
||||
.cand-merge { border-left: 3px solid #8b5cf6; }
|
||||
.merge-type { background: #8b5cf6; color: white; padding: 0.1rem 0.5rem; border-radius: 3px; font-size: 0.75rem; }
|
||||
.merge-sources { margin: 0.5rem 0 0.8rem 0; display: flex; flex-direction: column; gap: 0.35rem; }
|
||||
.merge-source { background: var(--bg); border: 1px dashed var(--border); border-radius: 4px; padding: 0.4rem 0.6rem; font-size: 0.85rem; }
|
||||
.merge-source-meta { font-family: monospace; font-size: 0.72rem; opacity: 0.7; margin-bottom: 0.2rem; }
|
||||
.merge-arrow { text-align: center; font-size: 1.1rem; opacity: 0.5; margin: 0.3rem 0; }
|
||||
.merge-proposed { background: var(--card); border: 1px solid #8b5cf6; border-radius: 4px; padding: 0.5rem; }
|
||||
.btn-merge-approve { background: #8b5cf6; color: white; border-color: #8b5cf6; }
|
||||
.btn-merge-approve:hover { background: #7c3aed; }
|
||||
</style>
|
||||
"""
|
||||
|
||||
|
||||
def _render_merge_card(cand: dict) -> str:
|
||||
import json as _json
|
||||
cid = _escape(cand.get("id", ""))
|
||||
sim = cand.get("similarity") or 0.0
|
||||
sources = cand.get("sources") or []
|
||||
proposed_content = cand.get("proposed_content") or ""
|
||||
proposed_tags = cand.get("proposed_tags") or []
|
||||
proposed_project = cand.get("proposed_project") or ""
|
||||
reason = cand.get("reason") or ""
|
||||
|
||||
src_html = "".join(
|
||||
f"""
|
||||
<div class="merge-source">
|
||||
<div class="merge-source-meta">
|
||||
{_escape(s.get('id','')[:8])} · [{_escape(s.get('memory_type',''))}]
|
||||
· {_escape(s.get('project','') or '(global)')}
|
||||
· conf {float(s.get('confidence',0)):.2f}
|
||||
· refs {int(s.get('reference_count',0))}
|
||||
</div>
|
||||
<div>{_escape((s.get('content') or '')[:300])}</div>
|
||||
</div>
|
||||
"""
|
||||
for s in sources
|
||||
)
|
||||
tags_str = ", ".join(proposed_tags)
|
||||
return f"""
|
||||
<div class="cand cand-merge" id="mcand-{cid}" data-merge-id="{cid}">
|
||||
<div class="cand-head">
|
||||
<span class="cand-type merge-type">[merge · {len(sources)} sources]</span>
|
||||
<span class="cand-project">{_escape(proposed_project or '(global)')}</span>
|
||||
<span class="cand-meta">sim ≥ {sim:.2f}</span>
|
||||
</div>
|
||||
<div class="merge-sources">{src_html}</div>
|
||||
<div class="merge-arrow">↓ merged into ↓</div>
|
||||
<div class="merge-proposed">
|
||||
<textarea class="cand-content" id="mcontent-{cid}">{_escape(proposed_content)}</textarea>
|
||||
<div class="cand-meta-row">
|
||||
<label class="cand-field-label">Tags:
|
||||
<input type="text" class="cand-tags-input" id="mtags-{cid}" value="{_escape(tags_str)}" placeholder="tag1, tag2">
|
||||
</label>
|
||||
</div>
|
||||
{f'<div class="auto-triage-msg" style="margin-top:0.4rem;">💡 {_escape(reason)}</div>' if reason else ''}
|
||||
</div>
|
||||
<div class="cand-actions">
|
||||
<button class="btn-merge-approve" data-merge-id="{cid}" title="Approve merge">✅ Approve Merge</button>
|
||||
<button class="btn-reject" data-merge-id="{cid}" data-merge-reject="1" title="Keep separate">❌ Keep Separate</button>
|
||||
</div>
|
||||
<div class="cand-status" id="mstatus-{cid}"></div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
|
||||
_MERGE_TRIAGE_SCRIPT = """
|
||||
<script>
|
||||
async function mergeApprove(id) {
|
||||
const st = document.getElementById('mstatus-' + id);
|
||||
st.textContent = 'Merging…';
|
||||
st.className = 'cand-status ok';
|
||||
const content = document.getElementById('mcontent-' + id).value;
|
||||
const tagsRaw = document.getElementById('mtags-' + id).value;
|
||||
const tags = tagsRaw.split(',').map(t => t.trim()).filter(Boolean);
|
||||
const r = await fetch('/admin/memory/merge-candidates/' + encodeURIComponent(id) + '/approve', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({actor: 'human-triage', content: content, domain_tags: tags}),
|
||||
});
|
||||
if (r.ok) {
|
||||
const data = await r.json();
|
||||
st.textContent = '✅ Merged → ' + (data.result_memory_id || '').slice(0, 8);
|
||||
setTimeout(() => {
|
||||
const card = document.getElementById('mcand-' + id);
|
||||
if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
|
||||
}, 600);
|
||||
} else {
|
||||
const err = await r.text();
|
||||
st.textContent = '❌ ' + r.status + ': ' + err.slice(0, 120);
|
||||
st.className = 'cand-status err';
|
||||
}
|
||||
}
|
||||
|
||||
async function mergeReject(id) {
|
||||
const st = document.getElementById('mstatus-' + id);
|
||||
st.textContent = 'Rejecting…';
|
||||
st.className = 'cand-status ok';
|
||||
const r = await fetch('/admin/memory/merge-candidates/' + encodeURIComponent(id) + '/reject', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({actor: 'human-triage'}),
|
||||
});
|
||||
if (r.ok) {
|
||||
st.textContent = '❌ Kept separate';
|
||||
setTimeout(() => {
|
||||
const card = document.getElementById('mcand-' + id);
|
||||
if (card) { card.style.opacity = '0'; setTimeout(() => card.remove(), 300); }
|
||||
}, 400);
|
||||
} else st.textContent = '❌ ' + r.status;
|
||||
}
|
||||
|
||||
document.addEventListener('click', (e) => {
|
||||
const mid = e.target.dataset?.mergeId;
|
||||
if (!mid) return;
|
||||
if (e.target.classList.contains('btn-merge-approve')) mergeApprove(mid);
|
||||
else if (e.target.dataset?.mergeReject) mergeReject(mid);
|
||||
});
|
||||
|
||||
async function requestDedupScan() {
|
||||
const btn = document.getElementById('dedup-btn');
|
||||
const status = document.getElementById('dedup-status');
|
||||
btn.disabled = true;
|
||||
btn.textContent = 'Queuing…';
|
||||
status.textContent = '';
|
||||
status.className = 'auto-triage-msg';
|
||||
const threshold = parseFloat(document.getElementById('dedup-threshold').value || '0.88');
|
||||
const r = await fetch('/admin/memory/dedup-scan', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({project: '', similarity_threshold: threshold, max_batch: 50}),
|
||||
});
|
||||
if (r.ok) {
|
||||
status.textContent = `✓ Queued dedup scan at threshold ${threshold}. Host watcher runs every 2 min; refresh in ~3 min to see merge candidates.`;
|
||||
status.className = 'auto-triage-msg ok';
|
||||
} else {
|
||||
status.textContent = '✗ ' + r.status;
|
||||
status.className = 'auto-triage-msg err';
|
||||
}
|
||||
setTimeout(() => {
|
||||
btn.disabled = false;
|
||||
btn.textContent = '🔗 Scan for duplicates';
|
||||
}, 2000);
|
||||
}
|
||||
</script>
|
||||
"""
|
||||
|
||||
|
||||
def _render_dedup_bar() -> str:
|
||||
return """
|
||||
<div class="auto-triage-bar">
|
||||
<button id="dedup-btn" onclick="requestDedupScan()" title="Run semantic dedup scan on Dalidou host">
|
||||
🔗 Scan for duplicates
|
||||
</button>
|
||||
<label class="cand-field-label" style="margin:0 0.5rem;">
|
||||
Threshold:
|
||||
<input id="dedup-threshold" type="number" min="0.70" max="0.99" step="0.01" value="0.88"
|
||||
style="width:70px; padding:0.25rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:3px;">
|
||||
</label>
|
||||
<span id="dedup-status" class="auto-triage-msg">
|
||||
Finds semantically near-duplicate active memories and proposes LLM-drafted merges for review. Source memories become <code>superseded</code> on approve; nothing is deleted.
|
||||
</span>
|
||||
</div>
|
||||
"""
|
||||
|
||||
|
||||
def _render_graduation_bar() -> str:
|
||||
"""The 'Graduate memories → entity candidates' control bar."""
|
||||
from atocore.projects.registry import load_project_registry
|
||||
try:
|
||||
projects = load_project_registry()
|
||||
options = '<option value="">(all projects)</option>' + "".join(
|
||||
f'<option value="{_escape(p.project_id)}">{_escape(p.project_id)}</option>'
|
||||
for p in projects
|
||||
)
|
||||
except Exception:
|
||||
options = '<option value="">(all projects)</option>'
|
||||
|
||||
return f"""
|
||||
<div class="auto-triage-bar graduation-bar">
|
||||
<button id="grad-btn" onclick="requestGraduation()" title="Run memory→entity graduation on Dalidou host">
|
||||
🎓 Graduate memories
|
||||
</button>
|
||||
<label class="cand-field-label">Project:
|
||||
<select id="grad-project" class="cand-type-select">{options}</select>
|
||||
</label>
|
||||
<label class="cand-field-label">Limit:
|
||||
<input id="grad-limit" type="number" class="cand-tags-input" style="max-width:80px"
|
||||
value="30" min="1" max="200" />
|
||||
</label>
|
||||
<span id="grad-status" class="auto-triage-msg">
|
||||
Scans active memories, asks the LLM "does this describe a typed entity?",
|
||||
and creates entity candidates. Review them in the Entity section below.
|
||||
</span>
|
||||
</div>
|
||||
"""
|
||||
|
||||
|
||||
_GRADUATION_SCRIPT = """
|
||||
<script>
|
||||
async function requestGraduation() {
|
||||
const btn = document.getElementById('grad-btn');
|
||||
const status = document.getElementById('grad-status');
|
||||
const project = document.getElementById('grad-project').value;
|
||||
const limit = parseInt(document.getElementById('grad-limit').value || '30', 10);
|
||||
btn.disabled = true;
|
||||
btn.textContent = '⏳ Requesting...';
|
||||
const r = await fetch('/admin/graduation/request', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({project, limit}),
|
||||
});
|
||||
if (r.ok) {
|
||||
const scope = project || 'all projects';
|
||||
status.textContent = `✓ Queued graduation for ${scope} (limit ${limit}). Host watcher runs every 2 min; refresh this page in ~3 min to see candidates.`;
|
||||
status.className = 'auto-triage-msg ok';
|
||||
btn.textContent = '✓ Requested';
|
||||
pollGraduationStatus();
|
||||
} else {
|
||||
status.textContent = '❌ Request failed: ' + r.status;
|
||||
status.className = 'auto-triage-msg err';
|
||||
btn.disabled = false;
|
||||
btn.textContent = '🎓 Graduate memories';
|
||||
}
|
||||
}
|
||||
|
||||
async function pollGraduationStatus() {
|
||||
const status = document.getElementById('grad-status');
|
||||
const btn = document.getElementById('grad-btn');
|
||||
let polls = 0;
|
||||
const timer = setInterval(async () => {
|
||||
polls++;
|
||||
const r = await fetch('/admin/graduation/status');
|
||||
if (!r.ok) return;
|
||||
const s = await r.json();
|
||||
if (s.is_running) {
|
||||
status.textContent = '⚙️ Graduation running... (started ' + (s.last_started_at || '?') + ')';
|
||||
status.className = 'auto-triage-msg ok';
|
||||
} else if (s.last_finished_at && !s.requested) {
|
||||
status.textContent = '✅ Finished: ' + s.last_finished_at + ' → ' + (s.last_result || 'complete');
|
||||
status.className = 'auto-triage-msg ok';
|
||||
if (btn) { btn.disabled = false; btn.textContent = '🎓 Graduate memories'; }
|
||||
clearInterval(timer);
|
||||
setTimeout(() => window.location.reload(), 3000);
|
||||
}
|
||||
if (polls > 120) { clearInterval(timer); } // ~20 min cap
|
||||
}, 10000);
|
||||
}
|
||||
</script>
|
||||
"""
|
||||
|
||||
|
||||
def render_triage_page(limit: int = 100) -> str:
|
||||
"""Render the full triage page with pending memory + entity candidates."""
|
||||
from atocore.engineering.service import get_entities
|
||||
|
||||
try:
|
||||
mem_candidates = get_memories(status="candidate", limit=limit)
|
||||
except Exception as e:
|
||||
body = f"<p>Error loading memory candidates: {_escape(str(e))}</p>"
|
||||
return render_html("Triage — AtoCore", body, breadcrumbs=[("Wiki", "/wiki"), ("Triage", "")])
|
||||
|
||||
try:
|
||||
entity_candidates = get_entities(status="candidate", limit=limit)
|
||||
except Exception as e:
|
||||
entity_candidates = []
|
||||
|
||||
try:
|
||||
from atocore.memory.service import get_merge_candidates
|
||||
merge_candidates = get_merge_candidates(status="pending", limit=limit)
|
||||
except Exception:
|
||||
merge_candidates = []
|
||||
|
||||
total = len(mem_candidates) + len(entity_candidates) + len(merge_candidates)
|
||||
graduation_bar = _render_graduation_bar()
|
||||
dedup_bar = _render_dedup_bar()
|
||||
|
||||
if total == 0:
|
||||
body = _TRIAGE_CSS + _ENTITY_TRIAGE_CSS + _MERGE_TRIAGE_CSS + f"""
|
||||
<div class="triage-header">
|
||||
<h1>Triage Queue</h1>
|
||||
</div>
|
||||
{graduation_bar}
|
||||
{dedup_bar}
|
||||
<div class="empty">
|
||||
<p>🎉 No candidates to review.</p>
|
||||
<p>The auto-triage pipeline keeps this queue empty unless something needs your judgment.</p>
|
||||
<p>Use 🎓 Graduate memories to propose entity candidates, or 🔗 Scan for duplicates to find near-duplicate memories to merge.</p>
|
||||
</div>
|
||||
""" + _GRADUATION_SCRIPT + _MERGE_TRIAGE_SCRIPT
|
||||
return render_html("Triage — AtoCore", body, breadcrumbs=[("Wiki", "/wiki"), ("Triage", "")])
|
||||
|
||||
# Memory cards
|
||||
mem_cards = "".join(_render_candidate_card(c) for c in mem_candidates)
|
||||
|
||||
# Merge cards (Phase 7A)
|
||||
merge_cards_html = ""
|
||||
if merge_candidates:
|
||||
merge_cards = "".join(_render_merge_card(c) for c in merge_candidates)
|
||||
merge_cards_html = f"""
|
||||
<div class="section-break">
|
||||
<h2>🔗 Merge Candidates ({len(merge_candidates)})</h2>
|
||||
<p class="auto-triage-msg">
|
||||
Semantically near-duplicate active memories. Approving merges the sources
|
||||
into the proposed unified memory; sources become <code>superseded</code>
|
||||
(not deleted — still queryable). You can edit the draft content and tags
|
||||
before approving.
|
||||
</p>
|
||||
</div>
|
||||
{merge_cards}
|
||||
"""
|
||||
|
||||
# Entity cards
|
||||
ent_cards_html = ""
|
||||
if entity_candidates:
|
||||
ent_cards = "".join(_render_entity_card(e) for e in entity_candidates)
|
||||
ent_cards_html = f"""
|
||||
<div class="section-break">
|
||||
<h2>🔧 Entity Candidates ({len(entity_candidates)})</h2>
|
||||
<p class="auto-triage-msg">
|
||||
Typed graph entries awaiting review. Promoting an entity connects it to
|
||||
the engineering knowledge graph (subsystems, requirements, decisions, etc.).
|
||||
</p>
|
||||
</div>
|
||||
{ent_cards}
|
||||
"""
|
||||
|
||||
body = _TRIAGE_CSS + _ENTITY_TRIAGE_CSS + _MERGE_TRIAGE_CSS + f"""
|
||||
<div class="triage-header">
|
||||
<h1>Triage Queue</h1>
|
||||
<span class="count">
|
||||
<span id="cand-count">{len(mem_candidates)}</span> memory ·
|
||||
{len(merge_candidates)} merge ·
|
||||
{len(entity_candidates)} entity
|
||||
</span>
|
||||
</div>
|
||||
<div class="triage-help">
|
||||
Review candidates the auto-triage wasn't sure about. Edit the content
|
||||
if needed, then promote or reject. Shortcuts: <kbd>Y</kbd> promote · <kbd>N</kbd>
|
||||
reject · <kbd>E</kbd> edit · <kbd>S</kbd> scroll to next.
|
||||
</div>
|
||||
<div class="auto-triage-bar">
|
||||
<button id="auto-triage-btn" onclick="requestAutoTriage()" title="Run auto_triage on Dalidou host">
|
||||
🤖 Auto-process queue
|
||||
</button>
|
||||
<span id="auto-triage-status" class="auto-triage-msg">
|
||||
Sends the full memory queue through 3-tier LLM triage on the host.
|
||||
Sonnet → Opus → auto-discard. Only genuinely ambiguous items land here.
|
||||
</span>
|
||||
</div>
|
||||
{graduation_bar}
|
||||
{dedup_bar}
|
||||
<h2>📝 Memory Candidates ({len(mem_candidates)})</h2>
|
||||
{mem_cards}
|
||||
{merge_cards_html}
|
||||
{ent_cards_html}
|
||||
""" + _TRIAGE_SCRIPT + _ENTITY_TRIAGE_SCRIPT + _GRADUATION_SCRIPT + _MERGE_TRIAGE_SCRIPT
|
||||
|
||||
return render_html(
|
||||
"Triage — AtoCore",
|
||||
body,
|
||||
breadcrumbs=[("Wiki", "/wiki"), ("Triage", "")],
|
||||
)
|
||||
@@ -26,8 +26,25 @@ from atocore.memory.service import get_memories
|
||||
from atocore.projects.registry import load_project_registry
|
||||
|
||||
|
||||
def render_html(title: str, body_html: str, breadcrumbs: list[tuple[str, str]] | None = None) -> str:
|
||||
nav = ""
|
||||
_TOP_NAV_LINKS = [
|
||||
("🏠 Home", "/wiki"),
|
||||
("📡 Activity", "/wiki/activity"),
|
||||
("🔀 Triage", "/admin/triage"),
|
||||
("📊 Dashboard", "/admin/dashboard"),
|
||||
]
|
||||
|
||||
|
||||
def _render_topnav(active_path: str = "") -> str:
|
||||
items = []
|
||||
for label, href in _TOP_NAV_LINKS:
|
||||
cls = "topnav-item active" if href == active_path else "topnav-item"
|
||||
items.append(f'<a href="{href}" class="{cls}">{label}</a>')
|
||||
return f'<nav class="topnav">{" ".join(items)}</nav>'
|
||||
|
||||
|
||||
def render_html(title: str, body_html: str, breadcrumbs: list[tuple[str, str]] | None = None, active_path: str = "") -> str:
|
||||
topnav = _render_topnav(active_path)
|
||||
crumbs = ""
|
||||
if breadcrumbs:
|
||||
parts = []
|
||||
for label, href in breadcrumbs:
|
||||
@@ -35,8 +52,9 @@ def render_html(title: str, body_html: str, breadcrumbs: list[tuple[str, str]] |
|
||||
parts.append(f'<a href="{href}">{label}</a>')
|
||||
else:
|
||||
parts.append(f"<span>{label}</span>")
|
||||
nav = f'<nav class="breadcrumbs">{" / ".join(parts)}</nav>'
|
||||
crumbs = f'<nav class="breadcrumbs">{" / ".join(parts)}</nav>'
|
||||
|
||||
nav = topnav + crumbs
|
||||
return _TEMPLATE.replace("{{title}}", title).replace("{{nav}}", nav).replace("{{body}}", body_html)
|
||||
|
||||
|
||||
@@ -100,6 +118,35 @@ def render_homepage() -> str:
|
||||
lines.append('<button type="submit">Search</button>')
|
||||
lines.append('</form>')
|
||||
|
||||
# What's happening — autonomous activity snippet
|
||||
try:
|
||||
from atocore.memory.service import get_recent_audit
|
||||
recent = get_recent_audit(limit=30)
|
||||
by_action: dict[str, int] = {}
|
||||
by_actor: dict[str, int] = {}
|
||||
for a in recent:
|
||||
by_action[a["action"]] = by_action.get(a["action"], 0) + 1
|
||||
by_actor[a["actor"]] = by_actor.get(a["actor"], 0) + 1
|
||||
# Surface autonomous actors specifically
|
||||
auto_actors = {k: v for k, v in by_actor.items()
|
||||
if k.startswith("auto-") or k == "confidence-decay"
|
||||
or k == "phase10-auto-promote" or k == "transient-to-durable"}
|
||||
if recent:
|
||||
lines.append('<div class="activity-snippet">')
|
||||
lines.append('<h3>📡 What the brain is doing</h3>')
|
||||
top_actions = sorted(by_action.items(), key=lambda x: -x[1])[:6]
|
||||
lines.append('<div class="stat-row">' +
|
||||
"".join(f'<span>{a}: {n}</span>' for a, n in top_actions) +
|
||||
'</div>')
|
||||
if auto_actors:
|
||||
lines.append(f'<p style="font-size:0.9rem; margin:0.3rem 0;">Autonomous actors: ' +
|
||||
" · ".join(f'<code>{k}</code> ({v})' for k, v in auto_actors.items()) +
|
||||
'</p>')
|
||||
lines.append('<p style="font-size:0.85rem; margin:0;"><a href="/wiki/activity">Full timeline →</a></p>')
|
||||
lines.append('</div>')
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
for bucket_name, items in buckets.items():
|
||||
if not items:
|
||||
continue
|
||||
@@ -116,14 +163,58 @@ def render_homepage() -> str:
|
||||
lines.append('</a>')
|
||||
lines.append('</div>')
|
||||
|
||||
# Phase 6 C.2: Emerging projects section
|
||||
try:
|
||||
import json as _json
|
||||
emerging_projects = []
|
||||
state_entries = get_state("atocore")
|
||||
for e in state_entries:
|
||||
if e.category == "proposals" and e.key == "unregistered_projects":
|
||||
try:
|
||||
emerging_projects = _json.loads(e.value)
|
||||
except Exception:
|
||||
emerging_projects = []
|
||||
break
|
||||
if emerging_projects:
|
||||
lines.append('<h2>📋 Emerging</h2>')
|
||||
lines.append('<p class="emerging-intro">Projects that appear in memories but aren\'t yet registered. '
|
||||
'One click to promote them to first-class projects.</p>')
|
||||
lines.append('<div class="emerging-grid">')
|
||||
for ep in emerging_projects[:10]:
|
||||
name = ep.get("project", "?")
|
||||
count = ep.get("count", 0)
|
||||
samples = ep.get("sample_contents", [])
|
||||
samples_html = "".join(f'<li>{s[:120]}</li>' for s in samples[:2])
|
||||
lines.append(
|
||||
f'<div class="emerging-card">'
|
||||
f'<h3>{name}</h3>'
|
||||
f'<div class="emerging-count">{count} memories</div>'
|
||||
f'<ul class="emerging-samples">{samples_html}</ul>'
|
||||
f'<button class="btn-register-emerging" onclick="registerEmerging({name!r})">📌 Register as project</button>'
|
||||
f'</div>'
|
||||
)
|
||||
lines.append('</div>')
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Quick stats
|
||||
all_entities = get_entities(limit=500)
|
||||
all_memories = get_memories(active_only=True, limit=500)
|
||||
pending = get_memories(status="candidate", limit=500)
|
||||
lines.append('<h2>System</h2>')
|
||||
lines.append(f'<p>{len(all_entities)} entities · {len(all_memories)} active memories · {len(projects)} projects</p>')
|
||||
lines.append(f'<p><a href="/admin/dashboard">API Dashboard (JSON)</a> · <a href="/health">Health Check</a></p>')
|
||||
|
||||
return render_html("AtoCore Wiki", "\n".join(lines))
|
||||
# Triage queue prompt — surfaced prominently if non-empty
|
||||
if pending:
|
||||
tone = "triage-warning" if len(pending) > 50 else "triage-notice"
|
||||
lines.append(
|
||||
f'<p class="{tone}">🗂️ <strong>{len(pending)} candidates</strong> awaiting triage — '
|
||||
f'<a href="/admin/triage">review now →</a></p>'
|
||||
)
|
||||
|
||||
lines.append(f'<p><a href="/admin/triage">Triage Queue</a> · <a href="/admin/dashboard">API Dashboard (JSON)</a> · <a href="/health">Health Check</a></p>')
|
||||
|
||||
return render_html("AtoCore Wiki", "\n".join(lines), active_path="/wiki")
|
||||
|
||||
|
||||
def render_project(project: str) -> str:
|
||||
@@ -201,15 +292,32 @@ def render_search(query: str) -> str:
|
||||
)
|
||||
lines.append('</ul>')
|
||||
|
||||
# Search memories
|
||||
# Search memories — match on content OR domain_tags (Phase 3)
|
||||
all_memories = get_memories(active_only=True, limit=200)
|
||||
query_lower = query.lower()
|
||||
matching_mems = [m for m in all_memories if query_lower in m.content.lower()][:10]
|
||||
matching_mems = [
|
||||
m for m in all_memories
|
||||
if query_lower in m.content.lower()
|
||||
or any(query_lower in (t or "").lower() for t in (m.domain_tags or []))
|
||||
][:20]
|
||||
if matching_mems:
|
||||
lines.append(f'<h2>Memories ({len(matching_mems)})</h2><ul>')
|
||||
for m in matching_mems:
|
||||
proj = f' <span class="tag">{m.project}</span>' if m.project else ''
|
||||
lines.append(f'<li>[{m.memory_type}]{proj} {m.content[:200]}</li>')
|
||||
tags_html = ""
|
||||
if m.domain_tags:
|
||||
tag_links = " ".join(
|
||||
f'<a href="/wiki/search?q={t}" class="tag-badge">{t}</a>'
|
||||
for t in m.domain_tags[:5]
|
||||
)
|
||||
tags_html = f' <span class="mem-tags">{tag_links}</span>'
|
||||
expiry_html = ""
|
||||
if m.valid_until:
|
||||
expiry_html = f' <span class="mem-expiry">valid until {m.valid_until[:10]}</span>'
|
||||
lines.append(
|
||||
f'<li>[{m.memory_type}]{proj}{tags_html}{expiry_html} '
|
||||
f'{m.content[:200]}</li>'
|
||||
)
|
||||
lines.append('</ul>')
|
||||
|
||||
if not entities and not matching_mems:
|
||||
@@ -227,6 +335,381 @@ def render_search(query: str) -> str:
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# /wiki/capture — DEPRECATED emergency paste-in form.
|
||||
# Kept as an endpoint because POST /interactions is public anyway, but
|
||||
# REMOVED from the topnav so it's not promoted as the capture path.
|
||||
# The sanctioned surfaces are Claude Code (Stop + UserPromptSubmit
|
||||
# hooks) and OpenClaw (capture plugin with 7I context injection).
|
||||
# This form is explicitly a last-resort for when someone has to feed
|
||||
# in an external log and can't get the normal hooks to reach it.
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
|
||||
def render_capture() -> str:
|
||||
lines = ['<h1>📥 Manual capture (fallback only)</h1>']
|
||||
lines.append(
|
||||
'<div class="triage-warning"><strong>This is not the capture path.</strong> '
|
||||
'The sanctioned capture surfaces are Claude Code (Stop hook auto-captures every turn) '
|
||||
'and OpenClaw (plugin auto-captures + injects AtoCore context on every agent turn). '
|
||||
'This form exists only as a last resort for external logs you can\'t get into the normal pipeline.</div>'
|
||||
)
|
||||
lines.append(
|
||||
'<p>If you\'re reaching for this page because you had a chat somewhere AtoCore didn\'t see, '
|
||||
'fix the capture surface instead — don\'t paste. The deliberate scope is Claude Code + OpenClaw.</p>'
|
||||
)
|
||||
lines.append('<p class="meta">Your prompt + the assistant\'s response. Project is optional — '
|
||||
'the extractor infers it from content.</p>')
|
||||
lines.append("""
|
||||
<form id="capture-form" style="display:flex; flex-direction:column; gap:0.8rem; margin-top:1rem;">
|
||||
<label><strong>Your prompt / question</strong>
|
||||
<textarea id="cap-prompt" required rows="4"
|
||||
style="width:100%; padding:0.6rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:6px; font-family:inherit; font-size:0.95rem;"
|
||||
placeholder="Paste what you asked…"></textarea>
|
||||
</label>
|
||||
<label><strong>Assistant response</strong>
|
||||
<textarea id="cap-response" required rows="10"
|
||||
style="width:100%; padding:0.6rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:6px; font-family:inherit; font-size:0.95rem;"
|
||||
placeholder="Paste the full assistant response…"></textarea>
|
||||
</label>
|
||||
<div style="display:flex; gap:0.5rem; align-items:center; flex-wrap:wrap;">
|
||||
<label style="display:flex; gap:0.35rem; align-items:center;">Project (optional):
|
||||
<input type="text" id="cap-project" placeholder="auto-detect"
|
||||
style="padding:0.35rem 0.6rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:4px; font-family:monospace; width:180px;">
|
||||
</label>
|
||||
<label style="display:flex; gap:0.35rem; align-items:center;">Source:
|
||||
<select id="cap-source" style="padding:0.35rem; background:var(--bg); color:var(--text); border:1px solid var(--border); border-radius:4px;">
|
||||
<option value="claude-desktop">Claude Desktop</option>
|
||||
<option value="claude-web">Claude.ai web</option>
|
||||
<option value="claude-mobile">Claude mobile</option>
|
||||
<option value="chatgpt">ChatGPT</option>
|
||||
<option value="other">Other</option>
|
||||
</select>
|
||||
</label>
|
||||
</div>
|
||||
<button type="submit"
|
||||
style="padding:0.6rem 1.2rem; background:var(--accent); color:white; border:none; border-radius:6px; cursor:pointer; font-size:1rem; font-weight:600; align-self:flex-start;">
|
||||
Save to AtoCore
|
||||
</button>
|
||||
</form>
|
||||
<div id="cap-status" style="margin-top:1rem; font-size:0.9rem; min-height:1.5em;"></div>
|
||||
|
||||
<script>
|
||||
document.getElementById('capture-form').addEventListener('submit', async (e) => {
|
||||
e.preventDefault();
|
||||
const prompt = document.getElementById('cap-prompt').value.trim();
|
||||
const response = document.getElementById('cap-response').value.trim();
|
||||
const project = document.getElementById('cap-project').value.trim();
|
||||
const source = document.getElementById('cap-source').value;
|
||||
const status = document.getElementById('cap-status');
|
||||
if (!prompt || !response) { status.textContent = 'Need both prompt and response.'; return; }
|
||||
status.textContent = 'Saving…';
|
||||
try {
|
||||
const r = await fetch('/interactions', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({
|
||||
prompt: prompt, response: response,
|
||||
client: source, project: project, reinforce: true
|
||||
})
|
||||
});
|
||||
if (r.ok) {
|
||||
const data = await r.json();
|
||||
status.innerHTML = '✅ Saved — interaction ' + (data.interaction_id || '?').slice(0,8) +
|
||||
'. Runs through extraction + triage within the hour.<br>' +
|
||||
'<a href="/interactions/' + (data.interaction_id || '') + '">view</a>';
|
||||
document.getElementById('capture-form').reset();
|
||||
} else {
|
||||
status.textContent = '❌ ' + r.status + ': ' + (await r.text()).slice(0, 200);
|
||||
}
|
||||
} catch (err) { status.textContent = '❌ ' + err.message; }
|
||||
});
|
||||
</script>
|
||||
""")
|
||||
lines.append(
|
||||
'<h2>How this works</h2>'
|
||||
'<ul>'
|
||||
'<li><strong>Claude Code</strong> → auto-captured via Stop hook</li>'
|
||||
'<li><strong>OpenClaw</strong> → auto-captured + gets AtoCore context injected on prompt start (Phase 7I)</li>'
|
||||
'<li><strong>Anything else</strong> (Claude Desktop, mobile, web, ChatGPT) → paste here</li>'
|
||||
'</ul>'
|
||||
'<p>The extractor is aggressive about capturing signal — don\'t hand-filter. '
|
||||
'If the conversation had nothing durable, triage will auto-reject.</p>'
|
||||
)
|
||||
|
||||
return render_html(
|
||||
"Capture — AtoCore",
|
||||
"\n".join(lines),
|
||||
breadcrumbs=[("Wiki", "/wiki"), ("Capture", "")],
|
||||
active_path="/wiki/capture",
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Phase 7E — /wiki/memories/{id}: memory detail page
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
|
||||
def render_memory_detail(memory_id: str) -> str | None:
|
||||
"""Full view of a single memory: content, audit trail, source refs,
|
||||
neighbors, graduation status. Fills the drill-down gap the list
|
||||
views can't."""
|
||||
from atocore.memory.service import get_memory_audit
|
||||
from atocore.models.database import get_connection
|
||||
|
||||
with get_connection() as conn:
|
||||
row = conn.execute("SELECT * FROM memories WHERE id = ?", (memory_id,)).fetchone()
|
||||
if row is None:
|
||||
return None
|
||||
|
||||
import json as _json
|
||||
mem = dict(row)
|
||||
try:
|
||||
tags = _json.loads(mem.get("domain_tags") or "[]") or []
|
||||
except Exception:
|
||||
tags = []
|
||||
|
||||
lines = [f'<h1>{mem["memory_type"]}: <span style="color:var(--text);">{mem["content"][:80]}</span></h1>']
|
||||
if len(mem["content"]) > 80:
|
||||
lines.append(f'<blockquote><p>{mem["content"]}</p></blockquote>')
|
||||
|
||||
# Metadata row
|
||||
meta_items = [
|
||||
f'<span class="tag">{mem["status"]}</span>',
|
||||
f'<strong>{mem["memory_type"]}</strong>',
|
||||
]
|
||||
if mem.get("project"):
|
||||
meta_items.append(f'<a href="/wiki/projects/{mem["project"]}">{mem["project"]}</a>')
|
||||
meta_items.append(f'confidence: <strong>{float(mem.get("confidence") or 0):.2f}</strong>')
|
||||
meta_items.append(f'refs: <strong>{int(mem.get("reference_count") or 0)}</strong>')
|
||||
if mem.get("valid_until"):
|
||||
meta_items.append(f'<span class="mem-expiry">valid until {str(mem["valid_until"])[:10]}</span>')
|
||||
lines.append(f'<p>{" · ".join(meta_items)}</p>')
|
||||
|
||||
if tags:
|
||||
tag_links = " ".join(f'<a href="/wiki/domains/{t}" class="tag-badge">{t}</a>' for t in tags)
|
||||
lines.append(f'<p><span class="mem-tags">{tag_links}</span></p>')
|
||||
|
||||
lines.append(f'<p class="meta">id: <code>{mem["id"]}</code> · created: {mem["created_at"]}'
|
||||
f' · updated: {mem.get("updated_at", "?")}'
|
||||
+ (f' · last referenced: {mem["last_referenced_at"]}' if mem.get("last_referenced_at") else '')
|
||||
+ '</p>')
|
||||
|
||||
# Graduation
|
||||
if mem.get("graduated_to_entity_id"):
|
||||
eid = mem["graduated_to_entity_id"]
|
||||
lines.append(
|
||||
f'<h2>🎓 Graduated</h2>'
|
||||
f'<p>This memory was promoted to a typed entity: '
|
||||
f'<a href="/wiki/entities/{eid}">{eid[:8]}</a></p>'
|
||||
)
|
||||
|
||||
# Source chunk
|
||||
if mem.get("source_chunk_id"):
|
||||
lines.append(f'<h2>Source chunk</h2><p><code>{mem["source_chunk_id"]}</code></p>')
|
||||
|
||||
# Audit trail
|
||||
audit = get_memory_audit(memory_id, limit=50)
|
||||
if audit:
|
||||
lines.append(f'<h2>Audit trail ({len(audit)} events)</h2><ul>')
|
||||
for a in audit:
|
||||
note = f' — {a["note"]}' if a.get("note") else ""
|
||||
lines.append(
|
||||
f'<li><code>{a["timestamp"]}</code> '
|
||||
f'<strong>{a["action"]}</strong> '
|
||||
f'<em>{a["actor"]}</em>{note}</li>'
|
||||
)
|
||||
lines.append('</ul>')
|
||||
|
||||
# Neighbors by shared tag
|
||||
if tags:
|
||||
from atocore.memory.service import get_memories as _get_memories
|
||||
neighbors = []
|
||||
for t in tags[:3]:
|
||||
for other in _get_memories(active_only=True, limit=30):
|
||||
if other.id == memory_id:
|
||||
continue
|
||||
if any(ot == t for ot in (other.domain_tags or [])):
|
||||
neighbors.append(other)
|
||||
# Dedupe
|
||||
seen = set()
|
||||
uniq = []
|
||||
for n in neighbors:
|
||||
if n.id in seen:
|
||||
continue
|
||||
seen.add(n.id)
|
||||
uniq.append(n)
|
||||
if uniq:
|
||||
lines.append(f'<h2>Related (by tag)</h2><ul>')
|
||||
for n in uniq[:10]:
|
||||
lines.append(
|
||||
f'<li><a href="/wiki/memories/{n.id}">[{n.memory_type}] '
|
||||
f'{n.content[:120]}</a>'
|
||||
+ (f' <span class="tag">{n.project}</span>' if n.project else '')
|
||||
+ '</li>'
|
||||
)
|
||||
lines.append('</ul>')
|
||||
|
||||
return render_html(
|
||||
f"Memory {memory_id[:8]}",
|
||||
"\n".join(lines),
|
||||
breadcrumbs=[("Wiki", "/wiki"), ("Memory", "")],
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Phase 7F — /wiki/domains/{tag}: cross-project domain view
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
|
||||
def render_domain(tag: str) -> str:
|
||||
"""All memories + entities carrying a given domain_tag, grouped by project.
|
||||
Answers 'what does the brain know about optics, across all projects?'"""
|
||||
tag = (tag or "").strip().lower()
|
||||
if not tag:
|
||||
return render_html("Domain", "<p>No tag specified.</p>",
|
||||
breadcrumbs=[("Wiki", "/wiki"), ("Domains", "")])
|
||||
|
||||
all_mems = get_memories(active_only=True, limit=500)
|
||||
matching = [m for m in all_mems
|
||||
if any((t or "").lower() == tag for t in (m.domain_tags or []))]
|
||||
|
||||
# Group by project
|
||||
by_project: dict[str, list] = {}
|
||||
for m in matching:
|
||||
by_project.setdefault(m.project or "(global)", []).append(m)
|
||||
|
||||
lines = [f'<h1>Domain: <code>{tag}</code></h1>']
|
||||
lines.append(f'<p class="meta">{len(matching)} active memories across {len(by_project)} projects</p>')
|
||||
|
||||
if not matching:
|
||||
lines.append(
|
||||
f'<p>No memories currently carry the tag <code>{tag}</code>.</p>'
|
||||
'<p>Domain tags are assigned by the extractor when it identifies '
|
||||
'the topical scope of a memory. They update over time.</p>'
|
||||
)
|
||||
return render_html(
|
||||
f"Domain: {tag}",
|
||||
"\n".join(lines),
|
||||
breadcrumbs=[("Wiki", "/wiki"), ("Domains", ""), (tag, "")],
|
||||
)
|
||||
|
||||
# Sort projects by count descending, (global) last
|
||||
def sort_key(item: tuple[str, list]) -> tuple[int, int]:
|
||||
proj, mems = item
|
||||
return (1 if proj == "(global)" else 0, -len(mems))
|
||||
|
||||
for proj, mems in sorted(by_project.items(), key=sort_key):
|
||||
proj_link = proj if proj == "(global)" else f'<a href="/wiki/projects/{proj}">{proj}</a>'
|
||||
lines.append(f'<h2>{proj_link} ({len(mems)})</h2><ul>')
|
||||
for m in mems:
|
||||
other_tags = [t for t in (m.domain_tags or []) if t != tag][:3]
|
||||
other_tags_html = ""
|
||||
if other_tags:
|
||||
other_tags_html = ' <span class="mem-tags">' + " ".join(
|
||||
f'<a href="/wiki/domains/{t}" class="tag-badge">{t}</a>' for t in other_tags
|
||||
) + '</span>'
|
||||
lines.append(
|
||||
f'<li><a href="/wiki/memories/{m.id}">[{m.memory_type}] '
|
||||
f'{m.content[:200]}</a>'
|
||||
f' <span class="meta">conf {m.confidence:.2f} · refs {m.reference_count}</span>'
|
||||
f'{other_tags_html}</li>'
|
||||
)
|
||||
lines.append('</ul>')
|
||||
|
||||
# Entities with this tag (if any have tags — currently they might not)
|
||||
try:
|
||||
all_entities = get_entities(limit=500)
|
||||
ent_matching = []
|
||||
for e in all_entities:
|
||||
tags = e.properties.get("domain_tags") if e.properties else []
|
||||
if isinstance(tags, list) and tag in [str(t).lower() for t in tags]:
|
||||
ent_matching.append(e)
|
||||
if ent_matching:
|
||||
lines.append(f'<h2>🔧 Entities ({len(ent_matching)})</h2><ul>')
|
||||
for e in ent_matching:
|
||||
lines.append(
|
||||
f'<li><a href="/wiki/entities/{e.id}">[{e.entity_type}] {e.name}</a>'
|
||||
+ (f' <span class="tag">{e.project}</span>' if e.project else '')
|
||||
+ '</li>'
|
||||
)
|
||||
lines.append('</ul>')
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return render_html(
|
||||
f"Domain: {tag}",
|
||||
"\n".join(lines),
|
||||
breadcrumbs=[("Wiki", "/wiki"), ("Domains", ""), (tag, "")],
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# /wiki/activity — autonomous-activity feed
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
|
||||
def render_activity(hours: int = 48, limit: int = 100) -> str:
|
||||
"""Timeline of what the autonomous pipeline did recently. Answers
|
||||
'what has the brain been doing while I was away?'"""
|
||||
from atocore.memory.service import get_recent_audit
|
||||
|
||||
audit = get_recent_audit(limit=limit)
|
||||
|
||||
# Group events by category for summary
|
||||
by_action: dict[str, int] = {}
|
||||
by_actor: dict[str, int] = {}
|
||||
for a in audit:
|
||||
by_action[a["action"]] = by_action.get(a["action"], 0) + 1
|
||||
by_actor[a["actor"]] = by_actor.get(a["actor"], 0) + 1
|
||||
|
||||
lines = [f'<h1>📡 Activity Feed</h1>']
|
||||
lines.append(f'<p class="meta">Last {len(audit)} events in the memory audit log</p>')
|
||||
|
||||
# Summary chips
|
||||
if by_action or by_actor:
|
||||
lines.append('<h2>Summary</h2>')
|
||||
lines.append('<p><strong>By action:</strong> ' +
|
||||
" · ".join(f'{k}: {v}' for k, v in sorted(by_action.items(), key=lambda x: -x[1])) +
|
||||
'</p>')
|
||||
lines.append('<p><strong>By actor:</strong> ' +
|
||||
" · ".join(f'<code>{k}</code>: {v}' for k, v in sorted(by_actor.items(), key=lambda x: -x[1])) +
|
||||
'</p>')
|
||||
|
||||
# Action-type color/emoji
|
||||
action_emoji = {
|
||||
"created": "➕", "promoted": "✅", "rejected": "❌", "invalidated": "🚫",
|
||||
"superseded": "🔀", "reinforced": "🔁", "updated": "✏️",
|
||||
"auto_promoted": "⚡", "created_via_merge": "🔗",
|
||||
"valid_until_extended": "⏳", "tag_canonicalized": "🏷️",
|
||||
}
|
||||
|
||||
lines.append('<h2>Timeline</h2><ul>')
|
||||
for a in audit:
|
||||
emoji = action_emoji.get(a["action"], "•")
|
||||
preview = a.get("content_preview") or ""
|
||||
ts_short = a["timestamp"][:16] if a.get("timestamp") else "?"
|
||||
mid_short = (a.get("memory_id") or "")[:8]
|
||||
note = f' — <em>{a["note"]}</em>' if a.get("note") else ""
|
||||
lines.append(
|
||||
f'<li>{emoji} <code>{ts_short}</code> '
|
||||
f'<strong>{a["action"]}</strong> '
|
||||
f'<em>{a["actor"]}</em> '
|
||||
f'<a href="/wiki/memories/{a["memory_id"]}">{mid_short}</a>'
|
||||
f'{note}'
|
||||
+ (f'<br><span style="opacity:0.6; font-size:0.85rem; margin-left:1.5rem;">{preview[:140]}</span>' if preview else '')
|
||||
+ '</li>'
|
||||
)
|
||||
lines.append('</ul>')
|
||||
|
||||
return render_html(
|
||||
"Activity — AtoCore",
|
||||
"\n".join(lines),
|
||||
breadcrumbs=[("Wiki", "/wiki"), ("Activity", "")],
|
||||
active_path="/wiki/activity",
|
||||
)
|
||||
|
||||
|
||||
_TEMPLATE = """<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
@@ -263,6 +746,17 @@ _TEMPLATE = """<!DOCTYPE html>
|
||||
hr { border: none; border-top: 1px solid var(--border); margin: 2rem 0; }
|
||||
.breadcrumbs { margin-bottom: 1.5rem; font-size: 0.85em; opacity: 0.7; }
|
||||
.breadcrumbs a { opacity: 0.8; }
|
||||
.topnav { display: flex; gap: 0.25rem; flex-wrap: wrap; margin-bottom: 1rem; padding-bottom: 0.8rem; border-bottom: 1px solid var(--border); }
|
||||
.topnav-item { padding: 0.35rem 0.8rem; background: var(--card); border: 1px solid var(--border); border-radius: 6px; font-size: 0.88rem; color: var(--text); opacity: 0.75; text-decoration: none; }
|
||||
.topnav-item:hover { opacity: 1; background: var(--hover); text-decoration: none; }
|
||||
.topnav-item.active { background: var(--accent); color: white; border-color: var(--accent); opacity: 1; }
|
||||
.topnav-item.active:hover { background: var(--accent); }
|
||||
.activity-snippet { background: var(--card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; margin: 1rem 0; }
|
||||
.activity-snippet h3 { color: var(--accent); margin-bottom: 0.4rem; }
|
||||
.activity-snippet ul { margin: 0.3rem 0 0 1.2rem; font-size: 0.9rem; }
|
||||
.activity-snippet li { margin-bottom: 0.2rem; }
|
||||
.stat-row { display: flex; gap: 1rem; flex-wrap: wrap; font-size: 0.9rem; margin: 0.4rem 0; }
|
||||
.stat-row span { padding: 0.1rem 0.4rem; background: var(--hover); border-radius: 4px; }
|
||||
.meta { font-size: 0.8em; opacity: 0.5; margin-top: 0.5rem; }
|
||||
.tag { background: var(--accent); color: var(--bg); padding: 0.1rem 0.4rem; border-radius: 3px; font-size: 0.75em; margin-left: 0.3rem; }
|
||||
.search-box { display: flex; gap: 0.5rem; margin: 1.5rem 0; }
|
||||
@@ -289,7 +783,49 @@ _TEMPLATE = """<!DOCTYPE html>
|
||||
.card .stats { font-size: 0.8em; margin-top: 0.5rem; opacity: 0.5; }
|
||||
.card .client { font-size: 0.85em; opacity: 0.65; margin-bottom: 0.3rem; font-style: italic; }
|
||||
.card h3 .tag { font-size: 0.65em; vertical-align: middle; margin-left: 0.4rem; }
|
||||
.triage-notice { background: var(--card); border-left: 4px solid var(--accent); padding: 0.6rem 1rem; border-radius: 4px; margin: 0.8rem 0; }
|
||||
.triage-warning { background: #fef3c7; color: #78350f; border-left: 4px solid #d97706; padding: 0.6rem 1rem; border-radius: 4px; margin: 0.8rem 0; }
|
||||
@media (prefers-color-scheme: dark) { .triage-warning { background: #451a03; color: #fde68a; } }
|
||||
.mem-tags { display: inline-flex; gap: 0.25rem; flex-wrap: wrap; vertical-align: middle; }
|
||||
.tag-badge { background: var(--accent); color: white; padding: 0.1rem 0.5rem; border-radius: 10px; font-size: 0.7rem; font-family: monospace; text-decoration: none; font-weight: 500; }
|
||||
.tag-badge:hover { opacity: 0.85; text-decoration: none; }
|
||||
.mem-expiry { font-size: 0.75rem; color: #d97706; font-style: italic; margin-left: 0.4rem; }
|
||||
@media (prefers-color-scheme: dark) { .mem-expiry { color: #fbbf24; } }
|
||||
/* Phase 6 C.2 — Emerging projects section */
|
||||
.emerging-intro { font-size: 0.9rem; opacity: 0.75; margin-bottom: 0.8rem; }
|
||||
.emerging-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 1rem; margin-bottom: 1rem; }
|
||||
.emerging-card { background: var(--card); border: 1px dashed var(--accent); border-radius: 8px; padding: 1rem; }
|
||||
.emerging-card h3 { margin: 0 0 0.3rem 0; color: var(--accent); font-family: monospace; font-size: 1rem; }
|
||||
.emerging-count { font-size: 0.8rem; opacity: 0.6; margin-bottom: 0.5rem; }
|
||||
.emerging-samples { font-size: 0.85rem; margin: 0.5rem 0; padding-left: 1.2rem; opacity: 0.8; }
|
||||
.emerging-samples li { margin-bottom: 0.25rem; }
|
||||
.btn-register-emerging { width: 100%; padding: 0.45rem 0.9rem; background: var(--accent); color: white; border: 1px solid var(--accent); border-radius: 4px; cursor: pointer; font-size: 0.88rem; font-weight: 500; margin-top: 0.5rem; }
|
||||
.btn-register-emerging:hover { opacity: 0.9; }
|
||||
</style>
|
||||
<script>
|
||||
async function registerEmerging(projectId) {
|
||||
if (!confirm(`Register "${projectId}" as a first-class project?\n\nThis creates:\n• /wiki/projects/${projectId} page\n• System map + gaps + killer queries\n• Triage + graduation support\n\nIngest root defaults to vault:incoming/projects/${projectId}/`)) {
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const r = await fetch('/admin/projects/register-emerging', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({project_id: projectId}),
|
||||
});
|
||||
if (r.ok) {
|
||||
const data = await r.json();
|
||||
alert(data.message || `Registered ${projectId}`);
|
||||
window.location.reload();
|
||||
} else {
|
||||
const err = await r.text();
|
||||
alert(`Registration failed: ${r.status}\n${err.substring(0, 300)}`);
|
||||
}
|
||||
} catch (e) {
|
||||
alert(`Network error: ${e.message}`);
|
||||
}
|
||||
}
|
||||
</script>
|
||||
</head>
|
||||
<body>
|
||||
{{nav}}
|
||||
|
||||
200
src/atocore/memory/_dedup_prompt.py
Normal file
200
src/atocore/memory/_dedup_prompt.py
Normal file
@@ -0,0 +1,200 @@
|
||||
"""Shared LLM prompt + parser for memory dedup (Phase 7A).
|
||||
|
||||
Stdlib-only — must be importable from both the in-container service
|
||||
layer (when a user clicks "scan for duplicates" in the UI) and the
|
||||
host-side batch script (``scripts/memory_dedup.py``), which runs on
|
||||
Dalidou where the container's Python deps are not available.
|
||||
|
||||
The prompt instructs the model to draft a UNIFIED memory that
|
||||
preserves every specific detail from the sources. We never want a
|
||||
merge to lose information — if two memories disagree on a number, the
|
||||
merged content should surface both with context.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
DEDUP_PROMPT_VERSION = "dedup-0.1.0"
|
||||
MAX_CONTENT_CHARS = 1000
|
||||
MAX_SOURCES = 8 # cluster size cap — bigger clusters are suspicious
|
||||
|
||||
SYSTEM_PROMPT = """You consolidate near-duplicate memories for AtoCore, a personal context engine.
|
||||
|
||||
Given 2-8 memories that a semantic-similarity scan flagged as likely duplicates, draft a UNIFIED replacement that preserves every specific detail from every source.
|
||||
|
||||
CORE PRINCIPLE: information never gets lost. If the sources disagree on a number, date, vendor, or spec, surface BOTH with attribution (e.g., "quoted at $3.2k on 2026-03-01, revised to $3.8k on 2026-04-10"). If one source is more specific than another, keep the specificity. If they say the same thing differently, pick the clearer wording.
|
||||
|
||||
YOU MUST:
|
||||
- Produce content under 500 characters that reads as a single coherent statement
|
||||
- Keep all project/vendor/person/part names that appear in any source
|
||||
- Keep all numbers, dates, and identifiers
|
||||
- Keep the strongest claim wording ("ratified", "decided", "committed") if any source has it
|
||||
- Propose domain_tags as a UNION of the sources' tags (lowercase, deduped, cap 6)
|
||||
- Return valid_until = latest non-null valid_until across sources, or null if any source has null (permanent beats transient)
|
||||
|
||||
REFUSE TO MERGE (return action="reject") if:
|
||||
- The memories are actually about DIFFERENT subjects that just share vocabulary (e.g., "p04 mirror" and "p05 mirror" — same project bucket means same project, but different components)
|
||||
- One memory CONTRADICTS another and you cannot reconcile them — flag for contradiction review instead
|
||||
- The sources span different time snapshots of a changing state that should stay as a timeline, not be collapsed
|
||||
|
||||
OUTPUT — raw JSON, no prose, no markdown fences:
|
||||
{
|
||||
"action": "merge" | "reject",
|
||||
"content": "the unified memory content",
|
||||
"memory_type": "knowledge|project|preference|adaptation|episodic|identity",
|
||||
"project": "project-slug or empty",
|
||||
"domain_tags": ["tag1", "tag2"],
|
||||
"confidence": 0.5,
|
||||
"reason": "one sentence explaining the merge (or the rejection)"
|
||||
}
|
||||
|
||||
On action=reject, still fill content with a short explanation and set confidence=0."""
|
||||
|
||||
|
||||
TIER2_SYSTEM_PROMPT = """You are the second-opinion reviewer for AtoCore's memory-consolidation pipeline.
|
||||
|
||||
A tier-1 model (cheaper, faster) already drafted a unified memory from N near-duplicate source memories. Your job is to either CONFIRM the merge (refining the content if you see a clearer phrasing) or OVERRIDE with action="reject" if the tier-1 missed something important.
|
||||
|
||||
You must be STRICTER than tier-1. Specifically, REJECT if:
|
||||
- The sources are about different subjects that share vocabulary (e.g., different components within the same project)
|
||||
- The tier-1 draft dropped specifics that existed in the sources (numbers, dates, vendors, people, part IDs)
|
||||
- One source contradicts another and the draft glossed over it
|
||||
- The sources span a timeline of a changing state (should be preserved as a sequence, not collapsed)
|
||||
|
||||
If you CONFIRM, you may polish the content — but preserve every specific from every source.
|
||||
|
||||
Same output schema as tier-1:
|
||||
{
|
||||
"action": "merge" | "reject",
|
||||
"content": "the unified memory content",
|
||||
"memory_type": "knowledge|project|preference|adaptation|episodic|identity",
|
||||
"project": "project-slug or empty",
|
||||
"domain_tags": ["tag1", "tag2"],
|
||||
"confidence": 0.5,
|
||||
"reason": "one sentence — what you confirmed or why you overrode"
|
||||
}
|
||||
|
||||
Raw JSON only, no prose, no markdown fences."""
|
||||
|
||||
|
||||
def build_tier2_user_message(sources: list[dict[str, Any]], tier1_verdict: dict[str, Any]) -> str:
|
||||
"""Format tier-2 review payload: same sources + tier-1's draft."""
|
||||
base = build_user_message(sources)
|
||||
draft_summary = (
|
||||
f"\n\n--- TIER-1 DRAFT (for your review) ---\n"
|
||||
f"action: {tier1_verdict.get('action')}\n"
|
||||
f"confidence: {tier1_verdict.get('confidence', 0):.2f}\n"
|
||||
f"proposed content: {(tier1_verdict.get('content') or '')[:600]}\n"
|
||||
f"proposed memory_type: {tier1_verdict.get('memory_type', '')}\n"
|
||||
f"proposed project: {tier1_verdict.get('project', '')}\n"
|
||||
f"proposed tags: {tier1_verdict.get('domain_tags', [])}\n"
|
||||
f"tier-1 reason: {tier1_verdict.get('reason', '')[:300]}\n"
|
||||
f"---\n\n"
|
||||
f"Return your JSON verdict now. Confirm or override."
|
||||
)
|
||||
return base.replace("Return the JSON object now.", "").rstrip() + draft_summary
|
||||
|
||||
|
||||
def build_user_message(sources: list[dict[str, Any]]) -> str:
|
||||
"""Format N source memories for the model to consolidate.
|
||||
|
||||
Each source dict should carry id, content, project, memory_type,
|
||||
domain_tags, confidence, valid_until, reference_count.
|
||||
"""
|
||||
lines = [f"You have {len(sources)} source memories in the same (project, memory_type) bucket:\n"]
|
||||
for i, src in enumerate(sources[:MAX_SOURCES], start=1):
|
||||
tags = src.get("domain_tags") or []
|
||||
if isinstance(tags, str):
|
||||
try:
|
||||
tags = json.loads(tags)
|
||||
except Exception:
|
||||
tags = []
|
||||
lines.append(
|
||||
f"--- Source {i} (id={src.get('id','?')[:8]}, "
|
||||
f"refs={src.get('reference_count',0)}, "
|
||||
f"conf={src.get('confidence',0):.2f}, "
|
||||
f"valid_until={src.get('valid_until') or 'permanent'}) ---"
|
||||
)
|
||||
lines.append(f"project: {src.get('project','')}")
|
||||
lines.append(f"type: {src.get('memory_type','')}")
|
||||
lines.append(f"tags: {tags}")
|
||||
lines.append(f"content: {(src.get('content') or '')[:MAX_CONTENT_CHARS]}")
|
||||
lines.append("")
|
||||
lines.append("Return the JSON object now.")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def parse_merge_verdict(raw_output: str) -> dict[str, Any] | None:
|
||||
"""Strip markdown fences / leading prose and return the parsed JSON
|
||||
object. Returns None on parse failure."""
|
||||
text = (raw_output or "").strip()
|
||||
if text.startswith("```"):
|
||||
text = text.strip("`")
|
||||
nl = text.find("\n")
|
||||
if nl >= 0:
|
||||
text = text[nl + 1:]
|
||||
if text.endswith("```"):
|
||||
text = text[:-3]
|
||||
text = text.strip()
|
||||
|
||||
if not text.lstrip().startswith("{"):
|
||||
start = text.find("{")
|
||||
end = text.rfind("}")
|
||||
if start >= 0 and end > start:
|
||||
text = text[start:end + 1]
|
||||
|
||||
try:
|
||||
parsed = json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
return None
|
||||
if not isinstance(parsed, dict):
|
||||
return None
|
||||
return parsed
|
||||
|
||||
|
||||
def normalize_merge_verdict(verdict: dict[str, Any]) -> dict[str, Any] | None:
|
||||
"""Validate + normalize a raw merge verdict. Returns None if the
|
||||
verdict is unusable (no content, unknown action)."""
|
||||
action = str(verdict.get("action") or "").strip().lower()
|
||||
if action not in ("merge", "reject"):
|
||||
return None
|
||||
|
||||
content = str(verdict.get("content") or "").strip()
|
||||
if not content:
|
||||
return None
|
||||
|
||||
memory_type = str(verdict.get("memory_type") or "knowledge").strip().lower()
|
||||
project = str(verdict.get("project") or "").strip()
|
||||
|
||||
raw_tags = verdict.get("domain_tags") or []
|
||||
if isinstance(raw_tags, str):
|
||||
raw_tags = [t.strip() for t in raw_tags.split(",") if t.strip()]
|
||||
if not isinstance(raw_tags, list):
|
||||
raw_tags = []
|
||||
tags: list[str] = []
|
||||
for t in raw_tags[:6]:
|
||||
if not isinstance(t, str):
|
||||
continue
|
||||
tt = t.strip().lower()
|
||||
if tt and tt not in tags:
|
||||
tags.append(tt)
|
||||
|
||||
try:
|
||||
confidence = float(verdict.get("confidence", 0.5))
|
||||
except (TypeError, ValueError):
|
||||
confidence = 0.5
|
||||
confidence = max(0.0, min(1.0, confidence))
|
||||
|
||||
reason = str(verdict.get("reason") or "").strip()[:500]
|
||||
|
||||
return {
|
||||
"action": action,
|
||||
"content": content[:1000],
|
||||
"memory_type": memory_type,
|
||||
"project": project,
|
||||
"domain_tags": tags,
|
||||
"confidence": confidence,
|
||||
"reason": reason,
|
||||
}
|
||||
@@ -21,7 +21,7 @@ from __future__ import annotations
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
LLM_EXTRACTOR_VERSION = "llm-0.4.0"
|
||||
LLM_EXTRACTOR_VERSION = "llm-0.6.0" # bolder unknown-project tagging
|
||||
MAX_RESPONSE_CHARS = 8000
|
||||
MAX_PROMPT_CHARS = 2000
|
||||
MEMORY_TYPES = {"identity", "preference", "project", "episodic", "knowledge", "adaptation"}
|
||||
@@ -30,7 +30,24 @@ SYSTEM_PROMPT = """You extract memory candidates from LLM conversation turns for
|
||||
|
||||
AtoCore is the brain for Atomaste's engineering work. Known projects:
|
||||
p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, atocore,
|
||||
abb-space. Unknown project names — still tag them, the system auto-detects.
|
||||
abb-space.
|
||||
|
||||
UNKNOWN PROJECT/TOOL DETECTION (important): when a memory is clearly
|
||||
about a named tool, product, project, or system that is NOT in the
|
||||
known list above, use a slugified version of that name as the project
|
||||
tag (e.g., "apm" for "Atomaste Part Manager", "foo-bar" for "Foo Bar
|
||||
System"). DO NOT default to a nearest registered match just because
|
||||
APM isn't listed — that's misattribution. The system's Living
|
||||
Taxonomy detector scans for these unregistered tags and surfaces them
|
||||
for one-click registration once they appear in ≥3 memories. Your job
|
||||
is to be honest about scope, not to squeeze everything into existing
|
||||
buckets.
|
||||
|
||||
Exception: if the memory is about a registered project that merely
|
||||
uses or integrates with an unknown tool (e.g., "p04 parts are missing
|
||||
materials in APM"), tag with the registered project (p04-gigabit) and
|
||||
mention the tool in content. Only use an unknown tool as the project
|
||||
tag when the tool itself is the primary subject.
|
||||
|
||||
Your job is to emit SIGNALS that matter for future context. Be aggressive:
|
||||
err on the side of capturing useful signal. Triage filters noise downstream.
|
||||
@@ -84,6 +101,36 @@ DOMAINS for knowledge candidates (required when type=knowledge and project is em
|
||||
physics, materials, optics, mechanics, manufacturing, metrology,
|
||||
controls, software, math, finance, business
|
||||
|
||||
DOMAIN TAGS (Phase 3):
|
||||
Every candidate gets domain_tags — a lowercase list of topical keywords
|
||||
that describe the SUBJECT matter regardless of project. This is how
|
||||
cross-project retrieval works: a query about "optics" surfaces matches
|
||||
from p04 + p05 + p06 without naming each project.
|
||||
|
||||
Good tags: single lowercase words or hyphenated terms.
|
||||
Examples:
|
||||
- "ABB quote received for P04" → ["abb", "p04", "procurement", "optics"]
|
||||
- "USB SSD mandatory on polisher" → ["p06", "firmware", "storage"]
|
||||
- "CTE dominates WFE at F/1.2" → ["optics", "materials", "thermal"]
|
||||
- "Antoine prefers OAuth over API keys" → ["security", "auth", "preference"]
|
||||
|
||||
Tag 2-5 items. Use domain keywords (optics, thermal, firmware), project
|
||||
tokens when relevant (p04, abb), and lifecycle words (procurement, design,
|
||||
validation) as appropriate.
|
||||
|
||||
VALID_UNTIL (Phase 3):
|
||||
A memory can have an expiry date if it describes time-bounded truth.
|
||||
Use valid_until for:
|
||||
- Status snapshots: "current blocker is X" → valid_until = ~2 weeks out
|
||||
- Scheduled events: "meeting with vendor Friday" → valid_until = meeting date
|
||||
- Quotes with expiry: "quote valid until May 31"
|
||||
- Interim decisions pending ratification
|
||||
Leave empty (null) for:
|
||||
- Durable design decisions ("Option B selected")
|
||||
- Engineering insights ("CTE dominates at F/1.2")
|
||||
- Ratified requirements, architectural commitments
|
||||
Default = null (permanent). Format: ISO date "YYYY-MM-DD" or empty.
|
||||
|
||||
TRUST HIERARCHY:
|
||||
|
||||
- project-specific: set project to the project id, leave domain empty
|
||||
@@ -99,7 +146,7 @@ OUTPUT RULES:
|
||||
- Empty array [] is fine when the conversation has no durable signal
|
||||
|
||||
Each element:
|
||||
{"type": "project|knowledge|preference|adaptation|episodic", "content": "...", "project": "...", "domain": "", "confidence": 0.5}"""
|
||||
{"type": "project|knowledge|preference|adaptation|episodic", "content": "...", "project": "...", "domain": "", "confidence": 0.5, "domain_tags": ["tag1","tag2"], "valid_until": null}"""
|
||||
|
||||
|
||||
def build_user_message(prompt: str, response: str, project_hint: str) -> str:
|
||||
@@ -174,10 +221,36 @@ def normalize_candidate_item(item: dict[str, Any]) -> dict[str, Any] | None:
|
||||
if domain and not model_project:
|
||||
content = f"[{domain}] {content}"
|
||||
|
||||
# Phase 3: domain_tags + valid_until
|
||||
raw_tags = item.get("domain_tags") or []
|
||||
if isinstance(raw_tags, str):
|
||||
# Tolerate comma-separated string fallback
|
||||
raw_tags = [t.strip() for t in raw_tags.split(",") if t.strip()]
|
||||
if not isinstance(raw_tags, list):
|
||||
raw_tags = []
|
||||
domain_tags = []
|
||||
for t in raw_tags[:10]: # cap at 10
|
||||
if not isinstance(t, str):
|
||||
continue
|
||||
tag = t.strip().lower()
|
||||
if tag and tag not in domain_tags:
|
||||
domain_tags.append(tag)
|
||||
|
||||
valid_until = item.get("valid_until")
|
||||
if valid_until is not None:
|
||||
valid_until = str(valid_until).strip()
|
||||
# Accept ISO date "YYYY-MM-DD" or full timestamp; empty/"null" → none
|
||||
if valid_until.lower() in ("", "null", "none", "permanent"):
|
||||
valid_until = ""
|
||||
else:
|
||||
valid_until = ""
|
||||
|
||||
return {
|
||||
"type": mem_type,
|
||||
"content": content[:1000],
|
||||
"project": model_project,
|
||||
"domain": domain,
|
||||
"confidence": confidence,
|
||||
"domain_tags": domain_tags,
|
||||
"valid_until": valid_until,
|
||||
}
|
||||
|
||||
158
src/atocore/memory/_tag_canon_prompt.py
Normal file
158
src/atocore/memory/_tag_canon_prompt.py
Normal file
@@ -0,0 +1,158 @@
|
||||
"""Shared LLM prompt + parser for tag canonicalization (Phase 7C).
|
||||
|
||||
Stdlib-only, importable from both the in-container service layer and the
|
||||
host-side batch script that shells out to ``claude -p``.
|
||||
|
||||
The prompt instructs the model to propose a map of domain_tag aliases
|
||||
to their canonical form. Confidence is key here — we AUTO-APPLY high-
|
||||
confidence aliases; low-confidence go to human review. Over-merging
|
||||
distinct concepts ("optics" vs "optical" — sometimes equivalent,
|
||||
sometimes not) destroys cross-cutting retrieval, so the model is
|
||||
instructed to err conservative.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
TAG_CANON_PROMPT_VERSION = "tagcanon-0.1.0"
|
||||
MAX_TAGS_IN_PROMPT = 100
|
||||
|
||||
SYSTEM_PROMPT = """You canonicalize domain tags for AtoCore's memory layer.
|
||||
|
||||
Input: a distribution of lowercase domain tags (keyword → usage count across active memories). Examples: "firmware: 23", "fw: 5", "firmware-control: 3", "optics: 18", "optical: 2".
|
||||
|
||||
Your job: identify aliases — distinct strings that refer to the SAME concept — and map them to a single canonical form. The canonical should be the clearest / most-used / most-descriptive variant.
|
||||
|
||||
STRICT RULES:
|
||||
|
||||
1. ONLY propose aliases that are UNAMBIGUOUSLY equivalent. Examples:
|
||||
- "fw" → "firmware" (abbreviation)
|
||||
- "firmware-control" → "firmware" (compound narrowing — only if usage context makes it clear the narrower one is never used to DISTINGUISH from firmware-in-general)
|
||||
- "py" → "python"
|
||||
- "ml" → "machine-learning"
|
||||
Do NOT merge:
|
||||
- "optics" vs "optical" — these CAN diverge ("optics" = subsystem/product domain; "optical" = adjective used in non-optics contexts)
|
||||
- "p04" vs "p04-gigabit" — project ids are their own namespace, never canonicalize
|
||||
- "thermal" vs "temperature" — related but distinct
|
||||
- Anything where you're not sure — skip it, human review will catch real aliases next week
|
||||
|
||||
2. Confidence scale:
|
||||
0.9+ obvious abbreviation, very high usage disparity, no plausible alternative meaning
|
||||
0.7-0.9 likely alias, one-word-diff or standard contraction
|
||||
0.5-0.7 plausible but requires context — low count on alias side
|
||||
<0.5 DO NOT PROPOSE — if you're under 0.5, skip the pair entirely
|
||||
AtoCore auto-applies aliases at confidence >= 0.8; anything below goes to human review.
|
||||
|
||||
3. The CANONICAL must actually appear in the input list (don't invent a new term).
|
||||
|
||||
4. Never propose `alias == canonical`. Never propose circular mappings.
|
||||
|
||||
5. Project tags (p04, p05, p06, abb-space, atomizer-v2, atocore, apm) are OFF LIMITS — they are project identifiers, not concepts. Leave them alone entirely.
|
||||
|
||||
OUTPUT — raw JSON, no prose, no markdown fences:
|
||||
{
|
||||
"aliases": [
|
||||
{"alias": "fw", "canonical": "firmware", "confidence": 0.95, "reason": "fw is a standard abbreviation of firmware; 5 uses vs 23"},
|
||||
{"alias": "ml", "canonical": "machine-learning", "confidence": 0.90, "reason": "ml is the universal abbreviation"}
|
||||
]
|
||||
}
|
||||
|
||||
Empty aliases list is fine if nothing in the distribution is a clear alias. Err conservative — one false merge can pollute retrieval for hundreds of memories."""
|
||||
|
||||
|
||||
def build_user_message(tag_distribution: dict[str, int]) -> str:
|
||||
"""Format the tag distribution for the model.
|
||||
|
||||
Limited to MAX_TAGS_IN_PROMPT entries, sorted by count descending
|
||||
so high-usage tags appear first (the LLM uses them as anchor points
|
||||
for canonical selection).
|
||||
"""
|
||||
if not tag_distribution:
|
||||
return "Empty tag distribution — return {\"aliases\": []}."
|
||||
|
||||
sorted_tags = sorted(tag_distribution.items(), key=lambda x: x[1], reverse=True)
|
||||
top = sorted_tags[:MAX_TAGS_IN_PROMPT]
|
||||
lines = [f"{tag}: {count}" for tag, count in top]
|
||||
return (
|
||||
f"Tag distribution across {sum(tag_distribution.values())} total tag references "
|
||||
f"(showing top {len(top)} of {len(tag_distribution)} unique tags):\n\n"
|
||||
+ "\n".join(lines)
|
||||
+ "\n\nReturn the JSON aliases map now. Only propose UNAMBIGUOUS equivalents."
|
||||
)
|
||||
|
||||
|
||||
def parse_canon_output(raw_output: str) -> list[dict[str, Any]]:
|
||||
"""Strip markdown fences / prose and return the parsed aliases list."""
|
||||
text = (raw_output or "").strip()
|
||||
if text.startswith("```"):
|
||||
text = text.strip("`")
|
||||
nl = text.find("\n")
|
||||
if nl >= 0:
|
||||
text = text[nl + 1:]
|
||||
if text.endswith("```"):
|
||||
text = text[:-3]
|
||||
text = text.strip()
|
||||
|
||||
if not text.lstrip().startswith("{"):
|
||||
start = text.find("{")
|
||||
end = text.rfind("}")
|
||||
if start >= 0 and end > start:
|
||||
text = text[start:end + 1]
|
||||
|
||||
try:
|
||||
parsed = json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
return []
|
||||
|
||||
if not isinstance(parsed, dict):
|
||||
return []
|
||||
aliases = parsed.get("aliases") or []
|
||||
if not isinstance(aliases, list):
|
||||
return []
|
||||
return [a for a in aliases if isinstance(a, dict)]
|
||||
|
||||
|
||||
# Project tokens that must never be canonicalized — they're project ids,
|
||||
# not concepts. Keep this list in sync with the registered projects.
|
||||
# Safe to be over-inclusive; extra entries just skip canonicalization.
|
||||
PROTECTED_PROJECT_TOKENS = frozenset({
|
||||
"p04", "p04-gigabit",
|
||||
"p05", "p05-interferometer",
|
||||
"p06", "p06-polisher",
|
||||
"p08", "abb-space",
|
||||
"atomizer", "atomizer-v2",
|
||||
"atocore", "apm",
|
||||
})
|
||||
|
||||
|
||||
def normalize_alias_item(item: dict[str, Any]) -> dict[str, Any] | None:
|
||||
"""Validate one raw alias proposal. Returns None if unusable.
|
||||
|
||||
Filters: non-strings, empty strings, identity mappings, protected
|
||||
project tokens on either side.
|
||||
"""
|
||||
alias = str(item.get("alias") or "").strip().lower()
|
||||
canonical = str(item.get("canonical") or "").strip().lower()
|
||||
if not alias or not canonical:
|
||||
return None
|
||||
if alias == canonical:
|
||||
return None
|
||||
if alias in PROTECTED_PROJECT_TOKENS or canonical in PROTECTED_PROJECT_TOKENS:
|
||||
return None
|
||||
|
||||
try:
|
||||
confidence = float(item.get("confidence", 0.0))
|
||||
except (TypeError, ValueError):
|
||||
confidence = 0.0
|
||||
confidence = max(0.0, min(1.0, confidence))
|
||||
|
||||
reason = str(item.get("reason") or "").strip()[:300]
|
||||
|
||||
return {
|
||||
"alias": alias,
|
||||
"canonical": canonical,
|
||||
"confidence": confidence,
|
||||
"reason": reason,
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
88
src/atocore/memory/similarity.py
Normal file
88
src/atocore/memory/similarity.py
Normal file
@@ -0,0 +1,88 @@
|
||||
"""Phase 7A (Memory Consolidation): semantic similarity helpers.
|
||||
|
||||
Thin wrapper over ``atocore.retrieval.embeddings`` that exposes
|
||||
pairwise + batch cosine similarity on normalized embeddings. Used by
|
||||
the dedup detector to cluster near-duplicate active memories.
|
||||
|
||||
Embeddings from ``embed_texts()`` are already L2-normalized, so cosine
|
||||
similarity reduces to a dot product — no extra normalization needed.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from atocore.retrieval.embeddings import embed_texts
|
||||
|
||||
|
||||
def _dot(a: list[float], b: list[float]) -> float:
|
||||
return sum(x * y for x, y in zip(a, b))
|
||||
|
||||
|
||||
def cosine(a: list[float], b: list[float]) -> float:
|
||||
"""Cosine similarity on already-normalized vectors. Clamped to [0,1]
|
||||
(embeddings use paraphrase-multilingual-MiniLM which is unit-norm,
|
||||
and we never want negative values leaking into thresholds)."""
|
||||
return max(0.0, min(1.0, _dot(a, b)))
|
||||
|
||||
|
||||
def compute_memory_similarity(text_a: str, text_b: str) -> float:
|
||||
"""Return cosine similarity of two memory contents in [0,1].
|
||||
|
||||
Convenience helper for one-off checks + tests. For batch work (the
|
||||
dedup detector), use ``embed_texts()`` directly and compute the
|
||||
similarity matrix yourself to avoid re-embedding shared texts.
|
||||
"""
|
||||
if not text_a or not text_b:
|
||||
return 0.0
|
||||
vecs = embed_texts([text_a, text_b])
|
||||
return cosine(vecs[0], vecs[1])
|
||||
|
||||
|
||||
def similarity_matrix(texts: list[str]) -> list[list[float]]:
|
||||
"""N×N cosine similarity matrix. Diagonal is 1.0, symmetric."""
|
||||
if not texts:
|
||||
return []
|
||||
vecs = embed_texts(texts)
|
||||
n = len(vecs)
|
||||
matrix = [[0.0] * n for _ in range(n)]
|
||||
for i in range(n):
|
||||
matrix[i][i] = 1.0
|
||||
for j in range(i + 1, n):
|
||||
s = cosine(vecs[i], vecs[j])
|
||||
matrix[i][j] = s
|
||||
matrix[j][i] = s
|
||||
return matrix
|
||||
|
||||
|
||||
def cluster_by_threshold(texts: list[str], threshold: float) -> list[list[int]]:
|
||||
"""Greedy transitive clustering: if sim(i,j) >= threshold, merge.
|
||||
|
||||
Returns a list of clusters, each a list of indices into ``texts``.
|
||||
Singletons are included. Used by the dedup detector to collapse
|
||||
A~B~C into one merge proposal rather than three pair proposals.
|
||||
"""
|
||||
if not texts:
|
||||
return []
|
||||
matrix = similarity_matrix(texts)
|
||||
n = len(texts)
|
||||
parent = list(range(n))
|
||||
|
||||
def find(x: int) -> int:
|
||||
while parent[x] != x:
|
||||
parent[x] = parent[parent[x]]
|
||||
x = parent[x]
|
||||
return x
|
||||
|
||||
def union(x: int, y: int) -> None:
|
||||
rx, ry = find(x), find(y)
|
||||
if rx != ry:
|
||||
parent[rx] = ry
|
||||
|
||||
for i in range(n):
|
||||
for j in range(i + 1, n):
|
||||
if matrix[i][j] >= threshold:
|
||||
union(i, j)
|
||||
|
||||
groups: dict[int, list[int]] = {}
|
||||
for i in range(n):
|
||||
groups.setdefault(find(i), []).append(i)
|
||||
return list(groups.values())
|
||||
@@ -119,6 +119,111 @@ def _apply_migrations(conn: sqlite3.Connection) -> None:
|
||||
"CREATE INDEX IF NOT EXISTS idx_memories_last_referenced ON memories(last_referenced_at)"
|
||||
)
|
||||
|
||||
# Phase 3 (Auto-Organization V1): domain tags + expiry.
|
||||
# domain_tags is a JSON array of lowercase strings (optics, mechanics,
|
||||
# firmware, business, etc.) inferred by the LLM during triage. Used for
|
||||
# cross-project retrieval: a query about "optics" can surface matches from
|
||||
# p04 + p05 + p06 without knowing all the project names.
|
||||
# valid_until is an ISO UTC timestamp beyond which the memory is
|
||||
# considered stale. get_memories_for_context filters these out of context
|
||||
# packs automatically so ephemeral facts (status snapshots, weekly counts)
|
||||
# don't pollute grounding once they've aged out.
|
||||
if not _column_exists(conn, "memories", "domain_tags"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN domain_tags TEXT DEFAULT '[]'")
|
||||
if not _column_exists(conn, "memories", "valid_until"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN valid_until DATETIME")
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_memories_valid_until ON memories(valid_until)"
|
||||
)
|
||||
|
||||
# Phase 5 (Engineering V1): when a memory graduates to an entity, we
|
||||
# keep the memory row as an immutable historical pointer. The forward
|
||||
# pointer lets downstream code follow "what did this memory become?"
|
||||
# without having to join through source_refs.
|
||||
if not _column_exists(conn, "memories", "graduated_to_entity_id"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN graduated_to_entity_id TEXT")
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_memories_graduated ON memories(graduated_to_entity_id)"
|
||||
)
|
||||
|
||||
# Phase 4 (Robustness V1): append-only audit log for memory mutations.
|
||||
# Every create/update/promote/reject/supersede/invalidate/reinforce/expire/
|
||||
# auto_promote writes one row here. before/after are JSON snapshots of the
|
||||
# relevant fields. actor lets us distinguish auto-triage vs human-triage vs
|
||||
# api vs cron. This is the "how did this memory get to its current state"
|
||||
# trail — essential once the brain starts auto-organizing itself.
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS memory_audit (
|
||||
id TEXT PRIMARY KEY,
|
||||
memory_id TEXT NOT NULL,
|
||||
action TEXT NOT NULL,
|
||||
actor TEXT DEFAULT 'api',
|
||||
before_json TEXT DEFAULT '{}',
|
||||
after_json TEXT DEFAULT '{}',
|
||||
note TEXT DEFAULT '',
|
||||
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
"""
|
||||
)
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_memory ON memory_audit(memory_id)")
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_timestamp ON memory_audit(timestamp)")
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_action ON memory_audit(action)")
|
||||
|
||||
# Phase 5 (Engineering V1): entity_kind discriminator lets one audit
|
||||
# table serve both memories AND entities. Default "memory" keeps existing
|
||||
# rows correct; entity mutations write entity_kind="entity".
|
||||
if not _column_exists(conn, "memory_audit", "entity_kind"):
|
||||
conn.execute("ALTER TABLE memory_audit ADD COLUMN entity_kind TEXT DEFAULT 'memory'")
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_memory_audit_entity_kind ON memory_audit(entity_kind)"
|
||||
)
|
||||
|
||||
# Phase 5: conflicts + conflict_members tables per conflict-model.md.
|
||||
# A conflict is "two or more active rows claiming the same slot with
|
||||
# incompatible values". slot_kind + slot_key identify the logical slot
|
||||
# (e.g., "component.material" for some component id). Members point
|
||||
# back to the conflicting rows (memory or entity) with layer trust so
|
||||
# resolution can pick the highest-trust winner.
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS conflicts (
|
||||
id TEXT PRIMARY KEY,
|
||||
slot_kind TEXT NOT NULL,
|
||||
slot_key TEXT NOT NULL,
|
||||
project TEXT DEFAULT '',
|
||||
status TEXT DEFAULT 'open',
|
||||
resolution TEXT DEFAULT '',
|
||||
resolved_at DATETIME,
|
||||
detected_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
note TEXT DEFAULT ''
|
||||
)
|
||||
"""
|
||||
)
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS conflict_members (
|
||||
id TEXT PRIMARY KEY,
|
||||
conflict_id TEXT NOT NULL REFERENCES conflicts(id) ON DELETE CASCADE,
|
||||
member_kind TEXT NOT NULL,
|
||||
member_id TEXT NOT NULL,
|
||||
member_layer_trust INTEGER DEFAULT 0,
|
||||
value_snapshot TEXT DEFAULT ''
|
||||
)
|
||||
"""
|
||||
)
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_conflicts_status ON conflicts(status)")
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_conflicts_project ON conflicts(project)")
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_conflicts_slot ON conflicts(slot_kind, slot_key)"
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_conflict_members_conflict ON conflict_members(conflict_id)"
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_conflict_members_member ON conflict_members(member_kind, member_id)"
|
||||
)
|
||||
|
||||
# Phase 9 Commit A: capture loop columns on the interactions table.
|
||||
# The original schema only carried prompt + project_id + a context_pack
|
||||
# JSON blob. To make interactions a real audit trail of what AtoCore fed
|
||||
@@ -146,6 +251,69 @@ def _apply_migrations(conn: sqlite3.Connection) -> None:
|
||||
"CREATE INDEX IF NOT EXISTS idx_interactions_created_at ON interactions(created_at)"
|
||||
)
|
||||
|
||||
# Phase 7A (Memory Consolidation — "sleep cycle"): merge candidates.
|
||||
# When the dedup detector finds a cluster of semantically similar active
|
||||
# memories within the same (project, memory_type) bucket, it drafts a
|
||||
# unified content via LLM and writes a proposal here. The triage UI
|
||||
# surfaces these for human approval. On approve, source memories become
|
||||
# status=superseded and a new merged memory is created.
|
||||
# memory_ids is a JSON array (length >= 2) of the source memory ids.
|
||||
# proposed_* hold the LLM's draft; a human can edit before approve.
|
||||
# result_memory_id is filled on approve with the new merged memory's id.
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS memory_merge_candidates (
|
||||
id TEXT PRIMARY KEY,
|
||||
status TEXT DEFAULT 'pending',
|
||||
memory_ids TEXT NOT NULL,
|
||||
similarity REAL,
|
||||
proposed_content TEXT,
|
||||
proposed_memory_type TEXT,
|
||||
proposed_project TEXT,
|
||||
proposed_tags TEXT DEFAULT '[]',
|
||||
proposed_confidence REAL,
|
||||
reason TEXT DEFAULT '',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
resolved_at DATETIME,
|
||||
resolved_by TEXT,
|
||||
result_memory_id TEXT
|
||||
)
|
||||
"""
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_mmc_status ON memory_merge_candidates(status)"
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_mmc_created_at ON memory_merge_candidates(created_at)"
|
||||
)
|
||||
|
||||
# Phase 7C (Memory Consolidation — tag canonicalization): alias → canonical
|
||||
# map for domain_tags. A weekly LLM pass proposes rows here; high-confidence
|
||||
# ones auto-apply (rewrite domain_tags across all memories), low-confidence
|
||||
# ones stay pending for human approval. Immutable history: resolved rows
|
||||
# keep status=approved/rejected; the same alias can re-appear with a new
|
||||
# id if the tag reaches a different canonical later.
|
||||
conn.execute(
|
||||
"""
|
||||
CREATE TABLE IF NOT EXISTS tag_aliases (
|
||||
id TEXT PRIMARY KEY,
|
||||
alias TEXT NOT NULL,
|
||||
canonical TEXT NOT NULL,
|
||||
status TEXT DEFAULT 'pending',
|
||||
confidence REAL DEFAULT 0.0,
|
||||
alias_count INTEGER DEFAULT 0,
|
||||
canonical_count INTEGER DEFAULT 0,
|
||||
reason TEXT DEFAULT '',
|
||||
applied_to_memories INTEGER DEFAULT 0,
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
resolved_at DATETIME,
|
||||
resolved_by TEXT
|
||||
)
|
||||
"""
|
||||
)
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_tag_aliases_status ON tag_aliases(status)")
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_tag_aliases_alias ON tag_aliases(alias)")
|
||||
|
||||
|
||||
def _column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool:
|
||||
rows = conn.execute(f"PRAGMA table_info({table})").fetchall()
|
||||
|
||||
170
src/atocore/observability/alerts.py
Normal file
170
src/atocore/observability/alerts.py
Normal file
@@ -0,0 +1,170 @@
|
||||
"""Alert emission framework (Phase 4 Robustness V1).
|
||||
|
||||
One-stop helper to raise operational alerts from any AtoCore code
|
||||
path. An alert is a structured message about something the operator
|
||||
should see — harness regression, queue pileup, integrity drift,
|
||||
pipeline skipped, etc.
|
||||
|
||||
Emission fans out to multiple sinks so a single call touches every
|
||||
observability channel:
|
||||
|
||||
1. structlog logger (always)
|
||||
2. Append to ``$ATOCORE_ALERT_LOG`` (default ~/atocore-logs/alerts.log)
|
||||
3. Write the last alert of each severity to AtoCore project state
|
||||
(atocore/alert/last_{severity}) so the dashboard can surface it
|
||||
4. POST to ``$ATOCORE_ALERT_WEBHOOK`` if set (Discord/Slack/generic)
|
||||
|
||||
All sinks are fail-open — if one fails the others still fire.
|
||||
|
||||
Severity levels (inspired by syslog but simpler):
|
||||
- ``info`` operational event worth noting
|
||||
- ``warning`` degraded state, service still works
|
||||
- ``critical`` something is broken and needs attention
|
||||
|
||||
Environment variables:
|
||||
ATOCORE_ALERT_LOG override the alerts log file path
|
||||
ATOCORE_ALERT_WEBHOOK POST JSON alerts here (Discord webhook, etc.)
|
||||
ATOCORE_BASE_URL AtoCore API for project-state write (default localhost:8100)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import threading
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("alerts")
|
||||
|
||||
SEVERITIES = {"info", "warning", "critical"}
|
||||
|
||||
|
||||
def _default_alert_log() -> Path:
|
||||
explicit = os.environ.get("ATOCORE_ALERT_LOG")
|
||||
if explicit:
|
||||
return Path(explicit)
|
||||
return Path.home() / "atocore-logs" / "alerts.log"
|
||||
|
||||
|
||||
def _append_log(severity: str, title: str, message: str, context: dict | None) -> None:
|
||||
path = _default_alert_log()
|
||||
try:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
ts = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
line = f"[{ts}] [{severity.upper()}] {title}: {message}"
|
||||
if context:
|
||||
line += f" {json.dumps(context, ensure_ascii=True)[:500]}"
|
||||
line += "\n"
|
||||
with open(path, "a", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
except Exception as e:
|
||||
log.warning("alert_log_write_failed", error=str(e))
|
||||
|
||||
|
||||
def _write_state(severity: str, title: str, message: str, ts: str) -> None:
|
||||
"""Record the most-recent alert per severity into project_state.
|
||||
|
||||
Uses the internal ``set_state`` helper directly so we work even
|
||||
when the HTTP API isn't available (e.g. called from cron scripts
|
||||
that import atocore as a library).
|
||||
"""
|
||||
try:
|
||||
from atocore.context.project_state import set_state
|
||||
|
||||
set_state(
|
||||
project_name="atocore",
|
||||
category="alert",
|
||||
key=f"last_{severity}",
|
||||
value=json.dumps({"title": title, "message": message[:400], "timestamp": ts}),
|
||||
source="alert framework",
|
||||
)
|
||||
except Exception as e:
|
||||
log.warning("alert_state_write_failed", error=str(e))
|
||||
|
||||
|
||||
def _post_webhook(severity: str, title: str, message: str, context: dict | None, ts: str) -> None:
|
||||
url = os.environ.get("ATOCORE_ALERT_WEBHOOK")
|
||||
if not url:
|
||||
return
|
||||
|
||||
# Auto-detect Discord webhook shape for nicer formatting
|
||||
if "discord.com/api/webhooks" in url or "discordapp.com/api/webhooks" in url:
|
||||
emoji = {"info": ":information_source:", "warning": ":warning:", "critical": ":rotating_light:"}.get(severity, "")
|
||||
body = {
|
||||
"content": f"{emoji} **AtoCore {severity}**: {title}",
|
||||
"embeds": [{
|
||||
"description": message[:1800],
|
||||
"timestamp": ts,
|
||||
"fields": [
|
||||
{"name": k, "value": str(v)[:200], "inline": True}
|
||||
for k, v in (context or {}).items()
|
||||
][:10],
|
||||
}],
|
||||
}
|
||||
else:
|
||||
body = {
|
||||
"severity": severity,
|
||||
"title": title,
|
||||
"message": message,
|
||||
"context": context or {},
|
||||
"timestamp": ts,
|
||||
}
|
||||
|
||||
def _fire():
|
||||
try:
|
||||
req = urllib.request.Request(
|
||||
url,
|
||||
data=json.dumps(body).encode("utf-8"),
|
||||
method="POST",
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
urllib.request.urlopen(req, timeout=8)
|
||||
except Exception as e:
|
||||
log.warning("alert_webhook_failed", error=str(e))
|
||||
|
||||
threading.Thread(target=_fire, daemon=True).start()
|
||||
|
||||
|
||||
def emit_alert(
|
||||
severity: str,
|
||||
title: str,
|
||||
message: str,
|
||||
context: dict | None = None,
|
||||
) -> None:
|
||||
"""Emit an alert to all configured sinks.
|
||||
|
||||
Fail-open: any single sink failure is logged but does not prevent
|
||||
other sinks from firing.
|
||||
"""
|
||||
severity = (severity or "info").lower()
|
||||
if severity not in SEVERITIES:
|
||||
severity = "info"
|
||||
|
||||
ts = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
|
||||
# Sink 1: structlog — always
|
||||
logger_fn = {
|
||||
"info": log.info,
|
||||
"warning": log.warning,
|
||||
"critical": log.error,
|
||||
}[severity]
|
||||
logger_fn("alert", title=title, message=message[:500], **(context or {}))
|
||||
|
||||
# Sinks 2-4: fail-open, each wrapped
|
||||
try:
|
||||
_append_log(severity, title, message, context)
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
_write_state(severity, title, message, ts)
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
_post_webhook(severity, title, message, context, ts)
|
||||
except Exception:
|
||||
pass
|
||||
58
tests/test_alerts.py
Normal file
58
tests/test_alerts.py
Normal file
@@ -0,0 +1,58 @@
|
||||
"""Tests for the Phase 4 alerts framework."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
import atocore.config as _config
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def isolated_env(monkeypatch):
|
||||
"""Isolate alerts sinks per test."""
|
||||
tmpdir = tempfile.mkdtemp()
|
||||
log_file = Path(tmpdir) / "alerts.log"
|
||||
monkeypatch.setenv("ATOCORE_ALERT_LOG", str(log_file))
|
||||
monkeypatch.delenv("ATOCORE_ALERT_WEBHOOK", raising=False)
|
||||
|
||||
# Data dir for any state writes
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", tmpdir)
|
||||
_config.settings = _config.Settings()
|
||||
|
||||
from atocore.models.database import init_db
|
||||
init_db()
|
||||
|
||||
yield {"tmpdir": tmpdir, "log_file": log_file}
|
||||
|
||||
|
||||
def test_emit_alert_writes_log_file(isolated_env):
|
||||
from atocore.observability.alerts import emit_alert
|
||||
|
||||
emit_alert("warning", "test title", "test message body", context={"count": 5})
|
||||
|
||||
content = isolated_env["log_file"].read_text(encoding="utf-8")
|
||||
assert "test title" in content
|
||||
assert "test message body" in content
|
||||
assert "WARNING" in content
|
||||
assert '"count": 5' in content
|
||||
|
||||
|
||||
def test_emit_alert_invalid_severity_falls_back_to_info(isolated_env):
|
||||
from atocore.observability.alerts import emit_alert
|
||||
|
||||
emit_alert("made-up-severity", "t", "m")
|
||||
content = isolated_env["log_file"].read_text(encoding="utf-8")
|
||||
assert "INFO" in content
|
||||
|
||||
|
||||
def test_emit_alert_fails_open_on_log_write_error(monkeypatch, isolated_env):
|
||||
"""An unwritable log path should not crash the emit."""
|
||||
from atocore.observability.alerts import emit_alert
|
||||
|
||||
monkeypatch.setenv("ATOCORE_ALERT_LOG", "/nonexistent/path/that/definitely/is/not/writable/alerts.log")
|
||||
# Must not raise
|
||||
emit_alert("info", "t", "m")
|
||||
251
tests/test_confidence_decay.py
Normal file
251
tests/test_confidence_decay.py
Normal file
@@ -0,0 +1,251 @@
|
||||
"""Phase 7D — confidence decay tests.
|
||||
|
||||
Covers:
|
||||
- idle unreferenced memories decay at the expected rate
|
||||
- fresh / reinforced memories are untouched
|
||||
- below floor → auto-supersede with audit
|
||||
- graduated memories exempt
|
||||
- reinforcement reverses decay (integration with Phase 9 Commit B)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime, timedelta, timezone
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
decay_unreferenced_memories,
|
||||
get_memory_audit,
|
||||
reinforce_memory,
|
||||
)
|
||||
from atocore.models.database import get_connection, init_db
|
||||
|
||||
|
||||
def _force_old(mem_id: str, days_ago: int) -> None:
|
||||
"""Force last_referenced_at and created_at to N days in the past."""
|
||||
ts = (datetime.now(timezone.utc) - timedelta(days=days_ago)).strftime("%Y-%m-%d %H:%M:%S")
|
||||
with get_connection() as conn:
|
||||
conn.execute(
|
||||
"UPDATE memories SET last_referenced_at = ?, created_at = ? WHERE id = ?",
|
||||
(ts, ts, mem_id),
|
||||
)
|
||||
|
||||
|
||||
def _set_confidence(mem_id: str, c: float) -> None:
|
||||
with get_connection() as conn:
|
||||
conn.execute("UPDATE memories SET confidence = ? WHERE id = ?", (c, mem_id))
|
||||
|
||||
|
||||
def _set_reference_count(mem_id: str, n: int) -> None:
|
||||
with get_connection() as conn:
|
||||
conn.execute("UPDATE memories SET reference_count = ? WHERE id = ?", (n, mem_id))
|
||||
|
||||
|
||||
def _get(mem_id: str) -> dict:
|
||||
with get_connection() as conn:
|
||||
row = conn.execute("SELECT * FROM memories WHERE id = ?", (mem_id,)).fetchone()
|
||||
return dict(row) if row else {}
|
||||
|
||||
|
||||
def _set_status(mem_id: str, status: str) -> None:
|
||||
with get_connection() as conn:
|
||||
conn.execute("UPDATE memories SET status = ? WHERE id = ?", (status, mem_id))
|
||||
|
||||
|
||||
# --- Basic decay mechanics ---
|
||||
|
||||
|
||||
def test_decay_applies_to_idle_unreferenced(tmp_data_dir):
|
||||
init_db()
|
||||
m = create_memory("knowledge", "cold fact", confidence=0.8)
|
||||
_force_old(m.id, days_ago=60)
|
||||
_set_reference_count(m.id, 0)
|
||||
|
||||
result = decay_unreferenced_memories()
|
||||
assert len(result["decayed"]) == 1
|
||||
assert result["decayed"][0]["memory_id"] == m.id
|
||||
|
||||
row = _get(m.id)
|
||||
# 0.8 * 0.97 = 0.776
|
||||
assert row["confidence"] == pytest.approx(0.776)
|
||||
assert row["status"] == "active" # still above floor
|
||||
|
||||
|
||||
def test_decay_skips_fresh_memory(tmp_data_dir):
|
||||
"""A memory created today shouldn't decay even if reference_count=0."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "just-created fact", confidence=0.8)
|
||||
# Don't force old — it's fresh
|
||||
result = decay_unreferenced_memories()
|
||||
assert not any(e["memory_id"] == m.id for e in result["decayed"])
|
||||
assert not any(e["memory_id"] == m.id for e in result["superseded"])
|
||||
|
||||
row = _get(m.id)
|
||||
assert row["confidence"] == pytest.approx(0.8)
|
||||
|
||||
|
||||
def test_decay_skips_reinforced_memory(tmp_data_dir):
|
||||
"""Any reinforcement protects the memory from decay."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "referenced fact", confidence=0.8)
|
||||
_force_old(m.id, days_ago=90)
|
||||
_set_reference_count(m.id, 1) # just one reference is enough
|
||||
|
||||
result = decay_unreferenced_memories()
|
||||
assert not any(e["memory_id"] == m.id for e in result["decayed"])
|
||||
|
||||
row = _get(m.id)
|
||||
assert row["confidence"] == pytest.approx(0.8)
|
||||
|
||||
|
||||
# --- Auto-supersede at floor ---
|
||||
|
||||
|
||||
def test_decay_supersedes_below_floor(tmp_data_dir):
|
||||
init_db()
|
||||
m = create_memory("knowledge", "very cold fact", confidence=0.31)
|
||||
_force_old(m.id, days_ago=60)
|
||||
_set_reference_count(m.id, 0)
|
||||
|
||||
# 0.31 * 0.97 = 0.3007 which is still above the default floor 0.30.
|
||||
# Drop it a hair lower to cross the floor in one step.
|
||||
_set_confidence(m.id, 0.305)
|
||||
|
||||
result = decay_unreferenced_memories(supersede_confidence_floor=0.30)
|
||||
# 0.305 * 0.97 = 0.29585 → below 0.30, supersede
|
||||
assert len(result["superseded"]) == 1
|
||||
assert result["superseded"][0]["memory_id"] == m.id
|
||||
|
||||
row = _get(m.id)
|
||||
assert row["status"] == "superseded"
|
||||
assert row["confidence"] < 0.30
|
||||
|
||||
|
||||
def test_supersede_writes_audit_row(tmp_data_dir):
|
||||
init_db()
|
||||
m = create_memory("knowledge", "will decay out", confidence=0.305)
|
||||
_force_old(m.id, days_ago=60)
|
||||
_set_reference_count(m.id, 0)
|
||||
|
||||
decay_unreferenced_memories(supersede_confidence_floor=0.30)
|
||||
|
||||
audit = get_memory_audit(m.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "superseded" in actions
|
||||
entry = next(a for a in audit if a["action"] == "superseded")
|
||||
assert entry["actor"] == "confidence-decay"
|
||||
assert "decayed below floor" in entry["note"]
|
||||
|
||||
|
||||
# --- Exemptions ---
|
||||
|
||||
|
||||
def test_decay_skips_graduated_memory(tmp_data_dir):
|
||||
"""Graduated memories are frozen pointers to entities — never decay."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "graduated fact", confidence=0.8)
|
||||
_force_old(m.id, days_ago=90)
|
||||
_set_reference_count(m.id, 0)
|
||||
_set_status(m.id, "graduated")
|
||||
|
||||
result = decay_unreferenced_memories()
|
||||
assert not any(e["memory_id"] == m.id for e in result["decayed"])
|
||||
|
||||
row = _get(m.id)
|
||||
assert row["confidence"] == pytest.approx(0.8) # unchanged
|
||||
|
||||
|
||||
def test_decay_skips_superseded_memory(tmp_data_dir):
|
||||
"""Already superseded memories don't decay further."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "old news", confidence=0.5)
|
||||
_force_old(m.id, days_ago=90)
|
||||
_set_reference_count(m.id, 0)
|
||||
_set_status(m.id, "superseded")
|
||||
|
||||
result = decay_unreferenced_memories()
|
||||
assert not any(e["memory_id"] == m.id for e in result["decayed"])
|
||||
|
||||
|
||||
# --- Reversibility ---
|
||||
|
||||
|
||||
def test_reinforcement_reverses_decay(tmp_data_dir):
|
||||
"""A memory that decayed then got reinforced comes back up."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "will come back", confidence=0.8)
|
||||
_force_old(m.id, days_ago=60)
|
||||
_set_reference_count(m.id, 0)
|
||||
|
||||
decay_unreferenced_memories()
|
||||
# Now at 0.776
|
||||
reinforce_memory(m.id, confidence_delta=0.05)
|
||||
row = _get(m.id)
|
||||
assert row["confidence"] == pytest.approx(0.826)
|
||||
assert row["reference_count"] >= 1
|
||||
|
||||
|
||||
def test_reinforced_memory_no_longer_decays(tmp_data_dir):
|
||||
"""Once reinforce_memory bumps reference_count, decay skips it."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "protected", confidence=0.8)
|
||||
_force_old(m.id, days_ago=90)
|
||||
# Simulate reinforcement
|
||||
reinforce_memory(m.id)
|
||||
|
||||
result = decay_unreferenced_memories()
|
||||
assert not any(e["memory_id"] == m.id for e in result["decayed"])
|
||||
|
||||
|
||||
# --- Parameter validation ---
|
||||
|
||||
|
||||
def test_decay_rejects_invalid_factor(tmp_data_dir):
|
||||
init_db()
|
||||
with pytest.raises(ValueError):
|
||||
decay_unreferenced_memories(daily_decay_factor=1.0)
|
||||
with pytest.raises(ValueError):
|
||||
decay_unreferenced_memories(daily_decay_factor=0.0)
|
||||
with pytest.raises(ValueError):
|
||||
decay_unreferenced_memories(daily_decay_factor=-0.5)
|
||||
|
||||
|
||||
def test_decay_rejects_invalid_floor(tmp_data_dir):
|
||||
init_db()
|
||||
with pytest.raises(ValueError):
|
||||
decay_unreferenced_memories(supersede_confidence_floor=1.5)
|
||||
with pytest.raises(ValueError):
|
||||
decay_unreferenced_memories(supersede_confidence_floor=-0.1)
|
||||
|
||||
|
||||
# --- Threshold tuning ---
|
||||
|
||||
|
||||
def test_decay_threshold_tight_excludes_newer(tmp_data_dir):
|
||||
"""With idle_days_threshold=90, a 60-day-old memory should NOT decay."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "60-day-old", confidence=0.8)
|
||||
_force_old(m.id, days_ago=60)
|
||||
_set_reference_count(m.id, 0)
|
||||
|
||||
result = decay_unreferenced_memories(idle_days_threshold=90)
|
||||
assert not any(e["memory_id"] == m.id for e in result["decayed"])
|
||||
|
||||
|
||||
# --- Idempotency-ish (multiple runs apply additional decay) ---
|
||||
|
||||
|
||||
def test_decay_stacks_across_runs(tmp_data_dir):
|
||||
"""Running decay twice (simulating two days) compounds the factor."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "aging fact", confidence=0.8)
|
||||
_force_old(m.id, days_ago=60)
|
||||
_set_reference_count(m.id, 0)
|
||||
|
||||
decay_unreferenced_memories()
|
||||
decay_unreferenced_memories()
|
||||
row = _get(m.id)
|
||||
# 0.8 * 0.97 * 0.97 = 0.75272
|
||||
assert row["confidence"] == pytest.approx(0.75272, rel=1e-4)
|
||||
@@ -116,3 +116,108 @@ def test_entity_name_search(tmp_data_dir):
|
||||
|
||||
results = get_entities(name_contains="Support")
|
||||
assert len(results) == 2
|
||||
|
||||
|
||||
# --- Phase 5: Entity promote/reject lifecycle + audit + canonicalization ---
|
||||
|
||||
|
||||
def test_entity_project_canonicalization(tmp_data_dir):
|
||||
"""Aliases resolve to canonical project_id on write (Phase 5)."""
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
# "p04" is a registered alias for p04-gigabit
|
||||
e = create_entity("component", "Test Component", project="p04")
|
||||
assert e.project == "p04-gigabit"
|
||||
|
||||
|
||||
def test_promote_entity_candidate_to_active(tmp_data_dir):
|
||||
from atocore.engineering.service import promote_entity, get_entity
|
||||
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
e = create_entity("requirement", "CTE tolerance", status="candidate")
|
||||
assert e.status == "candidate"
|
||||
|
||||
assert promote_entity(e.id, actor="test-triage")
|
||||
e2 = get_entity(e.id)
|
||||
assert e2.status == "active"
|
||||
|
||||
|
||||
def test_reject_entity_candidate(tmp_data_dir):
|
||||
from atocore.engineering.service import reject_entity_candidate, get_entity
|
||||
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
e = create_entity("decision", "pick vendor Y", status="candidate")
|
||||
|
||||
assert reject_entity_candidate(e.id, actor="test-triage", note="duplicate")
|
||||
e2 = get_entity(e.id)
|
||||
assert e2.status == "invalid"
|
||||
|
||||
|
||||
def test_promote_active_entity_noop(tmp_data_dir):
|
||||
from atocore.engineering.service import promote_entity
|
||||
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
e = create_entity("component", "Already Active") # default status=active
|
||||
assert not promote_entity(e.id) # only candidates can promote
|
||||
|
||||
|
||||
def test_entity_audit_log_captures_lifecycle(tmp_data_dir):
|
||||
from atocore.engineering.service import (
|
||||
promote_entity,
|
||||
get_entity_audit,
|
||||
)
|
||||
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
e = create_entity("requirement", "test req", status="candidate", actor="test")
|
||||
promote_entity(e.id, actor="test-triage", note="looks good")
|
||||
|
||||
audit = get_entity_audit(e.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "created" in actions
|
||||
assert "promoted" in actions
|
||||
|
||||
promote_entry = next(a for a in audit if a["action"] == "promoted")
|
||||
assert promote_entry["actor"] == "test-triage"
|
||||
assert promote_entry["note"] == "looks good"
|
||||
assert promote_entry["before"]["status"] == "candidate"
|
||||
assert promote_entry["after"]["status"] == "active"
|
||||
|
||||
|
||||
def test_new_relationship_types_available(tmp_data_dir):
|
||||
"""Phase 5 added 6 missing relationship types."""
|
||||
for rel in ["based_on_assumption", "supports", "conflicts_with",
|
||||
"updated_by_session", "evidenced_by", "summarized_in"]:
|
||||
assert rel in RELATIONSHIP_TYPES, f"{rel} missing from RELATIONSHIP_TYPES"
|
||||
|
||||
|
||||
def test_conflicts_tables_exist(tmp_data_dir):
|
||||
"""Phase 5 conflict-model tables."""
|
||||
from atocore.models.database import get_connection
|
||||
|
||||
init_db()
|
||||
with get_connection() as conn:
|
||||
tables = {r[0] for r in conn.execute(
|
||||
"SELECT name FROM sqlite_master WHERE type='table'"
|
||||
).fetchall()}
|
||||
assert "conflicts" in tables
|
||||
assert "conflict_members" in tables
|
||||
|
||||
|
||||
def test_memory_audit_has_entity_kind(tmp_data_dir):
|
||||
"""Phase 5 added entity_kind discriminator."""
|
||||
from atocore.models.database import get_connection
|
||||
|
||||
init_db()
|
||||
with get_connection() as conn:
|
||||
cols = {r["name"] for r in conn.execute("PRAGMA table_info(memory_audit)").fetchall()}
|
||||
assert "entity_kind" in cols
|
||||
|
||||
|
||||
def test_graduated_status_accepted(tmp_data_dir):
|
||||
"""Phase 5 added 'graduated' memory status for memory→entity transitions."""
|
||||
from atocore.memory.service import MEMORY_STATUSES
|
||||
assert "graduated" in MEMORY_STATUSES
|
||||
|
||||
212
tests/test_engineering_queries.py
Normal file
212
tests/test_engineering_queries.py
Normal file
@@ -0,0 +1,212 @@
|
||||
"""Phase 5 tests — the 10 canonical engineering queries.
|
||||
|
||||
Test fixtures seed a small p-test graph and exercise each query. The 3 killer
|
||||
queries (Q-006/009/011) get dedicated tests that verify they surface real gaps
|
||||
and DON'T false-positive on well-formed data.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.engineering.queries import (
|
||||
all_gaps,
|
||||
decisions_affecting,
|
||||
evidence_chain,
|
||||
impact_analysis,
|
||||
orphan_requirements,
|
||||
recent_changes,
|
||||
requirements_for,
|
||||
risky_decisions,
|
||||
system_map,
|
||||
unsupported_claims,
|
||||
)
|
||||
from atocore.engineering.service import (
|
||||
create_entity,
|
||||
create_relationship,
|
||||
init_engineering_schema,
|
||||
)
|
||||
from atocore.models.database import init_db
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def seeded_graph(tmp_data_dir):
|
||||
"""Build a small engineering graph for query tests."""
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
|
||||
# Subsystem + components
|
||||
ss = create_entity("subsystem", "Optics", project="p-test")
|
||||
c1 = create_entity("component", "Primary Mirror", project="p-test")
|
||||
c2 = create_entity("component", "Diverger Lens", project="p-test")
|
||||
c_orphan = create_entity("component", "Unparented", project="p-test")
|
||||
create_relationship(c1.id, ss.id, "part_of")
|
||||
create_relationship(c2.id, ss.id, "part_of")
|
||||
|
||||
# Requirements — one satisfied, one orphan
|
||||
r_ok = create_entity("requirement", "Surface figure < 25nm RMS", project="p-test")
|
||||
r_orphan = create_entity("requirement", "Measurement lambda/20", project="p-test")
|
||||
create_relationship(c1.id, r_ok.id, "satisfies")
|
||||
|
||||
# Decisions
|
||||
d_ok = create_entity("decision", "Use Zerodur blank", project="p-test")
|
||||
d_risky = create_entity("decision", "Use external CGH", project="p-test")
|
||||
create_relationship(d_ok.id, ss.id, "affected_by_decision")
|
||||
|
||||
# Assumption (flagged) — d_risky depends on it
|
||||
a_flagged = create_entity(
|
||||
"parameter", "Vendor lead time 6 weeks",
|
||||
project="p-test",
|
||||
properties={"flagged": True},
|
||||
)
|
||||
create_relationship(d_risky.id, a_flagged.id, "based_on_assumption")
|
||||
|
||||
# Validation claim — one supported, one not
|
||||
v_ok = create_entity("validation_claim", "Margin is adequate", project="p-test")
|
||||
v_orphan = create_entity("validation_claim", "Thermal stability OK", project="p-test")
|
||||
result = create_entity("result", "FEA thermal sweep 2026-03", project="p-test")
|
||||
create_relationship(result.id, v_ok.id, "supports")
|
||||
|
||||
# Material
|
||||
mat = create_entity("material", "Zerodur", project="p-test")
|
||||
create_relationship(c1.id, mat.id, "uses_material")
|
||||
|
||||
return {
|
||||
"subsystem": ss, "component_1": c1, "component_2": c2,
|
||||
"orphan_component": c_orphan,
|
||||
"req_ok": r_ok, "req_orphan": r_orphan,
|
||||
"decision_ok": d_ok, "decision_risky": d_risky,
|
||||
"assumption_flagged": a_flagged,
|
||||
"claim_supported": v_ok, "claim_orphan": v_orphan,
|
||||
"result": result, "material": mat,
|
||||
}
|
||||
|
||||
|
||||
# --- Structure queries ---
|
||||
|
||||
|
||||
def test_system_map_returns_subsystem_with_components(seeded_graph):
|
||||
result = system_map("p-test")
|
||||
assert result["project"] == "p-test"
|
||||
assert len(result["subsystems"]) == 1
|
||||
optics = result["subsystems"][0]
|
||||
assert optics["name"] == "Optics"
|
||||
comp_names = {c["name"] for c in optics["components"]}
|
||||
assert "Primary Mirror" in comp_names
|
||||
assert "Diverger Lens" in comp_names
|
||||
|
||||
|
||||
def test_system_map_reports_orphan_components(seeded_graph):
|
||||
result = system_map("p-test")
|
||||
names = {c["name"] for c in result["orphan_components"]}
|
||||
assert "Unparented" in names
|
||||
|
||||
|
||||
def test_system_map_includes_materials(seeded_graph):
|
||||
result = system_map("p-test")
|
||||
primary = next(
|
||||
c for s in result["subsystems"] for c in s["components"] if c["name"] == "Primary Mirror"
|
||||
)
|
||||
assert "Zerodur" in primary["materials"]
|
||||
|
||||
|
||||
def test_decisions_affecting_whole_project(seeded_graph):
|
||||
result = decisions_affecting("p-test")
|
||||
names = {d["name"] for d in result["decisions"]}
|
||||
assert "Use Zerodur blank" in names
|
||||
assert "Use external CGH" in names
|
||||
|
||||
|
||||
def test_decisions_affecting_specific_subsystem(seeded_graph):
|
||||
ss_id = seeded_graph["subsystem"].id
|
||||
result = decisions_affecting("p-test", subsystem_id=ss_id)
|
||||
names = {d["name"] for d in result["decisions"]}
|
||||
# d_ok has edge to subsystem directly
|
||||
assert "Use Zerodur blank" in names
|
||||
|
||||
|
||||
def test_requirements_for_component(seeded_graph):
|
||||
c_id = seeded_graph["component_1"].id
|
||||
result = requirements_for(c_id)
|
||||
assert result["count"] == 1
|
||||
assert result["requirements"][0]["name"] == "Surface figure < 25nm RMS"
|
||||
|
||||
|
||||
def test_recent_changes_includes_created_entities(seeded_graph):
|
||||
result = recent_changes("p-test", limit=100)
|
||||
actions = [c["action"] for c in result["changes"]]
|
||||
assert "created" in actions
|
||||
assert result["count"] > 0
|
||||
|
||||
|
||||
# --- Killer queries ---
|
||||
|
||||
|
||||
def test_orphan_requirements_finds_unsatisfied(seeded_graph):
|
||||
result = orphan_requirements("p-test")
|
||||
names = {r["name"] for r in result["gaps"]}
|
||||
assert "Measurement lambda/20" in names # orphan
|
||||
assert "Surface figure < 25nm RMS" not in names # has SATISFIES edge
|
||||
|
||||
|
||||
def test_orphan_requirements_empty_when_all_satisfied(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
c = create_entity("component", "C", project="p-clean")
|
||||
r = create_entity("requirement", "R", project="p-clean")
|
||||
create_relationship(c.id, r.id, "satisfies")
|
||||
result = orphan_requirements("p-clean")
|
||||
assert result["count"] == 0
|
||||
|
||||
|
||||
def test_risky_decisions_finds_flagged_assumptions(seeded_graph):
|
||||
result = risky_decisions("p-test")
|
||||
names = {d["decision_name"] for d in result["gaps"]}
|
||||
assert "Use external CGH" in names
|
||||
assert "Use Zerodur blank" not in names # has no flagged assumption
|
||||
|
||||
|
||||
def test_unsupported_claims_finds_orphan_claims(seeded_graph):
|
||||
result = unsupported_claims("p-test")
|
||||
names = {c["name"] for c in result["gaps"]}
|
||||
assert "Thermal stability OK" in names
|
||||
assert "Margin is adequate" not in names # has SUPPORTS edge
|
||||
|
||||
|
||||
def test_all_gaps_combines_the_three_killers(seeded_graph):
|
||||
result = all_gaps("p-test")
|
||||
assert result["orphan_requirements"]["count"] == 1
|
||||
assert result["risky_decisions"]["count"] == 1
|
||||
assert result["unsupported_claims"]["count"] == 1
|
||||
|
||||
|
||||
def test_all_gaps_clean_project_reports_zero(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
create_entity("component", "alone", project="p-empty")
|
||||
result = all_gaps("p-empty")
|
||||
assert result["orphan_requirements"]["count"] == 0
|
||||
assert result["risky_decisions"]["count"] == 0
|
||||
assert result["unsupported_claims"]["count"] == 0
|
||||
|
||||
|
||||
# --- Impact + evidence ---
|
||||
|
||||
|
||||
def test_impact_analysis_walks_outbound_edges(seeded_graph):
|
||||
c_id = seeded_graph["component_1"].id
|
||||
result = impact_analysis(c_id, max_depth=2)
|
||||
# Primary Mirror → SATISFIES → Requirement, → USES_MATERIAL → Material
|
||||
rel_types = {i["relationship"] for i in result["impacted"]}
|
||||
assert "satisfies" in rel_types
|
||||
assert "uses_material" in rel_types
|
||||
|
||||
|
||||
def test_evidence_chain_walks_inbound_provenance(seeded_graph):
|
||||
v_ok_id = seeded_graph["claim_supported"].id
|
||||
result = evidence_chain(v_ok_id)
|
||||
# The Result entity supports the claim
|
||||
via_types = {e["via"] for e in result["evidence_chain"]}
|
||||
assert "supports" in via_types
|
||||
source_names = {e["source_name"] for e in result["evidence_chain"]}
|
||||
assert "FEA thermal sweep 2026-03" in source_names
|
||||
246
tests/test_engineering_v1_phase5.py
Normal file
246
tests/test_engineering_v1_phase5.py
Normal file
@@ -0,0 +1,246 @@
|
||||
"""Phase 5F + 5G + 5H tests — graduation, conflicts, MCP tools."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.engineering.conflicts import (
|
||||
detect_conflicts_for_entity,
|
||||
list_open_conflicts,
|
||||
resolve_conflict,
|
||||
)
|
||||
from atocore.engineering._graduation_prompt import (
|
||||
build_user_message,
|
||||
parse_graduation_output,
|
||||
)
|
||||
from atocore.engineering.service import (
|
||||
create_entity,
|
||||
create_relationship,
|
||||
get_entity,
|
||||
init_engineering_schema,
|
||||
promote_entity,
|
||||
)
|
||||
from atocore.memory.service import create_memory
|
||||
from atocore.models.database import get_connection, init_db
|
||||
|
||||
|
||||
# --- 5F Memory graduation ---
|
||||
|
||||
|
||||
def test_graduation_prompt_parses_positive_decision():
|
||||
raw = """
|
||||
{"graduate": true, "entity_type": "component", "name": "Primary Mirror",
|
||||
"description": "The 1.2m primary mirror for p04", "confidence": 0.85,
|
||||
"relationships": [{"rel_type": "part_of", "target_hint": "Optics Subsystem"}]}
|
||||
"""
|
||||
decision = parse_graduation_output(raw)
|
||||
assert decision is not None
|
||||
assert decision["graduate"] is True
|
||||
assert decision["entity_type"] == "component"
|
||||
assert decision["name"] == "Primary Mirror"
|
||||
assert decision["confidence"] == 0.85
|
||||
assert decision["relationships"] == [
|
||||
{"rel_type": "part_of", "target_hint": "Optics Subsystem"}
|
||||
]
|
||||
|
||||
|
||||
def test_graduation_prompt_parses_negative_decision():
|
||||
raw = '{"graduate": false, "reason": "conversational filler, no typed entity"}'
|
||||
decision = parse_graduation_output(raw)
|
||||
assert decision is not None
|
||||
assert decision["graduate"] is False
|
||||
assert "filler" in decision["reason"]
|
||||
|
||||
|
||||
def test_graduation_prompt_rejects_unknown_entity_type():
|
||||
raw = '{"graduate": true, "entity_type": "quantum_thing", "name": "x"}'
|
||||
assert parse_graduation_output(raw) is None
|
||||
|
||||
|
||||
def test_graduation_prompt_tolerates_markdown_fences():
|
||||
raw = '```json\n{"graduate": false, "reason": "ok"}\n```'
|
||||
d = parse_graduation_output(raw)
|
||||
assert d is not None
|
||||
assert d["graduate"] is False
|
||||
|
||||
|
||||
def test_promote_entity_marks_source_memory_graduated(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
mem = create_memory("knowledge", "The Primary Mirror is 1.2m Zerodur",
|
||||
project="p-test", status="active")
|
||||
# Create entity candidate pointing back to the memory
|
||||
ent = create_entity(
|
||||
"component",
|
||||
"Primary Mirror",
|
||||
project="p-test",
|
||||
status="candidate",
|
||||
source_refs=[f"memory:{mem.id}"],
|
||||
)
|
||||
# Promote
|
||||
assert promote_entity(ent.id, actor="test-triage")
|
||||
|
||||
# Memory should now be graduated with forward pointer
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT status, graduated_to_entity_id FROM memories WHERE id = ?",
|
||||
(mem.id,),
|
||||
).fetchone()
|
||||
assert row["status"] == "graduated"
|
||||
assert row["graduated_to_entity_id"] == ent.id
|
||||
|
||||
|
||||
def test_promote_entity_without_memory_refs_no_graduation(tmp_data_dir):
|
||||
"""Entity not backed by any memory — promote still works, no graduation."""
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
ent = create_entity("component", "Orphan", project="p-test", status="candidate")
|
||||
assert promote_entity(ent.id)
|
||||
assert get_entity(ent.id).status == "active"
|
||||
|
||||
|
||||
# --- 5G Conflict detection ---
|
||||
|
||||
|
||||
def test_component_material_conflict_detected(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
c = create_entity("component", "Mirror", project="p-test")
|
||||
m1 = create_entity("material", "Zerodur", project="p-test")
|
||||
m2 = create_entity("material", "ULE", project="p-test")
|
||||
create_relationship(c.id, m1.id, "uses_material")
|
||||
create_relationship(c.id, m2.id, "uses_material")
|
||||
|
||||
detected = detect_conflicts_for_entity(c.id)
|
||||
assert len(detected) == 1
|
||||
|
||||
conflicts = list_open_conflicts(project="p-test")
|
||||
assert any(c["slot_kind"] == "component.material" for c in conflicts)
|
||||
conflict = next(c for c in conflicts if c["slot_kind"] == "component.material")
|
||||
assert len(conflict["members"]) == 2
|
||||
|
||||
|
||||
def test_component_part_of_conflict_detected(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
c = create_entity("component", "MultiPart", project="p-test")
|
||||
s1 = create_entity("subsystem", "Mechanical", project="p-test")
|
||||
s2 = create_entity("subsystem", "Optical", project="p-test")
|
||||
create_relationship(c.id, s1.id, "part_of")
|
||||
create_relationship(c.id, s2.id, "part_of")
|
||||
|
||||
detected = detect_conflicts_for_entity(c.id)
|
||||
assert len(detected) == 1
|
||||
conflicts = list_open_conflicts(project="p-test")
|
||||
assert any(c["slot_kind"] == "component.part_of" for c in conflicts)
|
||||
|
||||
|
||||
def test_requirement_name_conflict_detected(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
r1 = create_entity("requirement", "Surface figure < 25nm",
|
||||
project="p-test", description="Primary mirror spec")
|
||||
r2 = create_entity("requirement", "Surface figure < 25nm",
|
||||
project="p-test", description="Different interpretation")
|
||||
|
||||
detected = detect_conflicts_for_entity(r2.id)
|
||||
assert len(detected) == 1
|
||||
conflicts = list_open_conflicts(project="p-test")
|
||||
assert any(c["slot_kind"] == "requirement.name" for c in conflicts)
|
||||
|
||||
|
||||
def test_conflict_not_detected_for_clean_component(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
c = create_entity("component", "Clean", project="p-test")
|
||||
m = create_entity("material", "Zerodur", project="p-test")
|
||||
create_relationship(c.id, m.id, "uses_material")
|
||||
|
||||
detected = detect_conflicts_for_entity(c.id)
|
||||
assert detected == []
|
||||
|
||||
|
||||
def test_conflict_resolution_supersedes_losers(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
c = create_entity("component", "Mirror2", project="p-test")
|
||||
m1 = create_entity("material", "Zerodur2", project="p-test")
|
||||
m2 = create_entity("material", "ULE2", project="p-test")
|
||||
create_relationship(c.id, m1.id, "uses_material")
|
||||
create_relationship(c.id, m2.id, "uses_material")
|
||||
|
||||
detected = detect_conflicts_for_entity(c.id)
|
||||
conflict_id = detected[0]
|
||||
|
||||
# Resolve by picking m1 as the winner
|
||||
assert resolve_conflict(conflict_id, "supersede_others", winner_id=m1.id)
|
||||
|
||||
# m2 should now be superseded; m1 stays active
|
||||
assert get_entity(m1.id).status == "active"
|
||||
assert get_entity(m2.id).status == "superseded"
|
||||
|
||||
# Conflict should be marked resolved
|
||||
open_conflicts = list_open_conflicts(project="p-test")
|
||||
assert not any(c["id"] == conflict_id for c in open_conflicts)
|
||||
|
||||
|
||||
def test_conflict_resolution_dismiss_leaves_entities_alone(tmp_data_dir):
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
r1 = create_entity("requirement", "Dup req", project="p-test",
|
||||
description="first meaning")
|
||||
r2 = create_entity("requirement", "Dup req", project="p-test",
|
||||
description="second meaning")
|
||||
detected = detect_conflicts_for_entity(r2.id)
|
||||
conflict_id = detected[0]
|
||||
|
||||
assert resolve_conflict(conflict_id, "dismiss")
|
||||
# Both still active — dismiss just clears the conflict marker
|
||||
assert get_entity(r1.id).status == "active"
|
||||
assert get_entity(r2.id).status == "active"
|
||||
|
||||
|
||||
def test_deduplicate_conflicts_for_same_slot(tmp_data_dir):
|
||||
"""Running detection twice on the same entity shouldn't dup the conflict row."""
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
c = create_entity("component", "Dup", project="p-test")
|
||||
m1 = create_entity("material", "A", project="p-test")
|
||||
m2 = create_entity("material", "B", project="p-test")
|
||||
create_relationship(c.id, m1.id, "uses_material")
|
||||
create_relationship(c.id, m2.id, "uses_material")
|
||||
|
||||
detect_conflicts_for_entity(c.id)
|
||||
detect_conflicts_for_entity(c.id) # should be a no-op
|
||||
|
||||
conflicts = list_open_conflicts(project="p-test")
|
||||
mat_conflicts = [c for c in conflicts if c["slot_kind"] == "component.material"]
|
||||
assert len(mat_conflicts) == 1
|
||||
|
||||
|
||||
def test_promote_triggers_conflict_detection(tmp_data_dir):
|
||||
"""End-to-end: promoting a candidate component with 2 active material edges
|
||||
triggers conflict detection."""
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
|
||||
c = create_entity("component", "AutoFlag", project="p-test", status="candidate")
|
||||
m1 = create_entity("material", "X1", project="p-test")
|
||||
m2 = create_entity("material", "X2", project="p-test")
|
||||
create_relationship(c.id, m1.id, "uses_material")
|
||||
create_relationship(c.id, m2.id, "uses_material")
|
||||
|
||||
promote_entity(c.id, actor="test")
|
||||
|
||||
conflicts = list_open_conflicts(project="p-test")
|
||||
assert any(c["slot_kind"] == "component.material" for c in conflicts)
|
||||
|
||||
|
||||
# --- 5H MCP tool shape checks (via build_user_message) ---
|
||||
|
||||
|
||||
def test_graduation_user_message_includes_project_and_type():
|
||||
msg = build_user_message("some content", "p04-gigabit", "project")
|
||||
assert "p04-gigabit" in msg
|
||||
assert "project" in msg
|
||||
assert "some content" in msg
|
||||
198
tests/test_inject_context_hook.py
Normal file
198
tests/test_inject_context_hook.py
Normal file
@@ -0,0 +1,198 @@
|
||||
"""Tests for deploy/hooks/inject_context.py — Claude Code UserPromptSubmit hook.
|
||||
|
||||
These are process-level tests: we run the actual script with subprocess,
|
||||
feed it stdin, and check the exit code + stdout shape. The hook must:
|
||||
- always exit 0 (never block a user prompt)
|
||||
- emit valid hookSpecificOutput JSON on success
|
||||
- fail open (empty output) on network errors, bad stdin, kill-switch
|
||||
- respect the short-prompt filter
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
HOOK = Path(__file__).resolve().parent.parent / "deploy" / "hooks" / "inject_context.py"
|
||||
|
||||
|
||||
def _run_hook(stdin_json: dict | str, env_overrides: dict | None = None, timeout: float = 10) -> tuple[int, str, str]:
|
||||
env = os.environ.copy()
|
||||
# Force kill switch off unless the test overrides
|
||||
env.pop("ATOCORE_CONTEXT_DISABLED", None)
|
||||
if env_overrides:
|
||||
env.update(env_overrides)
|
||||
stdin = stdin_json if isinstance(stdin_json, str) else json.dumps(stdin_json)
|
||||
proc = subprocess.run(
|
||||
[sys.executable, str(HOOK)],
|
||||
input=stdin, text=True,
|
||||
capture_output=True, timeout=timeout,
|
||||
env=env,
|
||||
)
|
||||
return proc.returncode, proc.stdout, proc.stderr
|
||||
|
||||
|
||||
def test_hook_exit_0_on_success_or_failure():
|
||||
"""Canonical contract: the hook never blocks a prompt. Even with a
|
||||
bogus URL we must exit 0 with empty stdout (fail-open)."""
|
||||
code, stdout, stderr = _run_hook(
|
||||
{
|
||||
"prompt": "What's the p04-gigabit current status?",
|
||||
"cwd": "/tmp",
|
||||
"session_id": "t",
|
||||
"hook_event_name": "UserPromptSubmit",
|
||||
},
|
||||
env_overrides={"ATOCORE_URL": "http://127.0.0.1:1", # unreachable
|
||||
"ATOCORE_CONTEXT_TIMEOUT": "1"},
|
||||
)
|
||||
assert code == 0
|
||||
# stdout is empty (fail-open) — no hookSpecificOutput emitted
|
||||
assert stdout.strip() == ""
|
||||
assert "atocore unreachable" in stderr or "request failed" in stderr
|
||||
|
||||
|
||||
def test_hook_kill_switch():
|
||||
code, stdout, stderr = _run_hook(
|
||||
{"prompt": "hello world is this a thing", "cwd": "", "session_id": "t"},
|
||||
env_overrides={"ATOCORE_CONTEXT_DISABLED": "1"},
|
||||
)
|
||||
assert code == 0
|
||||
assert stdout.strip() == ""
|
||||
|
||||
|
||||
def test_hook_ignores_short_prompt():
|
||||
code, stdout, _ = _run_hook(
|
||||
{"prompt": "ok", "cwd": "", "session_id": "t"},
|
||||
env_overrides={"ATOCORE_URL": "http://127.0.0.1:1"},
|
||||
)
|
||||
assert code == 0
|
||||
# No network call attempted; empty output
|
||||
assert stdout.strip() == ""
|
||||
|
||||
|
||||
def test_hook_ignores_xml_prompt():
|
||||
"""System/meta prompts starting with '<' should be skipped."""
|
||||
code, stdout, _ = _run_hook(
|
||||
{"prompt": "<system>do something</system>", "cwd": "", "session_id": "t"},
|
||||
env_overrides={"ATOCORE_URL": "http://127.0.0.1:1"},
|
||||
)
|
||||
assert code == 0
|
||||
assert stdout.strip() == ""
|
||||
|
||||
|
||||
def test_hook_handles_bad_stdin():
|
||||
code, stdout, stderr = _run_hook("not-json-at-all")
|
||||
assert code == 0
|
||||
assert stdout.strip() == ""
|
||||
assert "bad stdin" in stderr
|
||||
|
||||
|
||||
def test_hook_handles_empty_stdin():
|
||||
code, stdout, _ = _run_hook("")
|
||||
assert code == 0
|
||||
assert stdout.strip() == ""
|
||||
|
||||
|
||||
def test_hook_success_shape_with_mock_server(monkeypatch, tmp_path):
|
||||
"""When the API returns a pack, the hook emits valid
|
||||
hookSpecificOutput JSON wrapping it."""
|
||||
# Start a tiny HTTP server on localhost that returns a fake pack
|
||||
import http.server
|
||||
import json as _json
|
||||
import threading
|
||||
|
||||
pack = "Trusted State: foo=bar"
|
||||
|
||||
class Handler(http.server.BaseHTTPRequestHandler):
|
||||
def do_POST(self): # noqa: N802
|
||||
self.rfile.read(int(self.headers.get("Content-Length", 0)))
|
||||
body = _json.dumps({"formatted_context": pack}).encode()
|
||||
self.send_response(200)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.send_header("Content-Length", str(len(body)))
|
||||
self.end_headers()
|
||||
self.wfile.write(body)
|
||||
|
||||
def log_message(self, *a, **kw):
|
||||
pass
|
||||
|
||||
server = http.server.HTTPServer(("127.0.0.1", 0), Handler)
|
||||
port = server.server_address[1]
|
||||
t = threading.Thread(target=server.serve_forever, daemon=True)
|
||||
t.start()
|
||||
try:
|
||||
code, stdout, stderr = _run_hook(
|
||||
{
|
||||
"prompt": "What do we know about p04?",
|
||||
"cwd": "",
|
||||
"session_id": "t",
|
||||
"hook_event_name": "UserPromptSubmit",
|
||||
},
|
||||
env_overrides={
|
||||
"ATOCORE_URL": f"http://127.0.0.1:{port}",
|
||||
"ATOCORE_CONTEXT_TIMEOUT": "5",
|
||||
},
|
||||
timeout=15,
|
||||
)
|
||||
finally:
|
||||
server.shutdown()
|
||||
|
||||
assert code == 0, stderr
|
||||
assert stdout.strip(), "expected JSON output with context"
|
||||
out = json.loads(stdout)
|
||||
hso = out.get("hookSpecificOutput", {})
|
||||
assert hso.get("hookEventName") == "UserPromptSubmit"
|
||||
assert pack in hso.get("additionalContext", "")
|
||||
assert "AtoCore-injected context" in hso.get("additionalContext", "")
|
||||
|
||||
|
||||
def test_hook_project_inference_from_cwd(monkeypatch):
|
||||
"""The hook should map a known cwd to a project slug and send it in
|
||||
the /context/build payload."""
|
||||
import http.server
|
||||
import json as _json
|
||||
import threading
|
||||
|
||||
captured_body: dict = {}
|
||||
|
||||
class Handler(http.server.BaseHTTPRequestHandler):
|
||||
def do_POST(self): # noqa: N802
|
||||
n = int(self.headers.get("Content-Length", 0))
|
||||
body = self.rfile.read(n)
|
||||
captured_body.update(_json.loads(body.decode()))
|
||||
out = _json.dumps({"formatted_context": "ok"}).encode()
|
||||
self.send_response(200)
|
||||
self.send_header("Content-Length", str(len(out)))
|
||||
self.end_headers()
|
||||
self.wfile.write(out)
|
||||
|
||||
def log_message(self, *a, **kw):
|
||||
pass
|
||||
|
||||
server = http.server.HTTPServer(("127.0.0.1", 0), Handler)
|
||||
port = server.server_address[1]
|
||||
t = threading.Thread(target=server.serve_forever, daemon=True)
|
||||
t.start()
|
||||
try:
|
||||
_run_hook(
|
||||
{
|
||||
"prompt": "Is this being tested properly",
|
||||
"cwd": "C:\\Users\\antoi\\ATOCore",
|
||||
"session_id": "t",
|
||||
},
|
||||
env_overrides={
|
||||
"ATOCORE_URL": f"http://127.0.0.1:{port}",
|
||||
"ATOCORE_CONTEXT_TIMEOUT": "5",
|
||||
},
|
||||
)
|
||||
finally:
|
||||
server.shutdown()
|
||||
|
||||
# Hook should have inferred project="atocore" from the ATOCore cwd
|
||||
assert captured_body.get("project") == "atocore"
|
||||
assert captured_body.get("prompt", "").startswith("Is this being tested")
|
||||
@@ -264,6 +264,170 @@ def test_expire_stale_candidates(isolated_db):
|
||||
assert mem["status"] == "invalid"
|
||||
|
||||
|
||||
# --- Phase 4: memory_audit log ---
|
||||
|
||||
|
||||
def test_audit_create_logs_entry(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memory_audit
|
||||
|
||||
mem = create_memory("knowledge", "test content for audit", actor="test-harness")
|
||||
audit = get_memory_audit(mem.id)
|
||||
assert len(audit) >= 1
|
||||
latest = audit[0]
|
||||
assert latest["action"] == "created"
|
||||
assert latest["actor"] == "test-harness"
|
||||
assert latest["after"]["content"] == "test content for audit"
|
||||
|
||||
|
||||
def test_audit_promote_logs_entry(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memory_audit, promote_memory
|
||||
|
||||
mem = create_memory("knowledge", "candidate for promote", status="candidate")
|
||||
promote_memory(mem.id, actor="test-triage")
|
||||
audit = get_memory_audit(mem.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "promoted" in actions
|
||||
promote_entry = next(a for a in audit if a["action"] == "promoted")
|
||||
assert promote_entry["actor"] == "test-triage"
|
||||
assert promote_entry["before"]["status"] == "candidate"
|
||||
assert promote_entry["after"]["status"] == "active"
|
||||
|
||||
|
||||
def test_audit_reject_logs_entry(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memory_audit, reject_candidate_memory
|
||||
|
||||
mem = create_memory("knowledge", "candidate for reject", status="candidate")
|
||||
reject_candidate_memory(mem.id, actor="test-triage", note="stale")
|
||||
audit = get_memory_audit(mem.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "rejected" in actions
|
||||
reject_entry = next(a for a in audit if a["action"] == "rejected")
|
||||
assert reject_entry["note"] == "stale"
|
||||
|
||||
|
||||
def test_audit_update_captures_before_after(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memory_audit, update_memory
|
||||
|
||||
mem = create_memory("knowledge", "original content", confidence=0.5)
|
||||
update_memory(mem.id, content="updated content", confidence=0.9, actor="human-edit")
|
||||
audit = get_memory_audit(mem.id)
|
||||
update_entries = [a for a in audit if a["action"] == "updated"]
|
||||
assert len(update_entries) >= 1
|
||||
u = update_entries[0]
|
||||
assert u["before"]["content"] == "original content"
|
||||
assert u["after"]["content"] == "updated content"
|
||||
assert u["before"]["confidence"] == 0.5
|
||||
assert u["after"]["confidence"] == 0.9
|
||||
|
||||
|
||||
def test_audit_reinforce_logs_entry(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memory_audit, reinforce_memory
|
||||
|
||||
mem = create_memory("knowledge", "reinforced mem", confidence=0.5)
|
||||
reinforce_memory(mem.id, confidence_delta=0.02)
|
||||
audit = get_memory_audit(mem.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "reinforced" in actions
|
||||
|
||||
|
||||
def test_recent_audit_returns_cross_memory_entries(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_recent_audit
|
||||
|
||||
m1 = create_memory("knowledge", "mem one content", actor="harness")
|
||||
m2 = create_memory("knowledge", "mem two content", actor="harness")
|
||||
recent = get_recent_audit(limit=10)
|
||||
ids = {e["memory_id"] for e in recent}
|
||||
assert m1.id in ids and m2.id in ids
|
||||
|
||||
|
||||
# --- Phase 3: domain_tags + valid_until ---
|
||||
|
||||
|
||||
def test_create_memory_with_tags_and_valid_until(isolated_db):
|
||||
from atocore.memory.service import create_memory
|
||||
|
||||
mem = create_memory(
|
||||
"knowledge",
|
||||
"CTE gradient dominates WFE at F/1.2",
|
||||
domain_tags=["optics", "thermal", "materials"],
|
||||
valid_until="2027-01-01",
|
||||
)
|
||||
assert mem.domain_tags == ["optics", "thermal", "materials"]
|
||||
assert mem.valid_until == "2027-01-01"
|
||||
|
||||
|
||||
def test_create_memory_normalizes_tags(isolated_db):
|
||||
from atocore.memory.service import create_memory
|
||||
|
||||
mem = create_memory(
|
||||
"knowledge",
|
||||
"some content here",
|
||||
domain_tags=[" Optics ", "OPTICS", "Thermal", ""],
|
||||
)
|
||||
# Duplicates and empty removed; lowercased; stripped
|
||||
assert mem.domain_tags == ["optics", "thermal"]
|
||||
|
||||
|
||||
def test_update_memory_sets_tags_and_valid_until(isolated_db):
|
||||
from atocore.memory.service import create_memory, update_memory
|
||||
from atocore.models.database import get_connection
|
||||
|
||||
mem = create_memory("knowledge", "some content for update test")
|
||||
assert update_memory(
|
||||
mem.id,
|
||||
domain_tags=["controls", "firmware"],
|
||||
valid_until="2026-12-31",
|
||||
)
|
||||
with get_connection() as conn:
|
||||
row = conn.execute("SELECT domain_tags, valid_until FROM memories WHERE id = ?", (mem.id,)).fetchone()
|
||||
import json as _json
|
||||
assert _json.loads(row["domain_tags"]) == ["controls", "firmware"]
|
||||
assert row["valid_until"] == "2026-12-31"
|
||||
|
||||
|
||||
def test_get_memories_for_context_excludes_expired(isolated_db):
|
||||
"""Expired active memories must not land in context packs."""
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
|
||||
# Active but expired
|
||||
create_memory(
|
||||
"knowledge",
|
||||
"stale snapshot from long ago period",
|
||||
valid_until="2020-01-01",
|
||||
confidence=1.0,
|
||||
)
|
||||
# Active and valid
|
||||
create_memory(
|
||||
"knowledge",
|
||||
"durable engineering insight stays valid forever",
|
||||
confidence=1.0,
|
||||
)
|
||||
|
||||
text, _ = get_memories_for_context(memory_types=["knowledge"], budget=600)
|
||||
assert "durable engineering" in text
|
||||
assert "stale snapshot" not in text
|
||||
|
||||
|
||||
def test_context_builder_tag_boost_orders_results(isolated_db):
|
||||
"""Memories with tags matching query should rank higher."""
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
|
||||
create_memory("knowledge", "generic content has no obvious overlap with topic", confidence=0.8, domain_tags=[])
|
||||
create_memory("knowledge", "generic content has no obvious overlap topic here", confidence=0.8, domain_tags=["optics"])
|
||||
|
||||
text, _ = get_memories_for_context(
|
||||
memory_types=["knowledge"],
|
||||
budget=2000,
|
||||
query="tell me about optics",
|
||||
)
|
||||
# Tagged memory should appear before the untagged one
|
||||
idx_tagged = text.find("overlap topic here")
|
||||
idx_untagged = text.find("overlap with topic")
|
||||
assert idx_tagged != -1
|
||||
assert idx_untagged != -1
|
||||
assert idx_tagged < idx_untagged
|
||||
|
||||
|
||||
def test_expire_stale_candidates_keeps_reinforced(isolated_db):
|
||||
from atocore.memory.service import create_memory, expire_stale_candidates
|
||||
from atocore.models.database import get_connection
|
||||
|
||||
496
tests/test_memory_dedup.py
Normal file
496
tests/test_memory_dedup.py
Normal file
@@ -0,0 +1,496 @@
|
||||
"""Phase 7A — memory consolidation tests.
|
||||
|
||||
Covers:
|
||||
- similarity helpers (cosine bounds, matrix symmetry, clustering)
|
||||
- _dedup_prompt parser / normalizer robustness
|
||||
- create_merge_candidate idempotency
|
||||
- get_merge_candidates inlines source memories
|
||||
- merge_memories end-to-end happy path (sources → superseded,
|
||||
new merged memory active, audit rows, result_memory_id)
|
||||
- reject_merge_candidate leaves sources untouched
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.memory._dedup_prompt import (
|
||||
TIER2_SYSTEM_PROMPT,
|
||||
build_tier2_user_message,
|
||||
normalize_merge_verdict,
|
||||
parse_merge_verdict,
|
||||
)
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
create_merge_candidate,
|
||||
get_memory_audit,
|
||||
get_merge_candidates,
|
||||
merge_memories,
|
||||
reject_merge_candidate,
|
||||
)
|
||||
from atocore.memory.similarity import (
|
||||
cluster_by_threshold,
|
||||
cosine,
|
||||
compute_memory_similarity,
|
||||
similarity_matrix,
|
||||
)
|
||||
from atocore.models.database import get_connection, init_db
|
||||
|
||||
|
||||
# --- Similarity helpers ---
|
||||
|
||||
|
||||
def test_cosine_bounds():
|
||||
assert cosine([1.0, 0.0], [1.0, 0.0]) == pytest.approx(1.0)
|
||||
assert cosine([1.0, 0.0], [0.0, 1.0]) == pytest.approx(0.0)
|
||||
# Negative dot product clamped to 0
|
||||
assert cosine([1.0, 0.0], [-1.0, 0.0]) == 0.0
|
||||
|
||||
|
||||
def test_compute_memory_similarity_identical_high():
|
||||
s = compute_memory_similarity("the sky is blue", "the sky is blue")
|
||||
assert 0.99 <= s <= 1.0
|
||||
|
||||
|
||||
def test_compute_memory_similarity_unrelated_low():
|
||||
s = compute_memory_similarity(
|
||||
"APM integrates with NX via a Python bridge",
|
||||
"the polisher firmware must use USB SSD not SD card",
|
||||
)
|
||||
assert 0.0 <= s < 0.7
|
||||
|
||||
|
||||
def test_similarity_matrix_symmetric():
|
||||
texts = ["alpha beta gamma", "alpha beta gamma", "completely unrelated text"]
|
||||
m = similarity_matrix(texts)
|
||||
assert len(m) == 3 and all(len(r) == 3 for r in m)
|
||||
for i in range(3):
|
||||
assert m[i][i] == pytest.approx(1.0)
|
||||
for i in range(3):
|
||||
for j in range(3):
|
||||
assert m[i][j] == pytest.approx(m[j][i])
|
||||
|
||||
|
||||
def test_cluster_by_threshold_transitive():
|
||||
# Three near-paraphrases should land in one cluster
|
||||
texts = [
|
||||
"Antoine prefers OAuth over API keys",
|
||||
"Antoine's preference is OAuth, not API keys",
|
||||
"the polisher firmware uses USB SSD storage",
|
||||
]
|
||||
clusters = cluster_by_threshold(texts, threshold=0.7)
|
||||
# At least one cluster of size 2+ containing the paraphrases
|
||||
big = [c for c in clusters if len(c) >= 2]
|
||||
assert big, f"expected at least one multi-member cluster, got {clusters}"
|
||||
assert 0 in big[0] and 1 in big[0]
|
||||
|
||||
|
||||
# --- Prompt parser robustness ---
|
||||
|
||||
|
||||
def test_parse_merge_verdict_strips_fences():
|
||||
raw = "```json\n{\"action\":\"merge\",\"content\":\"x\"}\n```"
|
||||
parsed = parse_merge_verdict(raw)
|
||||
assert parsed == {"action": "merge", "content": "x"}
|
||||
|
||||
|
||||
def test_parse_merge_verdict_handles_prose_prefix():
|
||||
raw = "Sure! Here's the result:\n{\"action\":\"reject\",\"content\":\"no\"}"
|
||||
parsed = parse_merge_verdict(raw)
|
||||
assert parsed is not None
|
||||
assert parsed["action"] == "reject"
|
||||
|
||||
|
||||
def test_normalize_merge_verdict_fills_defaults():
|
||||
v = normalize_merge_verdict({
|
||||
"action": "merge",
|
||||
"content": "unified text",
|
||||
})
|
||||
assert v is not None
|
||||
assert v["memory_type"] == "knowledge"
|
||||
assert v["project"] == ""
|
||||
assert v["domain_tags"] == []
|
||||
assert v["confidence"] == 0.5
|
||||
|
||||
|
||||
def test_normalize_merge_verdict_rejects_empty_content():
|
||||
assert normalize_merge_verdict({"action": "merge", "content": ""}) is None
|
||||
|
||||
|
||||
def test_normalize_merge_verdict_rejects_unknown_action():
|
||||
assert normalize_merge_verdict({"action": "?", "content": "x"}) is None
|
||||
|
||||
|
||||
# --- Tier-2 (Phase 7A.1) ---
|
||||
|
||||
|
||||
def test_tier2_prompt_is_stricter():
|
||||
# The tier-2 system prompt must explicitly instruct the model to be
|
||||
# stricter than tier-1 — that's the whole point of escalation.
|
||||
assert "STRICTER" in TIER2_SYSTEM_PROMPT
|
||||
assert "REJECT" in TIER2_SYSTEM_PROMPT
|
||||
|
||||
|
||||
def test_build_tier2_user_message_includes_tier1_draft():
|
||||
sources = [{
|
||||
"id": "abc12345", "content": "source text A",
|
||||
"memory_type": "knowledge", "project": "p04",
|
||||
"domain_tags": ["optics"], "confidence": 0.6,
|
||||
"valid_until": "", "reference_count": 2,
|
||||
}, {
|
||||
"id": "def67890", "content": "source text B",
|
||||
"memory_type": "knowledge", "project": "p04",
|
||||
"domain_tags": ["optics"], "confidence": 0.7,
|
||||
"valid_until": "", "reference_count": 1,
|
||||
}]
|
||||
tier1 = {
|
||||
"action": "merge",
|
||||
"content": "unified draft by tier1",
|
||||
"memory_type": "knowledge",
|
||||
"project": "p04",
|
||||
"domain_tags": ["optics"],
|
||||
"confidence": 0.65,
|
||||
"reason": "near-paraphrase",
|
||||
}
|
||||
msg = build_tier2_user_message(sources, tier1)
|
||||
assert "source text A" in msg
|
||||
assert "source text B" in msg
|
||||
assert "TIER-1 DRAFT" in msg
|
||||
assert "unified draft by tier1" in msg
|
||||
assert "near-paraphrase" in msg
|
||||
# Should end asking for a verdict
|
||||
assert "verdict" in msg.lower()
|
||||
|
||||
|
||||
# --- Tiering helpers (min_pairwise_similarity, same_bucket) ---
|
||||
|
||||
|
||||
def test_same_bucket_true_for_matching():
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"memory_dedup_for_test",
|
||||
"scripts/memory_dedup.py",
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
sources = [
|
||||
{"memory_type": "knowledge", "project": "p04"},
|
||||
{"memory_type": "knowledge", "project": "p04"},
|
||||
]
|
||||
assert mod.same_bucket(sources) is True
|
||||
|
||||
|
||||
def test_same_bucket_false_for_mixed():
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"memory_dedup_for_test",
|
||||
"scripts/memory_dedup.py",
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
# Different project
|
||||
assert mod.same_bucket([
|
||||
{"memory_type": "knowledge", "project": "p04"},
|
||||
{"memory_type": "knowledge", "project": "p05"},
|
||||
]) is False
|
||||
# Different memory_type
|
||||
assert mod.same_bucket([
|
||||
{"memory_type": "knowledge", "project": "p04"},
|
||||
{"memory_type": "project", "project": "p04"},
|
||||
]) is False
|
||||
|
||||
|
||||
def test_min_pairwise_similarity_identical_texts():
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"memory_dedup_for_test",
|
||||
"scripts/memory_dedup.py",
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
# Three identical texts — min should be ~1.0
|
||||
ms = mod.min_pairwise_similarity(["hello world"] * 3)
|
||||
assert 0.99 <= ms <= 1.0
|
||||
|
||||
|
||||
def test_min_pairwise_similarity_mixed_cluster():
|
||||
"""Transitive cluster A~B~C with A and C actually quite different
|
||||
should expose a low min even though A~B and B~C are high."""
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"memory_dedup_for_test",
|
||||
"scripts/memory_dedup.py",
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
ms = mod.min_pairwise_similarity([
|
||||
"Antoine prefers OAuth over API keys",
|
||||
"Antoine's OAuth preference",
|
||||
"USB SSD mandatory for polisher firmware",
|
||||
])
|
||||
assert ms < 0.6 # Third is unrelated; min is far below threshold
|
||||
|
||||
|
||||
# --- create_merge_candidate idempotency ---
|
||||
|
||||
|
||||
def test_create_merge_candidate_inserts_row(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "APM uses NX for DXF conversion")
|
||||
m2 = create_memory("knowledge", "APM uses NX for DXF-to-STL")
|
||||
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id],
|
||||
similarity=0.92,
|
||||
proposed_content="APM uses NX for DXF→STL conversion",
|
||||
proposed_memory_type="knowledge",
|
||||
proposed_project="",
|
||||
proposed_tags=["apm", "nx"],
|
||||
proposed_confidence=0.6,
|
||||
reason="near-paraphrase",
|
||||
)
|
||||
assert cid is not None
|
||||
|
||||
pending = get_merge_candidates(status="pending")
|
||||
assert len(pending) == 1
|
||||
assert pending[0]["id"] == cid
|
||||
assert pending[0]["similarity"] == pytest.approx(0.92)
|
||||
assert len(pending[0]["sources"]) == 2
|
||||
|
||||
|
||||
def test_create_merge_candidate_idempotent(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "Fact A")
|
||||
m2 = create_memory("knowledge", "Fact A slightly reworded")
|
||||
|
||||
first = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id],
|
||||
similarity=0.9,
|
||||
proposed_content="merged",
|
||||
proposed_memory_type="knowledge",
|
||||
proposed_project="",
|
||||
)
|
||||
# Same id set, different order → dedupe skips
|
||||
second = create_merge_candidate(
|
||||
memory_ids=[m2.id, m1.id],
|
||||
similarity=0.9,
|
||||
proposed_content="merged (again)",
|
||||
proposed_memory_type="knowledge",
|
||||
proposed_project="",
|
||||
)
|
||||
assert first is not None
|
||||
assert second is None
|
||||
|
||||
|
||||
def test_create_merge_candidate_requires_two_ids(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "lonely")
|
||||
with pytest.raises(ValueError):
|
||||
create_merge_candidate(
|
||||
memory_ids=[m1.id],
|
||||
similarity=1.0,
|
||||
proposed_content="x",
|
||||
proposed_memory_type="knowledge",
|
||||
proposed_project="",
|
||||
)
|
||||
|
||||
|
||||
# --- merge_memories end-to-end ---
|
||||
|
||||
|
||||
def test_merge_memories_happy_path(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory(
|
||||
"knowledge", "APM uses NX for DXF conversion",
|
||||
project="apm", confidence=0.6, domain_tags=["apm", "nx"],
|
||||
)
|
||||
m2 = create_memory(
|
||||
"knowledge", "APM does DXF to STL via NX bridge",
|
||||
project="apm", confidence=0.8, domain_tags=["apm", "bridge"],
|
||||
)
|
||||
# Bump reference counts so sum is meaningful
|
||||
with get_connection() as conn:
|
||||
conn.execute("UPDATE memories SET reference_count = 3 WHERE id = ?", (m1.id,))
|
||||
conn.execute("UPDATE memories SET reference_count = 5 WHERE id = ?", (m2.id,))
|
||||
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id],
|
||||
similarity=0.92,
|
||||
proposed_content="APM uses NX bridge for DXF→STL conversion",
|
||||
proposed_memory_type="knowledge",
|
||||
proposed_project="apm",
|
||||
proposed_tags=["apm", "nx", "bridge"],
|
||||
proposed_confidence=0.7,
|
||||
reason="duplicates",
|
||||
)
|
||||
new_id = merge_memories(candidate_id=cid, actor="human-triage")
|
||||
assert new_id is not None
|
||||
|
||||
# Sources superseded
|
||||
with get_connection() as conn:
|
||||
s1 = conn.execute("SELECT status FROM memories WHERE id = ?", (m1.id,)).fetchone()
|
||||
s2 = conn.execute("SELECT status FROM memories WHERE id = ?", (m2.id,)).fetchone()
|
||||
merged = conn.execute(
|
||||
"SELECT content, status, confidence, reference_count, project "
|
||||
"FROM memories WHERE id = ?", (new_id,)
|
||||
).fetchone()
|
||||
cand = conn.execute(
|
||||
"SELECT status, result_memory_id FROM memory_merge_candidates WHERE id = ?",
|
||||
(cid,),
|
||||
).fetchone()
|
||||
assert s1["status"] == "superseded"
|
||||
assert s2["status"] == "superseded"
|
||||
assert merged["status"] == "active"
|
||||
assert merged["project"] == "apm"
|
||||
# confidence = max of sources (0.8), not the proposed 0.7 (proposed is hint;
|
||||
# merge_memories picks max of actual source confidences — verify).
|
||||
assert merged["confidence"] == pytest.approx(0.8)
|
||||
# reference_count = sum (3 + 5 = 8)
|
||||
assert int(merged["reference_count"]) == 8
|
||||
assert cand["status"] == "approved"
|
||||
assert cand["result_memory_id"] == new_id
|
||||
|
||||
|
||||
def test_merge_memories_content_override(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "draft A", project="p05-interferometer")
|
||||
m2 = create_memory("knowledge", "draft B", project="p05-interferometer")
|
||||
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id],
|
||||
similarity=0.9,
|
||||
proposed_content="AI draft",
|
||||
proposed_memory_type="knowledge",
|
||||
proposed_project="p05-interferometer",
|
||||
)
|
||||
new_id = merge_memories(
|
||||
candidate_id=cid,
|
||||
actor="human-triage",
|
||||
override_content="human-edited final text",
|
||||
override_tags=["optics", "custom"],
|
||||
)
|
||||
assert new_id is not None
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT content, domain_tags FROM memories WHERE id = ?", (new_id,)
|
||||
).fetchone()
|
||||
assert row["content"] == "human-edited final text"
|
||||
# domain_tags JSON should contain the override
|
||||
assert "optics" in row["domain_tags"]
|
||||
assert "custom" in row["domain_tags"]
|
||||
|
||||
|
||||
def test_merge_memories_writes_audit(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "alpha")
|
||||
m2 = create_memory("knowledge", "alpha variant")
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id], similarity=0.9,
|
||||
proposed_content="alpha merged",
|
||||
proposed_memory_type="knowledge", proposed_project="",
|
||||
)
|
||||
new_id = merge_memories(candidate_id=cid)
|
||||
assert new_id
|
||||
|
||||
audit_new = get_memory_audit(new_id)
|
||||
actions_new = {a["action"] for a in audit_new}
|
||||
assert "created_via_merge" in actions_new
|
||||
|
||||
audit_m1 = get_memory_audit(m1.id)
|
||||
actions_m1 = {a["action"] for a in audit_m1}
|
||||
assert "superseded" in actions_m1
|
||||
|
||||
|
||||
def test_merge_memories_aborts_if_source_not_active(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "one")
|
||||
m2 = create_memory("knowledge", "two")
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id], similarity=0.9,
|
||||
proposed_content="merged",
|
||||
proposed_memory_type="knowledge", proposed_project="",
|
||||
)
|
||||
# Tamper: supersede one source before the merge runs
|
||||
with get_connection() as conn:
|
||||
conn.execute("UPDATE memories SET status = 'superseded' WHERE id = ?", (m1.id,))
|
||||
result = merge_memories(candidate_id=cid)
|
||||
assert result is None
|
||||
|
||||
# Candidate still pending
|
||||
pending = get_merge_candidates(status="pending")
|
||||
assert any(c["id"] == cid for c in pending)
|
||||
|
||||
|
||||
def test_merge_memories_rejects_already_resolved(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "x")
|
||||
m2 = create_memory("knowledge", "y")
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id], similarity=0.9,
|
||||
proposed_content="xy",
|
||||
proposed_memory_type="knowledge", proposed_project="",
|
||||
)
|
||||
first = merge_memories(candidate_id=cid)
|
||||
assert first is not None
|
||||
# second call — already approved, should return None
|
||||
second = merge_memories(candidate_id=cid)
|
||||
assert second is None
|
||||
|
||||
|
||||
# --- reject_merge_candidate ---
|
||||
|
||||
|
||||
def test_reject_merge_candidate_leaves_sources_untouched(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "a")
|
||||
m2 = create_memory("knowledge", "b")
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id], similarity=0.9,
|
||||
proposed_content="a+b",
|
||||
proposed_memory_type="knowledge", proposed_project="",
|
||||
)
|
||||
ok = reject_merge_candidate(cid, actor="human-triage", note="false positive")
|
||||
assert ok
|
||||
|
||||
# Sources still active
|
||||
with get_connection() as conn:
|
||||
s1 = conn.execute("SELECT status FROM memories WHERE id = ?", (m1.id,)).fetchone()
|
||||
s2 = conn.execute("SELECT status FROM memories WHERE id = ?", (m2.id,)).fetchone()
|
||||
cand = conn.execute(
|
||||
"SELECT status FROM memory_merge_candidates WHERE id = ?", (cid,)
|
||||
).fetchone()
|
||||
assert s1["status"] == "active"
|
||||
assert s2["status"] == "active"
|
||||
assert cand["status"] == "rejected"
|
||||
|
||||
|
||||
def test_reject_merge_candidate_idempotent(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "p")
|
||||
m2 = create_memory("knowledge", "q")
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id], similarity=0.9,
|
||||
proposed_content="pq",
|
||||
proposed_memory_type="knowledge", proposed_project="",
|
||||
)
|
||||
assert reject_merge_candidate(cid) is True
|
||||
# second reject — already rejected, returns False
|
||||
assert reject_merge_candidate(cid) is False
|
||||
|
||||
|
||||
# --- Schema sanity ---
|
||||
|
||||
|
||||
def test_merge_candidates_table_exists(tmp_data_dir):
|
||||
init_db()
|
||||
with get_connection() as conn:
|
||||
cols = [r["name"] for r in conn.execute("PRAGMA table_info(memory_merge_candidates)").fetchall()]
|
||||
expected = {"id", "status", "memory_ids", "similarity", "proposed_content",
|
||||
"proposed_memory_type", "proposed_project", "proposed_tags",
|
||||
"proposed_confidence", "reason", "created_at", "resolved_at",
|
||||
"resolved_by", "result_memory_id"}
|
||||
assert expected.issubset(set(cols))
|
||||
148
tests/test_phase6_living_taxonomy.py
Normal file
148
tests/test_phase6_living_taxonomy.py
Normal file
@@ -0,0 +1,148 @@
|
||||
"""Phase 6 tests — Living Taxonomy: detector + transient-to-durable extension."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime, timedelta, timezone
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
extend_reinforced_valid_until,
|
||||
)
|
||||
from atocore.models.database import get_connection, init_db
|
||||
|
||||
|
||||
def _set_memory_fields(mem_id, reference_count=None, valid_until=None):
|
||||
"""Helper to force memory state for tests."""
|
||||
with get_connection() as conn:
|
||||
fields, params = [], []
|
||||
if reference_count is not None:
|
||||
fields.append("reference_count = ?")
|
||||
params.append(reference_count)
|
||||
if valid_until is not None:
|
||||
fields.append("valid_until = ?")
|
||||
params.append(valid_until)
|
||||
params.append(mem_id)
|
||||
conn.execute(
|
||||
f"UPDATE memories SET {', '.join(fields)} WHERE id = ?",
|
||||
params,
|
||||
)
|
||||
|
||||
|
||||
# --- Transient-to-durable extension (C.3) ---
|
||||
|
||||
|
||||
def test_extend_extends_imminent_valid_until(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Reinforced content for extension")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=6, valid_until=soon)
|
||||
|
||||
result = extend_reinforced_valid_until()
|
||||
assert len(result) == 1
|
||||
assert result[0]["memory_id"] == mem.id
|
||||
assert result[0]["action"] == "extended"
|
||||
# New expiry should be ~90 days out
|
||||
new_date = datetime.strptime(result[0]["new_valid_until"], "%Y-%m-%d")
|
||||
days_out = (new_date - datetime.now(timezone.utc).replace(tzinfo=None)).days
|
||||
assert 85 <= days_out <= 92 # ~90 days, some slop for test timing
|
||||
|
||||
|
||||
def test_extend_makes_permanent_at_high_reference_count(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Heavy-referenced content")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=15, valid_until=soon)
|
||||
|
||||
result = extend_reinforced_valid_until()
|
||||
assert len(result) == 1
|
||||
assert result[0]["action"] == "made_permanent"
|
||||
assert result[0]["new_valid_until"] is None
|
||||
|
||||
# Verify the DB reflects the cleared expiry
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT valid_until FROM memories WHERE id = ?", (mem.id,)
|
||||
).fetchone()
|
||||
assert row["valid_until"] is None
|
||||
|
||||
|
||||
def test_extend_skips_not_expiring_soon(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Far-future expiry")
|
||||
far = (datetime.now(timezone.utc) + timedelta(days=365)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=6, valid_until=far)
|
||||
|
||||
result = extend_reinforced_valid_until(imminent_expiry_days=30)
|
||||
assert result == []
|
||||
|
||||
|
||||
def test_extend_skips_low_reference_count(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Not reinforced enough")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=2, valid_until=soon)
|
||||
|
||||
result = extend_reinforced_valid_until(min_reference_count=5)
|
||||
assert result == []
|
||||
|
||||
|
||||
def test_extend_skips_permanent_memory(tmp_data_dir):
|
||||
"""Memory with no valid_until is already permanent — shouldn't touch."""
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Already permanent")
|
||||
_set_memory_fields(mem.id, reference_count=20)
|
||||
# no valid_until
|
||||
|
||||
result = extend_reinforced_valid_until()
|
||||
assert result == []
|
||||
|
||||
|
||||
def test_extend_writes_audit_row(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory("knowledge", "Audited extension")
|
||||
soon = (datetime.now(timezone.utc) + timedelta(days=7)).strftime("%Y-%m-%d")
|
||||
_set_memory_fields(mem.id, reference_count=6, valid_until=soon)
|
||||
|
||||
extend_reinforced_valid_until()
|
||||
|
||||
from atocore.memory.service import get_memory_audit
|
||||
audit = get_memory_audit(mem.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "valid_until_extended" in actions
|
||||
entry = next(a for a in audit if a["action"] == "valid_until_extended")
|
||||
assert entry["actor"] == "transient-to-durable"
|
||||
|
||||
|
||||
# --- Emerging detector (smoke tests — detector runs against live DB state
|
||||
# so we test the shape of results rather than full integration here) ---
|
||||
|
||||
|
||||
def test_detector_imports_cleanly():
|
||||
"""Detector module must import without errors (it's called from nightly cron)."""
|
||||
import importlib.util
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Load the detector script as a module
|
||||
script = Path(__file__).resolve().parent.parent / "scripts" / "detect_emerging.py"
|
||||
assert script.exists()
|
||||
spec = importlib.util.spec_from_file_location("detect_emerging", script)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
# Don't actually run main() — just verify it parses and defines expected names
|
||||
spec.loader.exec_module(mod)
|
||||
assert hasattr(mod, "main")
|
||||
assert hasattr(mod, "PROJECT_MIN_MEMORIES")
|
||||
assert hasattr(mod, "PROJECT_ALERT_THRESHOLD")
|
||||
|
||||
|
||||
def test_detector_handles_empty_db(tmp_data_dir):
|
||||
"""Detector should handle zero memories without crashing."""
|
||||
init_db()
|
||||
# Don't create any memories. Just verify the queries work via the service layer.
|
||||
from atocore.memory.service import get_memories
|
||||
active = get_memories(active_only=True, limit=500)
|
||||
candidates = get_memories(status="candidate", limit=500)
|
||||
assert active == []
|
||||
assert candidates == []
|
||||
296
tests/test_tag_canon.py
Normal file
296
tests/test_tag_canon.py
Normal file
@@ -0,0 +1,296 @@
|
||||
"""Phase 7C — tag canonicalization tests.
|
||||
|
||||
Covers:
|
||||
- prompt parser (fences, prose, empty)
|
||||
- normalizer (identity, protected tokens, empty)
|
||||
- get_tag_distribution counts across active memories
|
||||
- apply_tag_alias rewrites + dedupes + audits
|
||||
- create / approve / reject lifecycle
|
||||
- idempotency (dup proposals skipped)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.memory._tag_canon_prompt import (
|
||||
PROTECTED_PROJECT_TOKENS,
|
||||
build_user_message,
|
||||
normalize_alias_item,
|
||||
parse_canon_output,
|
||||
)
|
||||
from atocore.memory.service import (
|
||||
apply_tag_alias,
|
||||
approve_tag_alias,
|
||||
create_memory,
|
||||
create_tag_alias_proposal,
|
||||
get_memory_audit,
|
||||
get_tag_alias_proposals,
|
||||
get_tag_distribution,
|
||||
reject_tag_alias,
|
||||
)
|
||||
from atocore.models.database import get_connection, init_db
|
||||
|
||||
|
||||
# --- Prompt parser ---
|
||||
|
||||
|
||||
def test_parse_canon_output_handles_fences():
|
||||
raw = "```json\n{\"aliases\": [{\"alias\": \"fw\", \"canonical\": \"firmware\", \"confidence\": 0.9}]}\n```"
|
||||
items = parse_canon_output(raw)
|
||||
assert len(items) == 1
|
||||
assert items[0]["alias"] == "fw"
|
||||
|
||||
|
||||
def test_parse_canon_output_handles_prose_prefix():
|
||||
raw = "Here you go:\n{\"aliases\": [{\"alias\": \"ml\", \"canonical\": \"machine-learning\", \"confidence\": 0.9}]}"
|
||||
items = parse_canon_output(raw)
|
||||
assert len(items) == 1
|
||||
|
||||
|
||||
def test_parse_canon_output_empty_list():
|
||||
assert parse_canon_output("{\"aliases\": []}") == []
|
||||
|
||||
|
||||
def test_parse_canon_output_malformed():
|
||||
assert parse_canon_output("not json at all") == []
|
||||
assert parse_canon_output("") == []
|
||||
|
||||
|
||||
# --- Normalizer ---
|
||||
|
||||
|
||||
def test_normalize_alias_strips_and_lowercases():
|
||||
n = normalize_alias_item({"alias": " FW ", "canonical": "Firmware", "confidence": 0.95, "reason": "abbrev"})
|
||||
assert n == {"alias": "fw", "canonical": "firmware", "confidence": 0.95, "reason": "abbrev"}
|
||||
|
||||
|
||||
def test_normalize_rejects_identity():
|
||||
assert normalize_alias_item({"alias": "foo", "canonical": "foo", "confidence": 0.9}) is None
|
||||
|
||||
|
||||
def test_normalize_rejects_empty():
|
||||
assert normalize_alias_item({"alias": "", "canonical": "foo", "confidence": 0.9}) is None
|
||||
assert normalize_alias_item({"alias": "foo", "canonical": "", "confidence": 0.9}) is None
|
||||
|
||||
|
||||
def test_normalize_protects_project_tokens():
|
||||
# Project ids must not be canonicalized — they're their own namespace
|
||||
assert "p04" in PROTECTED_PROJECT_TOKENS
|
||||
assert normalize_alias_item({"alias": "p04", "canonical": "p04-gigabit", "confidence": 1.0}) is None
|
||||
assert normalize_alias_item({"alias": "p04-gigabit", "canonical": "p04", "confidence": 1.0}) is None
|
||||
assert normalize_alias_item({"alias": "apm", "canonical": "part-manager", "confidence": 1.0}) is None
|
||||
|
||||
|
||||
def test_normalize_clamps_confidence():
|
||||
hi = normalize_alias_item({"alias": "a", "canonical": "b", "confidence": 2.5})
|
||||
assert hi["confidence"] == 1.0
|
||||
lo = normalize_alias_item({"alias": "a", "canonical": "b", "confidence": -0.5})
|
||||
assert lo["confidence"] == 0.0
|
||||
|
||||
|
||||
def test_normalize_handles_non_numeric_confidence():
|
||||
n = normalize_alias_item({"alias": "a", "canonical": "b", "confidence": "not a number"})
|
||||
assert n is not None and n["confidence"] == 0.0
|
||||
|
||||
|
||||
# --- build_user_message ---
|
||||
|
||||
|
||||
def test_build_user_message_includes_top_tags():
|
||||
dist = {"firmware": 23, "fw": 5, "optics": 18, "optical": 2}
|
||||
msg = build_user_message(dist)
|
||||
assert "firmware: 23" in msg
|
||||
assert "optics: 18" in msg
|
||||
assert "aliases" in msg.lower() or "JSON" in msg
|
||||
|
||||
|
||||
def test_build_user_message_empty():
|
||||
msg = build_user_message({})
|
||||
assert "Empty" in msg or "empty" in msg
|
||||
|
||||
|
||||
# --- get_tag_distribution ---
|
||||
|
||||
|
||||
def test_tag_distribution_counts_active_only(tmp_data_dir):
|
||||
init_db()
|
||||
create_memory("knowledge", "a", domain_tags=["firmware", "p06"])
|
||||
create_memory("knowledge", "b", domain_tags=["firmware"])
|
||||
create_memory("knowledge", "c", domain_tags=["optics"])
|
||||
|
||||
# Add an invalid memory — should NOT be counted
|
||||
m_invalid = create_memory("knowledge", "d", domain_tags=["firmware", "ignored"])
|
||||
with get_connection() as conn:
|
||||
conn.execute("UPDATE memories SET status = 'invalid' WHERE id = ?", (m_invalid.id,))
|
||||
|
||||
dist = get_tag_distribution()
|
||||
assert dist.get("firmware") == 2 # two active memories
|
||||
assert dist.get("optics") == 1
|
||||
assert dist.get("p06") == 1
|
||||
assert "ignored" not in dist
|
||||
|
||||
|
||||
def test_tag_distribution_min_count_filter(tmp_data_dir):
|
||||
init_db()
|
||||
create_memory("knowledge", "a", domain_tags=["firmware"])
|
||||
create_memory("knowledge", "b", domain_tags=["firmware"])
|
||||
create_memory("knowledge", "c", domain_tags=["once"])
|
||||
|
||||
dist = get_tag_distribution(min_count=2)
|
||||
assert "firmware" in dist
|
||||
assert "once" not in dist
|
||||
|
||||
|
||||
# --- apply_tag_alias ---
|
||||
|
||||
|
||||
def test_apply_tag_alias_rewrites_across_memories(tmp_data_dir):
|
||||
init_db()
|
||||
m1 = create_memory("knowledge", "a", domain_tags=["fw", "p06"])
|
||||
m2 = create_memory("knowledge", "b", domain_tags=["fw"])
|
||||
m3 = create_memory("knowledge", "c", domain_tags=["optics"]) # untouched
|
||||
|
||||
result = apply_tag_alias("fw", "firmware")
|
||||
assert result["memories_touched"] == 2
|
||||
|
||||
import json as _json
|
||||
with get_connection() as conn:
|
||||
r1 = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m1.id,)).fetchone()
|
||||
r2 = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m2.id,)).fetchone()
|
||||
r3 = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m3.id,)).fetchone()
|
||||
assert "firmware" in _json.loads(r1["domain_tags"])
|
||||
assert "fw" not in _json.loads(r1["domain_tags"])
|
||||
assert "firmware" in _json.loads(r2["domain_tags"])
|
||||
assert _json.loads(r3["domain_tags"]) == ["optics"] # untouched
|
||||
|
||||
|
||||
def test_apply_tag_alias_dedupes_when_both_present(tmp_data_dir):
|
||||
"""Memory has both fw AND firmware → rewrite collapses to just firmware."""
|
||||
init_db()
|
||||
m = create_memory("knowledge", "dual-tagged", domain_tags=["fw", "firmware", "p06"])
|
||||
|
||||
result = apply_tag_alias("fw", "firmware")
|
||||
assert result["memories_touched"] == 1
|
||||
|
||||
import json as _json
|
||||
with get_connection() as conn:
|
||||
r = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m.id,)).fetchone()
|
||||
tags = _json.loads(r["domain_tags"])
|
||||
assert tags.count("firmware") == 1
|
||||
assert "fw" not in tags
|
||||
assert "p06" in tags
|
||||
|
||||
|
||||
def test_apply_tag_alias_skips_memories_without_alias(tmp_data_dir):
|
||||
init_db()
|
||||
m = create_memory("knowledge", "no match", domain_tags=["optics", "p04"])
|
||||
result = apply_tag_alias("fw", "firmware")
|
||||
assert result["memories_touched"] == 0
|
||||
|
||||
|
||||
def test_apply_tag_alias_writes_audit(tmp_data_dir):
|
||||
init_db()
|
||||
m = create_memory("knowledge", "audited", domain_tags=["fw"])
|
||||
apply_tag_alias("fw", "firmware", actor="auto-tag-canon")
|
||||
|
||||
audit = get_memory_audit(m.id)
|
||||
actions = [a["action"] for a in audit]
|
||||
assert "tag_canonicalized" in actions
|
||||
entry = next(a for a in audit if a["action"] == "tag_canonicalized")
|
||||
assert entry["actor"] == "auto-tag-canon"
|
||||
assert "fw → firmware" in entry["note"]
|
||||
assert "fw" in entry["before"]["domain_tags"]
|
||||
assert "firmware" in entry["after"]["domain_tags"]
|
||||
|
||||
|
||||
def test_apply_tag_alias_rejects_identity(tmp_data_dir):
|
||||
init_db()
|
||||
with pytest.raises(ValueError):
|
||||
apply_tag_alias("foo", "foo")
|
||||
|
||||
|
||||
def test_apply_tag_alias_rejects_empty(tmp_data_dir):
|
||||
init_db()
|
||||
with pytest.raises(ValueError):
|
||||
apply_tag_alias("", "firmware")
|
||||
|
||||
|
||||
# --- Proposal lifecycle ---
|
||||
|
||||
|
||||
def test_create_proposal_inserts_pending(tmp_data_dir):
|
||||
init_db()
|
||||
pid = create_tag_alias_proposal("fw", "firmware", confidence=0.65,
|
||||
alias_count=5, canonical_count=23,
|
||||
reason="standard abbreviation")
|
||||
assert pid is not None
|
||||
|
||||
rows = get_tag_alias_proposals(status="pending")
|
||||
assert len(rows) == 1
|
||||
assert rows[0]["alias"] == "fw"
|
||||
assert rows[0]["confidence"] == pytest.approx(0.65)
|
||||
|
||||
|
||||
def test_create_proposal_idempotent(tmp_data_dir):
|
||||
init_db()
|
||||
first = create_tag_alias_proposal("fw", "firmware", confidence=0.6)
|
||||
second = create_tag_alias_proposal("fw", "firmware", confidence=0.7)
|
||||
assert first is not None
|
||||
assert second is None
|
||||
|
||||
|
||||
def test_approve_applies_rewrite(tmp_data_dir):
|
||||
init_db()
|
||||
m = create_memory("knowledge", "x", domain_tags=["fw"])
|
||||
pid = create_tag_alias_proposal("fw", "firmware", confidence=0.7)
|
||||
result = approve_tag_alias(pid, actor="human-triage")
|
||||
assert result is not None
|
||||
assert result["memories_touched"] == 1
|
||||
|
||||
# Proposal now approved with applied_to_memories recorded
|
||||
rows = get_tag_alias_proposals(status="approved")
|
||||
assert len(rows) == 1
|
||||
assert rows[0]["applied_to_memories"] == 1
|
||||
|
||||
# Memory actually rewritten
|
||||
import json as _json
|
||||
with get_connection() as conn:
|
||||
r = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m.id,)).fetchone()
|
||||
assert "firmware" in _json.loads(r["domain_tags"])
|
||||
|
||||
|
||||
def test_approve_already_resolved_returns_none(tmp_data_dir):
|
||||
init_db()
|
||||
pid = create_tag_alias_proposal("a", "b", confidence=0.6)
|
||||
approve_tag_alias(pid)
|
||||
assert approve_tag_alias(pid) is None # second approve — no-op
|
||||
|
||||
|
||||
def test_reject_leaves_memories_untouched(tmp_data_dir):
|
||||
init_db()
|
||||
m = create_memory("knowledge", "x", domain_tags=["fw"])
|
||||
pid = create_tag_alias_proposal("fw", "firmware", confidence=0.6)
|
||||
assert reject_tag_alias(pid)
|
||||
|
||||
rows = get_tag_alias_proposals(status="rejected")
|
||||
assert len(rows) == 1
|
||||
|
||||
# Memory still has the original tag
|
||||
import json as _json
|
||||
with get_connection() as conn:
|
||||
r = conn.execute("SELECT domain_tags FROM memories WHERE id = ?", (m.id,)).fetchone()
|
||||
assert "fw" in _json.loads(r["domain_tags"])
|
||||
|
||||
|
||||
# --- Schema sanity ---
|
||||
|
||||
|
||||
def test_tag_aliases_table_exists(tmp_data_dir):
|
||||
init_db()
|
||||
with get_connection() as conn:
|
||||
cols = [r["name"] for r in conn.execute("PRAGMA table_info(tag_aliases)").fetchall()]
|
||||
expected = {"id", "alias", "canonical", "status", "confidence",
|
||||
"alias_count", "canonical_count", "reason",
|
||||
"applied_to_memories", "created_at", "resolved_at", "resolved_by"}
|
||||
assert expected.issubset(set(cols))
|
||||
219
tests/test_triage_escalation.py
Normal file
219
tests/test_triage_escalation.py
Normal file
@@ -0,0 +1,219 @@
|
||||
"""Tests for 3-tier triage escalation logic (Phase Triage Quality).
|
||||
|
||||
The actual LLM calls are gated by ``shutil.which('claude')`` and can't be
|
||||
exercised in CI without the CLI, so we mock the tier functions directly
|
||||
and verify the control-flow (escalation routing, discard vs human, project
|
||||
misattribution, metadata update).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
|
||||
# Import the script as a module for unit testing
|
||||
_SCRIPTS = str(Path(__file__).resolve().parent.parent / "scripts")
|
||||
if _SCRIPTS not in sys.path:
|
||||
sys.path.insert(0, _SCRIPTS)
|
||||
|
||||
import auto_triage # noqa: E402
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def reset_thresholds(monkeypatch):
|
||||
"""Make sure env-var overrides don't leak between tests."""
|
||||
monkeypatch.setattr(auto_triage, "AUTO_PROMOTE_MIN_CONFIDENCE", 0.8)
|
||||
monkeypatch.setattr(auto_triage, "ESCALATION_CONFIDENCE_THRESHOLD", 0.75)
|
||||
monkeypatch.setattr(auto_triage, "TIER3_ACTION", "discard")
|
||||
monkeypatch.setattr(auto_triage, "TIER1_MODEL", "sonnet")
|
||||
monkeypatch.setattr(auto_triage, "TIER2_MODEL", "opus")
|
||||
|
||||
|
||||
def test_parse_verdict_captures_suggested_project():
|
||||
raw = '{"verdict": "promote", "confidence": 0.9, "reason": "clear", "suggested_project": "p04-gigabit"}'
|
||||
v = auto_triage.parse_verdict(raw)
|
||||
assert v["verdict"] == "promote"
|
||||
assert v["suggested_project"] == "p04-gigabit"
|
||||
|
||||
|
||||
def test_parse_verdict_defaults_suggested_project_to_empty():
|
||||
raw = '{"verdict": "reject", "confidence": 0.9, "reason": "dup"}'
|
||||
v = auto_triage.parse_verdict(raw)
|
||||
assert v["suggested_project"] == ""
|
||||
|
||||
|
||||
def test_high_confidence_tier1_promote_no_escalation():
|
||||
"""Tier 1 confident promote → no tier 2 call."""
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.triage_escalation") as t2, \
|
||||
mock.patch("auto_triage.api_post"), \
|
||||
mock.patch("auto_triage._apply_metadata_update"):
|
||||
t1.return_value = {
|
||||
"verdict": "promote", "confidence": 0.95, "reason": "clear",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
action, _ = auto_triage.process_candidate(
|
||||
cand, "http://fake", {"p-test": []}, {"p-test": []},
|
||||
{"p-test": []}, dry_run=False,
|
||||
)
|
||||
assert action == "promote"
|
||||
t2.assert_not_called()
|
||||
|
||||
|
||||
def test_high_confidence_tier1_reject_no_escalation():
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.triage_escalation") as t2, \
|
||||
mock.patch("auto_triage.api_post"):
|
||||
t1.return_value = {
|
||||
"verdict": "reject", "confidence": 0.9, "reason": "duplicate",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
action, _ = auto_triage.process_candidate(
|
||||
cand, "http://fake", {"p-test": []}, {"p-test": []},
|
||||
{"p-test": []}, dry_run=False,
|
||||
)
|
||||
assert action == "reject"
|
||||
t2.assert_not_called()
|
||||
|
||||
|
||||
def test_low_confidence_escalates_to_tier2():
|
||||
"""Tier 1 low confidence → tier 2 is consulted."""
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.triage_escalation") as t2, \
|
||||
mock.patch("auto_triage.api_post"), \
|
||||
mock.patch("auto_triage._apply_metadata_update"):
|
||||
t1.return_value = {
|
||||
"verdict": "promote", "confidence": 0.6, "reason": "maybe",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
t2.return_value = {
|
||||
"verdict": "promote", "confidence": 0.9, "reason": "opus agrees",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
action, note = auto_triage.process_candidate(
|
||||
cand, "http://fake", {"p-test": []}, {"p-test": []},
|
||||
{"p-test": []}, dry_run=False,
|
||||
)
|
||||
assert action == "promote"
|
||||
assert "opus" in note
|
||||
t2.assert_called_once()
|
||||
|
||||
|
||||
def test_needs_human_tier1_always_escalates():
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.triage_escalation") as t2, \
|
||||
mock.patch("auto_triage.api_post"):
|
||||
t1.return_value = {
|
||||
"verdict": "needs_human", "confidence": 0.5, "reason": "uncertain",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
t2.return_value = {
|
||||
"verdict": "reject", "confidence": 0.88, "reason": "opus decided",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
action, _ = auto_triage.process_candidate(
|
||||
cand, "http://fake", {"p-test": []}, {"p-test": []},
|
||||
{"p-test": []}, dry_run=False,
|
||||
)
|
||||
assert action == "reject"
|
||||
t2.assert_called_once()
|
||||
|
||||
|
||||
def test_tier2_uncertain_leads_to_discard_by_default(monkeypatch):
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
|
||||
monkeypatch.setattr(auto_triage, "TIER3_ACTION", "discard")
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.triage_escalation") as t2, \
|
||||
mock.patch("auto_triage.api_post") as api_post:
|
||||
t1.return_value = {
|
||||
"verdict": "needs_human", "confidence": 0.4, "reason": "unclear",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
t2.return_value = {
|
||||
"verdict": "needs_human", "confidence": 0.5, "reason": "still unclear",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
action, _ = auto_triage.process_candidate(
|
||||
cand, "http://fake", {"p-test": []}, {"p-test": []},
|
||||
{"p-test": []}, dry_run=False,
|
||||
)
|
||||
assert action == "discard"
|
||||
# Should have called reject on the API
|
||||
api_post.assert_called_once()
|
||||
assert "reject" in api_post.call_args.args[1]
|
||||
|
||||
|
||||
def test_tier2_uncertain_goes_to_human_when_configured(monkeypatch):
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
|
||||
monkeypatch.setattr(auto_triage, "TIER3_ACTION", "human")
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.triage_escalation") as t2, \
|
||||
mock.patch("auto_triage.api_post") as api_post:
|
||||
t1.return_value = {
|
||||
"verdict": "needs_human", "confidence": 0.4, "reason": "unclear",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
t2.return_value = {
|
||||
"verdict": "needs_human", "confidence": 0.5, "reason": "still unclear",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
action, _ = auto_triage.process_candidate(
|
||||
cand, "http://fake", {"p-test": []}, {"p-test": []},
|
||||
{"p-test": []}, dry_run=False,
|
||||
)
|
||||
assert action == "human"
|
||||
# Should NOT have touched the API — leave candidate in queue
|
||||
api_post.assert_not_called()
|
||||
|
||||
|
||||
def test_dry_run_does_not_call_api():
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.api_post") as api_post:
|
||||
t1.return_value = {
|
||||
"verdict": "promote", "confidence": 0.9, "reason": "clear",
|
||||
"domain_tags": [], "valid_until": "", "suggested_project": "",
|
||||
}
|
||||
action, _ = auto_triage.process_candidate(
|
||||
cand, "http://fake", {"p-test": []}, {"p-test": []},
|
||||
{"p-test": []}, dry_run=True,
|
||||
)
|
||||
assert action == "promote"
|
||||
api_post.assert_not_called()
|
||||
|
||||
|
||||
def test_misattribution_flagged_when_suggestion_differs(capsys):
|
||||
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p04-gigabit"}
|
||||
|
||||
with mock.patch("auto_triage.triage_one") as t1, \
|
||||
mock.patch("auto_triage.api_post"), \
|
||||
mock.patch("auto_triage._apply_metadata_update"):
|
||||
t1.return_value = {
|
||||
"verdict": "promote", "confidence": 0.9, "reason": "clear",
|
||||
"domain_tags": [], "valid_until": "",
|
||||
"suggested_project": "p05-interferometer",
|
||||
}
|
||||
auto_triage.process_candidate(
|
||||
cand, "http://fake",
|
||||
{"p04-gigabit": [], "p05-interferometer": []},
|
||||
{"p04-gigabit": [], "p05-interferometer": []},
|
||||
{"p04-gigabit": [], "p05-interferometer": []},
|
||||
dry_run=True,
|
||||
)
|
||||
out = capsys.readouterr().out
|
||||
assert "misattribution" in out
|
||||
assert "p05-interferometer" in out
|
||||
163
tests/test_wiki_pages.py
Normal file
163
tests/test_wiki_pages.py
Normal file
@@ -0,0 +1,163 @@
|
||||
"""Tests for the new wiki pages shipped in the UI refresh:
|
||||
- /wiki/capture (7I follow-up)
|
||||
- /wiki/memories/{id} (7E)
|
||||
- /wiki/domains/{tag} (7F)
|
||||
- /wiki/activity (activity feed)
|
||||
- home refresh (topnav + activity snippet)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.engineering.wiki import (
|
||||
render_activity,
|
||||
render_capture,
|
||||
render_domain,
|
||||
render_homepage,
|
||||
render_memory_detail,
|
||||
)
|
||||
from atocore.engineering.service import init_engineering_schema
|
||||
from atocore.memory.service import create_memory
|
||||
from atocore.models.database import init_db
|
||||
|
||||
|
||||
def _init_all():
|
||||
"""Wiki pages read from both the memory and engineering schemas, so
|
||||
tests need both initialized (the engineering schema is a separate
|
||||
init_engineering_schema() call)."""
|
||||
init_db()
|
||||
init_engineering_schema()
|
||||
|
||||
|
||||
def test_capture_page_renders_as_fallback(tmp_data_dir):
|
||||
_init_all()
|
||||
html = render_capture()
|
||||
# Page is reachable but now labeled as a fallback, not promoted
|
||||
assert "fallback only" in html
|
||||
assert "sanctioned capture surfaces are Claude Code" in html
|
||||
# Form inputs still exist for emergency use
|
||||
assert "cap-prompt" in html
|
||||
assert "cap-response" in html
|
||||
|
||||
|
||||
def test_capture_not_in_topnav(tmp_data_dir):
|
||||
"""The paste form should NOT appear in topnav — it's not the sanctioned path."""
|
||||
_init_all()
|
||||
html = render_homepage()
|
||||
assert "/wiki/capture" not in html
|
||||
assert "📥 Capture" not in html
|
||||
|
||||
|
||||
def test_memory_detail_renders(tmp_data_dir):
|
||||
_init_all()
|
||||
m = create_memory(
|
||||
"knowledge", "APM uses NX bridge for DXF → STL",
|
||||
project="apm", confidence=0.7, domain_tags=["apm", "nx", "cad"],
|
||||
)
|
||||
html = render_memory_detail(m.id)
|
||||
assert html is not None
|
||||
assert "APM uses NX" in html
|
||||
assert "Audit trail" in html
|
||||
# Tag links go to domain pages
|
||||
assert '/wiki/domains/apm' in html
|
||||
assert '/wiki/domains/nx' in html
|
||||
# Project link present
|
||||
assert '/wiki/projects/apm' in html
|
||||
|
||||
|
||||
def test_memory_detail_404(tmp_data_dir):
|
||||
_init_all()
|
||||
assert render_memory_detail("nonexistent-id") is None
|
||||
|
||||
|
||||
def test_domain_page_lists_memories(tmp_data_dir):
|
||||
_init_all()
|
||||
create_memory("knowledge", "optics fact 1", project="p04-gigabit",
|
||||
domain_tags=["optics"])
|
||||
create_memory("knowledge", "optics fact 2", project="p05-interferometer",
|
||||
domain_tags=["optics", "metrology"])
|
||||
create_memory("knowledge", "other", project="p06-polisher",
|
||||
domain_tags=["firmware"])
|
||||
|
||||
html = render_domain("optics")
|
||||
assert "Domain: <code>optics</code>" in html
|
||||
assert "p04-gigabit" in html
|
||||
assert "p05-interferometer" in html
|
||||
assert "optics fact 1" in html
|
||||
assert "optics fact 2" in html
|
||||
# Unrelated memory should NOT appear
|
||||
assert "other" not in html or "firmware" not in html
|
||||
|
||||
|
||||
def test_domain_page_empty(tmp_data_dir):
|
||||
_init_all()
|
||||
html = render_domain("definitely-not-a-tag")
|
||||
assert "No memories currently carry" in html
|
||||
|
||||
|
||||
def test_domain_page_normalizes_tag(tmp_data_dir):
|
||||
_init_all()
|
||||
create_memory("knowledge", "x", domain_tags=["firmware"])
|
||||
# Case-insensitive
|
||||
assert "firmware" in render_domain("FIRMWARE")
|
||||
# Whitespace tolerant
|
||||
assert "firmware" in render_domain(" firmware ")
|
||||
|
||||
|
||||
def test_activity_feed_renders(tmp_data_dir):
|
||||
_init_all()
|
||||
m = create_memory("knowledge", "activity test")
|
||||
html = render_activity()
|
||||
assert "Activity Feed" in html
|
||||
# The newly-created memory should appear as a "created" event
|
||||
assert "created" in html
|
||||
# Short timestamp format
|
||||
assert m.id[:8] in html
|
||||
|
||||
|
||||
def test_activity_feed_groups_by_action_and_actor(tmp_data_dir):
|
||||
_init_all()
|
||||
for i in range(3):
|
||||
create_memory("knowledge", f"m{i}", actor="test-actor")
|
||||
|
||||
html = render_activity()
|
||||
# Summary row should show "created: 3" or similar
|
||||
assert "created" in html
|
||||
assert "test-actor" in html
|
||||
|
||||
|
||||
def test_homepage_has_topnav_and_activity(tmp_data_dir):
|
||||
_init_all()
|
||||
create_memory("knowledge", "homepage test")
|
||||
html = render_homepage()
|
||||
# Topnav with expected items (Capture removed — it's not sanctioned capture)
|
||||
assert "🏠 Home" in html
|
||||
assert "📡 Activity" in html
|
||||
assert "/wiki/activity" in html
|
||||
assert "/wiki/capture" not in html
|
||||
# Activity snippet
|
||||
assert "What the brain is doing" in html
|
||||
|
||||
|
||||
def test_memory_detail_shows_superseded_sources(tmp_data_dir):
|
||||
"""After a merge, sources go to status=superseded. Detail page should
|
||||
still render them."""
|
||||
from atocore.memory.service import (
|
||||
create_merge_candidate, merge_memories,
|
||||
)
|
||||
_init_all()
|
||||
m1 = create_memory("knowledge", "alpha variant 1", project="test")
|
||||
m2 = create_memory("knowledge", "alpha variant 2", project="test")
|
||||
cid = create_merge_candidate(
|
||||
memory_ids=[m1.id, m2.id], similarity=0.9,
|
||||
proposed_content="alpha merged",
|
||||
proposed_memory_type="knowledge", proposed_project="test",
|
||||
)
|
||||
merge_memories(cid, actor="auto-dedup-tier1")
|
||||
|
||||
# Source detail page should render and show the superseded status
|
||||
html1 = render_memory_detail(m1.id)
|
||||
assert html1 is not None
|
||||
assert "superseded" in html1
|
||||
assert "auto-dedup-tier1" in html1 # audit trail shows who merged
|
||||
Reference in New Issue
Block a user