Files
ATOCore/tests/test_triage_escalation.py
Anto01 3ca19724a5 feat: 3-tier triage escalation + project validation + enriched context
Addresses the triage-quality problems the user observed:
- Candidates getting wrong project/product attribution
- Stale facts promoted as if still true
- "Hard to decide" items reaching human queue without real value

Solution: let sonnet handle the easy 80%, escalate borderline cases
to opus, auto-discard (or flag) what two models can't resolve.
Plus enrich the context the triage model sees so it can catch
misattribution, contradictions, and temporal drift earlier.

THE 3-TIER FLOW (scripts/auto_triage.py):

Tier 1: sonnet (fast, cheap)
  - confidence >= 0.8 + clear verdict → PROMOTE or REJECT (done)
  - otherwise → escalate to tier 2

Tier 2: opus (smarter, sees tier-1 verdict + reasoning)
  - second opinion with explicit "sonnet said X, resolve the uncertainty"
  - confidence >= 0.8 → PROMOTE or REJECT with note="[opus]"
  - still uncertain → tier 3

Tier 3: configurable (default discard)
  - ATOCORE_TRIAGE_TIER3=discard (default): auto-reject with
    "two models couldn't decide" reason
  - ATOCORE_TRIAGE_TIER3=human: leave in queue for /admin/triage

Configuration via env vars:
  ATOCORE_TRIAGE_MODEL_TIER1  (default sonnet)
  ATOCORE_TRIAGE_MODEL_TIER2  (default opus)
  ATOCORE_TRIAGE_TIER3        (default discard)
  ATOCORE_TRIAGE_ESCALATION_THRESHOLD (default 0.75)
  ATOCORE_TRIAGE_TIER2_TIMEOUT_S (default 120 — opus is slower)

ENRICHED CONTEXT shown to the triage model (both tiers):
- List of registered project ids so misattribution is detectable
- Trusted project state entries (ground truth, higher trust than memories)
- Top 30 active memories for the claimed project (was 20)
- Tier 2 additionally sees tier 1's verdict + reason

PROJECT MISATTRIBUTION DETECTION:
- Triage prompt asks the model to output "suggested_project" when it
  detects the claimed project is wrong but the content clearly belongs
  to a registered one
- Main loop auto-applies the fix via PUT /memory/{id} (which canonicalizes
  through the registry)
- Misattribution is the #1 pollution source — this catches it upstream

TEMPORAL AGGRESSIVENESS:
- Prompt upgraded: "be aggressive with valid_until for anything that
  reads like 'current state' or 'this week'. When in doubt, 2-4 week
  expiry rather than null."
- Stale facts decay automatically via Phase 3's expiry filter

CONFIDENCE GRADING (new in prompt):
- 0.9+: crystal clear durable fact or clear noise
- 0.75-0.9: confident but not cryptographic-certain
- 0.6-0.75: borderline — WILL escalate
- <0.6: genuinely ambiguous — human or discard

Tests: 356 → 366 (10 new, all in test_triage_escalation.py):
- High-confidence tier-1 promote/reject → no tier-2 call
- Low-confidence tier-1 → tier-2 escalates → decides
- needs_human always escalates regardless of confidence
- tier-2 uncertain → discard by default
- tier-2 uncertain → human when configured
- dry-run skips all API calls
- suggested_project flag surfaces + gets printed
- parse_verdict captures suggested_project

Runtime behavior unchanged for the clear cases (sonnet still handles
them). The 20-30% of candidates that currently land as needs_human
will now route through opus, and only the genuinely stuck get a human
(or discard) action.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 09:09:58 -04:00

220 lines
8.8 KiB
Python

"""Tests for 3-tier triage escalation logic (Phase Triage Quality).
The actual LLM calls are gated by ``shutil.which('claude')`` and can't be
exercised in CI without the CLI, so we mock the tier functions directly
and verify the control-flow (escalation routing, discard vs human, project
misattribution, metadata update).
"""
from __future__ import annotations
import sys
from pathlib import Path
from unittest import mock
import pytest
# Import the script as a module for unit testing
_SCRIPTS = str(Path(__file__).resolve().parent.parent / "scripts")
if _SCRIPTS not in sys.path:
sys.path.insert(0, _SCRIPTS)
import auto_triage # noqa: E402
@pytest.fixture(autouse=True)
def reset_thresholds(monkeypatch):
"""Make sure env-var overrides don't leak between tests."""
monkeypatch.setattr(auto_triage, "AUTO_PROMOTE_MIN_CONFIDENCE", 0.8)
monkeypatch.setattr(auto_triage, "ESCALATION_CONFIDENCE_THRESHOLD", 0.75)
monkeypatch.setattr(auto_triage, "TIER3_ACTION", "discard")
monkeypatch.setattr(auto_triage, "TIER1_MODEL", "sonnet")
monkeypatch.setattr(auto_triage, "TIER2_MODEL", "opus")
def test_parse_verdict_captures_suggested_project():
raw = '{"verdict": "promote", "confidence": 0.9, "reason": "clear", "suggested_project": "p04-gigabit"}'
v = auto_triage.parse_verdict(raw)
assert v["verdict"] == "promote"
assert v["suggested_project"] == "p04-gigabit"
def test_parse_verdict_defaults_suggested_project_to_empty():
raw = '{"verdict": "reject", "confidence": 0.9, "reason": "dup"}'
v = auto_triage.parse_verdict(raw)
assert v["suggested_project"] == ""
def test_high_confidence_tier1_promote_no_escalation():
"""Tier 1 confident promote → no tier 2 call."""
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.triage_escalation") as t2, \
mock.patch("auto_triage.api_post"), \
mock.patch("auto_triage._apply_metadata_update"):
t1.return_value = {
"verdict": "promote", "confidence": 0.95, "reason": "clear",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
action, _ = auto_triage.process_candidate(
cand, "http://fake", {"p-test": []}, {"p-test": []},
{"p-test": []}, dry_run=False,
)
assert action == "promote"
t2.assert_not_called()
def test_high_confidence_tier1_reject_no_escalation():
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.triage_escalation") as t2, \
mock.patch("auto_triage.api_post"):
t1.return_value = {
"verdict": "reject", "confidence": 0.9, "reason": "duplicate",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
action, _ = auto_triage.process_candidate(
cand, "http://fake", {"p-test": []}, {"p-test": []},
{"p-test": []}, dry_run=False,
)
assert action == "reject"
t2.assert_not_called()
def test_low_confidence_escalates_to_tier2():
"""Tier 1 low confidence → tier 2 is consulted."""
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.triage_escalation") as t2, \
mock.patch("auto_triage.api_post"), \
mock.patch("auto_triage._apply_metadata_update"):
t1.return_value = {
"verdict": "promote", "confidence": 0.6, "reason": "maybe",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
t2.return_value = {
"verdict": "promote", "confidence": 0.9, "reason": "opus agrees",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
action, note = auto_triage.process_candidate(
cand, "http://fake", {"p-test": []}, {"p-test": []},
{"p-test": []}, dry_run=False,
)
assert action == "promote"
assert "opus" in note
t2.assert_called_once()
def test_needs_human_tier1_always_escalates():
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.triage_escalation") as t2, \
mock.patch("auto_triage.api_post"):
t1.return_value = {
"verdict": "needs_human", "confidence": 0.5, "reason": "uncertain",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
t2.return_value = {
"verdict": "reject", "confidence": 0.88, "reason": "opus decided",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
action, _ = auto_triage.process_candidate(
cand, "http://fake", {"p-test": []}, {"p-test": []},
{"p-test": []}, dry_run=False,
)
assert action == "reject"
t2.assert_called_once()
def test_tier2_uncertain_leads_to_discard_by_default(monkeypatch):
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
monkeypatch.setattr(auto_triage, "TIER3_ACTION", "discard")
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.triage_escalation") as t2, \
mock.patch("auto_triage.api_post") as api_post:
t1.return_value = {
"verdict": "needs_human", "confidence": 0.4, "reason": "unclear",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
t2.return_value = {
"verdict": "needs_human", "confidence": 0.5, "reason": "still unclear",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
action, _ = auto_triage.process_candidate(
cand, "http://fake", {"p-test": []}, {"p-test": []},
{"p-test": []}, dry_run=False,
)
assert action == "discard"
# Should have called reject on the API
api_post.assert_called_once()
assert "reject" in api_post.call_args.args[1]
def test_tier2_uncertain_goes_to_human_when_configured(monkeypatch):
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
monkeypatch.setattr(auto_triage, "TIER3_ACTION", "human")
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.triage_escalation") as t2, \
mock.patch("auto_triage.api_post") as api_post:
t1.return_value = {
"verdict": "needs_human", "confidence": 0.4, "reason": "unclear",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
t2.return_value = {
"verdict": "needs_human", "confidence": 0.5, "reason": "still unclear",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
action, _ = auto_triage.process_candidate(
cand, "http://fake", {"p-test": []}, {"p-test": []},
{"p-test": []}, dry_run=False,
)
assert action == "human"
# Should NOT have touched the API — leave candidate in queue
api_post.assert_not_called()
def test_dry_run_does_not_call_api():
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p-test"}
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.api_post") as api_post:
t1.return_value = {
"verdict": "promote", "confidence": 0.9, "reason": "clear",
"domain_tags": [], "valid_until": "", "suggested_project": "",
}
action, _ = auto_triage.process_candidate(
cand, "http://fake", {"p-test": []}, {"p-test": []},
{"p-test": []}, dry_run=True,
)
assert action == "promote"
api_post.assert_not_called()
def test_misattribution_flagged_when_suggestion_differs(capsys):
cand = {"id": "m1", "content": "x", "memory_type": "knowledge", "project": "p04-gigabit"}
with mock.patch("auto_triage.triage_one") as t1, \
mock.patch("auto_triage.api_post"), \
mock.patch("auto_triage._apply_metadata_update"):
t1.return_value = {
"verdict": "promote", "confidence": 0.9, "reason": "clear",
"domain_tags": [], "valid_until": "",
"suggested_project": "p05-interferometer",
}
auto_triage.process_candidate(
cand, "http://fake",
{"p04-gigabit": [], "p05-interferometer": []},
{"p04-gigabit": [], "p05-interferometer": []},
{"p04-gigabit": [], "p05-interferometer": []},
dry_run=True,
)
out = capsys.readouterr().out
assert "misattribution" in out
assert "p05-interferometer" in out