feat: length-aware reinforcement + batch triage CLI + off-host backup

- Reinforcement matcher now handles paragraph-length memories via a
  dual-mode threshold: short memories keep the 70% overlap rule,
  long memories (>15 stems) require 12 absolute overlaps AND 35%
  fraction so organic paraphrase can still reinforce. Diagnosis:
  every active memory stayed at reference_count=0 because 40-token
  project summaries never hit 70% overlap on real responses.
- scripts/atocore_client.py gains batch-extract (fan out
  /interactions/{id}/extract over recent interactions) and triage
  (interactive promote/reject walker for the candidate queue),
  matching the Phase 9 reflection-loop review flow without pulling
  extraction into the capture hot path.
- deploy/dalidou/cron-backup.sh adds an optional off-host rsync step
  gated on ATOCORE_BACKUP_RSYNC, fail-open when the target is offline
  so a laptop being off at 03:00 UTC never reds the local backup.
- docs/next-steps.md records the retrieval-quality sweep: project
  state surfaces, chunks are on-topic but broad, active memories
  never reach the pack (reflection loop has no retrieval outlet yet).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-11 11:20:03 -04:00
parent c5bad996a7
commit 9366ba7879
5 changed files with 294 additions and 6 deletions

View File

@@ -51,6 +51,15 @@ _STOP_WORDS: frozenset[str] = frozenset({
})
_MATCH_THRESHOLD = 0.70
# Long memories can't realistically hit 70% overlap through organic
# paraphrase — a 40-token memory would need 28 stemmed tokens echoed
# verbatim. Above this token count the matcher switches to an absolute
# overlap floor plus a softer fraction floor so paragraph-length memories
# still reinforce when the response genuinely uses them.
_LONG_MEMORY_TOKEN_COUNT = 15
_LONG_MODE_MIN_OVERLAP = 12
_LONG_MODE_MIN_FRACTION = 0.35
DEFAULT_CONFIDENCE_DELTA = 0.02
@@ -188,9 +197,14 @@ def _tokenize(text: str) -> set[str]:
def _memory_matches(memory_content: str, normalized_response: str) -> bool:
"""Return True if enough of the memory's tokens appear in the response.
Uses token-overlap: tokenize both sides (lowercase, stem, drop stop
words), then check whether >= 70 % of the memory's content tokens
appear in the response token set.
Dual-mode token overlap:
- Short memories (<= _LONG_MEMORY_TOKEN_COUNT stems): require
>= 70 % of memory tokens echoed.
- Long memories (paragraphs): require an absolute floor of
_LONG_MODE_MIN_OVERLAP distinct stems echoed AND a softer
fraction of _LONG_MODE_MIN_FRACTION, so organic paraphrase
of a real project memory can reinforce without the response
quoting the paragraph verbatim.
"""
if not memory_content:
return False
@@ -202,4 +216,10 @@ def _memory_matches(memory_content: str, normalized_response: str) -> bool:
return False
response_tokens = _tokenize(normalized_response)
overlap = memory_tokens & response_tokens
return len(overlap) / len(memory_tokens) >= _MATCH_THRESHOLD
fraction = len(overlap) / len(memory_tokens)
if len(memory_tokens) <= _LONG_MEMORY_TOKEN_COUNT:
return fraction >= _MATCH_THRESHOLD
return (
len(overlap) >= _LONG_MODE_MIN_OVERLAP
and fraction >= _LONG_MODE_MIN_FRACTION
)