feat: length-aware reinforcement + batch triage CLI + off-host backup

- Reinforcement matcher now handles paragraph-length memories via a dual-mode threshold: short memories keep the 70% overlap rule, long memories (>15 stems) require 12 absolute overlaps AND 35% fraction so organic paraphrase can still reinforce. Diagnosis: every active memory stayed at reference_count=0 because 40-token project summaries never hit 70% overlap on real responses. - scripts/atocore_client.py gains batch-extract (fan out /interactions/{id}/extract over recent interactions) and triage (interactive promote/reject walker for the candidate queue), matching the Phase 9 reflection-loop review flow without pulling extraction into the capture hot path. - deploy/dalidou/cron-backup.sh adds an optional off-host rsync step gated on ATOCORE_BACKUP_RSYNC, fail-open when the target is offline so a laptop being off at 03:00 UTC never reds the local backup. - docs/next-steps.md records the retrieval-quality sweep: project state surfaces, chunks are on-topic but broad, active memories never reach the pack (reflection loop has no retrieval outlet yet). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:20:03 -04:00
parent c5bad996a7
commit 9366ba7879
5 changed files with 294 additions and 6 deletions
--- a/src/atocore/memory/reinforcement.py
+++ b/src/atocore/memory/reinforcement.py
@@ -51,6 +51,15 @@ _STOP_WORDS: frozenset[str] = frozenset({
 })
 _MATCH_THRESHOLD = 0.70

+# Long memories can't realistically hit 70% overlap through organic
+# paraphrase — a 40-token memory would need 28 stemmed tokens echoed
+# verbatim. Above this token count the matcher switches to an absolute
+# overlap floor plus a softer fraction floor so paragraph-length memories
+# still reinforce when the response genuinely uses them.
+_LONG_MEMORY_TOKEN_COUNT = 15
+_LONG_MODE_MIN_OVERLAP = 12
+_LONG_MODE_MIN_FRACTION = 0.35
+
 DEFAULT_CONFIDENCE_DELTA = 0.02


@@ -188,9 +197,14 @@ def _tokenize(text: str) -> set[str]:
 def _memory_matches(memory_content: str, normalized_response: str) -> bool:
    """Return True if enough of the memory's tokens appear in the response.

-    Uses token-overlap: tokenize both sides (lowercase, stem, drop stop
-    words), then check whether >= 70 % of the memory's content tokens
-    appear in the response token set.
+    Dual-mode token overlap:
+    - Short memories (<= _LONG_MEMORY_TOKEN_COUNT stems): require
+      >= 70 % of memory tokens echoed.
+    - Long memories (paragraphs): require an absolute floor of
+      _LONG_MODE_MIN_OVERLAP distinct stems echoed AND a softer
+      fraction of _LONG_MODE_MIN_FRACTION, so organic paraphrase
+      of a real project memory can reinforce without the response
+      quoting the paragraph verbatim.
    """
    if not memory_content:
        return False
@@ -202,4 +216,10 @@ def _memory_matches(memory_content: str, normalized_response: str) -> bool:
        return False
    response_tokens = _tokenize(normalized_response)
    overlap = memory_tokens & response_tokens
-    return len(overlap) / len(memory_tokens) >= _MATCH_THRESHOLD
+    fraction = len(overlap) / len(memory_tokens)
+    if len(memory_tokens) <= _LONG_MEMORY_TOKEN_COUNT:
+        return fraction >= _MATCH_THRESHOLD
+    return (
+        len(overlap) >= _LONG_MODE_MIN_OVERLAP
+        and fraction >= _LONG_MODE_MIN_FRACTION
+    )