feat: length-aware reinforcement + batch triage CLI + off-host backup

- Reinforcement matcher now handles paragraph-length memories via a dual-mode threshold: short memories keep the 70% overlap rule, long memories (>15 stems) require 12 absolute overlaps AND 35% fraction so organic paraphrase can still reinforce. Diagnosis: every active memory stayed at reference_count=0 because 40-token project summaries never hit 70% overlap on real responses. - scripts/atocore_client.py gains batch-extract (fan out /interactions/{id}/extract over recent interactions) and triage (interactive promote/reject walker for the candidate queue), matching the Phase 9 reflection-loop review flow without pulling extraction into the capture hot path. - deploy/dalidou/cron-backup.sh adds an optional off-host rsync step gated on ATOCORE_BACKUP_RSYNC, fail-open when the target is offline so a laptop being off at 03:00 UTC never reds the local backup. - docs/next-steps.md records the retrieval-quality sweep: project state surfaces, chunks are on-topic but broad, active memories never reach the pack (reflection loop has no retrieval outlet yet). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:20:03 -04:00
parent c5bad996a7
commit 9366ba7879
5 changed files with 294 additions and 6 deletions
--- a/docs/next-steps.md
+++ b/docs/next-steps.md
@@ -159,6 +159,44 @@ The next batch is successful if:
 - project ingestion remains controlled rather than noisy
 - the canonical Dalidou instance stays stable

+## Retrieval Quality Review — 2026-04-11
+
+First sweep with real project-hinted queries on Dalidou. Used
+`POST /context/build` against p04, p05, p06 with representative
+questions and inspected `formatted_context`.
+
+Findings:
+
+- **Trusted Project State is surfacing correctly.** The DECISION and
+  REQUIREMENT categories appear at the top of the pack and include
+  the expected key facts (e.g. p04 "Option B conical-back mirror
+  architecture"). This is the strongest signal in the pack today.
+- **Chunk retrieval is relevant on-topic but broad.** Top chunks for
+  the p04 architecture query are PDR intro, CAD assembly overview,
+  and the index — all on the right project but none of them directly
+  answer the "why was Option B chosen" question. The authoritative
+  answer sits in Project State, not in the chunks.
+- **Active memories are NOT reaching the pack.** The context builder
+  surfaces Trusted Project State and retrieved chunks but does not
+  include the 21 active project/knowledge memories. Reinforcement
+  (Phase 9 Commit B) bumps memory confidence without the memory ever
+  being read back into a prompt — the reflection loop has no outlet
+  on the retrieval side. This is a design gap, not a bug: needs a
+  decision on whether memories should feed into context assembly,
+  and if so at what trust level (below project_state, above chunks).
+- **Cross-project bleed is low.** The p04 query did pull one p05
+  chunk (CGH_Design_Input_for_AOM) as the bottom hit but the top-4
+  were all p04.
+
+Proposed follow-ups (not yet scheduled):
+
+1. Decide whether memories should be folded into `formatted_context`
+   and under what section header. Candidate: a "--- Project Memories ---"
+   band between Trusted Project State and Retrieved Context, filtered
+   to active memories for the target project plus identity/preference.
+2. Re-run the same three queries after any builder change and compare
+   `formatted_context` diffs.
+
 ## Long-Run Goal

 The long-run target is: