feat(phase9-B): reinforce active memories from captured interactions
Phase 9 Commit B from the agreed plan. With Commit A capturing what
AtoCore fed to the LLM and what came back, this commit closes the
weakest part of the loop: when a memory is actually referenced in a
response, its confidence should drift up, and stale memories that
nobody ever mentions should stay where they are.
This is reinforcement only — nothing is promoted into trusted state
and no candidates are created. Extraction is Commit C.
Schema (additive migration):
- memories.last_referenced_at DATETIME (null by default)
- memories.reference_count INTEGER DEFAULT 0
- idx_memories_last_referenced on last_referenced_at
- memories.status now accepts the new "candidate" value so Commit C
has the status slot to land on. Existing active/superseded/invalid
rows are untouched.
New module: src/atocore/memory/reinforcement.py
- reinforce_from_interaction(interaction): scans the interaction's
response + response_summary for echoes of active memories and
bumps confidence / reference_count for each match
- matching is intentionally simple and explainable:
* normalize both sides (lowercase, collapse whitespace)
* require >= 12 chars of memory content to match
* compare the leading 80-char window of each memory
- the candidate pool is project-scoped memories for the interaction's
project + global identity + preference memories, deduplicated
- candidates and invalidated memories are NEVER reinforced; only
active memories move
Memory service changes:
- MEMORY_STATUSES = ["candidate", "active", "superseded", "invalid"]
- create_memory(status="candidate"|"active"|...) with per-status
duplicate scoping so a candidate and an active with identical text
can legitimately coexist during review
- get_memories(status=...) explicit override of the legacy active_only
flag; callers can now list the review queue cleanly
- update_memory accepts any valid status including "candidate"
- reinforce_memory(id, delta): low-level primitive that bumps
confidence (capped at 1.0), increments reference_count, and sets
last_referenced_at. Only active memories; returns (applied, old, new)
- promote_memory / reject_candidate_memory helpers prepping Commit C
Interactions service:
- record_interaction(reinforce=True) runs reinforce_from_interaction
automatically when the interaction has response content. reinforcement
errors are logged but never raised back to the caller so capture
itself is never blocked by a flaky downstream.
- circular import between interactions service and memory.reinforcement
avoided by lazy import inside the function
API:
- POST /interactions now accepts a reinforce bool field (default true)
- POST /interactions/{id}/reinforce runs reinforcement on an existing
captured interaction — useful for backfilling or for retrying after
a transient error in the automatic pass
- response lists which memory ids were reinforced with
old / new confidence for audit
Tests (17 new, all green):
- reinforce_memory bumps, caps at 1.0, accumulates reference_count
- reinforce_memory rejects candidates and missing ids
- reinforce_memory rejects negative delta
- reinforce_from_interaction matches active memory
- reinforce_from_interaction ignores candidates and inactive
- reinforce_from_interaction requires minimum content length
- reinforce_from_interaction handles empty response cleanly
- reinforce_from_interaction normalizes casing and whitespace
- reinforce_from_interaction deduplicates across memory buckets
- record_interaction auto-reinforces by default
- record_interaction reinforce=False skips the pass
- record_interaction handles empty response
- POST /interactions/{id}/reinforce runs against stored interaction
- POST /interactions/{id}/reinforce returns 404 for missing id
- POST /interactions accepts reinforce=false
Full suite: 135 passing (was 118).
Trust model unchanged:
- reinforcement only moves confidence within the existing active set
- the candidate lifecycle is declared but only Commit C will actually
create candidate memories
- trusted project state is never touched by reinforcement
Next: Commit C adds the rule-based extractor that produces candidate
memories from captured interactions plus the promote/reject review
queue endpoints.
This commit is contained in:
@@ -30,11 +30,15 @@ from atocore.interactions.service import (
|
||||
list_interactions,
|
||||
record_interaction,
|
||||
)
|
||||
from atocore.memory.reinforcement import reinforce_from_interaction
|
||||
from atocore.memory.service import (
|
||||
MEMORY_STATUSES,
|
||||
MEMORY_TYPES,
|
||||
create_memory,
|
||||
get_memories,
|
||||
invalidate_memory,
|
||||
promote_memory,
|
||||
reject_candidate_memory,
|
||||
supersede_memory,
|
||||
update_memory,
|
||||
)
|
||||
@@ -461,6 +465,7 @@ class InteractionRecordRequest(BaseModel):
|
||||
memories_used: list[str] = []
|
||||
chunks_used: list[str] = []
|
||||
context_pack: dict | None = None
|
||||
reinforce: bool = True
|
||||
|
||||
|
||||
@router.post("/interactions")
|
||||
@@ -468,9 +473,11 @@ def api_record_interaction(req: InteractionRecordRequest) -> dict:
|
||||
"""Capture one interaction (prompt + response + what was used).
|
||||
|
||||
This is the foundation of the AtoCore reflection loop. It records
|
||||
what the system fed to an LLM and what came back, but does not
|
||||
promote anything into trusted state. Phase 9 Commit B/C will layer
|
||||
reinforcement and extraction on top of this audit trail.
|
||||
what the system fed to an LLM and what came back. If ``reinforce``
|
||||
is true (default) and there is response content, the Phase 9
|
||||
Commit B reinforcement pass runs automatically, bumping the
|
||||
confidence of any active memory echoed in the response. Nothing is
|
||||
ever promoted into trusted state automatically.
|
||||
"""
|
||||
try:
|
||||
interaction = record_interaction(
|
||||
@@ -483,6 +490,7 @@ def api_record_interaction(req: InteractionRecordRequest) -> dict:
|
||||
memories_used=req.memories_used,
|
||||
chunks_used=req.chunks_used,
|
||||
context_pack=req.context_pack,
|
||||
reinforce=req.reinforce,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
@@ -493,6 +501,33 @@ def api_record_interaction(req: InteractionRecordRequest) -> dict:
|
||||
}
|
||||
|
||||
|
||||
@router.post("/interactions/{interaction_id}/reinforce")
|
||||
def api_reinforce_interaction(interaction_id: str) -> dict:
|
||||
"""Run the reinforcement pass on an already-captured interaction.
|
||||
|
||||
Useful for backfilling reinforcement over historical interactions,
|
||||
or for retrying after a transient failure in the automatic pass
|
||||
that runs inside ``POST /interactions``.
|
||||
"""
|
||||
interaction = get_interaction(interaction_id)
|
||||
if interaction is None:
|
||||
raise HTTPException(status_code=404, detail=f"Interaction not found: {interaction_id}")
|
||||
results = reinforce_from_interaction(interaction)
|
||||
return {
|
||||
"interaction_id": interaction_id,
|
||||
"reinforced_count": len(results),
|
||||
"reinforced": [
|
||||
{
|
||||
"memory_id": r.memory_id,
|
||||
"memory_type": r.memory_type,
|
||||
"old_confidence": round(r.old_confidence, 4),
|
||||
"new_confidence": round(r.new_confidence, 4),
|
||||
}
|
||||
for r in results
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
@router.get("/interactions")
|
||||
def api_list_interactions(
|
||||
project: str | None = None,
|
||||
|
||||
@@ -53,12 +53,21 @@ def record_interaction(
|
||||
memories_used: list[str] | None = None,
|
||||
chunks_used: list[str] | None = None,
|
||||
context_pack: dict | None = None,
|
||||
reinforce: bool = True,
|
||||
) -> Interaction:
|
||||
"""Persist a single interaction to the audit trail.
|
||||
|
||||
The only required field is ``prompt`` so this can be called even when
|
||||
the caller is in the middle of a partial turn (for example to record
|
||||
that AtoCore was queried even before the LLM response is back).
|
||||
|
||||
When ``reinforce`` is True (default) and the interaction has response
|
||||
content, the Phase 9 Commit B reinforcement pass runs automatically
|
||||
against the active memory set. This bumps the confidence of any
|
||||
memory whose content is echoed in the response. Set ``reinforce`` to
|
||||
False to capture the interaction without touching memory confidence,
|
||||
which is useful for backfill and for tests that want to isolate the
|
||||
audit trail from the reinforcement loop.
|
||||
"""
|
||||
if not prompt or not prompt.strip():
|
||||
raise ValueError("Interaction prompt must be non-empty")
|
||||
@@ -109,7 +118,7 @@ def record_interaction(
|
||||
response_chars=len(response),
|
||||
)
|
||||
|
||||
return Interaction(
|
||||
interaction = Interaction(
|
||||
id=interaction_id,
|
||||
prompt=prompt,
|
||||
response=response,
|
||||
@@ -123,6 +132,23 @@ def record_interaction(
|
||||
created_at=now,
|
||||
)
|
||||
|
||||
if reinforce and (response or response_summary):
|
||||
# Import inside the function to avoid a circular import between
|
||||
# the interactions service and the reinforcement module which
|
||||
# depends on it.
|
||||
try:
|
||||
from atocore.memory.reinforcement import reinforce_from_interaction
|
||||
|
||||
reinforce_from_interaction(interaction)
|
||||
except Exception as exc: # pragma: no cover - reinforcement must never block capture
|
||||
log.error(
|
||||
"reinforcement_failed_on_capture",
|
||||
interaction_id=interaction_id,
|
||||
error=str(exc),
|
||||
)
|
||||
|
||||
return interaction
|
||||
|
||||
|
||||
def list_interactions(
|
||||
project: str | None = None,
|
||||
|
||||
155
src/atocore/memory/reinforcement.py
Normal file
155
src/atocore/memory/reinforcement.py
Normal file
@@ -0,0 +1,155 @@
|
||||
"""Reinforce active memories from captured interactions (Phase 9 Commit B).
|
||||
|
||||
When an interaction is captured with a non-empty response, this module
|
||||
scans the response text against currently-active memories and bumps the
|
||||
confidence of any memory whose content appears in the response. The
|
||||
intent is to surface a weak signal that the LLM actually relied on a
|
||||
given memory, without ever promoting anything new into trusted state.
|
||||
|
||||
Design notes
|
||||
------------
|
||||
- Matching is intentionally simple and explainable:
|
||||
* normalize both sides (lowercase, collapse whitespace)
|
||||
* require the normalized memory content (or its first 80 chars) to
|
||||
appear as a substring in the normalized response
|
||||
- Candidates and invalidated memories are NEVER considered — reinforcement
|
||||
must not revive history.
|
||||
- Reinforcement is capped at 1.0 and monotonically non-decreasing.
|
||||
- The function is idempotent with respect to a single call but will
|
||||
accumulate confidence across multiple calls; that is intentional — if
|
||||
the same memory is mentioned in 10 separate conversations it is, by
|
||||
definition, more confidently useful.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
|
||||
from atocore.interactions.service import Interaction
|
||||
from atocore.memory.service import (
|
||||
Memory,
|
||||
get_memories,
|
||||
reinforce_memory,
|
||||
)
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("reinforcement")
|
||||
|
||||
# Minimum memory content length to consider for matching. Too-short
|
||||
# memories (e.g. "use SI") would otherwise fire on almost every response
|
||||
# and generate noise. 12 characters is long enough to require real
|
||||
# semantic content but short enough to match one-liner identity
|
||||
# memories like "prefers Python".
|
||||
_MIN_MEMORY_CONTENT_LENGTH = 12
|
||||
|
||||
# When a memory's content is very long, match on its leading window only
|
||||
# to avoid punishing small paraphrases further into the body.
|
||||
_MATCH_WINDOW_CHARS = 80
|
||||
|
||||
DEFAULT_CONFIDENCE_DELTA = 0.02
|
||||
|
||||
|
||||
@dataclass
|
||||
class ReinforcementResult:
|
||||
memory_id: str
|
||||
memory_type: str
|
||||
old_confidence: float
|
||||
new_confidence: float
|
||||
|
||||
|
||||
def reinforce_from_interaction(
|
||||
interaction: Interaction,
|
||||
confidence_delta: float = DEFAULT_CONFIDENCE_DELTA,
|
||||
) -> list[ReinforcementResult]:
|
||||
"""Scan an interaction's response for active-memory mentions.
|
||||
|
||||
Returns the list of memories that were reinforced. An empty list is
|
||||
returned if the interaction has no response content, if no memories
|
||||
match, or if the interaction has no project scope and the global
|
||||
active set is empty.
|
||||
"""
|
||||
response_text = _combined_response_text(interaction)
|
||||
if not response_text:
|
||||
return []
|
||||
|
||||
normalized_response = _normalize(response_text)
|
||||
if not normalized_response:
|
||||
return []
|
||||
|
||||
# Fetch the candidate pool of active memories. We cast a wide net
|
||||
# here: project-scoped memories for the interaction's project first,
|
||||
# plus identity and preference memories which are global by nature.
|
||||
candidate_pool: list[Memory] = []
|
||||
seen_ids: set[str] = set()
|
||||
|
||||
def _add_batch(batch: list[Memory]) -> None:
|
||||
for mem in batch:
|
||||
if mem.id in seen_ids:
|
||||
continue
|
||||
seen_ids.add(mem.id)
|
||||
candidate_pool.append(mem)
|
||||
|
||||
if interaction.project:
|
||||
_add_batch(get_memories(project=interaction.project, active_only=True, limit=200))
|
||||
_add_batch(get_memories(memory_type="identity", active_only=True, limit=50))
|
||||
_add_batch(get_memories(memory_type="preference", active_only=True, limit=50))
|
||||
|
||||
reinforced: list[ReinforcementResult] = []
|
||||
for memory in candidate_pool:
|
||||
if not _memory_matches(memory.content, normalized_response):
|
||||
continue
|
||||
applied, old_conf, new_conf = reinforce_memory(
|
||||
memory.id, confidence_delta=confidence_delta
|
||||
)
|
||||
if not applied:
|
||||
continue
|
||||
reinforced.append(
|
||||
ReinforcementResult(
|
||||
memory_id=memory.id,
|
||||
memory_type=memory.memory_type,
|
||||
old_confidence=old_conf,
|
||||
new_confidence=new_conf,
|
||||
)
|
||||
)
|
||||
|
||||
if reinforced:
|
||||
log.info(
|
||||
"reinforcement_applied",
|
||||
interaction_id=interaction.id,
|
||||
project=interaction.project,
|
||||
reinforced_count=len(reinforced),
|
||||
)
|
||||
return reinforced
|
||||
|
||||
|
||||
def _combined_response_text(interaction: Interaction) -> str:
|
||||
"""Pick the best available response text from an interaction."""
|
||||
parts: list[str] = []
|
||||
if interaction.response:
|
||||
parts.append(interaction.response)
|
||||
if interaction.response_summary:
|
||||
parts.append(interaction.response_summary)
|
||||
return "\n".join(parts).strip()
|
||||
|
||||
|
||||
def _normalize(text: str) -> str:
|
||||
"""Lowercase and collapse whitespace for substring matching."""
|
||||
if not text:
|
||||
return ""
|
||||
lowered = text.lower()
|
||||
# Collapse any run of whitespace (including newlines and tabs) to
|
||||
# a single space so multi-line responses match single-line memories.
|
||||
collapsed = re.sub(r"\s+", " ", lowered)
|
||||
return collapsed.strip()
|
||||
|
||||
|
||||
def _memory_matches(memory_content: str, normalized_response: str) -> bool:
|
||||
"""Return True if the memory content appears in the response."""
|
||||
if not memory_content:
|
||||
return False
|
||||
normalized_memory = _normalize(memory_content)
|
||||
if len(normalized_memory) < _MIN_MEMORY_CONTENT_LENGTH:
|
||||
return False
|
||||
window = normalized_memory[:_MATCH_WINDOW_CHARS]
|
||||
return window in normalized_response
|
||||
@@ -10,7 +10,16 @@ Memory types (per Master Plan):
|
||||
|
||||
Memories have:
|
||||
- confidence (0.0–1.0): how certain we are
|
||||
- status (active/superseded/invalid): lifecycle state
|
||||
- status: lifecycle state, one of MEMORY_STATUSES
|
||||
* candidate: extracted from an interaction, awaiting human review
|
||||
(Phase 9 Commit C). Candidates are NEVER included in
|
||||
context packs.
|
||||
* active: promoted/curated, visible to retrieval and context
|
||||
* superseded: replaced by a newer entry
|
||||
* invalid: rejected / error-corrected
|
||||
- last_referenced_at / reference_count: reinforcement signal
|
||||
(Phase 9 Commit B). Bumped whenever a captured interaction's
|
||||
response content echoes this memory.
|
||||
- optional link to source chunk: traceability
|
||||
"""
|
||||
|
||||
@@ -32,6 +41,13 @@ MEMORY_TYPES = [
|
||||
"adaptation",
|
||||
]
|
||||
|
||||
MEMORY_STATUSES = [
|
||||
"candidate",
|
||||
"active",
|
||||
"superseded",
|
||||
"invalid",
|
||||
]
|
||||
|
||||
|
||||
@dataclass
|
||||
class Memory:
|
||||
@@ -44,6 +60,8 @@ class Memory:
|
||||
status: str
|
||||
created_at: str
|
||||
updated_at: str
|
||||
last_referenced_at: str = ""
|
||||
reference_count: int = 0
|
||||
|
||||
|
||||
def create_memory(
|
||||
@@ -52,35 +70,57 @@ def create_memory(
|
||||
project: str = "",
|
||||
source_chunk_id: str = "",
|
||||
confidence: float = 1.0,
|
||||
status: str = "active",
|
||||
) -> Memory:
|
||||
"""Create a new memory entry."""
|
||||
"""Create a new memory entry.
|
||||
|
||||
``status`` defaults to ``active`` for backward compatibility. Pass
|
||||
``candidate`` when the memory is being proposed by the Phase 9 Commit C
|
||||
extractor and still needs human review before it can influence context.
|
||||
"""
|
||||
if memory_type not in MEMORY_TYPES:
|
||||
raise ValueError(f"Invalid memory type '{memory_type}'. Must be one of: {MEMORY_TYPES}")
|
||||
if status not in MEMORY_STATUSES:
|
||||
raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
|
||||
_validate_confidence(confidence)
|
||||
|
||||
memory_id = str(uuid.uuid4())
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
# Check for duplicate content within same type+project
|
||||
# Check for duplicate content within the same type+project at the same status.
|
||||
# Scoping by status keeps active curation separate from the candidate
|
||||
# review queue: a candidate and an active memory with identical text can
|
||||
# legitimately coexist if the candidate is a fresh extraction of something
|
||||
# already curated.
|
||||
with get_connection() as conn:
|
||||
existing = conn.execute(
|
||||
"SELECT id FROM memories "
|
||||
"WHERE memory_type = ? AND content = ? AND project = ? AND status = 'active'",
|
||||
(memory_type, content, project),
|
||||
"WHERE memory_type = ? AND content = ? AND project = ? AND status = ?",
|
||||
(memory_type, content, project, status),
|
||||
).fetchone()
|
||||
if existing:
|
||||
log.info("memory_duplicate_skipped", memory_type=memory_type, content_preview=content[:80])
|
||||
log.info(
|
||||
"memory_duplicate_skipped",
|
||||
memory_type=memory_type,
|
||||
status=status,
|
||||
content_preview=content[:80],
|
||||
)
|
||||
return _row_to_memory(
|
||||
conn.execute("SELECT * FROM memories WHERE id = ?", (existing["id"],)).fetchone()
|
||||
)
|
||||
|
||||
conn.execute(
|
||||
"INSERT INTO memories (id, memory_type, content, project, source_chunk_id, confidence, status) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, 'active')",
|
||||
(memory_id, memory_type, content, project, source_chunk_id or None, confidence),
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(memory_id, memory_type, content, project, source_chunk_id or None, confidence, status),
|
||||
)
|
||||
|
||||
log.info("memory_created", memory_type=memory_type, content_preview=content[:80])
|
||||
log.info(
|
||||
"memory_created",
|
||||
memory_type=memory_type,
|
||||
status=status,
|
||||
content_preview=content[:80],
|
||||
)
|
||||
|
||||
return Memory(
|
||||
id=memory_id,
|
||||
@@ -89,9 +129,11 @@ def create_memory(
|
||||
project=project,
|
||||
source_chunk_id=source_chunk_id,
|
||||
confidence=confidence,
|
||||
status="active",
|
||||
status=status,
|
||||
created_at=now,
|
||||
updated_at=now,
|
||||
last_referenced_at="",
|
||||
reference_count=0,
|
||||
)
|
||||
|
||||
|
||||
@@ -101,8 +143,18 @@ def get_memories(
|
||||
active_only: bool = True,
|
||||
min_confidence: float = 0.0,
|
||||
limit: int = 50,
|
||||
status: str | None = None,
|
||||
) -> list[Memory]:
|
||||
"""Retrieve memories, optionally filtered."""
|
||||
"""Retrieve memories, optionally filtered.
|
||||
|
||||
When ``status`` is provided explicitly, it takes precedence over
|
||||
``active_only`` so callers can list the candidate review queue via
|
||||
``get_memories(status='candidate')``. When ``status`` is omitted the
|
||||
legacy ``active_only`` behaviour still applies.
|
||||
"""
|
||||
if status is not None and status not in MEMORY_STATUSES:
|
||||
raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
|
||||
|
||||
query = "SELECT * FROM memories WHERE 1=1"
|
||||
params: list = []
|
||||
|
||||
@@ -112,7 +164,10 @@ def get_memories(
|
||||
if project is not None:
|
||||
query += " AND project = ?"
|
||||
params.append(project)
|
||||
if active_only:
|
||||
if status is not None:
|
||||
query += " AND status = ?"
|
||||
params.append(status)
|
||||
elif active_only:
|
||||
query += " AND status = 'active'"
|
||||
if min_confidence > 0:
|
||||
query += " AND confidence >= ?"
|
||||
@@ -163,8 +218,8 @@ def update_memory(
|
||||
updates.append("confidence = ?")
|
||||
params.append(confidence)
|
||||
if status is not None:
|
||||
if status not in ("active", "superseded", "invalid"):
|
||||
raise ValueError(f"Invalid status '{status}'")
|
||||
if status not in MEMORY_STATUSES:
|
||||
raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
|
||||
updates.append("status = ?")
|
||||
params.append(status)
|
||||
|
||||
@@ -195,6 +250,83 @@ def supersede_memory(memory_id: str) -> bool:
|
||||
return update_memory(memory_id, status="superseded")
|
||||
|
||||
|
||||
def promote_memory(memory_id: str) -> bool:
|
||||
"""Promote a candidate memory to active (Phase 9 Commit C review queue).
|
||||
|
||||
Returns False if the memory does not exist or is not currently a
|
||||
candidate. Raises ValueError only if the promotion would create a
|
||||
duplicate active memory (delegates to update_memory's existing check).
|
||||
"""
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT status FROM memories WHERE id = ?", (memory_id,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return False
|
||||
if row["status"] != "candidate":
|
||||
return False
|
||||
return update_memory(memory_id, status="active")
|
||||
|
||||
|
||||
def reject_candidate_memory(memory_id: str) -> bool:
|
||||
"""Reject a candidate memory (Phase 9 Commit C).
|
||||
|
||||
Sets the candidate's status to ``invalid`` so it drops out of the
|
||||
review queue without polluting the active set. Returns False if the
|
||||
memory does not exist or is not currently a candidate.
|
||||
"""
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT status FROM memories WHERE id = ?", (memory_id,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return False
|
||||
if row["status"] != "candidate":
|
||||
return False
|
||||
return update_memory(memory_id, status="invalid")
|
||||
|
||||
|
||||
def reinforce_memory(
|
||||
memory_id: str,
|
||||
confidence_delta: float = 0.02,
|
||||
) -> tuple[bool, float, float]:
|
||||
"""Bump a memory's confidence and reference count (Phase 9 Commit B).
|
||||
|
||||
Returns a 3-tuple ``(applied, old_confidence, new_confidence)``.
|
||||
``applied`` is False if the memory does not exist or is not in the
|
||||
``active`` state — reinforcement only touches live memories so the
|
||||
candidate queue and invalidated history are never silently revived.
|
||||
|
||||
Confidence is capped at 1.0. last_referenced_at is set to the current
|
||||
UTC time in SQLite-comparable format. reference_count is incremented
|
||||
by one per call (not per delta amount).
|
||||
"""
|
||||
if confidence_delta < 0:
|
||||
raise ValueError("confidence_delta must be non-negative for reinforcement")
|
||||
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT confidence, status FROM memories WHERE id = ?", (memory_id,)
|
||||
).fetchone()
|
||||
if row is None or row["status"] != "active":
|
||||
return False, 0.0, 0.0
|
||||
old_confidence = float(row["confidence"])
|
||||
new_confidence = min(1.0, old_confidence + confidence_delta)
|
||||
conn.execute(
|
||||
"UPDATE memories SET confidence = ?, last_referenced_at = ?, "
|
||||
"reference_count = COALESCE(reference_count, 0) + 1 "
|
||||
"WHERE id = ?",
|
||||
(new_confidence, now, memory_id),
|
||||
)
|
||||
log.info(
|
||||
"memory_reinforced",
|
||||
memory_id=memory_id,
|
||||
old_confidence=round(old_confidence, 4),
|
||||
new_confidence=round(new_confidence, 4),
|
||||
)
|
||||
return True, old_confidence, new_confidence
|
||||
|
||||
|
||||
def get_memories_for_context(
|
||||
memory_types: list[str] | None = None,
|
||||
project: str | None = None,
|
||||
@@ -251,6 +383,9 @@ def get_memories_for_context(
|
||||
|
||||
def _row_to_memory(row) -> Memory:
|
||||
"""Convert a DB row to Memory dataclass."""
|
||||
keys = row.keys() if hasattr(row, "keys") else []
|
||||
last_ref = row["last_referenced_at"] if "last_referenced_at" in keys else None
|
||||
ref_count = row["reference_count"] if "reference_count" in keys else 0
|
||||
return Memory(
|
||||
id=row["id"],
|
||||
memory_type=row["memory_type"],
|
||||
@@ -261,6 +396,8 @@ def _row_to_memory(row) -> Memory:
|
||||
status=row["status"],
|
||||
created_at=row["created_at"],
|
||||
updated_at=row["updated_at"],
|
||||
last_referenced_at=last_ref or "",
|
||||
reference_count=int(ref_count or 0),
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -41,6 +41,8 @@ CREATE TABLE IF NOT EXISTS memories (
|
||||
source_chunk_id TEXT REFERENCES source_chunks(id),
|
||||
confidence REAL DEFAULT 1.0,
|
||||
status TEXT DEFAULT 'active',
|
||||
last_referenced_at DATETIME,
|
||||
reference_count INTEGER DEFAULT 0,
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
@@ -99,6 +101,20 @@ def _apply_migrations(conn: sqlite3.Connection) -> None:
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN project TEXT DEFAULT ''")
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project)")
|
||||
|
||||
# Phase 9 Commit B: reinforcement columns.
|
||||
# last_referenced_at records when a memory was most recently referenced
|
||||
# in a captured interaction; reference_count is a monotonically
|
||||
# increasing counter bumped on every reference. Together they let
|
||||
# Reflection (Commit C) and decay (deferred) reason about which
|
||||
# memories are actually being used versus which have gone cold.
|
||||
if not _column_exists(conn, "memories", "last_referenced_at"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN last_referenced_at DATETIME")
|
||||
if not _column_exists(conn, "memories", "reference_count"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN reference_count INTEGER DEFAULT 0")
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_memories_last_referenced ON memories(last_referenced_at)"
|
||||
)
|
||||
|
||||
# Phase 9 Commit A: capture loop columns on the interactions table.
|
||||
# The original schema only carried prompt + project_id + a context_pack
|
||||
# JSON blob. To make interactions a real audit trail of what AtoCore fed
|
||||
|
||||
316
tests/test_reinforcement.py
Normal file
316
tests/test_reinforcement.py
Normal file
@@ -0,0 +1,316 @@
|
||||
"""Tests for Phase 9 Commit B reinforcement loop."""
|
||||
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
from atocore.interactions.service import record_interaction
|
||||
from atocore.main import app
|
||||
from atocore.memory.reinforcement import (
|
||||
DEFAULT_CONFIDENCE_DELTA,
|
||||
reinforce_from_interaction,
|
||||
)
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
get_memories,
|
||||
reinforce_memory,
|
||||
)
|
||||
from atocore.models.database import init_db
|
||||
|
||||
|
||||
# --- service-level tests: reinforce_memory primitive ----------------------
|
||||
|
||||
|
||||
def test_reinforce_memory_bumps_active_memory(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="prefers Python over Ruby for scripting",
|
||||
confidence=0.6,
|
||||
)
|
||||
|
||||
applied, old_conf, new_conf = reinforce_memory(mem.id, confidence_delta=0.05)
|
||||
|
||||
assert applied is True
|
||||
assert old_conf == 0.6
|
||||
assert abs(new_conf - 0.65) < 1e-9
|
||||
|
||||
reloaded = get_memories(memory_type="preference", limit=10)
|
||||
match = next((m for m in reloaded if m.id == mem.id), None)
|
||||
assert match is not None
|
||||
assert abs(match.confidence - 0.65) < 1e-9
|
||||
assert match.reference_count == 1
|
||||
assert match.last_referenced_at # non-empty
|
||||
|
||||
|
||||
def test_reinforce_memory_caps_at_one(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="identity",
|
||||
content="is a mechanical engineer who runs AtoCore",
|
||||
confidence=0.98,
|
||||
)
|
||||
applied, old_conf, new_conf = reinforce_memory(mem.id, confidence_delta=0.05)
|
||||
assert applied is True
|
||||
assert old_conf == 0.98
|
||||
assert new_conf == 1.0
|
||||
|
||||
|
||||
def test_reinforce_memory_rejects_candidate_and_missing(tmp_data_dir):
|
||||
init_db()
|
||||
candidate = create_memory(
|
||||
memory_type="knowledge",
|
||||
content="the lateral support uses GF-PTFE pads",
|
||||
confidence=0.5,
|
||||
status="candidate",
|
||||
)
|
||||
applied, _, _ = reinforce_memory(candidate.id)
|
||||
assert applied is False
|
||||
|
||||
missing, _, _ = reinforce_memory("no-such-id")
|
||||
assert missing is False
|
||||
|
||||
|
||||
def test_reinforce_memory_accumulates_reference_count(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="likes concise code reviews that focus on the why",
|
||||
confidence=0.5,
|
||||
)
|
||||
for _ in range(5):
|
||||
reinforce_memory(mem.id, confidence_delta=0.01)
|
||||
reloaded = [m for m in get_memories(memory_type="preference", limit=10) if m.id == mem.id][0]
|
||||
assert reloaded.reference_count == 5
|
||||
assert abs(reloaded.confidence - 0.55) < 1e-9
|
||||
|
||||
|
||||
def test_reinforce_memory_rejects_negative_delta(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(memory_type="preference", content="always uses structured logging")
|
||||
import pytest
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
reinforce_memory(mem.id, confidence_delta=-0.01)
|
||||
|
||||
|
||||
# --- reinforce_from_interaction: the high-level matcher -------------------
|
||||
|
||||
|
||||
def _make_interaction(**overrides):
|
||||
return record_interaction(
|
||||
prompt=overrides.get("prompt", "ignored"),
|
||||
response=overrides.get("response", ""),
|
||||
response_summary=overrides.get("response_summary", ""),
|
||||
project=overrides.get("project", ""),
|
||||
client=overrides.get("client", ""),
|
||||
session_id=overrides.get("session_id", ""),
|
||||
reinforce=False, # the matcher is tested in isolation here
|
||||
)
|
||||
|
||||
|
||||
def test_reinforce_from_interaction_matches_active_memory(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="prefers tests that describe behaviour in plain English",
|
||||
confidence=0.5,
|
||||
)
|
||||
interaction = _make_interaction(
|
||||
response=(
|
||||
"I wrote the new tests in plain English, since the project "
|
||||
"prefers tests that describe behaviour in plain English and "
|
||||
"that makes them easier to review."
|
||||
),
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert len(results) == 1
|
||||
assert results[0].memory_id == mem.id
|
||||
assert abs(results[0].new_confidence - (0.5 + DEFAULT_CONFIDENCE_DELTA)) < 1e-9
|
||||
|
||||
|
||||
def test_reinforce_from_interaction_ignores_candidates_and_inactive(tmp_data_dir):
|
||||
init_db()
|
||||
candidate = create_memory(
|
||||
memory_type="knowledge",
|
||||
content="the polisher frame uses kinematic mounts for thermal isolation",
|
||||
confidence=0.6,
|
||||
status="candidate",
|
||||
)
|
||||
interaction = _make_interaction(
|
||||
response=(
|
||||
"The polisher frame uses kinematic mounts for thermal isolation, "
|
||||
"which matches the note in the design log."
|
||||
),
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
# Candidate should NOT be reinforced even though the text matches
|
||||
assert all(r.memory_id != candidate.id for r in results)
|
||||
|
||||
|
||||
def test_reinforce_from_interaction_requires_min_content_length(tmp_data_dir):
|
||||
init_db()
|
||||
short_mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="uses SI", # below min length
|
||||
)
|
||||
interaction = _make_interaction(
|
||||
response="Everything uses SI for this project, consistently.",
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert all(r.memory_id != short_mem.id for r in results)
|
||||
|
||||
|
||||
def test_reinforce_from_interaction_empty_response_is_noop(tmp_data_dir):
|
||||
init_db()
|
||||
create_memory(memory_type="preference", content="prefers structured logging")
|
||||
interaction = _make_interaction(response="", response_summary="")
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert results == []
|
||||
|
||||
|
||||
def test_reinforce_from_interaction_is_normalized(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="Prefers concise commit messages focused on the why",
|
||||
)
|
||||
# Response has different casing and extra whitespace — should still match
|
||||
interaction = _make_interaction(
|
||||
response=(
|
||||
"The commit message was short on purpose — the user\n\n"
|
||||
"PREFERS concise commit MESSAGES focused on the WHY, "
|
||||
"so I stuck to one sentence."
|
||||
),
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert any(r.memory_id == mem.id for r in results)
|
||||
|
||||
|
||||
def test_reinforce_from_interaction_deduplicates_across_buckets(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="identity",
|
||||
content="mechanical engineer who runs AtoCore",
|
||||
project="",
|
||||
)
|
||||
# This memory belongs to the identity bucket AND would also be
|
||||
# fetched via the project query if project matched. We want to ensure
|
||||
# we don't double-reinforce.
|
||||
interaction = _make_interaction(
|
||||
response="The mechanical engineer who runs AtoCore asked for this patch.",
|
||||
project="p05-interferometer",
|
||||
)
|
||||
results = reinforce_from_interaction(interaction)
|
||||
assert sum(1 for r in results if r.memory_id == mem.id) == 1
|
||||
|
||||
|
||||
# --- automatic reinforcement on record_interaction ------------------------
|
||||
|
||||
|
||||
def test_record_interaction_auto_reinforces_by_default(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="writes tests before hooking features into API routes",
|
||||
confidence=0.5,
|
||||
)
|
||||
record_interaction(
|
||||
prompt="please add the /foo endpoint with tests",
|
||||
response=(
|
||||
"Wrote tests first, then added the /foo endpoint. The project "
|
||||
"writes tests before hooking features into API routes so the "
|
||||
"order is enforced."
|
||||
),
|
||||
)
|
||||
reloaded = [m for m in get_memories(memory_type="preference", limit=20) if m.id == mem.id][0]
|
||||
assert reloaded.confidence > 0.5
|
||||
assert reloaded.reference_count == 1
|
||||
|
||||
|
||||
def test_record_interaction_reinforce_false_skips_pass(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="always includes a rollback note in risky commits",
|
||||
confidence=0.5,
|
||||
)
|
||||
record_interaction(
|
||||
prompt="ignored",
|
||||
response=(
|
||||
"I always includes a rollback note in risky commits, so the "
|
||||
"commit message mentions how to revert if needed."
|
||||
),
|
||||
reinforce=False,
|
||||
)
|
||||
reloaded = [m for m in get_memories(memory_type="preference", limit=20) if m.id == mem.id][0]
|
||||
assert reloaded.confidence == 0.5
|
||||
assert reloaded.reference_count == 0
|
||||
|
||||
|
||||
def test_record_interaction_auto_reinforce_handles_empty_response(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(memory_type="preference", content="prefers descriptive branch names")
|
||||
# No response text — reinforcement should be a silent no-op
|
||||
record_interaction(prompt="hi", response="", response_summary="")
|
||||
reloaded = [m for m in get_memories(memory_type="preference", limit=20) if m.id == mem.id][0]
|
||||
assert reloaded.reference_count == 0
|
||||
|
||||
|
||||
# --- API level ------------------------------------------------------------
|
||||
|
||||
|
||||
def test_api_reinforce_endpoint_runs_against_stored_interaction(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="rejects commits that touch credential files",
|
||||
confidence=0.5,
|
||||
)
|
||||
interaction = record_interaction(
|
||||
prompt="review commit",
|
||||
response=(
|
||||
"I rejects commits that touch credential files on sight. "
|
||||
"That commit touched ~/.git-credentials, so it was blocked."
|
||||
),
|
||||
reinforce=False, # leave untouched for the endpoint to do it
|
||||
)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post(f"/interactions/{interaction.id}/reinforce")
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["interaction_id"] == interaction.id
|
||||
assert body["reinforced_count"] >= 1
|
||||
ids = [r["memory_id"] for r in body["reinforced"]]
|
||||
assert mem.id in ids
|
||||
|
||||
|
||||
def test_api_reinforce_endpoint_returns_404_for_missing(tmp_data_dir):
|
||||
init_db()
|
||||
client = TestClient(app)
|
||||
response = client.post("/interactions/does-not-exist/reinforce")
|
||||
assert response.status_code == 404
|
||||
|
||||
|
||||
def test_api_post_interactions_accepts_reinforce_false(tmp_data_dir):
|
||||
init_db()
|
||||
mem = create_memory(
|
||||
memory_type="preference",
|
||||
content="writes runbooks alongside new services",
|
||||
confidence=0.5,
|
||||
)
|
||||
client = TestClient(app)
|
||||
response = client.post(
|
||||
"/interactions",
|
||||
json={
|
||||
"prompt": "review",
|
||||
"response": (
|
||||
"I writes runbooks alongside new services and the diff includes "
|
||||
"one under docs/runbooks/."
|
||||
),
|
||||
"reinforce": False,
|
||||
},
|
||||
)
|
||||
assert response.status_code == 200
|
||||
reloaded = [m for m in get_memories(memory_type="preference", limit=20) if m.id == mem.id][0]
|
||||
assert reloaded.confidence == 0.5
|
||||
assert reloaded.reference_count == 0
|
||||
Reference in New Issue
Block a user