Commit Graph

17 Commits

Author SHA1 Message Date
e840ef4be3 feat: Phase 7D — confidence decay on unreferenced cold memories
Daily job multiplies confidence by 0.97 (~2-month half-life) for
active memories with reference_count=0 AND idle > 30 days. Below
0.3 → auto-supersede with audit. Reversible via reinforcement
(which already bumps confidence back up).

Rationale: stale memories currently rank equal to fresh ones in
retrieval. Without decay, the brain accumulates obsolete facts
that compete with fresh knowledge for context-pack slots. With
decay, memories earn their longevity via reference.

- decay_unreferenced_memories() in service.py (stdlib-only, no cron
  infra needed)
- POST /admin/memory/decay-run endpoint
- Nightly Step F4 in batch-extract.sh
- Exempt: reinforced (refcount > 0), graduated, superseded, invalid
- Audit row per supersession ("decayed below floor, no references"),
  actor="confidence-decay". Per-decay rows skipped (chatty, no
  human value — status change is the meaningful signal).
- Configurable via env: ATOCORE_DECAY_* (exposed through endpoint body)

Tests: +13 (basic decay, reinforcement protection, supersede at floor,
audit trail, graduated/superseded exemption, reinforcement reversibility,
threshold tuning, parameter validation, cross-run stacking).
401 → 414.

Next in Phase 7: 7C tag canonicalization (weekly), then 7B contradiction
detection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:50:20 -04:00
028d4c3594 feat: Phase 7A — semantic memory dedup ("sleep cycle" V1)
New table memory_merge_candidates + service functions to cluster
near-duplicate active memories within (project, memory_type) buckets,
draft a unified content via LLM, and merge on human approval. Source
memories become superseded (never deleted); merged memory carries
union of tags, max of confidence, sum of reference_count.

- schema migration for memory_merge_candidates
- atocore.memory.similarity: cosine + transitive clustering
- atocore.memory._dedup_prompt: stdlib-only LLM prompt preserving every specific
- service: merge_memories / create_merge_candidate / get_merge_candidates / reject_merge_candidate
- scripts/memory_dedup.py: host-side detector (HTTP-only, idempotent)
- 5 API endpoints under /admin/memory/merge-candidates* + /admin/memory/dedup-scan
- triage UI: purple "🔗 Merge Candidates" section + "🔗 Scan for duplicates" bar
- batch-extract.sh Step B3 (0.90 daily, 0.85 Sundays)
- deploy/dalidou/dedup-watcher.sh for UI-triggered scans
- 21 new tests (374 → 395)
- docs/PHASE-7-MEMORY-CONSOLIDATION.md covering 7A-7H roadmap

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:30:49 -04:00
02055e8db3 feat: Phase 6 — Living Taxonomy + Universal Capture
Closes two real-use gaps:
1. "APM tool" gap: work done outside Claude Code (desktop, web, phone,
   other machine) was invisible to AtoCore.
2. Project discovery gap: manual JSON-file edits required to promote
   an emerging theme to a first-class project.

B — atocore_remember MCP tool (scripts/atocore_mcp.py):
- New MCP tool for universal capture from any MCP-aware client
  (Claude Desktop, Code, Cursor, Zed, Windsurf, etc.)
- Accepts content (required) + memory_type/project/confidence/
  valid_until/domain_tags (all optional with sensible defaults)
- Creates a candidate memory, goes through the existing 3-tier triage
  (no bypass — the quality gate catches noise)
- Detailed tool description guides Claude on when to invoke: "remember
  this", "save that for later", "don't lose this fact"
- Total tools exposed by MCP server: 14 → 15

C.1 Emerging-concepts detector (scripts/detect_emerging.py):
- Nightly scan of active + candidate memories for:
  * Unregistered project names with ≥3 memory occurrences
  * Top 20 domain_tags by frequency (emerging categories)
  * Active memories with reference_count ≥ 5 + valid_until set
    (reinforced transients — candidates for extension)
- Writes findings to atocore/proposals/* project state entries
- Emits "warning" alert via Phase 4 framework the FIRST time a new
  project crosses the 5-memory alert threshold (avoids spam)
- Configurable via env vars: ATOCORE_EMERGING_PROJECT_MIN (default 3),
  ATOCORE_EMERGING_ALERT_THRESHOLD (default 5), TOP_TAGS_LIMIT (20)

C.2 Registration surface (src/atocore/api/routes.py + wiki.py):
- POST /admin/projects/register-emerging — one-click register with
  sensible defaults (ingest_roots auto-filled with
  vault:incoming/projects/<id>/ convention). Clears the proposal
  from the dashboard list on success.
- Dashboard /admin/dashboard: new "proposals" section with
  unregistered_projects + emerging_categories + reinforced_transients.
- Wiki homepage: "📋 Emerging" section rendering each unregistered
  project as a card with count + 2 sample memory previews + inline
  "📌 Register as project" button that calls the endpoint via fetch,
  reloads the page on success.

C.3 Transient-to-durable extension
(src/atocore/memory/service.py + API + cron):
- New extend_reinforced_valid_until() function — scans active memories
  with valid_until in the next 30 days and reference_count ≥ 5.
  Extends expiry by 90 days. If reference_count ≥ 10, clears expiry
  entirely (makes permanent). Writes audit rows via the Phase 4
  memory_audit framework with actor="transient-to-durable".
- POST /admin/memory/extend-reinforced — API wrapper for cron.
- Matches the user's intuition: "something transient becomes important
  if you keep coming back to it".

Nightly cron (deploy/dalidou/batch-extract.sh):
- Step F2: detect_emerging.py (after F pipeline summary)
- Step F3: /admin/memory/extend-reinforced (before integrity check)
- Both fail-open; errors don't break the pipeline.

Tests: 366 → 374 (+8 for Phase 6):
- 6 tests for extend_reinforced_valid_until covering:
  extension path, permanent path, skip far-future, skip low-refs,
  skip permanent memories, audit row write
- 2 smoke tests for the detector (imports cleanly, handles empty DB)
- MCP tool changes don't need new tests — the wrapper is pure passthrough

Design decisions documented in plan file:
- atocore_remember deliberately doesn't bypass triage (quality gate)
- Detector is passive (surfaces proposals) not active (auto-registers)
- Sensible ingest-root defaults ("vault:incoming/projects/<id>/")
  so registration is one-click with no file-path thinking
- Extension adds 90 days rather than clearing expiry (gradual
  permanence earned through sustained reinforcement)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 08:08:55 -04:00
07664bd743 feat: Phase 5A — Engineering V1 foundation
First slice of the Engineering V1 sprint. Lays the schema + lifecycle
plumbing so the 10 canonical queries, memory graduation, and conflict
detection can land cleanly on top.

Schema (src/atocore/models/database.py):
- conflicts + conflict_members tables per conflict-model.md (with 5
  indexes on status/project/slot/members)
- memory_audit.entity_kind discriminator — same audit table serves
  both memories ("memory") and entities ("entity"); unified history
  without duplicating infrastructure
- memories.graduated_to_entity_id forward pointer for graduated
  memories (M → E transition preserves the memory as historical
  pointer)

Memory (src/atocore/memory/service.py):
- MEMORY_STATUSES gains "graduated" — memory-entity graduation flow
  ready to wire in Phase 5F

Engineering service (src/atocore/engineering/service.py):
- RELATIONSHIP_TYPES organized into 4 families per ontology-v1.md:
  + Structural: contains, part_of, interfaces_with
  + Intent: satisfies, constrained_by, affected_by_decision,
    based_on_assumption (new), supersedes
  + Validation: analyzed_by, validated_by, supports (new),
    conflicts_with (new), depends_on
  + Provenance: described_by, updated_by_session (new),
    evidenced_by (new), summarized_in (new)
- create_entity + create_relationship now call resolve_project_name()
  on write (canonicalization contract per doc)
- Both accept actor= parameter for audit provenance
- _audit_entity() helper uses shared memory_audit table with
  entity_kind="entity" — one observability layer for everything
- promote_entity / reject_entity_candidate / supersede_entity —
  mirror the memory lifecycle exactly (same pattern, same naming)
- get_entity_audit() reads from the shared table filtered by
  entity_kind

API (src/atocore/api/routes.py):
- POST /entities/{id}/promote (candidate → active)
- POST /entities/{id}/reject (candidate → invalid)
- GET /entities/{id}/audit (full history for one entity)
- POST /entities passes actor="api-http" through

Tests: 317 → 326 (9 new):
- test_entity_project_canonicalization (p04 → p04-gigabit)
- test_promote_entity_candidate_to_active
- test_reject_entity_candidate
- test_promote_active_entity_noop (only candidates promote)
- test_entity_audit_log_captures_lifecycle (before/after snapshots)
- test_new_relationship_types_available (6 new types present)
- test_conflicts_tables_exist
- test_memory_audit_has_entity_kind
- test_graduated_status_accepted

What's next (5B-5I, deferred): entity triage UI tab, core structure
queries, the 3 killer queries, memory graduation script, conflict
detection, MCP + context pack integration. See plan file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:01:28 -04:00
88f2f7c4e1 feat: Phase 4 V1 — Robustness Hardening
Adds the observability + safety layer that turns AtoCore from
"works until something silently breaks" into "every mutation is
traceable, drift is detected, failures raise alerts."

1. Audit log (memory_audit table):
   - New table with id, memory_id, action, actor, before/after JSON,
     note, timestamp; 3 indexes for memory_id/timestamp/action
   - _audit_memory() helper called from every mutation:
     create_memory, update_memory, promote_memory,
     reject_candidate_memory, invalidate_memory, supersede_memory,
     reinforce_memory, auto_promote_reinforced, expire_stale_candidates
   - Action verb auto-selected: promoted/rejected/invalidated/
     superseded/updated based on state transition
   - "actor" threaded through: api-http, human-triage, phase10-auto-
     promote, candidate-expiry, reinforcement, etc.
   - Fail-open: audit write failure logs but never breaks the mutation
   - GET /memory/{id}/audit: full history for one memory
   - GET /admin/audit/recent: last 50 mutations across the system

2. Alerts framework (src/atocore/observability/alerts.py):
   - emit_alert(severity, title, message, context) fans out to:
     - structlog logger (always)
     - ~/atocore-logs/alerts.log append (configurable via
       ATOCORE_ALERT_LOG)
     - project_state atocore/alert/last_{severity} (dashboard surface)
     - ATOCORE_ALERT_WEBHOOK POST if set (auto-detects Discord webhook
       format for nice embeds; generic JSON otherwise)
   - Every sink fail-open — one failure doesn't prevent the others
   - Pipeline alert step in nightly cron: harness < 85% → warning;
     candidate queue > 200 → warning

3. Integrity checks (scripts/integrity_check.py):
   - Nightly scan for drift:
     - Memories → missing source_chunk_id references
     - Duplicate active memories (same type+content+project)
     - project_state → missing projects
     - Orphaned source_chunks (no parent document)
   - Results persisted to atocore/status/integrity_check_result
   - Any finding emits a warning alert
   - Added as Step G in deploy/dalidou/batch-extract.sh nightly cron

4. Dashboard surfaces it all:
   - integrity (findings + details)
   - alerts (last info/warning/critical per severity)
   - recent_audit (last 10 mutations with actor + action + preview)

Tests: 308 → 317 (9 new):
  - test_audit_create_logs_entry
  - test_audit_promote_logs_entry
  - test_audit_reject_logs_entry
  - test_audit_update_captures_before_after
  - test_audit_reinforce_logs_entry
  - test_recent_audit_returns_cross_memory_entries
  - test_emit_alert_writes_log_file
  - test_emit_alert_invalid_severity_falls_back_to_info
  - test_emit_alert_fails_open_on_log_write_error

Deferred: formal migration framework with rollback (current additive
pattern is fine for V1); memory detail wiki page with audit view
(quick follow-up).

To enable Discord alerts: set ATOCORE_ALERT_WEBHOOK to a Discord
webhook URL in Dalidou's environment. Default = log-only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:54:10 -04:00
bfa7dba4de feat: Phase 3 V1 — Auto-Organization (domain_tags + valid_until)
Adds structural metadata that the LLM triage was already implicitly
reasoning about ("stale snapshot" → reject). Phase 3 captures that
reasoning as fields so it can DRIVE retrieval, not just rejection.

Schema (src/atocore/models/database.py):
- domain_tags TEXT DEFAULT '[]'  JSON array of lowercase topic keywords
- valid_until DATETIME            ISO date; null = permanent
- idx_memories_valid_until index for efficient expiry queries

Memory service (src/atocore/memory/service.py):
- Memory dataclass gains domain_tags + valid_until
- create_memory, update_memory accept/persist both
- _row_to_memory safely reads both (JSON-decode + null handling)
- _normalize_tags helper: lowercase, dedup, strip, cap at 10
- get_memories_for_context filters expired (valid_until < today UTC)
- _rank_memories_for_query adds tag-boost: memories whose domain_tags
  appear as substrings in query text rank higher (tertiary key after
  content-overlap density + absolute overlap, before confidence)

LLM extractor (_llm_prompt.py → llm-0.5.0):
- SYSTEM_PROMPT documents domain_tags (2-5 keywords) + valid_until
  (time-bounded facts get expiry dates; durable facts stay null)
- normalize_candidate_item parses both fields from model output with
  graceful fallback for string/null/missing

LLM triage (scripts/auto_triage.py):
- TRIAGE_SYSTEM_PROMPT documents same two fields
- parse_verdict extracts them from verdict JSON
- On promote: PUT /memory/{id} with tags + valid_until BEFORE
  POST /memory/{id}/promote, so active memories carry them

API (src/atocore/api/routes.py):
- MemoryCreateRequest: adds domain_tags, valid_until
- MemoryUpdateRequest: adds domain_tags, valid_until, memory_type
- GET /memory response exposes domain_tags + valid_until + created_at

Triage UI (src/atocore/engineering/triage_ui.py):
- Renders existing tags as colored badges
- Adds inline text field for tags (comma-separated) + date picker for
  valid_until on every candidate card
- Save&Promote button persists edits via PUT then promotes
- Plain Promote (and Y shortcut) also saves tags/expiry if edited

Wiki (src/atocore/engineering/wiki.py):
- Search now matches memory content OR domain_tags
- Search results render tags as clickable badges linking to
  /wiki/search?q=<tag> for cross-project navigation
- valid_until shown as amber "valid until YYYY-MM-DD" hint

Tests: 303 → 308 (5 new for Phase 3 behavior):
- test_create_memory_with_tags_and_valid_until
- test_create_memory_normalizes_tags
- test_update_memory_sets_tags_and_valid_until
- test_get_memories_for_context_excludes_expired
- test_context_builder_tag_boost_orders_results

Deferred (explicitly): temporal_scope enum, source_refs memory graph,
HDBSCAN clustering, memory detail wiki page, backfill of existing
actives. See docs/MASTER-BRAIN-PLAN.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:37:01 -04:00
775960c8c8 feat: "Make It Actually Useful" sprint — observability + Phase 10
Pipeline observability:
- Retrieval harness runs nightly (Step E in batch-extract.sh)
- Pipeline summary persisted to project state after each run
  (pipeline_last_run, pipeline_summary, retrieval_harness_result)
- Dashboard enhanced: interaction total + by_client, pipeline health
  (last_run, hours_since, harness results, triage stats), dynamic
  project list from registry

Phase 10 — reinforcement-based auto-promotion:
- auto_promote_reinforced(): candidates with reference_count >= 3 and
  confidence >= 0.7 auto-graduate to active
- expire_stale_candidates(): candidates unreinforced for 14+ days
  auto-rejected to prevent unbounded queue growth
- Both wired into nightly cron (Step B2)
- Batch script: scripts/auto_promote_reinforced.py (--dry-run support)

Knowledge seeding:
- scripts/seed_project_state.py: 26 curated Trusted Project State
  entries across p04-gigabit, p05-interferometer, p06-polisher,
  atomizer-v2, abb-space, atocore (decisions, requirements, facts,
  contacts, milestones)

Tests: 299 → 303 (4 new Phase 10 tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:59:12 -04:00
8951c624fe fix(R7/R9): overlap-density ranking + project trust-preservation
R7: ranking scorer now uses overlap-density (overlap_count /
memory_token_count) as primary key instead of raw overlap count.
A 5-token memory with 3 overlapping tokens (density 0.6) now beats
a 40-token overview memory with 3 overlapping tokens (density 0.075)
at the same absolute count. Secondary: absolute overlap. Tertiary:
confidence. Targeting p06-firmware-interface harness fixture.

R9: when the LLM extractor returns a project that differs from the
interaction's known project, it now checks the project registry.
If the model's project is a registered canonical ID, trust it. If
not (hallucinated name), fall back to the interaction's project.
Uses load_project_registry() for the check. The host-side script
mirrors this via an API call to GET /projects at startup.

Two new tests: test_parser_keeps_registered_model_project and
test_parser_rejects_hallucinated_project.

Test count: 280 -> 281.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 14:34:33 -04:00
5c69f77b45 fix: cap per-entry memory length at 250 chars in context band
A 530-char program overview memory with confidence 0.96 was filling
the entire 25% project-memory budget at equal overlap score (3 tokens),
beating shorter query-relevant newly-promoted memories (confidence
0.5) on the confidence tiebreaker. The long memory legitimately
scored well, but its length starved every other memory from the band.

Fix: truncate each formatted entry to 250 chars with '...' so at
least 2-3 memories fit the ~700-char available budget. This doesn't
change ranking — the most relevant memory still goes first — but
it ensures the runner-up can also appear.

Harness fixture delta: Day 7 regression pass pending after deploy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 06:34:27 -04:00
37331d53ef fix: rank memories globally before budget walk
Per-type ranking was still starving later types: when a p05 query
matched a 'knowledge' memory best but 'project' came first in the
type order, the project-type candidates filled the budget before
the knowledge-type pool was even ranked.

Collect all candidates into a single pool, dedupe by id, then
rank the whole pool once against the query before walking the
flat budget. Python's stable sort preserves insertion order (which
still reflects the caller's memory_types order) as a natural
tiebreaker when scores are equal.

Regression surfaced by the retrieval eval harness:
p05-vendor-signal still missing 'Zygo' after 5aeeb1c — the vendor
memory was type=knowledge but never reached the ranker because
type=project consumed the budget first.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:55:10 -04:00
5aeeb1cad1 feat: query-relevance ordering for memory selection
get_memories_for_context now accepts an optional query string.
When provided, candidate memories are reranked by lexical overlap
with the query (stemmed token intersection, ties broken by
confidence) before the budget walk. Without a query the order is
unchanged — effectively "by confidence desc" as before — so
non-builder callers see no behaviour change.

The fetch limit is raised from 10 to 30 so there's a real pool to
rerank. Token overlap reuses _normalize/_tokenize from
reinforcement.py so ranking and reinforcement matching share the
same notion of distinctive terms.

build_context passes the user_prompt through to both the identity/
preference and project-memory calls. The retrieval harness
regression the fix is targeting:

- p05-vendor-signal FAIL @ 1161645: "Zygo" missing from the pack
  even though an active vendor memory contained it. Root cause:
  higher-confidence p05 memories filled the 25% budget slice
  before the vendor memory ever got a chance. Query-aware ordering
  puts the vendor memory first when the query is about vendors.

New regression test test_project_memories_query_relevance_ordering
locks the behaviour in with two p05 memories and a tight budget.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:47:05 -04:00
5913da53c5 fix: flat-budget walk in get_memories_for_context
The per-type slicing (available // len(memory_types)) starved
paragraph-length memories: with 3 types and a 450-char budget,
each type got ~131 chars while real project memories are 300-500
chars each — every entry was skipped and the new Project Memories
band never appeared in the live pack.

Switch to a flat budget pool walked type-by-type in order. Short
identity/preference memories still get first pick when the budget
is tight, but long project memories can now compete for space.

Caught on the first post-deploy probe: 2 active p04 memories
existed but none landed in formatted_context.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:43:41 -04:00
8ea53f4003 feat: fold project-scoped memories into context pack
The retrieval-quality review on 2026-04-11 found that active
project/knowledge/episodic memories never reached the pack: only
Trusted Project State and identity/preference memories were being
assembled. Reinforcement bumped confidence on memories that had
no retrieval outlet, so the reflection loop was half-open.

This change adds a third memory tier between identity/preference
and retrieved chunks:

- PROJECT_MEMORY_BUDGET_RATIO = 0.15
- Memory types: project, knowledge, episodic
- Only populated when a canonical project is in scope — without
  a project hint, project memories stay out (cross-project bleed
  would rot the signal)
- Rendered under a dedicated "--- Project Memories ---" header
  so the LLM can distinguish it from the identity/preference band
- Trim order in _trim_context_to_budget: retrieval → project
  memories → identity/preference → project state (most recently
  added tier drops first when budget is tight)

get_memories_for_context gains header/footer kwargs so the two
memory blocks can be distinguished in a single pack without a
second helper.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:35:40 -04:00
fb6298a9a1 fix(P1+P2): canonicalize project names at every trust boundary
Three findings from codex's review of the previous P1+P2 fix. The
earlier commit (f2372ef) only fixed alias resolution at the context
builder. Codex correctly pointed out that the same fragmentation
applies at every other place a project name crosses a boundary —
project_state writes/reads, interaction capture/listing/filtering,
memory create/queries, and reinforcement's downstream queries. Plus
a real bug in the interaction `since` filter where the storage
format and the documented ISO format don't compare cleanly.

The fix is one helper used at every boundary instead of duplicating
the resolution inline.

New helper: src/atocore/projects/registry.py::resolve_project_name
---------------------------------------------------------------
- Single canonicalization boundary for project names
- Returns the canonical project_id when the input matches any
  registered id or alias
- Returns the input unchanged for empty/None and for unregistered
  names (preserves backwards compat with hand-curated state that
  predates the registry)
- Documented as the contract that every read/write at the trust
  boundary should pass through

P1 — Trusted Project State endpoints
------------------------------------
src/atocore/context/project_state.py: set_state, get_state, and
invalidate_state now all canonicalize project_name through
resolve_project_name BEFORE looking up or creating the project row.

Before this fix:
- POST /project/state with project="p05" called ensure_project("p05")
  which created a separate row in the projects table
- The state row was attached to that alias project_id
- Later context builds canonicalized "p05" -> "p05-interferometer"
  via the builder fix from f2372ef and never found the state
- Result: trusted state silently fragmented across alias rows

After this fix:
- The alias is resolved to the canonical id at every entry point
- Two captures (one via "p05", one via "p05-interferometer") write
  to the same row
- get_state via either alias or the canonical id finds the same row

Fixes the highest-priority gap codex flagged because Trusted Project
State is supposed to be the most dependable layer in the AtoCore
trust hierarchy.

P2.a — Interaction capture project canonicalization
----------------------------------------------------
src/atocore/interactions/service.py: record_interaction now
canonicalizes project before storing, so interaction.project is
always the canonical id regardless of what the client passed.

Downstream effects:
- reinforce_from_interaction queries memories by interaction.project
  -> previously missed memories stored under canonical id
  -> now consistent because interaction.project IS the canonical id
- the extractor stamps candidates with interaction.project
  -> previously created candidates in alias buckets
  -> now creates candidates in the canonical bucket
- list_interactions(project=alias) was already broken, now fixed by
  canonicalizing the filter input on the read side too

Memory service applied the same fix:
- src/atocore/memory/service.py: create_memory and get_memories
  both canonicalize project through resolve_project_name
- This keeps stored memory.project consistent with the
  reinforcement query path

P2.b — Interaction `since` filter format normalization
------------------------------------------------------
src/atocore/interactions/service.py: new _normalize_since helper.

The bug:
- created_at is stored as 'YYYY-MM-DD HH:MM:SS' (no timezone, UTC by
  convention) so it sorts lexically and compares cleanly with the
  SQLite CURRENT_TIMESTAMP default
- The `since` parameter was documented as ISO 8601 but compared as
  a raw string against the storage format
- The lexically-greater 'T' separator means an ISO timestamp like
  '2026-04-07T12:00:00Z' is GREATER than the storage form
  '2026-04-07 12:00:00' for the same instant
- Result: a client passing ISO `since` got an empty result for any
  row from the same day, even though those rows existed and were
  technically "after" the cutoff in real-world time

The fix:
- _normalize_since accepts ISO 8601 with T, optional Z suffix,
  optional fractional seconds, optional +HH:MM offsets
- Uses datetime.fromisoformat for parsing (Python 3.11+)
- Converts to UTC and reformats as the storage format before the
  SQL comparison
- The bare storage format still works (backwards compat path is a
  regex match that returns the input unchanged)
- Unparseable input is returned as-is so the comparison degrades
  gracefully (rows just don't match) instead of raising and
  breaking the listing endpoint

builder.py refactor
-------------------
The previous P1 fix had inline canonicalization. Now it uses the
shared helper for consistency:
- import changed from get_registered_project to resolve_project_name
- the inline lookup is replaced with a single helper call
- the comment block now points at representation-authority.md for
  the canonicalization contract

New shared test fixture: tests/conftest.py::project_registry
------------------------------------------------------------
- Standardizes the registry-setup pattern that was duplicated
  across test_context_builder.py, test_project_state.py,
  test_interactions.py, and test_reinforcement.py
- Returns a callable that takes (project_id, [aliases]) tuples
  and writes them into a temp registry file with the env var
  pointed at it and config.settings reloaded
- Used by all 12 new regression tests in this commit

Tests (12 new, all green on first run)
--------------------------------------
test_project_state.py:
- test_set_state_canonicalizes_alias: write via alias, read via
  every alias and the canonical id, verify same row id
- test_get_state_canonicalizes_alias_after_canonical_write
- test_invalidate_state_canonicalizes_alias
- test_unregistered_project_state_still_works (backwards compat)

test_interactions.py:
- test_record_interaction_canonicalizes_project
- test_list_interactions_canonicalizes_project_filter
- test_list_interactions_since_accepts_iso_with_t_separator
- test_list_interactions_since_accepts_z_suffix
- test_list_interactions_since_accepts_offset
- test_list_interactions_since_storage_format_still_works

test_reinforcement.py:
- test_reinforcement_works_when_capture_uses_alias (end-to-end:
  capture under alias, seed memory under canonical, verify
  reinforcement matches)
- test_get_memories_filter_by_alias

Full suite: 174 passing (was 162), 1 warning. The +12 is the
new regression tests, no existing tests regressed.

What's still NOT canonicalized (and why)
----------------------------------------
- _rank_chunks's secondary substring boost in builder.py — the
  retriever already does the right thing via its own
  _project_match_boost which calls get_registered_project. The
  redundant secondary boost still uses the raw hint but it's a
  multiplicative factor on top of correct retrieval, not a
  filter, so it can't drop relevant chunks. Tracked as a future
  cleanup but not a P1.
- update_memory's project field (you can't change a memory's
  project after creation in the API anyway).
- The retriever's project_hint parameter on direct /query calls
  — same reasoning as the builder boost, plus the retriever's
  own get_registered_project call already handles aliases there.
2026-04-07 08:29:33 -04:00
2704997256 feat(phase9-B): reinforce active memories from captured interactions
Phase 9 Commit B from the agreed plan. With Commit A capturing what
AtoCore fed to the LLM and what came back, this commit closes the
weakest part of the loop: when a memory is actually referenced in a
response, its confidence should drift up, and stale memories that
nobody ever mentions should stay where they are.

This is reinforcement only — nothing is promoted into trusted state
and no candidates are created. Extraction is Commit C.

Schema (additive migration):
- memories.last_referenced_at DATETIME      (null by default)
- memories.reference_count    INTEGER DEFAULT 0
- idx_memories_last_referenced on last_referenced_at
- memories.status now accepts the new "candidate" value so Commit C
  has the status slot to land on. Existing active/superseded/invalid
  rows are untouched.

New module: src/atocore/memory/reinforcement.py
- reinforce_from_interaction(interaction): scans the interaction's
  response + response_summary for echoes of active memories and
  bumps confidence / reference_count for each match
- matching is intentionally simple and explainable:
  * normalize both sides (lowercase, collapse whitespace)
  * require >= 12 chars of memory content to match
  * compare the leading 80-char window of each memory
- the candidate pool is project-scoped memories for the interaction's
  project + global identity + preference memories, deduplicated
- candidates and invalidated memories are NEVER reinforced; only
  active memories move

Memory service changes:
- MEMORY_STATUSES = ["candidate", "active", "superseded", "invalid"]
- create_memory(status="candidate"|"active"|...) with per-status
  duplicate scoping so a candidate and an active with identical text
  can legitimately coexist during review
- get_memories(status=...) explicit override of the legacy active_only
  flag; callers can now list the review queue cleanly
- update_memory accepts any valid status including "candidate"
- reinforce_memory(id, delta): low-level primitive that bumps
  confidence (capped at 1.0), increments reference_count, and sets
  last_referenced_at. Only active memories; returns (applied, old, new)
- promote_memory / reject_candidate_memory helpers prepping Commit C

Interactions service:
- record_interaction(reinforce=True) runs reinforce_from_interaction
  automatically when the interaction has response content. reinforcement
  errors are logged but never raised back to the caller so capture
  itself is never blocked by a flaky downstream.
- circular import between interactions service and memory.reinforcement
  avoided by lazy import inside the function

API:
- POST /interactions now accepts a reinforce bool field (default true)
- POST /interactions/{id}/reinforce runs reinforcement on an existing
  captured interaction — useful for backfilling or for retrying after
  a transient error in the automatic pass
- response lists which memory ids were reinforced with
  old / new confidence for audit

Tests (17 new, all green):
- reinforce_memory bumps, caps at 1.0, accumulates reference_count
- reinforce_memory rejects candidates and missing ids
- reinforce_memory rejects negative delta
- reinforce_from_interaction matches active memory
- reinforce_from_interaction ignores candidates and inactive
- reinforce_from_interaction requires minimum content length
- reinforce_from_interaction handles empty response cleanly
- reinforce_from_interaction normalizes casing and whitespace
- reinforce_from_interaction deduplicates across memory buckets
- record_interaction auto-reinforces by default
- record_interaction reinforce=False skips the pass
- record_interaction handles empty response
- POST /interactions/{id}/reinforce runs against stored interaction
- POST /interactions/{id}/reinforce returns 404 for missing id
- POST /interactions accepts reinforce=false

Full suite: 135 passing (was 118).

Trust model unchanged:
- reinforcement only moves confidence within the existing active set
- the candidate lifecycle is declared but only Commit C will actually
  create candidate memories
- trusted project state is never touched by reinforcement

Next: Commit C adds the rule-based extractor that produces candidate
memories from captured interactions plus the promote/reject review
queue endpoints.
2026-04-06 21:18:38 -04:00
b0889b3925 Stabilize core correctness and sync project plan state 2026-04-05 17:53:23 -04:00
b48f0c95ab feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory

Context builder integration (trust precedence per Master Plan):
  1. Trusted Project State (highest trust, 20% budget)
  2. Identity + Preference memories (10% budget)
  3. Retrieved chunks (remaining budget)

Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00