ATOCore

Author	SHA1	Message	Date
Anto01	e840ef4be3	feat: Phase 7D — confidence decay on unreferenced cold memories Daily job multiplies confidence by 0.97 (~2-month half-life) for active memories with reference_count=0 AND idle > 30 days. Below 0.3 → auto-supersede with audit. Reversible via reinforcement (which already bumps confidence back up). Rationale: stale memories currently rank equal to fresh ones in retrieval. Without decay, the brain accumulates obsolete facts that compete with fresh knowledge for context-pack slots. With decay, memories earn their longevity via reference. - decay_unreferenced_memories() in service.py (stdlib-only, no cron infra needed) - POST /admin/memory/decay-run endpoint - Nightly Step F4 in batch-extract.sh - Exempt: reinforced (refcount > 0), graduated, superseded, invalid - Audit row per supersession ("decayed below floor, no references"), actor="confidence-decay". Per-decay rows skipped (chatty, no human value — status change is the meaningful signal). - Configurable via env: ATOCORE_DECAY_* (exposed through endpoint body) Tests: +13 (basic decay, reinforcement protection, supersede at floor, audit trail, graduated/superseded exemption, reinforcement reversibility, threshold tuning, parameter validation, cross-run stacking). 401 → 414. Next in Phase 7: 7C tag canonicalization (weekly), then 7B contradiction detection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 16:50:20 -04:00
Anto01	028d4c3594	feat: Phase 7A — semantic memory dedup ("sleep cycle" V1) New table memory_merge_candidates + service functions to cluster near-duplicate active memories within (project, memory_type) buckets, draft a unified content via LLM, and merge on human approval. Source memories become superseded (never deleted); merged memory carries union of tags, max of confidence, sum of reference_count. - schema migration for memory_merge_candidates - atocore.memory.similarity: cosine + transitive clustering - atocore.memory._dedup_prompt: stdlib-only LLM prompt preserving every specific - service: merge_memories / create_merge_candidate / get_merge_candidates / reject_merge_candidate - scripts/memory_dedup.py: host-side detector (HTTP-only, idempotent) - 5 API endpoints under /admin/memory/merge-candidates* + /admin/memory/dedup-scan - triage UI: purple "🔗 Merge Candidates" section + "🔗 Scan for duplicates" bar - batch-extract.sh Step B3 (0.90 daily, 0.85 Sundays) - deploy/dalidou/dedup-watcher.sh for UI-triggered scans - 21 new tests (374 → 395) - docs/PHASE-7-MEMORY-CONSOLIDATION.md covering 7A-7H roadmap Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:30:49 -04:00
Anto01	02055e8db3	feat: Phase 6 — Living Taxonomy + Universal Capture Closes two real-use gaps: 1. "APM tool" gap: work done outside Claude Code (desktop, web, phone, other machine) was invisible to AtoCore. 2. Project discovery gap: manual JSON-file edits required to promote an emerging theme to a first-class project. B — atocore_remember MCP tool (scripts/atocore_mcp.py): - New MCP tool for universal capture from any MCP-aware client (Claude Desktop, Code, Cursor, Zed, Windsurf, etc.) - Accepts content (required) + memory_type/project/confidence/ valid_until/domain_tags (all optional with sensible defaults) - Creates a candidate memory, goes through the existing 3-tier triage (no bypass — the quality gate catches noise) - Detailed tool description guides Claude on when to invoke: "remember this", "save that for later", "don't lose this fact" - Total tools exposed by MCP server: 14 → 15 C.1 Emerging-concepts detector (scripts/detect_emerging.py): - Nightly scan of active + candidate memories for: * Unregistered project names with ≥3 memory occurrences * Top 20 domain_tags by frequency (emerging categories) * Active memories with reference_count ≥ 5 + valid_until set (reinforced transients — candidates for extension) - Writes findings to atocore/proposals/* project state entries - Emits "warning" alert via Phase 4 framework the FIRST time a new project crosses the 5-memory alert threshold (avoids spam) - Configurable via env vars: ATOCORE_EMERGING_PROJECT_MIN (default 3), ATOCORE_EMERGING_ALERT_THRESHOLD (default 5), TOP_TAGS_LIMIT (20) C.2 Registration surface (src/atocore/api/routes.py + wiki.py): - POST /admin/projects/register-emerging — one-click register with sensible defaults (ingest_roots auto-filled with vault:incoming/projects/<id>/ convention). Clears the proposal from the dashboard list on success. - Dashboard /admin/dashboard: new "proposals" section with unregistered_projects + emerging_categories + reinforced_transients. - Wiki homepage: "📋 Emerging" section rendering each unregistered project as a card with count + 2 sample memory previews + inline "📌 Register as project" button that calls the endpoint via fetch, reloads the page on success. C.3 Transient-to-durable extension (src/atocore/memory/service.py + API + cron): - New extend_reinforced_valid_until() function — scans active memories with valid_until in the next 30 days and reference_count ≥ 5. Extends expiry by 90 days. If reference_count ≥ 10, clears expiry entirely (makes permanent). Writes audit rows via the Phase 4 memory_audit framework with actor="transient-to-durable". - POST /admin/memory/extend-reinforced — API wrapper for cron. - Matches the user's intuition: "something transient becomes important if you keep coming back to it". Nightly cron (deploy/dalidou/batch-extract.sh): - Step F2: detect_emerging.py (after F pipeline summary) - Step F3: /admin/memory/extend-reinforced (before integrity check) - Both fail-open; errors don't break the pipeline. Tests: 366 → 374 (+8 for Phase 6): - 6 tests for extend_reinforced_valid_until covering: extension path, permanent path, skip far-future, skip low-refs, skip permanent memories, audit row write - 2 smoke tests for the detector (imports cleanly, handles empty DB) - MCP tool changes don't need new tests — the wrapper is pure passthrough Design decisions documented in plan file: - atocore_remember deliberately doesn't bypass triage (quality gate) - Detector is passive (surfaces proposals) not active (auto-registers) - Sensible ingest-root defaults ("vault:incoming/projects/<id>/") so registration is one-click with no file-path thinking - Extension adds 90 days rather than clearing expiry (gradual permanence earned through sustained reinforcement) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 08:08:55 -04:00
Anto01	07664bd743	feat: Phase 5A — Engineering V1 foundation First slice of the Engineering V1 sprint. Lays the schema + lifecycle plumbing so the 10 canonical queries, memory graduation, and conflict detection can land cleanly on top. Schema (src/atocore/models/database.py): - conflicts + conflict_members tables per conflict-model.md (with 5 indexes on status/project/slot/members) - memory_audit.entity_kind discriminator — same audit table serves both memories ("memory") and entities ("entity"); unified history without duplicating infrastructure - memories.graduated_to_entity_id forward pointer for graduated memories (M → E transition preserves the memory as historical pointer) Memory (src/atocore/memory/service.py): - MEMORY_STATUSES gains "graduated" — memory-entity graduation flow ready to wire in Phase 5F Engineering service (src/atocore/engineering/service.py): - RELATIONSHIP_TYPES organized into 4 families per ontology-v1.md: + Structural: contains, part_of, interfaces_with + Intent: satisfies, constrained_by, affected_by_decision, based_on_assumption (new), supersedes + Validation: analyzed_by, validated_by, supports (new), conflicts_with (new), depends_on + Provenance: described_by, updated_by_session (new), evidenced_by (new), summarized_in (new) - create_entity + create_relationship now call resolve_project_name() on write (canonicalization contract per doc) - Both accept actor= parameter for audit provenance - _audit_entity() helper uses shared memory_audit table with entity_kind="entity" — one observability layer for everything - promote_entity / reject_entity_candidate / supersede_entity — mirror the memory lifecycle exactly (same pattern, same naming) - get_entity_audit() reads from the shared table filtered by entity_kind API (src/atocore/api/routes.py): - POST /entities/{id}/promote (candidate → active) - POST /entities/{id}/reject (candidate → invalid) - GET /entities/{id}/audit (full history for one entity) - POST /entities passes actor="api-http" through Tests: 317 → 326 (9 new): - test_entity_project_canonicalization (p04 → p04-gigabit) - test_promote_entity_candidate_to_active - test_reject_entity_candidate - test_promote_active_entity_noop (only candidates promote) - test_entity_audit_log_captures_lifecycle (before/after snapshots) - test_new_relationship_types_available (6 new types present) - test_conflicts_tables_exist - test_memory_audit_has_entity_kind - test_graduated_status_accepted What's next (5B-5I, deferred): entity triage UI tab, core structure queries, the 3 killer queries, memory graduation script, conflict detection, MCP + context pack integration. See plan file. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 07:01:28 -04:00
Anto01	88f2f7c4e1	feat: Phase 4 V1 — Robustness Hardening Adds the observability + safety layer that turns AtoCore from "works until something silently breaks" into "every mutation is traceable, drift is detected, failures raise alerts." 1. Audit log (memory_audit table): - New table with id, memory_id, action, actor, before/after JSON, note, timestamp; 3 indexes for memory_id/timestamp/action - _audit_memory() helper called from every mutation: create_memory, update_memory, promote_memory, reject_candidate_memory, invalidate_memory, supersede_memory, reinforce_memory, auto_promote_reinforced, expire_stale_candidates - Action verb auto-selected: promoted/rejected/invalidated/ superseded/updated based on state transition - "actor" threaded through: api-http, human-triage, phase10-auto- promote, candidate-expiry, reinforcement, etc. - Fail-open: audit write failure logs but never breaks the mutation - GET /memory/{id}/audit: full history for one memory - GET /admin/audit/recent: last 50 mutations across the system 2. Alerts framework (src/atocore/observability/alerts.py): - emit_alert(severity, title, message, context) fans out to: - structlog logger (always) - ~/atocore-logs/alerts.log append (configurable via ATOCORE_ALERT_LOG) - project_state atocore/alert/last_{severity} (dashboard surface) - ATOCORE_ALERT_WEBHOOK POST if set (auto-detects Discord webhook format for nice embeds; generic JSON otherwise) - Every sink fail-open — one failure doesn't prevent the others - Pipeline alert step in nightly cron: harness < 85% → warning; candidate queue > 200 → warning 3. Integrity checks (scripts/integrity_check.py): - Nightly scan for drift: - Memories → missing source_chunk_id references - Duplicate active memories (same type+content+project) - project_state → missing projects - Orphaned source_chunks (no parent document) - Results persisted to atocore/status/integrity_check_result - Any finding emits a warning alert - Added as Step G in deploy/dalidou/batch-extract.sh nightly cron 4. Dashboard surfaces it all: - integrity (findings + details) - alerts (last info/warning/critical per severity) - recent_audit (last 10 mutations with actor + action + preview) Tests: 308 → 317 (9 new): - test_audit_create_logs_entry - test_audit_promote_logs_entry - test_audit_reject_logs_entry - test_audit_update_captures_before_after - test_audit_reinforce_logs_entry - test_recent_audit_returns_cross_memory_entries - test_emit_alert_writes_log_file - test_emit_alert_invalid_severity_falls_back_to_info - test_emit_alert_fails_open_on_log_write_error Deferred: formal migration framework with rollback (current additive pattern is fine for V1); memory detail wiki page with audit view (quick follow-up). To enable Discord alerts: set ATOCORE_ALERT_WEBHOOK to a Discord webhook URL in Dalidou's environment. Default = log-only. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 21:54:10 -04:00
Anto01	bfa7dba4de	feat: Phase 3 V1 — Auto-Organization (domain_tags + valid_until) Adds structural metadata that the LLM triage was already implicitly reasoning about ("stale snapshot" → reject). Phase 3 captures that reasoning as fields so it can DRIVE retrieval, not just rejection. Schema (src/atocore/models/database.py): - domain_tags TEXT DEFAULT '[]' JSON array of lowercase topic keywords - valid_until DATETIME ISO date; null = permanent - idx_memories_valid_until index for efficient expiry queries Memory service (src/atocore/memory/service.py): - Memory dataclass gains domain_tags + valid_until - create_memory, update_memory accept/persist both - _row_to_memory safely reads both (JSON-decode + null handling) - _normalize_tags helper: lowercase, dedup, strip, cap at 10 - get_memories_for_context filters expired (valid_until < today UTC) - _rank_memories_for_query adds tag-boost: memories whose domain_tags appear as substrings in query text rank higher (tertiary key after content-overlap density + absolute overlap, before confidence) LLM extractor (_llm_prompt.py → llm-0.5.0): - SYSTEM_PROMPT documents domain_tags (2-5 keywords) + valid_until (time-bounded facts get expiry dates; durable facts stay null) - normalize_candidate_item parses both fields from model output with graceful fallback for string/null/missing LLM triage (scripts/auto_triage.py): - TRIAGE_SYSTEM_PROMPT documents same two fields - parse_verdict extracts them from verdict JSON - On promote: PUT /memory/{id} with tags + valid_until BEFORE POST /memory/{id}/promote, so active memories carry them API (src/atocore/api/routes.py): - MemoryCreateRequest: adds domain_tags, valid_until - MemoryUpdateRequest: adds domain_tags, valid_until, memory_type - GET /memory response exposes domain_tags + valid_until + created_at Triage UI (src/atocore/engineering/triage_ui.py): - Renders existing tags as colored badges - Adds inline text field for tags (comma-separated) + date picker for valid_until on every candidate card - Save&Promote button persists edits via PUT then promotes - Plain Promote (and Y shortcut) also saves tags/expiry if edited Wiki (src/atocore/engineering/wiki.py): - Search now matches memory content OR domain_tags - Search results render tags as clickable badges linking to /wiki/search?q=<tag> for cross-project navigation - valid_until shown as amber "valid until YYYY-MM-DD" hint Tests: 303 → 308 (5 new for Phase 3 behavior): - test_create_memory_with_tags_and_valid_until - test_create_memory_normalizes_tags - test_update_memory_sets_tags_and_valid_until - test_get_memories_for_context_excludes_expired - test_context_builder_tag_boost_orders_results Deferred (explicitly): temporal_scope enum, source_refs memory graph, HDBSCAN clustering, memory detail wiki page, backfill of existing actives. See docs/MASTER-BRAIN-PLAN.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 21:37:01 -04:00
Anto01	775960c8c8	feat: "Make It Actually Useful" sprint — observability + Phase 10 Pipeline observability: - Retrieval harness runs nightly (Step E in batch-extract.sh) - Pipeline summary persisted to project state after each run (pipeline_last_run, pipeline_summary, retrieval_harness_result) - Dashboard enhanced: interaction total + by_client, pipeline health (last_run, hours_since, harness results, triage stats), dynamic project list from registry Phase 10 — reinforcement-based auto-promotion: - auto_promote_reinforced(): candidates with reference_count >= 3 and confidence >= 0.7 auto-graduate to active - expire_stale_candidates(): candidates unreinforced for 14+ days auto-rejected to prevent unbounded queue growth - Both wired into nightly cron (Step B2) - Batch script: scripts/auto_promote_reinforced.py (--dry-run support) Knowledge seeding: - scripts/seed_project_state.py: 26 curated Trusted Project State entries across p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space, atocore (decisions, requirements, facts, contacts, milestones) Tests: 299 → 303 (4 new Phase 10 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:59:12 -04:00
Anto01	8951c624fe	fix(R7/R9): overlap-density ranking + project trust-preservation R7: ranking scorer now uses overlap-density (overlap_count / memory_token_count) as primary key instead of raw overlap count. A 5-token memory with 3 overlapping tokens (density 0.6) now beats a 40-token overview memory with 3 overlapping tokens (density 0.075) at the same absolute count. Secondary: absolute overlap. Tertiary: confidence. Targeting p06-firmware-interface harness fixture. R9: when the LLM extractor returns a project that differs from the interaction's known project, it now checks the project registry. If the model's project is a registered canonical ID, trust it. If not (hallucinated name), fall back to the interaction's project. Uses load_project_registry() for the check. The host-side script mirrors this via an API call to GET /projects at startup. Two new tests: test_parser_keeps_registered_model_project and test_parser_rejects_hallucinated_project. Test count: 280 -> 281. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:34:33 -04:00
Anto01	5c69f77b45	fix: cap per-entry memory length at 250 chars in context band A 530-char program overview memory with confidence 0.96 was filling the entire 25% project-memory budget at equal overlap score (3 tokens), beating shorter query-relevant newly-promoted memories (confidence 0.5) on the confidence tiebreaker. The long memory legitimately scored well, but its length starved every other memory from the band. Fix: truncate each formatted entry to 250 chars with '...' so at least 2-3 memories fit the ~700-char available budget. This doesn't change ranking — the most relevant memory still goes first — but it ensures the runner-up can also appear. Harness fixture delta: Day 7 regression pass pending after deploy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 06:34:27 -04:00
Anto01	37331d53ef	fix: rank memories globally before budget walk Per-type ranking was still starving later types: when a p05 query matched a 'knowledge' memory best but 'project' came first in the type order, the project-type candidates filled the budget before the knowledge-type pool was even ranked. Collect all candidates into a single pool, dedupe by id, then rank the whole pool once against the query before walking the flat budget. Python's stable sort preserves insertion order (which still reflects the caller's memory_types order) as a natural tiebreaker when scores are equal. Regression surfaced by the retrieval eval harness: p05-vendor-signal still missing 'Zygo' after `5aeeb1c` — the vendor memory was type=knowledge but never reached the ranker because type=project consumed the budget first. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:55:10 -04:00
Anto01	5aeeb1cad1	feat: query-relevance ordering for memory selection get_memories_for_context now accepts an optional query string. When provided, candidate memories are reranked by lexical overlap with the query (stemmed token intersection, ties broken by confidence) before the budget walk. Without a query the order is unchanged — effectively "by confidence desc" as before — so non-builder callers see no behaviour change. The fetch limit is raised from 10 to 30 so there's a real pool to rerank. Token overlap reuses _normalize/_tokenize from reinforcement.py so ranking and reinforcement matching share the same notion of distinctive terms. build_context passes the user_prompt through to both the identity/ preference and project-memory calls. The retrieval harness regression the fix is targeting: - p05-vendor-signal FAIL @ `1161645`: "Zygo" missing from the pack even though an active vendor memory contained it. Root cause: higher-confidence p05 memories filled the 25% budget slice before the vendor memory ever got a chance. Query-aware ordering puts the vendor memory first when the query is about vendors. New regression test test_project_memories_query_relevance_ordering locks the behaviour in with two p05 memories and a tight budget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:47:05 -04:00
Anto01	5913da53c5	fix: flat-budget walk in get_memories_for_context The per-type slicing (available // len(memory_types)) starved paragraph-length memories: with 3 types and a 450-char budget, each type got ~131 chars while real project memories are 300-500 chars each — every entry was skipped and the new Project Memories band never appeared in the live pack. Switch to a flat budget pool walked type-by-type in order. Short identity/preference memories still get first pick when the budget is tight, but long project memories can now compete for space. Caught on the first post-deploy probe: 2 active p04 memories existed but none landed in formatted_context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:43:41 -04:00
Anto01	8ea53f4003	feat: fold project-scoped memories into context pack The retrieval-quality review on 2026-04-11 found that active project/knowledge/episodic memories never reached the pack: only Trusted Project State and identity/preference memories were being assembled. Reinforcement bumped confidence on memories that had no retrieval outlet, so the reflection loop was half-open. This change adds a third memory tier between identity/preference and retrieved chunks: - PROJECT_MEMORY_BUDGET_RATIO = 0.15 - Memory types: project, knowledge, episodic - Only populated when a canonical project is in scope — without a project hint, project memories stay out (cross-project bleed would rot the signal) - Rendered under a dedicated "--- Project Memories ---" header so the LLM can distinguish it from the identity/preference band - Trim order in _trim_context_to_budget: retrieval → project memories → identity/preference → project state (most recently added tier drops first when budget is tight) get_memories_for_context gains header/footer kwargs so the two memory blocks can be distinguished in a single pack without a second helper. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:35:40 -04:00
Anto01	fb6298a9a1	fix(P1+P2): canonicalize project names at every trust boundary Three findings from codex's review of the previous P1+P2 fix. The earlier commit (`f2372ef`) only fixed alias resolution at the context builder. Codex correctly pointed out that the same fragmentation applies at every other place a project name crosses a boundary — project_state writes/reads, interaction capture/listing/filtering, memory create/queries, and reinforcement's downstream queries. Plus a real bug in the interaction `since` filter where the storage format and the documented ISO format don't compare cleanly. The fix is one helper used at every boundary instead of duplicating the resolution inline. New helper: src/atocore/projects/registry.py::resolve_project_name --------------------------------------------------------------- - Single canonicalization boundary for project names - Returns the canonical project_id when the input matches any registered id or alias - Returns the input unchanged for empty/None and for unregistered names (preserves backwards compat with hand-curated state that predates the registry) - Documented as the contract that every read/write at the trust boundary should pass through P1 — Trusted Project State endpoints ------------------------------------ src/atocore/context/project_state.py: set_state, get_state, and invalidate_state now all canonicalize project_name through resolve_project_name BEFORE looking up or creating the project row. Before this fix: - POST /project/state with project="p05" called ensure_project("p05") which created a separate row in the projects table - The state row was attached to that alias project_id - Later context builds canonicalized "p05" -> "p05-interferometer" via the builder fix from `f2372ef` and never found the state - Result: trusted state silently fragmented across alias rows After this fix: - The alias is resolved to the canonical id at every entry point - Two captures (one via "p05", one via "p05-interferometer") write to the same row - get_state via either alias or the canonical id finds the same row Fixes the highest-priority gap codex flagged because Trusted Project State is supposed to be the most dependable layer in the AtoCore trust hierarchy. P2.a — Interaction capture project canonicalization ---------------------------------------------------- src/atocore/interactions/service.py: record_interaction now canonicalizes project before storing, so interaction.project is always the canonical id regardless of what the client passed. Downstream effects: - reinforce_from_interaction queries memories by interaction.project -> previously missed memories stored under canonical id -> now consistent because interaction.project IS the canonical id - the extractor stamps candidates with interaction.project -> previously created candidates in alias buckets -> now creates candidates in the canonical bucket - list_interactions(project=alias) was already broken, now fixed by canonicalizing the filter input on the read side too Memory service applied the same fix: - src/atocore/memory/service.py: create_memory and get_memories both canonicalize project through resolve_project_name - This keeps stored memory.project consistent with the reinforcement query path P2.b — Interaction `since` filter format normalization ------------------------------------------------------ src/atocore/interactions/service.py: new _normalize_since helper. The bug: - created_at is stored as 'YYYY-MM-DD HH:MM:SS' (no timezone, UTC by convention) so it sorts lexically and compares cleanly with the SQLite CURRENT_TIMESTAMP default - The `since` parameter was documented as ISO 8601 but compared as a raw string against the storage format - The lexically-greater 'T' separator means an ISO timestamp like '2026-04-07T12:00:00Z' is GREATER than the storage form '2026-04-07 12:00:00' for the same instant - Result: a client passing ISO `since` got an empty result for any row from the same day, even though those rows existed and were technically "after" the cutoff in real-world time The fix: - _normalize_since accepts ISO 8601 with T, optional Z suffix, optional fractional seconds, optional +HH:MM offsets - Uses datetime.fromisoformat for parsing (Python 3.11+) - Converts to UTC and reformats as the storage format before the SQL comparison - The bare storage format still works (backwards compat path is a regex match that returns the input unchanged) - Unparseable input is returned as-is so the comparison degrades gracefully (rows just don't match) instead of raising and breaking the listing endpoint builder.py refactor ------------------- The previous P1 fix had inline canonicalization. Now it uses the shared helper for consistency: - import changed from get_registered_project to resolve_project_name - the inline lookup is replaced with a single helper call - the comment block now points at representation-authority.md for the canonicalization contract New shared test fixture: tests/conftest.py::project_registry ------------------------------------------------------------ - Standardizes the registry-setup pattern that was duplicated across test_context_builder.py, test_project_state.py, test_interactions.py, and test_reinforcement.py - Returns a callable that takes (project_id, [aliases]) tuples and writes them into a temp registry file with the env var pointed at it and config.settings reloaded - Used by all 12 new regression tests in this commit Tests (12 new, all green on first run) -------------------------------------- test_project_state.py: - test_set_state_canonicalizes_alias: write via alias, read via every alias and the canonical id, verify same row id - test_get_state_canonicalizes_alias_after_canonical_write - test_invalidate_state_canonicalizes_alias - test_unregistered_project_state_still_works (backwards compat) test_interactions.py: - test_record_interaction_canonicalizes_project - test_list_interactions_canonicalizes_project_filter - test_list_interactions_since_accepts_iso_with_t_separator - test_list_interactions_since_accepts_z_suffix - test_list_interactions_since_accepts_offset - test_list_interactions_since_storage_format_still_works test_reinforcement.py: - test_reinforcement_works_when_capture_uses_alias (end-to-end: capture under alias, seed memory under canonical, verify reinforcement matches) - test_get_memories_filter_by_alias Full suite: 174 passing (was 162), 1 warning. The +12 is the new regression tests, no existing tests regressed. What's still NOT canonicalized (and why) ---------------------------------------- - _rank_chunks's secondary substring boost in builder.py — the retriever already does the right thing via its own _project_match_boost which calls get_registered_project. The redundant secondary boost still uses the raw hint but it's a multiplicative factor on top of correct retrieval, not a filter, so it can't drop relevant chunks. Tracked as a future cleanup but not a P1. - update_memory's project field (you can't change a memory's project after creation in the API anyway). - The retriever's project_hint parameter on direct /query calls — same reasoning as the builder boost, plus the retriever's own get_registered_project call already handles aliases there.	2026-04-07 08:29:33 -04:00
Anto01	2704997256	feat(phase9-B): reinforce active memories from captured interactions Phase 9 Commit B from the agreed plan. With Commit A capturing what AtoCore fed to the LLM and what came back, this commit closes the weakest part of the loop: when a memory is actually referenced in a response, its confidence should drift up, and stale memories that nobody ever mentions should stay where they are. This is reinforcement only — nothing is promoted into trusted state and no candidates are created. Extraction is Commit C. Schema (additive migration): - memories.last_referenced_at DATETIME (null by default) - memories.reference_count INTEGER DEFAULT 0 - idx_memories_last_referenced on last_referenced_at - memories.status now accepts the new "candidate" value so Commit C has the status slot to land on. Existing active/superseded/invalid rows are untouched. New module: src/atocore/memory/reinforcement.py - reinforce_from_interaction(interaction): scans the interaction's response + response_summary for echoes of active memories and bumps confidence / reference_count for each match - matching is intentionally simple and explainable: * normalize both sides (lowercase, collapse whitespace) * require >= 12 chars of memory content to match * compare the leading 80-char window of each memory - the candidate pool is project-scoped memories for the interaction's project + global identity + preference memories, deduplicated - candidates and invalidated memories are NEVER reinforced; only active memories move Memory service changes: - MEMORY_STATUSES = ["candidate", "active", "superseded", "invalid"] - create_memory(status="candidate"\|"active"\|...) with per-status duplicate scoping so a candidate and an active with identical text can legitimately coexist during review - get_memories(status=...) explicit override of the legacy active_only flag; callers can now list the review queue cleanly - update_memory accepts any valid status including "candidate" - reinforce_memory(id, delta): low-level primitive that bumps confidence (capped at 1.0), increments reference_count, and sets last_referenced_at. Only active memories; returns (applied, old, new) - promote_memory / reject_candidate_memory helpers prepping Commit C Interactions service: - record_interaction(reinforce=True) runs reinforce_from_interaction automatically when the interaction has response content. reinforcement errors are logged but never raised back to the caller so capture itself is never blocked by a flaky downstream. - circular import between interactions service and memory.reinforcement avoided by lazy import inside the function API: - POST /interactions now accepts a reinforce bool field (default true) - POST /interactions/{id}/reinforce runs reinforcement on an existing captured interaction — useful for backfilling or for retrying after a transient error in the automatic pass - response lists which memory ids were reinforced with old / new confidence for audit Tests (17 new, all green): - reinforce_memory bumps, caps at 1.0, accumulates reference_count - reinforce_memory rejects candidates and missing ids - reinforce_memory rejects negative delta - reinforce_from_interaction matches active memory - reinforce_from_interaction ignores candidates and inactive - reinforce_from_interaction requires minimum content length - reinforce_from_interaction handles empty response cleanly - reinforce_from_interaction normalizes casing and whitespace - reinforce_from_interaction deduplicates across memory buckets - record_interaction auto-reinforces by default - record_interaction reinforce=False skips the pass - record_interaction handles empty response - POST /interactions/{id}/reinforce runs against stored interaction - POST /interactions/{id}/reinforce returns 404 for missing id - POST /interactions accepts reinforce=false Full suite: 135 passing (was 118). Trust model unchanged: - reinforcement only moves confidence within the existing active set - the candidate lifecycle is declared but only Commit C will actually create candidate memories - trusted project state is never touched by reinforcement Next: Commit C adds the rule-based extractor that produces candidate memories from captured interactions plus the promote/reject review queue endpoints.	2026-04-06 21:18:38 -04:00
Anto01	b0889b3925	Stabilize core correctness and sync project plan state	2026-04-05 17:53:23 -04:00
Anto01	b48f0c95ab	feat: Phase 2 Memory Core — structured memory with context integration Memory Core implementation: - Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation - CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede - Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid) - Memory API endpoints: POST/GET/PUT/DELETE /memory Context builder integration (trust precedence per Master Plan): 1. Trusted Project State (highest trust, 20% budget) 2. Identity + Preference memories (10% budget) 3. Retrieved chunks (remaining budget) Also fixed database.py to use dynamic settings reference for test isolation. 45/45 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 09:54:52 -04:00

17 Commits