feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
"""SQLite database schema and connection management."""
|
|
|
|
|
|
|
|
|
|
import sqlite3
|
|
|
|
|
from contextlib import contextmanager
|
|
|
|
|
from pathlib import Path
|
|
|
|
|
from typing import Generator
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
import atocore.config as _config
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
from atocore.observability.logger import get_logger
|
|
|
|
|
|
|
|
|
|
log = get_logger("database")
|
|
|
|
|
|
|
|
|
|
SCHEMA_SQL = """
|
|
|
|
|
CREATE TABLE IF NOT EXISTS source_documents (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
file_path TEXT UNIQUE NOT NULL,
|
|
|
|
|
file_hash TEXT NOT NULL,
|
|
|
|
|
title TEXT,
|
|
|
|
|
doc_type TEXT DEFAULT 'markdown',
|
|
|
|
|
tags TEXT DEFAULT '[]',
|
|
|
|
|
ingested_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
|
|
|
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
CREATE TABLE IF NOT EXISTS source_chunks (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
document_id TEXT NOT NULL REFERENCES source_documents(id) ON DELETE CASCADE,
|
|
|
|
|
chunk_index INTEGER NOT NULL,
|
|
|
|
|
content TEXT NOT NULL,
|
|
|
|
|
heading_path TEXT DEFAULT '',
|
|
|
|
|
char_count INTEGER NOT NULL,
|
|
|
|
|
metadata TEXT DEFAULT '{}',
|
|
|
|
|
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
CREATE TABLE IF NOT EXISTS memories (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
memory_type TEXT NOT NULL,
|
|
|
|
|
content TEXT NOT NULL,
|
2026-04-05 17:53:23 -04:00
|
|
|
project TEXT DEFAULT '',
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
source_chunk_id TEXT REFERENCES source_chunks(id),
|
|
|
|
|
confidence REAL DEFAULT 1.0,
|
|
|
|
|
status TEXT DEFAULT 'active',
|
feat(phase9-B): reinforce active memories from captured interactions
Phase 9 Commit B from the agreed plan. With Commit A capturing what
AtoCore fed to the LLM and what came back, this commit closes the
weakest part of the loop: when a memory is actually referenced in a
response, its confidence should drift up, and stale memories that
nobody ever mentions should stay where they are.
This is reinforcement only — nothing is promoted into trusted state
and no candidates are created. Extraction is Commit C.
Schema (additive migration):
- memories.last_referenced_at DATETIME (null by default)
- memories.reference_count INTEGER DEFAULT 0
- idx_memories_last_referenced on last_referenced_at
- memories.status now accepts the new "candidate" value so Commit C
has the status slot to land on. Existing active/superseded/invalid
rows are untouched.
New module: src/atocore/memory/reinforcement.py
- reinforce_from_interaction(interaction): scans the interaction's
response + response_summary for echoes of active memories and
bumps confidence / reference_count for each match
- matching is intentionally simple and explainable:
* normalize both sides (lowercase, collapse whitespace)
* require >= 12 chars of memory content to match
* compare the leading 80-char window of each memory
- the candidate pool is project-scoped memories for the interaction's
project + global identity + preference memories, deduplicated
- candidates and invalidated memories are NEVER reinforced; only
active memories move
Memory service changes:
- MEMORY_STATUSES = ["candidate", "active", "superseded", "invalid"]
- create_memory(status="candidate"|"active"|...) with per-status
duplicate scoping so a candidate and an active with identical text
can legitimately coexist during review
- get_memories(status=...) explicit override of the legacy active_only
flag; callers can now list the review queue cleanly
- update_memory accepts any valid status including "candidate"
- reinforce_memory(id, delta): low-level primitive that bumps
confidence (capped at 1.0), increments reference_count, and sets
last_referenced_at. Only active memories; returns (applied, old, new)
- promote_memory / reject_candidate_memory helpers prepping Commit C
Interactions service:
- record_interaction(reinforce=True) runs reinforce_from_interaction
automatically when the interaction has response content. reinforcement
errors are logged but never raised back to the caller so capture
itself is never blocked by a flaky downstream.
- circular import between interactions service and memory.reinforcement
avoided by lazy import inside the function
API:
- POST /interactions now accepts a reinforce bool field (default true)
- POST /interactions/{id}/reinforce runs reinforcement on an existing
captured interaction — useful for backfilling or for retrying after
a transient error in the automatic pass
- response lists which memory ids were reinforced with
old / new confidence for audit
Tests (17 new, all green):
- reinforce_memory bumps, caps at 1.0, accumulates reference_count
- reinforce_memory rejects candidates and missing ids
- reinforce_memory rejects negative delta
- reinforce_from_interaction matches active memory
- reinforce_from_interaction ignores candidates and inactive
- reinforce_from_interaction requires minimum content length
- reinforce_from_interaction handles empty response cleanly
- reinforce_from_interaction normalizes casing and whitespace
- reinforce_from_interaction deduplicates across memory buckets
- record_interaction auto-reinforces by default
- record_interaction reinforce=False skips the pass
- record_interaction handles empty response
- POST /interactions/{id}/reinforce runs against stored interaction
- POST /interactions/{id}/reinforce returns 404 for missing id
- POST /interactions accepts reinforce=false
Full suite: 135 passing (was 118).
Trust model unchanged:
- reinforcement only moves confidence within the existing active set
- the candidate lifecycle is declared but only Commit C will actually
create candidate memories
- trusted project state is never touched by reinforcement
Next: Commit C adds the rule-based extractor that produces candidate
memories from captured interactions plus the promote/reject review
queue endpoints.
2026-04-06 21:18:38 -04:00
|
|
|
last_referenced_at DATETIME,
|
|
|
|
|
reference_count INTEGER DEFAULT 0,
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
|
|
|
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
CREATE TABLE IF NOT EXISTS projects (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
name TEXT UNIQUE NOT NULL,
|
|
|
|
|
description TEXT DEFAULT '',
|
|
|
|
|
status TEXT DEFAULT 'active',
|
|
|
|
|
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
|
|
|
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
CREATE TABLE IF NOT EXISTS interactions (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
prompt TEXT NOT NULL,
|
|
|
|
|
context_pack TEXT DEFAULT '{}',
|
|
|
|
|
response_summary TEXT DEFAULT '',
|
feat(phase9-A): interaction capture loop foundation
Phase 9 Commit A from the agreed plan: turn AtoCore from a stateless
context enhancer into a system that records what it actually fed to an
LLM and what came back. This is the audit trail Reflection (Commit B)
and Extraction (Commit C) will be layered on top of.
The interactions table existed in the schema since the original PoC
but nothing wrote to it. This change makes it real:
Schema migration (additive only):
- response full LLM response (caller decides how much)
- memories_used JSON list of memory ids in the context pack
- chunks_used JSON list of chunk ids in the context pack
- client identifier of the calling system
(openclaw, claude-code, manual, ...)
- session_id groups multi-turn conversations
- project project name (mirrors the memory module pattern,
no FK so capture stays cheap)
- indexes on session_id, project, created_at
The created_at column is now written explicitly with a SQLite-compatible
'YYYY-MM-DD HH:MM:SS' format so the same string lives in the DB and the
returned dataclass. Without this the `since` filter on list_interactions
would silently fail because CURRENT_TIMESTAMP and isoformat use different
shapes that do not compare cleanly as strings.
New module src/atocore/interactions/:
- Interaction dataclass
- record_interaction() persists one round-trip (prompt required;
everything else optional). Refuses empty prompts.
- list_interactions() filters by project / session_id / client / since,
newest-first, hard-capped at 500
- get_interaction() fetch by id, full response + context pack
API endpoints:
- POST /interactions capture one interaction
- GET /interactions list with summaries (no full response)
- GET /interactions/{id} full record incl. response + pack
Trust model:
- Capture is read-only with respect to memories, project state, and
source chunks. Nothing here promotes anything into trusted state.
- The audit trail becomes the dataset Commit B (reinforcement) and
Commit C (extraction + review queue) will operate on.
Tests (13 new, all green):
- service: persist + roundtrip every field
- service: minimum-fields path (prompt only)
- service: empty / whitespace prompt rejected
- service: get by id returns None for missing
- service: filter by project, session, client
- service: ordering newest-first with limit
- service: since filter inclusive on cutoff (the bug the timestamp
fix above caught)
- service: limit=0 returns empty
- API: POST records and round-trips through GET /interactions/{id}
- API: empty prompt returns 400
- API: missing id returns 404
- API: list filter returns summaries (not full response bodies)
Full suite: 118 passing (was 105).
master-plan-status.md updated to move Phase 9 from "not started" to
"started" with the explicit note that Commit A is in and Commits B/C
remain.
2026-04-06 19:31:43 -04:00
|
|
|
response TEXT DEFAULT '',
|
|
|
|
|
memories_used TEXT DEFAULT '[]',
|
|
|
|
|
chunks_used TEXT DEFAULT '[]',
|
|
|
|
|
client TEXT DEFAULT '',
|
|
|
|
|
session_id TEXT DEFAULT '',
|
|
|
|
|
project TEXT DEFAULT '',
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
project_id TEXT REFERENCES projects(id),
|
|
|
|
|
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
|
|
|
);
|
|
|
|
|
|
fix: schema init ordering, deploy.sh default, client BASE_URL docs
Three issues Dalidou Claude surfaced during the first real deploy
of commit e877e5b to the live service (report from 2026-04-08).
Bug 1 was the critical one — a schema init ordering bug that would
have bitten every future upgrade from a pre-Phase-9 schema — and
the other two were usability traps around hostname resolution.
Bug 1 (CRITICAL): schema init ordering
--------------------------------------
src/atocore/models/database.py
SCHEMA_SQL contained CREATE INDEX statements that referenced
columns added later by _apply_migrations():
CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project);
CREATE INDEX IF NOT EXISTS idx_interactions_project_name ON interactions(project);
CREATE INDEX IF NOT EXISTS idx_interactions_session ON interactions(session_id);
On a FRESH install, CREATE TABLE IF NOT EXISTS creates the tables
with the Phase 9 shape (columns present), so the CREATE INDEX runs
cleanly and _apply_migrations is effectively a no-op.
On an UPGRADE from a pre-Phase-9 schema, CREATE TABLE IF NOT EXISTS
is a no-op (the tables already exist in the old shape), the columns
are NOT added yet, and the CREATE INDEX fails with
"OperationalError: no such column: project" before
_apply_migrations gets a chance to add the columns.
Dalidou Claude hit this exactly when redeploying from 0.1.0 to
0.2.0 — had to manually ALTER TABLE to add the Phase 9 columns
before the container could start.
The fix is to remove the Phase 9-column indexes from SCHEMA_SQL.
They already exist in _apply_migrations() AFTER the corresponding
ALTER TABLE, so they still get created on both fresh and upgrade
paths — just after the columns exist, not before.
Indexes still in SCHEMA_SQL (all safe — reference columns that
have existed since the first release):
- idx_chunks_document on source_chunks(document_id)
- idx_memories_type on memories(memory_type)
- idx_memories_status on memories(status)
- idx_interactions_project on interactions(project_id)
Indexes moved to _apply_migrations (already there — just no longer
duplicated in SCHEMA_SQL):
- idx_memories_project on memories(project)
- idx_interactions_project_name on interactions(project)
- idx_interactions_session on interactions(session_id)
- idx_interactions_created_at on interactions(created_at)
Regression test: tests/test_database.py
---------------------------------------
New test_init_db_upgrades_pre_phase9_schema_without_failing:
- Seeds the DB with the exact pre-Phase-9 shape (no project /
last_referenced_at / reference_count on memories; no project /
client / session_id / response / memories_used / chunks_used on
interactions)
- Calls init_db() — which used to raise OperationalError before
the fix
- Verifies all Phase 9 columns are present after the call
- Verifies the migration indexes exist
Before the fix this test would have failed with
"OperationalError: no such column: project" on the init_db call.
After the fix it passes. This locks the invariant "init_db is
safe on any legacy schema shape" so the bug can't silently come
back.
Full suite: 216 passing (was 215), 1 warning. The +1 is the new
regression test.
Bug 3 (usability): deploy.sh DNS default
----------------------------------------
deploy/dalidou/deploy.sh
ATOCORE_GIT_REMOTE defaulted to http://dalidou:3000/Antoine/ATOCore.git
which requires the "dalidou" hostname to resolve. On the Dalidou
host itself it didn't (no /etc/hosts entry for localhost alias),
so deploy.sh had to be run with the IP as a manual workaround.
Fix: default ATOCORE_GIT_REMOTE to http://127.0.0.1:3000/Antoine/ATOCore.git.
Loopback always works on the host running the script. Callers
from a remote host (e.g. running deploy.sh from a laptop against
the Dalidou LAN) set ATOCORE_GIT_REMOTE explicitly. The script
header's Environment Variables section documents this with an
explicit reference to the 2026-04-08 Dalidou deploy report so the
rationale isn't lost.
docs/dalidou-deployment.md gets a new "Troubleshooting hostname
resolution" subsection and a new example invocation showing how
to deploy from a remote host with an explicit ATOCORE_GIT_REMOTE
override.
Bug 2 (usability): atocore_client.py ATOCORE_BASE_URL documentation
-------------------------------------------------------------------
scripts/atocore_client.py
Same class of issue as bug 3. BASE_URL defaults to
http://dalidou:8100 which resolves fine from a remote caller
(laptop, T420/OpenClaw over Tailscale) but NOT from the Dalidou
host itself or from inside the atocore container. Dalidou Claude
saw the CLI return
{"status": "unavailable", "fail_open": true}
while direct curl to http://127.0.0.1:8100 worked.
The fix here is NOT to change the default (remote callers are
the common case and would break) but to DOCUMENT the override
clearly so the next operator knows what's happening:
- The script module docstring grew a new "Environment variables"
section covering ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS,
ATOCORE_REFRESH_TIMEOUT_SECONDS, and ATOCORE_FAIL_OPEN, with
the explicit override example for on-host/in-container use
- It calls out the exact symptom (fail-open envelope when the
base URL doesn't resolve) so the diagnosis is obvious from
the error alone
- docs/dalidou-deployment.md troubleshooting section mirrors
this guidance so there's one place to look regardless of
whether the operator starts with the client help or the
deploy doc
What this commit does NOT do
----------------------------
- Does NOT change the default ATOCORE_BASE_URL. Doing that would
break the T420 OpenClaw helper and every remote caller who
currently relies on the hostname. Documentation is the right
fix for this case.
- Does NOT fix /etc/hosts on Dalidou. That's a host-level
configuration issue that the user can fix if they prefer
having the hostname resolve; the deploy.sh fix makes it
unnecessary regardless.
- Does NOT re-run the validation on Dalidou. The next step is
for the live service to pull this commit via deploy.sh (which
should now work without the IP workaround) and re-run the
Phase 9 loop test to confirm nothing regressed.
2026-04-08 19:02:57 -04:00
|
|
|
-- Indexes that reference columns guaranteed to exist since the first
|
|
|
|
|
-- release ship here. Indexes that reference columns added by later
|
|
|
|
|
-- migrations (memories.project, interactions.project,
|
|
|
|
|
-- interactions.session_id) are created inside _apply_migrations AFTER
|
|
|
|
|
-- the corresponding ALTER TABLE, NOT here. Creating them here would
|
|
|
|
|
-- fail on upgrade from a pre-migration schema because CREATE TABLE
|
|
|
|
|
-- IF NOT EXISTS is a no-op on an existing table, so the new columns
|
|
|
|
|
-- wouldn't be added before the CREATE INDEX runs.
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
CREATE INDEX IF NOT EXISTS idx_chunks_document ON source_chunks(document_id);
|
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_memories_type ON memories(memory_type);
|
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_memories_status ON memories(status);
|
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_interactions_project ON interactions(project_id);
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _ensure_data_dir() -> None:
|
2026-04-05 18:33:52 -04:00
|
|
|
_config.ensure_runtime_dirs()
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def init_db() -> None:
|
|
|
|
|
"""Initialize the database with schema."""
|
|
|
|
|
_ensure_data_dir()
|
|
|
|
|
with get_connection() as conn:
|
|
|
|
|
conn.executescript(SCHEMA_SQL)
|
2026-04-05 17:53:23 -04:00
|
|
|
_apply_migrations(conn)
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
log.info("database_initialized", path=str(_config.settings.db_path))
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
|
2026-04-05 17:53:23 -04:00
|
|
|
def _apply_migrations(conn: sqlite3.Connection) -> None:
|
|
|
|
|
"""Apply lightweight schema migrations for existing local databases."""
|
|
|
|
|
if not _column_exists(conn, "memories", "project"):
|
|
|
|
|
conn.execute("ALTER TABLE memories ADD COLUMN project TEXT DEFAULT ''")
|
|
|
|
|
conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project)")
|
|
|
|
|
|
feat(phase9-B): reinforce active memories from captured interactions
Phase 9 Commit B from the agreed plan. With Commit A capturing what
AtoCore fed to the LLM and what came back, this commit closes the
weakest part of the loop: when a memory is actually referenced in a
response, its confidence should drift up, and stale memories that
nobody ever mentions should stay where they are.
This is reinforcement only — nothing is promoted into trusted state
and no candidates are created. Extraction is Commit C.
Schema (additive migration):
- memories.last_referenced_at DATETIME (null by default)
- memories.reference_count INTEGER DEFAULT 0
- idx_memories_last_referenced on last_referenced_at
- memories.status now accepts the new "candidate" value so Commit C
has the status slot to land on. Existing active/superseded/invalid
rows are untouched.
New module: src/atocore/memory/reinforcement.py
- reinforce_from_interaction(interaction): scans the interaction's
response + response_summary for echoes of active memories and
bumps confidence / reference_count for each match
- matching is intentionally simple and explainable:
* normalize both sides (lowercase, collapse whitespace)
* require >= 12 chars of memory content to match
* compare the leading 80-char window of each memory
- the candidate pool is project-scoped memories for the interaction's
project + global identity + preference memories, deduplicated
- candidates and invalidated memories are NEVER reinforced; only
active memories move
Memory service changes:
- MEMORY_STATUSES = ["candidate", "active", "superseded", "invalid"]
- create_memory(status="candidate"|"active"|...) with per-status
duplicate scoping so a candidate and an active with identical text
can legitimately coexist during review
- get_memories(status=...) explicit override of the legacy active_only
flag; callers can now list the review queue cleanly
- update_memory accepts any valid status including "candidate"
- reinforce_memory(id, delta): low-level primitive that bumps
confidence (capped at 1.0), increments reference_count, and sets
last_referenced_at. Only active memories; returns (applied, old, new)
- promote_memory / reject_candidate_memory helpers prepping Commit C
Interactions service:
- record_interaction(reinforce=True) runs reinforce_from_interaction
automatically when the interaction has response content. reinforcement
errors are logged but never raised back to the caller so capture
itself is never blocked by a flaky downstream.
- circular import between interactions service and memory.reinforcement
avoided by lazy import inside the function
API:
- POST /interactions now accepts a reinforce bool field (default true)
- POST /interactions/{id}/reinforce runs reinforcement on an existing
captured interaction — useful for backfilling or for retrying after
a transient error in the automatic pass
- response lists which memory ids were reinforced with
old / new confidence for audit
Tests (17 new, all green):
- reinforce_memory bumps, caps at 1.0, accumulates reference_count
- reinforce_memory rejects candidates and missing ids
- reinforce_memory rejects negative delta
- reinforce_from_interaction matches active memory
- reinforce_from_interaction ignores candidates and inactive
- reinforce_from_interaction requires minimum content length
- reinforce_from_interaction handles empty response cleanly
- reinforce_from_interaction normalizes casing and whitespace
- reinforce_from_interaction deduplicates across memory buckets
- record_interaction auto-reinforces by default
- record_interaction reinforce=False skips the pass
- record_interaction handles empty response
- POST /interactions/{id}/reinforce runs against stored interaction
- POST /interactions/{id}/reinforce returns 404 for missing id
- POST /interactions accepts reinforce=false
Full suite: 135 passing (was 118).
Trust model unchanged:
- reinforcement only moves confidence within the existing active set
- the candidate lifecycle is declared but only Commit C will actually
create candidate memories
- trusted project state is never touched by reinforcement
Next: Commit C adds the rule-based extractor that produces candidate
memories from captured interactions plus the promote/reject review
queue endpoints.
2026-04-06 21:18:38 -04:00
|
|
|
# Phase 9 Commit B: reinforcement columns.
|
|
|
|
|
# last_referenced_at records when a memory was most recently referenced
|
|
|
|
|
# in a captured interaction; reference_count is a monotonically
|
|
|
|
|
# increasing counter bumped on every reference. Together they let
|
|
|
|
|
# Reflection (Commit C) and decay (deferred) reason about which
|
|
|
|
|
# memories are actually being used versus which have gone cold.
|
|
|
|
|
if not _column_exists(conn, "memories", "last_referenced_at"):
|
|
|
|
|
conn.execute("ALTER TABLE memories ADD COLUMN last_referenced_at DATETIME")
|
|
|
|
|
if not _column_exists(conn, "memories", "reference_count"):
|
|
|
|
|
conn.execute("ALTER TABLE memories ADD COLUMN reference_count INTEGER DEFAULT 0")
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_memories_last_referenced ON memories(last_referenced_at)"
|
|
|
|
|
)
|
|
|
|
|
|
feat: Phase 3 V1 — Auto-Organization (domain_tags + valid_until)
Adds structural metadata that the LLM triage was already implicitly
reasoning about ("stale snapshot" → reject). Phase 3 captures that
reasoning as fields so it can DRIVE retrieval, not just rejection.
Schema (src/atocore/models/database.py):
- domain_tags TEXT DEFAULT '[]' JSON array of lowercase topic keywords
- valid_until DATETIME ISO date; null = permanent
- idx_memories_valid_until index for efficient expiry queries
Memory service (src/atocore/memory/service.py):
- Memory dataclass gains domain_tags + valid_until
- create_memory, update_memory accept/persist both
- _row_to_memory safely reads both (JSON-decode + null handling)
- _normalize_tags helper: lowercase, dedup, strip, cap at 10
- get_memories_for_context filters expired (valid_until < today UTC)
- _rank_memories_for_query adds tag-boost: memories whose domain_tags
appear as substrings in query text rank higher (tertiary key after
content-overlap density + absolute overlap, before confidence)
LLM extractor (_llm_prompt.py → llm-0.5.0):
- SYSTEM_PROMPT documents domain_tags (2-5 keywords) + valid_until
(time-bounded facts get expiry dates; durable facts stay null)
- normalize_candidate_item parses both fields from model output with
graceful fallback for string/null/missing
LLM triage (scripts/auto_triage.py):
- TRIAGE_SYSTEM_PROMPT documents same two fields
- parse_verdict extracts them from verdict JSON
- On promote: PUT /memory/{id} with tags + valid_until BEFORE
POST /memory/{id}/promote, so active memories carry them
API (src/atocore/api/routes.py):
- MemoryCreateRequest: adds domain_tags, valid_until
- MemoryUpdateRequest: adds domain_tags, valid_until, memory_type
- GET /memory response exposes domain_tags + valid_until + created_at
Triage UI (src/atocore/engineering/triage_ui.py):
- Renders existing tags as colored badges
- Adds inline text field for tags (comma-separated) + date picker for
valid_until on every candidate card
- Save&Promote button persists edits via PUT then promotes
- Plain Promote (and Y shortcut) also saves tags/expiry if edited
Wiki (src/atocore/engineering/wiki.py):
- Search now matches memory content OR domain_tags
- Search results render tags as clickable badges linking to
/wiki/search?q=<tag> for cross-project navigation
- valid_until shown as amber "valid until YYYY-MM-DD" hint
Tests: 303 → 308 (5 new for Phase 3 behavior):
- test_create_memory_with_tags_and_valid_until
- test_create_memory_normalizes_tags
- test_update_memory_sets_tags_and_valid_until
- test_get_memories_for_context_excludes_expired
- test_context_builder_tag_boost_orders_results
Deferred (explicitly): temporal_scope enum, source_refs memory graph,
HDBSCAN clustering, memory detail wiki page, backfill of existing
actives. See docs/MASTER-BRAIN-PLAN.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:37:01 -04:00
|
|
|
# Phase 3 (Auto-Organization V1): domain tags + expiry.
|
|
|
|
|
# domain_tags is a JSON array of lowercase strings (optics, mechanics,
|
|
|
|
|
# firmware, business, etc.) inferred by the LLM during triage. Used for
|
|
|
|
|
# cross-project retrieval: a query about "optics" can surface matches from
|
|
|
|
|
# p04 + p05 + p06 without knowing all the project names.
|
|
|
|
|
# valid_until is an ISO UTC timestamp beyond which the memory is
|
|
|
|
|
# considered stale. get_memories_for_context filters these out of context
|
|
|
|
|
# packs automatically so ephemeral facts (status snapshots, weekly counts)
|
|
|
|
|
# don't pollute grounding once they've aged out.
|
|
|
|
|
if not _column_exists(conn, "memories", "domain_tags"):
|
|
|
|
|
conn.execute("ALTER TABLE memories ADD COLUMN domain_tags TEXT DEFAULT '[]'")
|
|
|
|
|
if not _column_exists(conn, "memories", "valid_until"):
|
|
|
|
|
conn.execute("ALTER TABLE memories ADD COLUMN valid_until DATETIME")
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_memories_valid_until ON memories(valid_until)"
|
|
|
|
|
)
|
|
|
|
|
|
feat: Phase 5A — Engineering V1 foundation
First slice of the Engineering V1 sprint. Lays the schema + lifecycle
plumbing so the 10 canonical queries, memory graduation, and conflict
detection can land cleanly on top.
Schema (src/atocore/models/database.py):
- conflicts + conflict_members tables per conflict-model.md (with 5
indexes on status/project/slot/members)
- memory_audit.entity_kind discriminator — same audit table serves
both memories ("memory") and entities ("entity"); unified history
without duplicating infrastructure
- memories.graduated_to_entity_id forward pointer for graduated
memories (M → E transition preserves the memory as historical
pointer)
Memory (src/atocore/memory/service.py):
- MEMORY_STATUSES gains "graduated" — memory-entity graduation flow
ready to wire in Phase 5F
Engineering service (src/atocore/engineering/service.py):
- RELATIONSHIP_TYPES organized into 4 families per ontology-v1.md:
+ Structural: contains, part_of, interfaces_with
+ Intent: satisfies, constrained_by, affected_by_decision,
based_on_assumption (new), supersedes
+ Validation: analyzed_by, validated_by, supports (new),
conflicts_with (new), depends_on
+ Provenance: described_by, updated_by_session (new),
evidenced_by (new), summarized_in (new)
- create_entity + create_relationship now call resolve_project_name()
on write (canonicalization contract per doc)
- Both accept actor= parameter for audit provenance
- _audit_entity() helper uses shared memory_audit table with
entity_kind="entity" — one observability layer for everything
- promote_entity / reject_entity_candidate / supersede_entity —
mirror the memory lifecycle exactly (same pattern, same naming)
- get_entity_audit() reads from the shared table filtered by
entity_kind
API (src/atocore/api/routes.py):
- POST /entities/{id}/promote (candidate → active)
- POST /entities/{id}/reject (candidate → invalid)
- GET /entities/{id}/audit (full history for one entity)
- POST /entities passes actor="api-http" through
Tests: 317 → 326 (9 new):
- test_entity_project_canonicalization (p04 → p04-gigabit)
- test_promote_entity_candidate_to_active
- test_reject_entity_candidate
- test_promote_active_entity_noop (only candidates promote)
- test_entity_audit_log_captures_lifecycle (before/after snapshots)
- test_new_relationship_types_available (6 new types present)
- test_conflicts_tables_exist
- test_memory_audit_has_entity_kind
- test_graduated_status_accepted
What's next (5B-5I, deferred): entity triage UI tab, core structure
queries, the 3 killer queries, memory graduation script, conflict
detection, MCP + context pack integration. See plan file.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:01:28 -04:00
|
|
|
# Phase 5 (Engineering V1): when a memory graduates to an entity, we
|
|
|
|
|
# keep the memory row as an immutable historical pointer. The forward
|
|
|
|
|
# pointer lets downstream code follow "what did this memory become?"
|
|
|
|
|
# without having to join through source_refs.
|
|
|
|
|
if not _column_exists(conn, "memories", "graduated_to_entity_id"):
|
|
|
|
|
conn.execute("ALTER TABLE memories ADD COLUMN graduated_to_entity_id TEXT")
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_memories_graduated ON memories(graduated_to_entity_id)"
|
|
|
|
|
)
|
|
|
|
|
|
feat: Phase 4 V1 — Robustness Hardening
Adds the observability + safety layer that turns AtoCore from
"works until something silently breaks" into "every mutation is
traceable, drift is detected, failures raise alerts."
1. Audit log (memory_audit table):
- New table with id, memory_id, action, actor, before/after JSON,
note, timestamp; 3 indexes for memory_id/timestamp/action
- _audit_memory() helper called from every mutation:
create_memory, update_memory, promote_memory,
reject_candidate_memory, invalidate_memory, supersede_memory,
reinforce_memory, auto_promote_reinforced, expire_stale_candidates
- Action verb auto-selected: promoted/rejected/invalidated/
superseded/updated based on state transition
- "actor" threaded through: api-http, human-triage, phase10-auto-
promote, candidate-expiry, reinforcement, etc.
- Fail-open: audit write failure logs but never breaks the mutation
- GET /memory/{id}/audit: full history for one memory
- GET /admin/audit/recent: last 50 mutations across the system
2. Alerts framework (src/atocore/observability/alerts.py):
- emit_alert(severity, title, message, context) fans out to:
- structlog logger (always)
- ~/atocore-logs/alerts.log append (configurable via
ATOCORE_ALERT_LOG)
- project_state atocore/alert/last_{severity} (dashboard surface)
- ATOCORE_ALERT_WEBHOOK POST if set (auto-detects Discord webhook
format for nice embeds; generic JSON otherwise)
- Every sink fail-open — one failure doesn't prevent the others
- Pipeline alert step in nightly cron: harness < 85% → warning;
candidate queue > 200 → warning
3. Integrity checks (scripts/integrity_check.py):
- Nightly scan for drift:
- Memories → missing source_chunk_id references
- Duplicate active memories (same type+content+project)
- project_state → missing projects
- Orphaned source_chunks (no parent document)
- Results persisted to atocore/status/integrity_check_result
- Any finding emits a warning alert
- Added as Step G in deploy/dalidou/batch-extract.sh nightly cron
4. Dashboard surfaces it all:
- integrity (findings + details)
- alerts (last info/warning/critical per severity)
- recent_audit (last 10 mutations with actor + action + preview)
Tests: 308 → 317 (9 new):
- test_audit_create_logs_entry
- test_audit_promote_logs_entry
- test_audit_reject_logs_entry
- test_audit_update_captures_before_after
- test_audit_reinforce_logs_entry
- test_recent_audit_returns_cross_memory_entries
- test_emit_alert_writes_log_file
- test_emit_alert_invalid_severity_falls_back_to_info
- test_emit_alert_fails_open_on_log_write_error
Deferred: formal migration framework with rollback (current additive
pattern is fine for V1); memory detail wiki page with audit view
(quick follow-up).
To enable Discord alerts: set ATOCORE_ALERT_WEBHOOK to a Discord
webhook URL in Dalidou's environment. Default = log-only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:54:10 -04:00
|
|
|
# Phase 4 (Robustness V1): append-only audit log for memory mutations.
|
|
|
|
|
# Every create/update/promote/reject/supersede/invalidate/reinforce/expire/
|
|
|
|
|
# auto_promote writes one row here. before/after are JSON snapshots of the
|
|
|
|
|
# relevant fields. actor lets us distinguish auto-triage vs human-triage vs
|
|
|
|
|
# api vs cron. This is the "how did this memory get to its current state"
|
|
|
|
|
# trail — essential once the brain starts auto-organizing itself.
|
|
|
|
|
conn.execute(
|
|
|
|
|
"""
|
|
|
|
|
CREATE TABLE IF NOT EXISTS memory_audit (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
memory_id TEXT NOT NULL,
|
|
|
|
|
action TEXT NOT NULL,
|
|
|
|
|
actor TEXT DEFAULT 'api',
|
|
|
|
|
before_json TEXT DEFAULT '{}',
|
|
|
|
|
after_json TEXT DEFAULT '{}',
|
|
|
|
|
note TEXT DEFAULT '',
|
|
|
|
|
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
|
|
|
)
|
|
|
|
|
"""
|
|
|
|
|
)
|
|
|
|
|
conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_memory ON memory_audit(memory_id)")
|
|
|
|
|
conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_timestamp ON memory_audit(timestamp)")
|
|
|
|
|
conn.execute("CREATE INDEX IF NOT EXISTS idx_memory_audit_action ON memory_audit(action)")
|
|
|
|
|
|
feat: Phase 5A — Engineering V1 foundation
First slice of the Engineering V1 sprint. Lays the schema + lifecycle
plumbing so the 10 canonical queries, memory graduation, and conflict
detection can land cleanly on top.
Schema (src/atocore/models/database.py):
- conflicts + conflict_members tables per conflict-model.md (with 5
indexes on status/project/slot/members)
- memory_audit.entity_kind discriminator — same audit table serves
both memories ("memory") and entities ("entity"); unified history
without duplicating infrastructure
- memories.graduated_to_entity_id forward pointer for graduated
memories (M → E transition preserves the memory as historical
pointer)
Memory (src/atocore/memory/service.py):
- MEMORY_STATUSES gains "graduated" — memory-entity graduation flow
ready to wire in Phase 5F
Engineering service (src/atocore/engineering/service.py):
- RELATIONSHIP_TYPES organized into 4 families per ontology-v1.md:
+ Structural: contains, part_of, interfaces_with
+ Intent: satisfies, constrained_by, affected_by_decision,
based_on_assumption (new), supersedes
+ Validation: analyzed_by, validated_by, supports (new),
conflicts_with (new), depends_on
+ Provenance: described_by, updated_by_session (new),
evidenced_by (new), summarized_in (new)
- create_entity + create_relationship now call resolve_project_name()
on write (canonicalization contract per doc)
- Both accept actor= parameter for audit provenance
- _audit_entity() helper uses shared memory_audit table with
entity_kind="entity" — one observability layer for everything
- promote_entity / reject_entity_candidate / supersede_entity —
mirror the memory lifecycle exactly (same pattern, same naming)
- get_entity_audit() reads from the shared table filtered by
entity_kind
API (src/atocore/api/routes.py):
- POST /entities/{id}/promote (candidate → active)
- POST /entities/{id}/reject (candidate → invalid)
- GET /entities/{id}/audit (full history for one entity)
- POST /entities passes actor="api-http" through
Tests: 317 → 326 (9 new):
- test_entity_project_canonicalization (p04 → p04-gigabit)
- test_promote_entity_candidate_to_active
- test_reject_entity_candidate
- test_promote_active_entity_noop (only candidates promote)
- test_entity_audit_log_captures_lifecycle (before/after snapshots)
- test_new_relationship_types_available (6 new types present)
- test_conflicts_tables_exist
- test_memory_audit_has_entity_kind
- test_graduated_status_accepted
What's next (5B-5I, deferred): entity triage UI tab, core structure
queries, the 3 killer queries, memory graduation script, conflict
detection, MCP + context pack integration. See plan file.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:01:28 -04:00
|
|
|
# Phase 5 (Engineering V1): entity_kind discriminator lets one audit
|
|
|
|
|
# table serve both memories AND entities. Default "memory" keeps existing
|
|
|
|
|
# rows correct; entity mutations write entity_kind="entity".
|
|
|
|
|
if not _column_exists(conn, "memory_audit", "entity_kind"):
|
|
|
|
|
conn.execute("ALTER TABLE memory_audit ADD COLUMN entity_kind TEXT DEFAULT 'memory'")
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_memory_audit_entity_kind ON memory_audit(entity_kind)"
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# Phase 5: conflicts + conflict_members tables per conflict-model.md.
|
|
|
|
|
# A conflict is "two or more active rows claiming the same slot with
|
|
|
|
|
# incompatible values". slot_kind + slot_key identify the logical slot
|
|
|
|
|
# (e.g., "component.material" for some component id). Members point
|
|
|
|
|
# back to the conflicting rows (memory or entity) with layer trust so
|
|
|
|
|
# resolution can pick the highest-trust winner.
|
|
|
|
|
conn.execute(
|
|
|
|
|
"""
|
|
|
|
|
CREATE TABLE IF NOT EXISTS conflicts (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
slot_kind TEXT NOT NULL,
|
|
|
|
|
slot_key TEXT NOT NULL,
|
|
|
|
|
project TEXT DEFAULT '',
|
|
|
|
|
status TEXT DEFAULT 'open',
|
|
|
|
|
resolution TEXT DEFAULT '',
|
|
|
|
|
resolved_at DATETIME,
|
|
|
|
|
detected_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
|
|
|
|
note TEXT DEFAULT ''
|
|
|
|
|
)
|
|
|
|
|
"""
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"""
|
|
|
|
|
CREATE TABLE IF NOT EXISTS conflict_members (
|
|
|
|
|
id TEXT PRIMARY KEY,
|
|
|
|
|
conflict_id TEXT NOT NULL REFERENCES conflicts(id) ON DELETE CASCADE,
|
|
|
|
|
member_kind TEXT NOT NULL,
|
|
|
|
|
member_id TEXT NOT NULL,
|
|
|
|
|
member_layer_trust INTEGER DEFAULT 0,
|
|
|
|
|
value_snapshot TEXT DEFAULT ''
|
|
|
|
|
)
|
|
|
|
|
"""
|
|
|
|
|
)
|
|
|
|
|
conn.execute("CREATE INDEX IF NOT EXISTS idx_conflicts_status ON conflicts(status)")
|
|
|
|
|
conn.execute("CREATE INDEX IF NOT EXISTS idx_conflicts_project ON conflicts(project)")
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_conflicts_slot ON conflicts(slot_kind, slot_key)"
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_conflict_members_conflict ON conflict_members(conflict_id)"
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_conflict_members_member ON conflict_members(member_kind, member_id)"
|
|
|
|
|
)
|
|
|
|
|
|
feat(phase9-A): interaction capture loop foundation
Phase 9 Commit A from the agreed plan: turn AtoCore from a stateless
context enhancer into a system that records what it actually fed to an
LLM and what came back. This is the audit trail Reflection (Commit B)
and Extraction (Commit C) will be layered on top of.
The interactions table existed in the schema since the original PoC
but nothing wrote to it. This change makes it real:
Schema migration (additive only):
- response full LLM response (caller decides how much)
- memories_used JSON list of memory ids in the context pack
- chunks_used JSON list of chunk ids in the context pack
- client identifier of the calling system
(openclaw, claude-code, manual, ...)
- session_id groups multi-turn conversations
- project project name (mirrors the memory module pattern,
no FK so capture stays cheap)
- indexes on session_id, project, created_at
The created_at column is now written explicitly with a SQLite-compatible
'YYYY-MM-DD HH:MM:SS' format so the same string lives in the DB and the
returned dataclass. Without this the `since` filter on list_interactions
would silently fail because CURRENT_TIMESTAMP and isoformat use different
shapes that do not compare cleanly as strings.
New module src/atocore/interactions/:
- Interaction dataclass
- record_interaction() persists one round-trip (prompt required;
everything else optional). Refuses empty prompts.
- list_interactions() filters by project / session_id / client / since,
newest-first, hard-capped at 500
- get_interaction() fetch by id, full response + context pack
API endpoints:
- POST /interactions capture one interaction
- GET /interactions list with summaries (no full response)
- GET /interactions/{id} full record incl. response + pack
Trust model:
- Capture is read-only with respect to memories, project state, and
source chunks. Nothing here promotes anything into trusted state.
- The audit trail becomes the dataset Commit B (reinforcement) and
Commit C (extraction + review queue) will operate on.
Tests (13 new, all green):
- service: persist + roundtrip every field
- service: minimum-fields path (prompt only)
- service: empty / whitespace prompt rejected
- service: get by id returns None for missing
- service: filter by project, session, client
- service: ordering newest-first with limit
- service: since filter inclusive on cutoff (the bug the timestamp
fix above caught)
- service: limit=0 returns empty
- API: POST records and round-trips through GET /interactions/{id}
- API: empty prompt returns 400
- API: missing id returns 404
- API: list filter returns summaries (not full response bodies)
Full suite: 118 passing (was 105).
master-plan-status.md updated to move Phase 9 from "not started" to
"started" with the explicit note that Commit A is in and Commits B/C
remain.
2026-04-06 19:31:43 -04:00
|
|
|
# Phase 9 Commit A: capture loop columns on the interactions table.
|
|
|
|
|
# The original schema only carried prompt + project_id + a context_pack
|
|
|
|
|
# JSON blob. To make interactions a real audit trail of what AtoCore fed
|
|
|
|
|
# the LLM and what came back, we record the full response, the chunk
|
|
|
|
|
# and memory ids that were actually used, plus client + session metadata.
|
|
|
|
|
if not _column_exists(conn, "interactions", "response"):
|
|
|
|
|
conn.execute("ALTER TABLE interactions ADD COLUMN response TEXT DEFAULT ''")
|
|
|
|
|
if not _column_exists(conn, "interactions", "memories_used"):
|
|
|
|
|
conn.execute("ALTER TABLE interactions ADD COLUMN memories_used TEXT DEFAULT '[]'")
|
|
|
|
|
if not _column_exists(conn, "interactions", "chunks_used"):
|
|
|
|
|
conn.execute("ALTER TABLE interactions ADD COLUMN chunks_used TEXT DEFAULT '[]'")
|
|
|
|
|
if not _column_exists(conn, "interactions", "client"):
|
|
|
|
|
conn.execute("ALTER TABLE interactions ADD COLUMN client TEXT DEFAULT ''")
|
|
|
|
|
if not _column_exists(conn, "interactions", "session_id"):
|
|
|
|
|
conn.execute("ALTER TABLE interactions ADD COLUMN session_id TEXT DEFAULT ''")
|
|
|
|
|
if not _column_exists(conn, "interactions", "project"):
|
|
|
|
|
conn.execute("ALTER TABLE interactions ADD COLUMN project TEXT DEFAULT ''")
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_interactions_session ON interactions(session_id)"
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_interactions_project_name ON interactions(project)"
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"CREATE INDEX IF NOT EXISTS idx_interactions_created_at ON interactions(created_at)"
|
|
|
|
|
)
|
|
|
|
|
|
2026-04-05 17:53:23 -04:00
|
|
|
|
|
|
|
|
def _column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool:
|
|
|
|
|
rows = conn.execute(f"PRAGMA table_info({table})").fetchall()
|
|
|
|
|
return any(row["name"] == column for row in rows)
|
|
|
|
|
|
|
|
|
|
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
@contextmanager
|
|
|
|
|
def get_connection() -> Generator[sqlite3.Connection, None, None]:
|
|
|
|
|
"""Get a database connection with row factory."""
|
|
|
|
|
_ensure_data_dir()
|
2026-04-06 10:15:00 -04:00
|
|
|
conn = sqlite3.connect(
|
|
|
|
|
str(_config.settings.db_path),
|
|
|
|
|
timeout=_config.settings.db_busy_timeout_ms / 1000,
|
|
|
|
|
)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
conn.row_factory = sqlite3.Row
|
|
|
|
|
conn.execute("PRAGMA foreign_keys = ON")
|
2026-04-06 10:15:00 -04:00
|
|
|
conn.execute(f"PRAGMA busy_timeout = {_config.settings.db_busy_timeout_ms}")
|
|
|
|
|
conn.execute("PRAGMA journal_mode = WAL")
|
|
|
|
|
conn.execute("PRAGMA synchronous = NORMAL")
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
try:
|
|
|
|
|
yield conn
|
|
|
|
|
conn.commit()
|
|
|
|
|
except Exception:
|
|
|
|
|
conn.rollback()
|
|
|
|
|
raise
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|