ATOCore

Author	SHA1	Message	Date
Anto01	ea3fed3d44	feat(phase9-A): interaction capture loop foundation Phase 9 Commit A from the agreed plan: turn AtoCore from a stateless context enhancer into a system that records what it actually fed to an LLM and what came back. This is the audit trail Reflection (Commit B) and Extraction (Commit C) will be layered on top of. The interactions table existed in the schema since the original PoC but nothing wrote to it. This change makes it real: Schema migration (additive only): - response full LLM response (caller decides how much) - memories_used JSON list of memory ids in the context pack - chunks_used JSON list of chunk ids in the context pack - client identifier of the calling system (openclaw, claude-code, manual, ...) - session_id groups multi-turn conversations - project project name (mirrors the memory module pattern, no FK so capture stays cheap) - indexes on session_id, project, created_at The created_at column is now written explicitly with a SQLite-compatible 'YYYY-MM-DD HH:MM:SS' format so the same string lives in the DB and the returned dataclass. Without this the `since` filter on list_interactions would silently fail because CURRENT_TIMESTAMP and isoformat use different shapes that do not compare cleanly as strings. New module src/atocore/interactions/: - Interaction dataclass - record_interaction() persists one round-trip (prompt required; everything else optional). Refuses empty prompts. - list_interactions() filters by project / session_id / client / since, newest-first, hard-capped at 500 - get_interaction() fetch by id, full response + context pack API endpoints: - POST /interactions capture one interaction - GET /interactions list with summaries (no full response) - GET /interactions/{id} full record incl. response + pack Trust model: - Capture is read-only with respect to memories, project state, and source chunks. Nothing here promotes anything into trusted state. - The audit trail becomes the dataset Commit B (reinforcement) and Commit C (extraction + review queue) will operate on. Tests (13 new, all green): - service: persist + roundtrip every field - service: minimum-fields path (prompt only) - service: empty / whitespace prompt rejected - service: get by id returns None for missing - service: filter by project, session, client - service: ordering newest-first with limit - service: since filter inclusive on cutoff (the bug the timestamp fix above caught) - service: limit=0 returns empty - API: POST records and round-trips through GET /interactions/{id} - API: empty prompt returns 400 - API: missing id returns 404 - API: list filter returns summaries (not full response bodies) Full suite: 118 passing (was 105). master-plan-status.md updated to move Phase 9 from "not started" to "started" with the explicit note that Commit A is in and Commits B/C remain.	2026-04-06 19:31:43 -04:00
Anto01	c9b9eede25	feat: tunable ranking, refresh status, chroma backup + admin endpoints Three small improvements that move the operational baseline forward without changing the existing trust model. 1. Tunable retrieval ranking weights - rank_project_match_boost, rank_query_token_step, rank_query_token_cap, rank_path_high_signal_boost, rank_path_low_signal_penalty are now Settings fields - all overridable via ATOCORE_* env vars - retriever no longer hard-codes 2.0 / 1.18 / 0.72 / 0.08 / 1.32 - lets ranking be tuned per environment as Wave 1 is exercised without code changes 2. /projects/{name}/refresh status - refresh_registered_project now returns an overall status field ("ingested", "partial", "nothing_to_ingest") plus roots_ingested and roots_skipped counters - ProjectRefreshResponse advertises the new fields so callers can rely on them - covers the case where every configured root is missing on disk 3. Chroma cold snapshot + admin backup endpoints - create_runtime_backup now accepts include_chroma and writes a cold directory copy of the chroma persistence path - new list_runtime_backups() and validate_backup() helpers - new endpoints: - POST /admin/backup create snapshot (optional chroma) - GET /admin/backup list snapshots - GET /admin/backup/{stamp}/validate structural validation - chroma snapshots are taken under exclusive_ingestion() so a refresh or ingest cannot race with the cold copy - backup metadata records what was actually included and how big Tests: - 8 new tests covering tunable weights, refresh status branches (ingested / partial / nothing_to_ingest), chroma snapshot, list, validate, and the API endpoints (including the lock-acquisition path) - existing fake refresh stubs in test_api_storage.py updated for the expanded ProjectRefreshResponse model - full suite: 105 passing (was 97) next-steps doc updated to reflect that the chroma snapshot + restore validation gap from current-state.md is now closed in code; only the operational retention policy remains.	2026-04-06 18:42:19 -04:00
Anto01	14ab7c8e9f	fix: pass project_hint into retrieve and add path-signal ranking Two changes that belong together: 1. builder.build_context() now passes project_hint into retrieve(), so the project-aware boost actually fires for the retrieval pipeline driven by /context/build. Before this, only direct /query callers benefited from the registered-project boost. 2. retriever now applies two more ranking signals on every chunk: - _query_match_boost: boosts chunks whose source/title/heading echo high-signal query tokens (stop list filters out generic words like "the", "project", "system") - _path_signal_boost: down-weights archival noise (_archive, _history, pre-cleanup, reviews) by 0.72 and up-weights current high-signal docs (status, decision, requirements, charter, system-map, error-budget, ...) by 1.18 Tests: - test_context_builder_passes_project_hint_to_retrieval verifies the wiring fix - test_retrieve_downranks_archive_noise_and_prefers_high_signal_paths verifies the new ranking helpers prefer current docs over archive This addresses the cross-project competition and archive bleed called out in current-state.md after the Wave 1 ingestion.	2026-04-06 18:37:07 -04:00
Anto01	bdb42dba05	Expand active project wave and serialize refreshes	2026-04-06 14:58:14 -04:00
Anto01	26bfa94c65	Add project-aware boost to raw query	2026-04-06 13:32:33 -04:00
Anto01	06aa931273	Add project registry update flow	2026-04-06 12:31:24 -04:00
Anto01	c9757e313a	Harden runtime and add backup foundation	2026-04-06 10:15:00 -04:00
Anto01	9715fe3143	Add project registration endpoint	2026-04-06 09:52:19 -04:00
Anto01	1f1e6b5749	Add project registration proposal preview	2026-04-06 09:11:11 -04:00
Anto01	827dcf2cd1	Add project registration policy and template	2026-04-06 08:46:37 -04:00
Anto01	8293099025	Add project registry refresh foundation	2026-04-06 08:02:13 -04:00
Anto01	6bfa1fcc37	Add Dalidou storage foundation and deployment prep	2026-04-05 18:33:52 -04:00
Anto01	b0889b3925	Stabilize core correctness and sync project plan state	2026-04-05 17:53:23 -04:00
Anto01	b48f0c95ab	feat: Phase 2 Memory Core — structured memory with context integration Memory Core implementation: - Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation - CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede - Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid) - Memory API endpoints: POST/GET/PUT/DELETE /memory Context builder integration (trust precedence per Master Plan): 1. Trusted Project State (highest trust, 20% budget) 2. Identity + Preference memories (10% budget) 3. Retrieved chunks (remaining budget) Also fixed database.py to use dynamic settings reference for test isolation. 45/45 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 09:54:52 -04:00
Anto01	531c560db7	feat: Phase 1 ingestion hardening + Phase 5 Trusted Project State Phase 1 - Ingestion hardening: - Encoding fallback (UTF-8/UTF-8-sig/Latin-1/CP1252) - Delete detection: purge DB/vector entries for removed files - Ingestion stats endpoint (GET /stats) Phase 5 - Trusted Project State: - project_state table with categories (status, decision, requirement, contact, milestone, fact, config) - CRUD API: POST/GET/DELETE /project/state - Upsert semantics, invalidation (supersede) support - Context builder integrates project state at highest trust precedence - Project state gets 20% budget allocation, appears first in context - Trust precedence: Project State > Retrieved Chunks (per Master Plan) 33/33 tests passing. Validated end-to-end with GigaBIT M1 project data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 09:41:59 -04:00
Anto01	b4afbbb53a	feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC) Complete implementation of the personal context engine foundation: - FastAPI server with 5 endpoints (ingest, query, context/build, health, debug) - SQLite database with 5 tables (documents, chunks, memories, projects, interactions) - Heading-aware markdown chunker (800 char max, recursive splitting) - Multilingual embeddings via sentence-transformers (EN/FR) - ChromaDB vector store with cosine similarity retrieval - Context builder with project boosting, dedup, and budget enforcement - CLI scripts for batch ingestion and test prompt evaluation - 19 unit tests passing, 79% coverage - Validated on 482 real project files (8383 chunks, 0 errors) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 09:21:27 -04:00

16 Commits