2026-04-05 09:41:59 -04:00
|
|
|
"""Context pack assembly: retrieve, rank, budget, format.
|
|
|
|
|
|
|
|
|
|
Trust precedence (per Master Plan):
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
1. Trusted Project State → always included first, highest authority
|
|
|
|
|
2. Identity + Preference memories → included next
|
|
|
|
|
3. Retrieved chunks → ranked, deduplicated, budget-constrained
|
2026-04-05 09:41:59 -04:00
|
|
|
"""
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
import time
|
|
|
|
|
from dataclasses import dataclass, field
|
|
|
|
|
from pathlib import Path
|
|
|
|
|
|
2026-04-05 17:53:23 -04:00
|
|
|
import atocore.config as _config
|
2026-04-05 09:41:59 -04:00
|
|
|
from atocore.context.project_state import format_project_state, get_state
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
from atocore.memory.service import get_memories_for_context
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
from atocore.observability.logger import get_logger
|
fix(P1+P2): canonicalize project names at every trust boundary
Three findings from codex's review of the previous P1+P2 fix. The
earlier commit (f2372ef) only fixed alias resolution at the context
builder. Codex correctly pointed out that the same fragmentation
applies at every other place a project name crosses a boundary —
project_state writes/reads, interaction capture/listing/filtering,
memory create/queries, and reinforcement's downstream queries. Plus
a real bug in the interaction `since` filter where the storage
format and the documented ISO format don't compare cleanly.
The fix is one helper used at every boundary instead of duplicating
the resolution inline.
New helper: src/atocore/projects/registry.py::resolve_project_name
---------------------------------------------------------------
- Single canonicalization boundary for project names
- Returns the canonical project_id when the input matches any
registered id or alias
- Returns the input unchanged for empty/None and for unregistered
names (preserves backwards compat with hand-curated state that
predates the registry)
- Documented as the contract that every read/write at the trust
boundary should pass through
P1 — Trusted Project State endpoints
------------------------------------
src/atocore/context/project_state.py: set_state, get_state, and
invalidate_state now all canonicalize project_name through
resolve_project_name BEFORE looking up or creating the project row.
Before this fix:
- POST /project/state with project="p05" called ensure_project("p05")
which created a separate row in the projects table
- The state row was attached to that alias project_id
- Later context builds canonicalized "p05" -> "p05-interferometer"
via the builder fix from f2372ef and never found the state
- Result: trusted state silently fragmented across alias rows
After this fix:
- The alias is resolved to the canonical id at every entry point
- Two captures (one via "p05", one via "p05-interferometer") write
to the same row
- get_state via either alias or the canonical id finds the same row
Fixes the highest-priority gap codex flagged because Trusted Project
State is supposed to be the most dependable layer in the AtoCore
trust hierarchy.
P2.a — Interaction capture project canonicalization
----------------------------------------------------
src/atocore/interactions/service.py: record_interaction now
canonicalizes project before storing, so interaction.project is
always the canonical id regardless of what the client passed.
Downstream effects:
- reinforce_from_interaction queries memories by interaction.project
-> previously missed memories stored under canonical id
-> now consistent because interaction.project IS the canonical id
- the extractor stamps candidates with interaction.project
-> previously created candidates in alias buckets
-> now creates candidates in the canonical bucket
- list_interactions(project=alias) was already broken, now fixed by
canonicalizing the filter input on the read side too
Memory service applied the same fix:
- src/atocore/memory/service.py: create_memory and get_memories
both canonicalize project through resolve_project_name
- This keeps stored memory.project consistent with the
reinforcement query path
P2.b — Interaction `since` filter format normalization
------------------------------------------------------
src/atocore/interactions/service.py: new _normalize_since helper.
The bug:
- created_at is stored as 'YYYY-MM-DD HH:MM:SS' (no timezone, UTC by
convention) so it sorts lexically and compares cleanly with the
SQLite CURRENT_TIMESTAMP default
- The `since` parameter was documented as ISO 8601 but compared as
a raw string against the storage format
- The lexically-greater 'T' separator means an ISO timestamp like
'2026-04-07T12:00:00Z' is GREATER than the storage form
'2026-04-07 12:00:00' for the same instant
- Result: a client passing ISO `since` got an empty result for any
row from the same day, even though those rows existed and were
technically "after" the cutoff in real-world time
The fix:
- _normalize_since accepts ISO 8601 with T, optional Z suffix,
optional fractional seconds, optional +HH:MM offsets
- Uses datetime.fromisoformat for parsing (Python 3.11+)
- Converts to UTC and reformats as the storage format before the
SQL comparison
- The bare storage format still works (backwards compat path is a
regex match that returns the input unchanged)
- Unparseable input is returned as-is so the comparison degrades
gracefully (rows just don't match) instead of raising and
breaking the listing endpoint
builder.py refactor
-------------------
The previous P1 fix had inline canonicalization. Now it uses the
shared helper for consistency:
- import changed from get_registered_project to resolve_project_name
- the inline lookup is replaced with a single helper call
- the comment block now points at representation-authority.md for
the canonicalization contract
New shared test fixture: tests/conftest.py::project_registry
------------------------------------------------------------
- Standardizes the registry-setup pattern that was duplicated
across test_context_builder.py, test_project_state.py,
test_interactions.py, and test_reinforcement.py
- Returns a callable that takes (project_id, [aliases]) tuples
and writes them into a temp registry file with the env var
pointed at it and config.settings reloaded
- Used by all 12 new regression tests in this commit
Tests (12 new, all green on first run)
--------------------------------------
test_project_state.py:
- test_set_state_canonicalizes_alias: write via alias, read via
every alias and the canonical id, verify same row id
- test_get_state_canonicalizes_alias_after_canonical_write
- test_invalidate_state_canonicalizes_alias
- test_unregistered_project_state_still_works (backwards compat)
test_interactions.py:
- test_record_interaction_canonicalizes_project
- test_list_interactions_canonicalizes_project_filter
- test_list_interactions_since_accepts_iso_with_t_separator
- test_list_interactions_since_accepts_z_suffix
- test_list_interactions_since_accepts_offset
- test_list_interactions_since_storage_format_still_works
test_reinforcement.py:
- test_reinforcement_works_when_capture_uses_alias (end-to-end:
capture under alias, seed memory under canonical, verify
reinforcement matches)
- test_get_memories_filter_by_alias
Full suite: 174 passing (was 162), 1 warning. The +12 is the
new regression tests, no existing tests regressed.
What's still NOT canonicalized (and why)
----------------------------------------
- _rank_chunks's secondary substring boost in builder.py — the
retriever already does the right thing via its own
_project_match_boost which calls get_registered_project. The
redundant secondary boost still uses the raw hint but it's a
multiplicative factor on top of correct retrieval, not a
filter, so it can't drop relevant chunks. Tracked as a future
cleanup but not a P1.
- update_memory's project field (you can't change a memory's
project after creation in the API anyway).
- The retriever's project_hint parameter on direct /query calls
— same reasoning as the builder boost, plus the retriever's
own get_registered_project call already handles aliases there.
2026-04-07 08:29:33 -04:00
|
|
|
from atocore.projects.registry import resolve_project_name
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
from atocore.retrieval.retriever import ChunkResult, retrieve
|
|
|
|
|
|
|
|
|
|
log = get_logger("context_builder")
|
|
|
|
|
|
|
|
|
|
SYSTEM_PREFIX = (
|
|
|
|
|
"You have access to the following personal context from the user's knowledge base.\n"
|
|
|
|
|
"Use it to inform your answer. If the context is not relevant, ignore it.\n"
|
2026-04-05 09:41:59 -04:00
|
|
|
"Do not mention the context system unless asked.\n"
|
|
|
|
|
"When project state is provided, treat it as the most authoritative source."
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
)
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# Budget allocation (per Master Plan section 9):
|
|
|
|
|
# identity: 5%, preferences: 5%, project state: 20%, retrieval: 60%+
|
2026-04-05 09:41:59 -04:00
|
|
|
PROJECT_STATE_BUDGET_RATIO = 0.20
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
MEMORY_BUDGET_RATIO = 0.10 # 5% identity + 5% preference
|
2026-04-11 11:35:40 -04:00
|
|
|
# Project-scoped memories (project/knowledge/episodic) are the outlet
|
|
|
|
|
# for the Phase 9 reflection loop on the retrieval side. Budget sits
|
|
|
|
|
# between identity/preference and retrieved chunks so a reinforced
|
|
|
|
|
# memory can actually reach the model.
|
|
|
|
|
PROJECT_MEMORY_BUDGET_RATIO = 0.15
|
|
|
|
|
PROJECT_MEMORY_TYPES = ["project", "knowledge", "episodic"]
|
2026-04-05 09:41:59 -04:00
|
|
|
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
# Last built context pack for debug inspection
|
|
|
|
|
_last_context_pack: "ContextPack | None" = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@dataclass
|
|
|
|
|
class ContextChunk:
|
|
|
|
|
content: str
|
|
|
|
|
source_file: str
|
|
|
|
|
heading_path: str
|
|
|
|
|
score: float
|
|
|
|
|
char_count: int
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@dataclass
|
|
|
|
|
class ContextPack:
|
|
|
|
|
chunks_used: list[ContextChunk] = field(default_factory=list)
|
2026-04-05 09:41:59 -04:00
|
|
|
project_state_text: str = ""
|
|
|
|
|
project_state_chars: int = 0
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text: str = ""
|
|
|
|
|
memory_chars: int = 0
|
2026-04-11 11:35:40 -04:00
|
|
|
project_memory_text: str = ""
|
|
|
|
|
project_memory_chars: int = 0
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
total_chars: int = 0
|
|
|
|
|
budget: int = 0
|
|
|
|
|
budget_remaining: int = 0
|
|
|
|
|
formatted_context: str = ""
|
|
|
|
|
full_prompt: str = ""
|
|
|
|
|
query: str = ""
|
|
|
|
|
project_hint: str = ""
|
|
|
|
|
duration_ms: int = 0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def build_context(
|
|
|
|
|
user_prompt: str,
|
|
|
|
|
project_hint: str | None = None,
|
|
|
|
|
budget: int | None = None,
|
|
|
|
|
) -> ContextPack:
|
2026-04-05 09:41:59 -04:00
|
|
|
"""Build a context pack for a user prompt.
|
|
|
|
|
|
|
|
|
|
Trust precedence applied:
|
|
|
|
|
1. Project state is injected first (highest trust)
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
2. Identity + preference memories (second trust level)
|
|
|
|
|
3. Retrieved chunks fill the remaining budget
|
2026-04-05 09:41:59 -04:00
|
|
|
"""
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
global _last_context_pack
|
|
|
|
|
start = time.time()
|
2026-04-05 17:53:23 -04:00
|
|
|
budget = _config.settings.context_budget if budget is None else max(budget, 0)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
2026-04-05 09:41:59 -04:00
|
|
|
# 1. Get Trusted Project State (highest precedence)
|
|
|
|
|
project_state_text = ""
|
|
|
|
|
project_state_chars = 0
|
2026-04-05 17:53:23 -04:00
|
|
|
project_state_budget = min(
|
|
|
|
|
budget,
|
|
|
|
|
max(0, int(budget * PROJECT_STATE_BUDGET_RATIO)),
|
|
|
|
|
)
|
2026-04-05 09:41:59 -04:00
|
|
|
|
fix(P1+P2): canonicalize project names at every trust boundary
Three findings from codex's review of the previous P1+P2 fix. The
earlier commit (f2372ef) only fixed alias resolution at the context
builder. Codex correctly pointed out that the same fragmentation
applies at every other place a project name crosses a boundary —
project_state writes/reads, interaction capture/listing/filtering,
memory create/queries, and reinforcement's downstream queries. Plus
a real bug in the interaction `since` filter where the storage
format and the documented ISO format don't compare cleanly.
The fix is one helper used at every boundary instead of duplicating
the resolution inline.
New helper: src/atocore/projects/registry.py::resolve_project_name
---------------------------------------------------------------
- Single canonicalization boundary for project names
- Returns the canonical project_id when the input matches any
registered id or alias
- Returns the input unchanged for empty/None and for unregistered
names (preserves backwards compat with hand-curated state that
predates the registry)
- Documented as the contract that every read/write at the trust
boundary should pass through
P1 — Trusted Project State endpoints
------------------------------------
src/atocore/context/project_state.py: set_state, get_state, and
invalidate_state now all canonicalize project_name through
resolve_project_name BEFORE looking up or creating the project row.
Before this fix:
- POST /project/state with project="p05" called ensure_project("p05")
which created a separate row in the projects table
- The state row was attached to that alias project_id
- Later context builds canonicalized "p05" -> "p05-interferometer"
via the builder fix from f2372ef and never found the state
- Result: trusted state silently fragmented across alias rows
After this fix:
- The alias is resolved to the canonical id at every entry point
- Two captures (one via "p05", one via "p05-interferometer") write
to the same row
- get_state via either alias or the canonical id finds the same row
Fixes the highest-priority gap codex flagged because Trusted Project
State is supposed to be the most dependable layer in the AtoCore
trust hierarchy.
P2.a — Interaction capture project canonicalization
----------------------------------------------------
src/atocore/interactions/service.py: record_interaction now
canonicalizes project before storing, so interaction.project is
always the canonical id regardless of what the client passed.
Downstream effects:
- reinforce_from_interaction queries memories by interaction.project
-> previously missed memories stored under canonical id
-> now consistent because interaction.project IS the canonical id
- the extractor stamps candidates with interaction.project
-> previously created candidates in alias buckets
-> now creates candidates in the canonical bucket
- list_interactions(project=alias) was already broken, now fixed by
canonicalizing the filter input on the read side too
Memory service applied the same fix:
- src/atocore/memory/service.py: create_memory and get_memories
both canonicalize project through resolve_project_name
- This keeps stored memory.project consistent with the
reinforcement query path
P2.b — Interaction `since` filter format normalization
------------------------------------------------------
src/atocore/interactions/service.py: new _normalize_since helper.
The bug:
- created_at is stored as 'YYYY-MM-DD HH:MM:SS' (no timezone, UTC by
convention) so it sorts lexically and compares cleanly with the
SQLite CURRENT_TIMESTAMP default
- The `since` parameter was documented as ISO 8601 but compared as
a raw string against the storage format
- The lexically-greater 'T' separator means an ISO timestamp like
'2026-04-07T12:00:00Z' is GREATER than the storage form
'2026-04-07 12:00:00' for the same instant
- Result: a client passing ISO `since` got an empty result for any
row from the same day, even though those rows existed and were
technically "after" the cutoff in real-world time
The fix:
- _normalize_since accepts ISO 8601 with T, optional Z suffix,
optional fractional seconds, optional +HH:MM offsets
- Uses datetime.fromisoformat for parsing (Python 3.11+)
- Converts to UTC and reformats as the storage format before the
SQL comparison
- The bare storage format still works (backwards compat path is a
regex match that returns the input unchanged)
- Unparseable input is returned as-is so the comparison degrades
gracefully (rows just don't match) instead of raising and
breaking the listing endpoint
builder.py refactor
-------------------
The previous P1 fix had inline canonicalization. Now it uses the
shared helper for consistency:
- import changed from get_registered_project to resolve_project_name
- the inline lookup is replaced with a single helper call
- the comment block now points at representation-authority.md for
the canonicalization contract
New shared test fixture: tests/conftest.py::project_registry
------------------------------------------------------------
- Standardizes the registry-setup pattern that was duplicated
across test_context_builder.py, test_project_state.py,
test_interactions.py, and test_reinforcement.py
- Returns a callable that takes (project_id, [aliases]) tuples
and writes them into a temp registry file with the env var
pointed at it and config.settings reloaded
- Used by all 12 new regression tests in this commit
Tests (12 new, all green on first run)
--------------------------------------
test_project_state.py:
- test_set_state_canonicalizes_alias: write via alias, read via
every alias and the canonical id, verify same row id
- test_get_state_canonicalizes_alias_after_canonical_write
- test_invalidate_state_canonicalizes_alias
- test_unregistered_project_state_still_works (backwards compat)
test_interactions.py:
- test_record_interaction_canonicalizes_project
- test_list_interactions_canonicalizes_project_filter
- test_list_interactions_since_accepts_iso_with_t_separator
- test_list_interactions_since_accepts_z_suffix
- test_list_interactions_since_accepts_offset
- test_list_interactions_since_storage_format_still_works
test_reinforcement.py:
- test_reinforcement_works_when_capture_uses_alias (end-to-end:
capture under alias, seed memory under canonical, verify
reinforcement matches)
- test_get_memories_filter_by_alias
Full suite: 174 passing (was 162), 1 warning. The +12 is the
new regression tests, no existing tests regressed.
What's still NOT canonicalized (and why)
----------------------------------------
- _rank_chunks's secondary substring boost in builder.py — the
retriever already does the right thing via its own
_project_match_boost which calls get_registered_project. The
redundant secondary boost still uses the raw hint but it's a
multiplicative factor on top of correct retrieval, not a
filter, so it can't drop relevant chunks. Tracked as a future
cleanup but not a P1.
- update_memory's project field (you can't change a memory's
project after creation in the API anyway).
- The retriever's project_hint parameter on direct /query calls
— same reasoning as the builder boost, plus the retriever's
own get_registered_project call already handles aliases there.
2026-04-07 08:29:33 -04:00
|
|
|
# Canonicalize the project hint through the registry so callers
|
|
|
|
|
# can pass an alias (`p05`, `gigabit`) and still find trusted
|
|
|
|
|
# state stored under the canonical project id. The same helper
|
|
|
|
|
# is used everywhere a project name crosses a trust boundary
|
|
|
|
|
# (project_state, memories, interactions). When the registry has
|
|
|
|
|
# no entry the helper returns the input unchanged so hand-curated
|
|
|
|
|
# state that predates the registry still works.
|
|
|
|
|
canonical_project = resolve_project_name(project_hint) if project_hint else ""
|
|
|
|
|
if canonical_project:
|
fix(P1+P2): alias-aware project state lookup + slash command corpus fallback
Two regression fixes from codex's review of the slash command
refactor commit (78d4e97). Both findings are real and now have
covered tests.
P1 — server-side alias resolution for project_state lookup
----------------------------------------------------------
The bug:
- /context/build forwarded the caller's project hint verbatim to
get_state(project_hint), which does an exact-name lookup against
the projects table (case-insensitive but no alias resolution)
- the project registry's alias matching was only used by the
client's auto-context path and the retriever's project-match
boost, never by the server's project_state lookup
- consequence: /atocore-context "... p05" would silently miss
trusted project state stored under the canonical id
"p05-interferometer", weakening project-hinted retrieval to
the point that an explicit alias hint was *worse* than no hint
The fix in src/atocore/context/builder.py:
- import get_registered_project from the projects registry
- before calling get_state(project_hint), resolve the hint
through get_registered_project; if a registry record exists,
use the canonical project_id for the state lookup
- if no registry record exists, fall back to the raw hint so a
hand-curated project_state entry that predates the registry
still works (backwards compat with pre-registry deployments)
The retriever already does its own alias expansion via
get_registered_project for the project-match boost, so the
retriever side was never broken — only the project_state lookup
in the builder. The fix is scoped to that one call site.
Tests added in tests/test_context_builder.py:
- test_alias_hint_resolves_through_registry: stands up a fresh
registry, sets state under "p05-interferometer", then verifies
build_context with project_hint="p05" finds the state, AND
with project_hint="interferometer" (the second alias) finds it
too, AND with the canonical id finds it. Covers all three
resolution paths.
- test_unknown_hint_falls_back_to_raw_lookup: empty registry,
set state under an unregistered project name, verify the
build_context call with that name as the hint still finds the
state. Locks in the backwards-compat behavior.
P2 — slash command no-hint fallback to corpus-wide context build
----------------------------------------------------------------
The bug:
- the slash command's no-hint path called auto-context, which
returns {"status": "no_project_match"} when project detection
fails and does NOT fall back to a plain context-build
- the slash command's own help text told the user "call without
a hint to use the corpus-wide context build" — which was a lie
because the wrapper no longer did that
- consequence: generic prompts like "what changed in AtoCore
backup policy?" or any cross-project question got a useless
no_project_match envelope instead of a context pack
The fix in .claude/commands/atocore-context.md:
- the no-hint path now does the 2-step fallback dance:
1. try `auto-context "<prompt>"` for project detection
2. if the response contains "no_project_match", fall back to
`context-build "<prompt>"` (no project arg)
- both branches return a real context pack, fail-open envelope
is preserved for genuine network errors
- the underlying client surface is unchanged (no new flags, no
new subcommands) — the fallback is per-frontend logic in the
slash command, leaving auto-context's existing semantics
intact for OpenClaw and any other caller that depends on the
no_project_match envelope as a "do nothing" signal
While I was here, also tightened the slash command's argument
parsing to delegate alias-knowledge to the registry instead of
embedding a hardcoded list:
- old version had a literal list of "atocore", "p04", "p05",
"p06" and their aliases that needed manual maintenance every
time a project was added
- new version takes the last token of $ARGUMENTS and asks the
client's `detect-project` subcommand whether it's a known
alias; if matched, it's the explicit hint, if not it's part
of the prompt
- this delegates registry knowledge to the registry, where it
belongs
Unrelated improvement noted but NOT fixed in this commit:
- _rank_chunks in builder.py also has a naive substring boost
that uses the original hint without alias expansion. The
retriever already does the right thing, so this secondary
boost is redundant. Tracked as a future cleanup but not in
scope for the P1/P2 fix; codex's findings are about
project_state lookup, not about the secondary chunk boost.
Full suite: 162 passing (was 160), 1 warning. The +2 is the two
new P1 regression tests.
2026-04-07 07:47:03 -04:00
|
|
|
state_entries = get_state(canonical_project)
|
2026-04-05 09:41:59 -04:00
|
|
|
if state_entries:
|
|
|
|
|
project_state_text = format_project_state(state_entries)
|
2026-04-05 17:53:23 -04:00
|
|
|
project_state_text, project_state_chars = _truncate_text_block(
|
|
|
|
|
project_state_text,
|
|
|
|
|
project_state_budget or budget,
|
|
|
|
|
)
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
|
|
|
|
|
# 2. Get identity + preference memories (second precedence)
|
2026-04-05 17:53:23 -04:00
|
|
|
memory_budget = min(int(budget * MEMORY_BUDGET_RATIO), max(budget - project_state_chars, 0))
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text, memory_chars = get_memories_for_context(
|
|
|
|
|
memory_types=["identity", "preference"],
|
|
|
|
|
budget=memory_budget,
|
|
|
|
|
)
|
|
|
|
|
|
2026-04-11 11:35:40 -04:00
|
|
|
# 2b. Get project-scoped memories (third precedence). Only
|
|
|
|
|
# populated when a canonical project is in scope — cross-project
|
|
|
|
|
# memory bleed would rot the pack. Active-only filtering is
|
|
|
|
|
# handled by the shared min_confidence=0.5 gate inside
|
|
|
|
|
# get_memories_for_context.
|
|
|
|
|
project_memory_text = ""
|
|
|
|
|
project_memory_chars = 0
|
|
|
|
|
if canonical_project:
|
|
|
|
|
project_memory_budget = min(
|
|
|
|
|
int(budget * PROJECT_MEMORY_BUDGET_RATIO),
|
|
|
|
|
max(budget - project_state_chars - memory_chars, 0),
|
|
|
|
|
)
|
|
|
|
|
project_memory_text, project_memory_chars = get_memories_for_context(
|
|
|
|
|
memory_types=PROJECT_MEMORY_TYPES,
|
|
|
|
|
project=canonical_project,
|
|
|
|
|
budget=project_memory_budget,
|
|
|
|
|
header="--- Project Memories ---",
|
|
|
|
|
footer="--- End Project Memories ---",
|
|
|
|
|
)
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 3. Calculate remaining budget for retrieval
|
2026-04-11 11:35:40 -04:00
|
|
|
retrieval_budget = budget - project_state_chars - memory_chars - project_memory_chars
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
|
|
|
|
|
# 4. Retrieve candidates
|
fix: pass project_hint into retrieve and add path-signal ranking
Two changes that belong together:
1. builder.build_context() now passes project_hint into retrieve(),
so the project-aware boost actually fires for the retrieval pipeline
driven by /context/build. Before this, only direct /query callers
benefited from the registered-project boost.
2. retriever now applies two more ranking signals on every chunk:
- _query_match_boost: boosts chunks whose source/title/heading
echo high-signal query tokens (stop list filters out generic
words like "the", "project", "system")
- _path_signal_boost: down-weights archival noise (_archive,
_history, pre-cleanup, reviews) by 0.72 and up-weights current
high-signal docs (status, decision, requirements, charter,
system-map, error-budget, ...) by 1.18
Tests:
- test_context_builder_passes_project_hint_to_retrieval verifies
the wiring fix
- test_retrieve_downranks_archive_noise_and_prefers_high_signal_paths
verifies the new ranking helpers prefer current docs over archive
This addresses the cross-project competition and archive bleed
called out in current-state.md after the Wave 1 ingestion.
2026-04-06 18:37:07 -04:00
|
|
|
candidates = (
|
|
|
|
|
retrieve(
|
|
|
|
|
user_prompt,
|
|
|
|
|
top_k=_config.settings.context_top_k,
|
|
|
|
|
project_hint=project_hint,
|
|
|
|
|
)
|
|
|
|
|
if retrieval_budget > 0
|
|
|
|
|
else []
|
|
|
|
|
)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 5. Score and rank
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
scored = _rank_chunks(candidates, project_hint)
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 6. Select within remaining budget
|
2026-04-05 09:41:59 -04:00
|
|
|
selected = _select_within_budget(scored, max(retrieval_budget, 0))
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 7. Format full context
|
2026-04-11 11:35:40 -04:00
|
|
|
formatted = _format_full_context(
|
|
|
|
|
project_state_text, memory_text, project_memory_text, selected
|
|
|
|
|
)
|
2026-04-05 17:53:23 -04:00
|
|
|
if len(formatted) > budget:
|
|
|
|
|
formatted, selected = _trim_context_to_budget(
|
|
|
|
|
project_state_text,
|
|
|
|
|
memory_text,
|
2026-04-11 11:35:40 -04:00
|
|
|
project_memory_text,
|
2026-04-05 17:53:23 -04:00
|
|
|
selected,
|
|
|
|
|
budget,
|
|
|
|
|
)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 8. Build full prompt
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
full_prompt = f"{SYSTEM_PREFIX}\n\n{formatted}\n\n{user_prompt}"
|
|
|
|
|
|
2026-04-05 17:53:23 -04:00
|
|
|
project_state_chars = len(project_state_text)
|
|
|
|
|
memory_chars = len(memory_text)
|
2026-04-11 11:35:40 -04:00
|
|
|
project_memory_chars = len(project_memory_text)
|
2026-04-05 09:41:59 -04:00
|
|
|
retrieval_chars = sum(c.char_count for c in selected)
|
2026-04-05 17:53:23 -04:00
|
|
|
total_chars = len(formatted)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
duration_ms = int((time.time() - start) * 1000)
|
|
|
|
|
|
|
|
|
|
pack = ContextPack(
|
|
|
|
|
chunks_used=selected,
|
2026-04-05 09:41:59 -04:00
|
|
|
project_state_text=project_state_text,
|
|
|
|
|
project_state_chars=project_state_chars,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text=memory_text,
|
|
|
|
|
memory_chars=memory_chars,
|
2026-04-11 11:35:40 -04:00
|
|
|
project_memory_text=project_memory_text,
|
|
|
|
|
project_memory_chars=project_memory_chars,
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
total_chars=total_chars,
|
|
|
|
|
budget=budget,
|
|
|
|
|
budget_remaining=budget - total_chars,
|
|
|
|
|
formatted_context=formatted,
|
|
|
|
|
full_prompt=full_prompt,
|
|
|
|
|
query=user_prompt,
|
|
|
|
|
project_hint=project_hint or "",
|
|
|
|
|
duration_ms=duration_ms,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
_last_context_pack = pack
|
|
|
|
|
|
|
|
|
|
log.info(
|
|
|
|
|
"context_built",
|
|
|
|
|
chunks_used=len(selected),
|
2026-04-05 09:41:59 -04:00
|
|
|
project_state_chars=project_state_chars,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_chars=memory_chars,
|
2026-04-11 11:35:40 -04:00
|
|
|
project_memory_chars=project_memory_chars,
|
2026-04-05 09:41:59 -04:00
|
|
|
retrieval_chars=retrieval_chars,
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
total_chars=total_chars,
|
|
|
|
|
budget_remaining=budget - total_chars,
|
|
|
|
|
duration_ms=duration_ms,
|
|
|
|
|
)
|
|
|
|
|
log.debug("context_pack_detail", pack=_pack_to_dict(pack))
|
|
|
|
|
|
|
|
|
|
return pack
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def get_last_context_pack() -> ContextPack | None:
|
|
|
|
|
"""Return the last built context pack for debug inspection."""
|
|
|
|
|
return _last_context_pack
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _rank_chunks(
|
|
|
|
|
candidates: list[ChunkResult],
|
|
|
|
|
project_hint: str | None,
|
|
|
|
|
) -> list[tuple[float, ChunkResult]]:
|
|
|
|
|
"""Rank candidates with boosting for project match."""
|
|
|
|
|
scored = []
|
|
|
|
|
seen_content: set[str] = set()
|
|
|
|
|
|
|
|
|
|
for chunk in candidates:
|
|
|
|
|
# Deduplicate by content prefix (first 200 chars)
|
|
|
|
|
content_key = chunk.content[:200]
|
|
|
|
|
if content_key in seen_content:
|
|
|
|
|
continue
|
|
|
|
|
seen_content.add(content_key)
|
|
|
|
|
|
|
|
|
|
# Base score from similarity
|
|
|
|
|
final_score = chunk.score
|
|
|
|
|
|
|
|
|
|
# Project boost
|
|
|
|
|
if project_hint:
|
|
|
|
|
tags_str = chunk.tags.lower() if chunk.tags else ""
|
|
|
|
|
source_str = chunk.source_file.lower()
|
|
|
|
|
title_str = chunk.title.lower() if chunk.title else ""
|
|
|
|
|
hint_lower = project_hint.lower()
|
|
|
|
|
|
|
|
|
|
if hint_lower in tags_str or hint_lower in source_str or hint_lower in title_str:
|
2026-04-05 09:35:37 -04:00
|
|
|
final_score *= 1.3
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
scored.append((final_score, chunk))
|
|
|
|
|
|
|
|
|
|
# Sort by score descending
|
|
|
|
|
scored.sort(key=lambda x: x[0], reverse=True)
|
|
|
|
|
return scored
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _select_within_budget(
|
|
|
|
|
scored: list[tuple[float, ChunkResult]],
|
|
|
|
|
budget: int,
|
|
|
|
|
) -> list[ContextChunk]:
|
|
|
|
|
"""Select top chunks that fit within the character budget."""
|
|
|
|
|
selected = []
|
|
|
|
|
used = 0
|
|
|
|
|
|
|
|
|
|
for score, chunk in scored:
|
|
|
|
|
chunk_len = len(chunk.content)
|
|
|
|
|
if used + chunk_len > budget:
|
|
|
|
|
continue
|
|
|
|
|
selected.append(
|
|
|
|
|
ContextChunk(
|
|
|
|
|
content=chunk.content,
|
|
|
|
|
source_file=_shorten_path(chunk.source_file),
|
|
|
|
|
heading_path=chunk.heading_path,
|
|
|
|
|
score=score,
|
|
|
|
|
char_count=chunk_len,
|
|
|
|
|
)
|
|
|
|
|
)
|
|
|
|
|
used += chunk_len
|
|
|
|
|
|
|
|
|
|
return selected
|
|
|
|
|
|
|
|
|
|
|
2026-04-05 09:41:59 -04:00
|
|
|
def _format_full_context(
|
|
|
|
|
project_state_text: str,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text: str,
|
2026-04-11 11:35:40 -04:00
|
|
|
project_memory_text: str,
|
2026-04-05 09:41:59 -04:00
|
|
|
chunks: list[ContextChunk],
|
|
|
|
|
) -> str:
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
"""Format project state + memories + retrieved chunks into full context block."""
|
2026-04-05 09:41:59 -04:00
|
|
|
parts = []
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 1. Project state first (highest trust)
|
2026-04-05 09:41:59 -04:00
|
|
|
if project_state_text:
|
|
|
|
|
parts.append(project_state_text)
|
|
|
|
|
parts.append("")
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 2. Identity + preference memories (second trust level)
|
|
|
|
|
if memory_text:
|
|
|
|
|
parts.append(memory_text)
|
|
|
|
|
parts.append("")
|
|
|
|
|
|
2026-04-11 11:35:40 -04:00
|
|
|
# 3. Project-scoped memories (third trust level)
|
|
|
|
|
if project_memory_text:
|
|
|
|
|
parts.append(project_memory_text)
|
|
|
|
|
parts.append("")
|
|
|
|
|
|
|
|
|
|
# 4. Retrieved chunks (lowest trust)
|
2026-04-05 09:41:59 -04:00
|
|
|
if chunks:
|
|
|
|
|
parts.append("--- AtoCore Retrieved Context ---")
|
2026-04-05 17:53:23 -04:00
|
|
|
if project_state_text:
|
|
|
|
|
parts.append("If retrieved context conflicts with Trusted Project State above, trust the Trusted Project State.")
|
2026-04-05 09:41:59 -04:00
|
|
|
for chunk in chunks:
|
|
|
|
|
parts.append(
|
|
|
|
|
f"[Source: {chunk.source_file} | Section: {chunk.heading_path} | Score: {chunk.score:.2f}]"
|
|
|
|
|
)
|
|
|
|
|
parts.append(chunk.content)
|
|
|
|
|
parts.append("")
|
|
|
|
|
parts.append("--- End Context ---")
|
2026-04-11 11:35:40 -04:00
|
|
|
elif not project_state_text and not memory_text and not project_memory_text:
|
2026-04-05 09:41:59 -04:00
|
|
|
parts.append("--- AtoCore Context ---\nNo relevant context found.\n--- End Context ---")
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
2026-04-05 09:41:59 -04:00
|
|
|
return "\n".join(parts)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def _shorten_path(path: str) -> str:
|
|
|
|
|
"""Shorten an absolute path to a relative-like display."""
|
|
|
|
|
p = Path(path)
|
|
|
|
|
parts = p.parts
|
|
|
|
|
if len(parts) > 3:
|
|
|
|
|
return str(Path(*parts[-3:]))
|
|
|
|
|
return str(p)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _pack_to_dict(pack: ContextPack) -> dict:
|
|
|
|
|
"""Convert a context pack to a JSON-serializable dict."""
|
|
|
|
|
return {
|
|
|
|
|
"query": pack.query,
|
|
|
|
|
"project_hint": pack.project_hint,
|
2026-04-05 09:41:59 -04:00
|
|
|
"project_state_chars": pack.project_state_chars,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
"memory_chars": pack.memory_chars,
|
2026-04-11 11:35:40 -04:00
|
|
|
"project_memory_chars": pack.project_memory_chars,
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
"chunks_used": len(pack.chunks_used),
|
|
|
|
|
"total_chars": pack.total_chars,
|
|
|
|
|
"budget": pack.budget,
|
|
|
|
|
"budget_remaining": pack.budget_remaining,
|
|
|
|
|
"duration_ms": pack.duration_ms,
|
2026-04-05 09:41:59 -04:00
|
|
|
"has_project_state": bool(pack.project_state_text),
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
"has_memories": bool(pack.memory_text),
|
2026-04-11 11:35:40 -04:00
|
|
|
"has_project_memories": bool(pack.project_memory_text),
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
"chunks": [
|
|
|
|
|
{
|
|
|
|
|
"source_file": c.source_file,
|
|
|
|
|
"heading_path": c.heading_path,
|
|
|
|
|
"score": c.score,
|
|
|
|
|
"char_count": c.char_count,
|
|
|
|
|
"content_preview": c.content[:100],
|
|
|
|
|
}
|
|
|
|
|
for c in pack.chunks_used
|
|
|
|
|
],
|
|
|
|
|
}
|
2026-04-05 17:53:23 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def _truncate_text_block(text: str, budget: int) -> tuple[str, int]:
|
|
|
|
|
"""Trim a formatted text block so trusted tiers cannot exceed the total budget."""
|
|
|
|
|
if budget <= 0 or not text:
|
|
|
|
|
return "", 0
|
|
|
|
|
if len(text) <= budget:
|
|
|
|
|
return text, len(text)
|
|
|
|
|
if budget <= 3:
|
|
|
|
|
trimmed = text[:budget]
|
|
|
|
|
else:
|
|
|
|
|
trimmed = f"{text[: budget - 3].rstrip()}..."
|
|
|
|
|
return trimmed, len(trimmed)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _trim_context_to_budget(
|
|
|
|
|
project_state_text: str,
|
|
|
|
|
memory_text: str,
|
2026-04-11 11:35:40 -04:00
|
|
|
project_memory_text: str,
|
2026-04-05 17:53:23 -04:00
|
|
|
chunks: list[ContextChunk],
|
|
|
|
|
budget: int,
|
|
|
|
|
) -> tuple[str, list[ContextChunk]]:
|
2026-04-11 11:35:40 -04:00
|
|
|
"""Trim retrieval → project memories → identity/preference → project state."""
|
2026-04-05 17:53:23 -04:00
|
|
|
kept_chunks = list(chunks)
|
2026-04-11 11:35:40 -04:00
|
|
|
formatted = _format_full_context(
|
|
|
|
|
project_state_text, memory_text, project_memory_text, kept_chunks
|
|
|
|
|
)
|
2026-04-05 17:53:23 -04:00
|
|
|
while len(formatted) > budget and kept_chunks:
|
|
|
|
|
kept_chunks.pop()
|
2026-04-11 11:35:40 -04:00
|
|
|
formatted = _format_full_context(
|
|
|
|
|
project_state_text, memory_text, project_memory_text, kept_chunks
|
|
|
|
|
)
|
2026-04-05 17:53:23 -04:00
|
|
|
|
|
|
|
|
if len(formatted) <= budget:
|
|
|
|
|
return formatted, kept_chunks
|
|
|
|
|
|
2026-04-11 11:35:40 -04:00
|
|
|
# Drop project memories next (they were the most recently added
|
|
|
|
|
# tier and carry less trust than identity/preference).
|
|
|
|
|
project_memory_text, _ = _truncate_text_block(
|
|
|
|
|
project_memory_text,
|
|
|
|
|
max(budget - len(project_state_text) - len(memory_text), 0),
|
|
|
|
|
)
|
|
|
|
|
formatted = _format_full_context(
|
|
|
|
|
project_state_text, memory_text, project_memory_text, kept_chunks
|
|
|
|
|
)
|
|
|
|
|
if len(formatted) <= budget:
|
|
|
|
|
return formatted, kept_chunks
|
|
|
|
|
|
2026-04-05 17:53:23 -04:00
|
|
|
memory_text, _ = _truncate_text_block(memory_text, max(budget - len(project_state_text), 0))
|
2026-04-11 11:35:40 -04:00
|
|
|
formatted = _format_full_context(
|
|
|
|
|
project_state_text, memory_text, project_memory_text, kept_chunks
|
|
|
|
|
)
|
2026-04-05 17:53:23 -04:00
|
|
|
if len(formatted) <= budget:
|
|
|
|
|
return formatted, kept_chunks
|
|
|
|
|
|
|
|
|
|
project_state_text, _ = _truncate_text_block(project_state_text, budget)
|
2026-04-11 11:35:40 -04:00
|
|
|
formatted = _format_full_context(project_state_text, "", "", [])
|
2026-04-05 17:53:23 -04:00
|
|
|
if len(formatted) > budget:
|
|
|
|
|
formatted, _ = _truncate_text_block(formatted, budget)
|
|
|
|
|
return formatted, []
|