2026-04-05 09:41:59 -04:00
|
|
|
"""Context pack assembly: retrieve, rank, budget, format.
|
|
|
|
|
|
|
|
|
|
Trust precedence (per Master Plan):
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
1. Trusted Project State → always included first, highest authority
|
|
|
|
|
2. Identity + Preference memories → included next
|
|
|
|
|
3. Retrieved chunks → ranked, deduplicated, budget-constrained
|
2026-04-05 09:41:59 -04:00
|
|
|
"""
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
import time
|
|
|
|
|
from dataclasses import dataclass, field
|
|
|
|
|
from pathlib import Path
|
|
|
|
|
|
2026-04-05 17:53:23 -04:00
|
|
|
import atocore.config as _config
|
2026-04-05 09:41:59 -04:00
|
|
|
from atocore.context.project_state import format_project_state, get_state
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
from atocore.memory.service import get_memories_for_context
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
from atocore.observability.logger import get_logger
|
fix(P1+P2): alias-aware project state lookup + slash command corpus fallback
Two regression fixes from codex's review of the slash command
refactor commit (78d4e97). Both findings are real and now have
covered tests.
P1 — server-side alias resolution for project_state lookup
----------------------------------------------------------
The bug:
- /context/build forwarded the caller's project hint verbatim to
get_state(project_hint), which does an exact-name lookup against
the projects table (case-insensitive but no alias resolution)
- the project registry's alias matching was only used by the
client's auto-context path and the retriever's project-match
boost, never by the server's project_state lookup
- consequence: /atocore-context "... p05" would silently miss
trusted project state stored under the canonical id
"p05-interferometer", weakening project-hinted retrieval to
the point that an explicit alias hint was *worse* than no hint
The fix in src/atocore/context/builder.py:
- import get_registered_project from the projects registry
- before calling get_state(project_hint), resolve the hint
through get_registered_project; if a registry record exists,
use the canonical project_id for the state lookup
- if no registry record exists, fall back to the raw hint so a
hand-curated project_state entry that predates the registry
still works (backwards compat with pre-registry deployments)
The retriever already does its own alias expansion via
get_registered_project for the project-match boost, so the
retriever side was never broken — only the project_state lookup
in the builder. The fix is scoped to that one call site.
Tests added in tests/test_context_builder.py:
- test_alias_hint_resolves_through_registry: stands up a fresh
registry, sets state under "p05-interferometer", then verifies
build_context with project_hint="p05" finds the state, AND
with project_hint="interferometer" (the second alias) finds it
too, AND with the canonical id finds it. Covers all three
resolution paths.
- test_unknown_hint_falls_back_to_raw_lookup: empty registry,
set state under an unregistered project name, verify the
build_context call with that name as the hint still finds the
state. Locks in the backwards-compat behavior.
P2 — slash command no-hint fallback to corpus-wide context build
----------------------------------------------------------------
The bug:
- the slash command's no-hint path called auto-context, which
returns {"status": "no_project_match"} when project detection
fails and does NOT fall back to a plain context-build
- the slash command's own help text told the user "call without
a hint to use the corpus-wide context build" — which was a lie
because the wrapper no longer did that
- consequence: generic prompts like "what changed in AtoCore
backup policy?" or any cross-project question got a useless
no_project_match envelope instead of a context pack
The fix in .claude/commands/atocore-context.md:
- the no-hint path now does the 2-step fallback dance:
1. try `auto-context "<prompt>"` for project detection
2. if the response contains "no_project_match", fall back to
`context-build "<prompt>"` (no project arg)
- both branches return a real context pack, fail-open envelope
is preserved for genuine network errors
- the underlying client surface is unchanged (no new flags, no
new subcommands) — the fallback is per-frontend logic in the
slash command, leaving auto-context's existing semantics
intact for OpenClaw and any other caller that depends on the
no_project_match envelope as a "do nothing" signal
While I was here, also tightened the slash command's argument
parsing to delegate alias-knowledge to the registry instead of
embedding a hardcoded list:
- old version had a literal list of "atocore", "p04", "p05",
"p06" and their aliases that needed manual maintenance every
time a project was added
- new version takes the last token of $ARGUMENTS and asks the
client's `detect-project` subcommand whether it's a known
alias; if matched, it's the explicit hint, if not it's part
of the prompt
- this delegates registry knowledge to the registry, where it
belongs
Unrelated improvement noted but NOT fixed in this commit:
- _rank_chunks in builder.py also has a naive substring boost
that uses the original hint without alias expansion. The
retriever already does the right thing, so this secondary
boost is redundant. Tracked as a future cleanup but not in
scope for the P1/P2 fix; codex's findings are about
project_state lookup, not about the secondary chunk boost.
Full suite: 162 passing (was 160), 1 warning. The +2 is the two
new P1 regression tests.
2026-04-07 07:47:03 -04:00
|
|
|
from atocore.projects.registry import get_registered_project
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
from atocore.retrieval.retriever import ChunkResult, retrieve
|
|
|
|
|
|
|
|
|
|
log = get_logger("context_builder")
|
|
|
|
|
|
|
|
|
|
SYSTEM_PREFIX = (
|
|
|
|
|
"You have access to the following personal context from the user's knowledge base.\n"
|
|
|
|
|
"Use it to inform your answer. If the context is not relevant, ignore it.\n"
|
2026-04-05 09:41:59 -04:00
|
|
|
"Do not mention the context system unless asked.\n"
|
|
|
|
|
"When project state is provided, treat it as the most authoritative source."
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
)
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# Budget allocation (per Master Plan section 9):
|
|
|
|
|
# identity: 5%, preferences: 5%, project state: 20%, retrieval: 60%+
|
2026-04-05 09:41:59 -04:00
|
|
|
PROJECT_STATE_BUDGET_RATIO = 0.20
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
MEMORY_BUDGET_RATIO = 0.10 # 5% identity + 5% preference
|
2026-04-05 09:41:59 -04:00
|
|
|
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
# Last built context pack for debug inspection
|
|
|
|
|
_last_context_pack: "ContextPack | None" = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@dataclass
|
|
|
|
|
class ContextChunk:
|
|
|
|
|
content: str
|
|
|
|
|
source_file: str
|
|
|
|
|
heading_path: str
|
|
|
|
|
score: float
|
|
|
|
|
char_count: int
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@dataclass
|
|
|
|
|
class ContextPack:
|
|
|
|
|
chunks_used: list[ContextChunk] = field(default_factory=list)
|
2026-04-05 09:41:59 -04:00
|
|
|
project_state_text: str = ""
|
|
|
|
|
project_state_chars: int = 0
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text: str = ""
|
|
|
|
|
memory_chars: int = 0
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
total_chars: int = 0
|
|
|
|
|
budget: int = 0
|
|
|
|
|
budget_remaining: int = 0
|
|
|
|
|
formatted_context: str = ""
|
|
|
|
|
full_prompt: str = ""
|
|
|
|
|
query: str = ""
|
|
|
|
|
project_hint: str = ""
|
|
|
|
|
duration_ms: int = 0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def build_context(
|
|
|
|
|
user_prompt: str,
|
|
|
|
|
project_hint: str | None = None,
|
|
|
|
|
budget: int | None = None,
|
|
|
|
|
) -> ContextPack:
|
2026-04-05 09:41:59 -04:00
|
|
|
"""Build a context pack for a user prompt.
|
|
|
|
|
|
|
|
|
|
Trust precedence applied:
|
|
|
|
|
1. Project state is injected first (highest trust)
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
2. Identity + preference memories (second trust level)
|
|
|
|
|
3. Retrieved chunks fill the remaining budget
|
2026-04-05 09:41:59 -04:00
|
|
|
"""
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
global _last_context_pack
|
|
|
|
|
start = time.time()
|
2026-04-05 17:53:23 -04:00
|
|
|
budget = _config.settings.context_budget if budget is None else max(budget, 0)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
2026-04-05 09:41:59 -04:00
|
|
|
# 1. Get Trusted Project State (highest precedence)
|
|
|
|
|
project_state_text = ""
|
|
|
|
|
project_state_chars = 0
|
2026-04-05 17:53:23 -04:00
|
|
|
project_state_budget = min(
|
|
|
|
|
budget,
|
|
|
|
|
max(0, int(budget * PROJECT_STATE_BUDGET_RATIO)),
|
|
|
|
|
)
|
2026-04-05 09:41:59 -04:00
|
|
|
|
fix(P1+P2): alias-aware project state lookup + slash command corpus fallback
Two regression fixes from codex's review of the slash command
refactor commit (78d4e97). Both findings are real and now have
covered tests.
P1 — server-side alias resolution for project_state lookup
----------------------------------------------------------
The bug:
- /context/build forwarded the caller's project hint verbatim to
get_state(project_hint), which does an exact-name lookup against
the projects table (case-insensitive but no alias resolution)
- the project registry's alias matching was only used by the
client's auto-context path and the retriever's project-match
boost, never by the server's project_state lookup
- consequence: /atocore-context "... p05" would silently miss
trusted project state stored under the canonical id
"p05-interferometer", weakening project-hinted retrieval to
the point that an explicit alias hint was *worse* than no hint
The fix in src/atocore/context/builder.py:
- import get_registered_project from the projects registry
- before calling get_state(project_hint), resolve the hint
through get_registered_project; if a registry record exists,
use the canonical project_id for the state lookup
- if no registry record exists, fall back to the raw hint so a
hand-curated project_state entry that predates the registry
still works (backwards compat with pre-registry deployments)
The retriever already does its own alias expansion via
get_registered_project for the project-match boost, so the
retriever side was never broken — only the project_state lookup
in the builder. The fix is scoped to that one call site.
Tests added in tests/test_context_builder.py:
- test_alias_hint_resolves_through_registry: stands up a fresh
registry, sets state under "p05-interferometer", then verifies
build_context with project_hint="p05" finds the state, AND
with project_hint="interferometer" (the second alias) finds it
too, AND with the canonical id finds it. Covers all three
resolution paths.
- test_unknown_hint_falls_back_to_raw_lookup: empty registry,
set state under an unregistered project name, verify the
build_context call with that name as the hint still finds the
state. Locks in the backwards-compat behavior.
P2 — slash command no-hint fallback to corpus-wide context build
----------------------------------------------------------------
The bug:
- the slash command's no-hint path called auto-context, which
returns {"status": "no_project_match"} when project detection
fails and does NOT fall back to a plain context-build
- the slash command's own help text told the user "call without
a hint to use the corpus-wide context build" — which was a lie
because the wrapper no longer did that
- consequence: generic prompts like "what changed in AtoCore
backup policy?" or any cross-project question got a useless
no_project_match envelope instead of a context pack
The fix in .claude/commands/atocore-context.md:
- the no-hint path now does the 2-step fallback dance:
1. try `auto-context "<prompt>"` for project detection
2. if the response contains "no_project_match", fall back to
`context-build "<prompt>"` (no project arg)
- both branches return a real context pack, fail-open envelope
is preserved for genuine network errors
- the underlying client surface is unchanged (no new flags, no
new subcommands) — the fallback is per-frontend logic in the
slash command, leaving auto-context's existing semantics
intact for OpenClaw and any other caller that depends on the
no_project_match envelope as a "do nothing" signal
While I was here, also tightened the slash command's argument
parsing to delegate alias-knowledge to the registry instead of
embedding a hardcoded list:
- old version had a literal list of "atocore", "p04", "p05",
"p06" and their aliases that needed manual maintenance every
time a project was added
- new version takes the last token of $ARGUMENTS and asks the
client's `detect-project` subcommand whether it's a known
alias; if matched, it's the explicit hint, if not it's part
of the prompt
- this delegates registry knowledge to the registry, where it
belongs
Unrelated improvement noted but NOT fixed in this commit:
- _rank_chunks in builder.py also has a naive substring boost
that uses the original hint without alias expansion. The
retriever already does the right thing, so this secondary
boost is redundant. Tracked as a future cleanup but not in
scope for the P1/P2 fix; codex's findings are about
project_state lookup, not about the secondary chunk boost.
Full suite: 162 passing (was 160), 1 warning. The +2 is the two
new P1 regression tests.
2026-04-07 07:47:03 -04:00
|
|
|
# Resolve the project hint through the registry so callers can pass
|
|
|
|
|
# an alias (`p05`, `gigabit`) and still find trusted state stored
|
|
|
|
|
# under the canonical project id (`p05-interferometer`,
|
|
|
|
|
# `p04-gigabit`). The retriever already does this for the
|
|
|
|
|
# project-match boost — the project_state lookup needs the same
|
|
|
|
|
# courtesy. If the registry has no entry for the hint, fall back to
|
|
|
|
|
# the raw hint so a hand-curated project_state entry that predates
|
|
|
|
|
# the registry still works.
|
|
|
|
|
canonical_project = project_hint
|
2026-04-05 09:41:59 -04:00
|
|
|
if project_hint:
|
fix(P1+P2): alias-aware project state lookup + slash command corpus fallback
Two regression fixes from codex's review of the slash command
refactor commit (78d4e97). Both findings are real and now have
covered tests.
P1 — server-side alias resolution for project_state lookup
----------------------------------------------------------
The bug:
- /context/build forwarded the caller's project hint verbatim to
get_state(project_hint), which does an exact-name lookup against
the projects table (case-insensitive but no alias resolution)
- the project registry's alias matching was only used by the
client's auto-context path and the retriever's project-match
boost, never by the server's project_state lookup
- consequence: /atocore-context "... p05" would silently miss
trusted project state stored under the canonical id
"p05-interferometer", weakening project-hinted retrieval to
the point that an explicit alias hint was *worse* than no hint
The fix in src/atocore/context/builder.py:
- import get_registered_project from the projects registry
- before calling get_state(project_hint), resolve the hint
through get_registered_project; if a registry record exists,
use the canonical project_id for the state lookup
- if no registry record exists, fall back to the raw hint so a
hand-curated project_state entry that predates the registry
still works (backwards compat with pre-registry deployments)
The retriever already does its own alias expansion via
get_registered_project for the project-match boost, so the
retriever side was never broken — only the project_state lookup
in the builder. The fix is scoped to that one call site.
Tests added in tests/test_context_builder.py:
- test_alias_hint_resolves_through_registry: stands up a fresh
registry, sets state under "p05-interferometer", then verifies
build_context with project_hint="p05" finds the state, AND
with project_hint="interferometer" (the second alias) finds it
too, AND with the canonical id finds it. Covers all three
resolution paths.
- test_unknown_hint_falls_back_to_raw_lookup: empty registry,
set state under an unregistered project name, verify the
build_context call with that name as the hint still finds the
state. Locks in the backwards-compat behavior.
P2 — slash command no-hint fallback to corpus-wide context build
----------------------------------------------------------------
The bug:
- the slash command's no-hint path called auto-context, which
returns {"status": "no_project_match"} when project detection
fails and does NOT fall back to a plain context-build
- the slash command's own help text told the user "call without
a hint to use the corpus-wide context build" — which was a lie
because the wrapper no longer did that
- consequence: generic prompts like "what changed in AtoCore
backup policy?" or any cross-project question got a useless
no_project_match envelope instead of a context pack
The fix in .claude/commands/atocore-context.md:
- the no-hint path now does the 2-step fallback dance:
1. try `auto-context "<prompt>"` for project detection
2. if the response contains "no_project_match", fall back to
`context-build "<prompt>"` (no project arg)
- both branches return a real context pack, fail-open envelope
is preserved for genuine network errors
- the underlying client surface is unchanged (no new flags, no
new subcommands) — the fallback is per-frontend logic in the
slash command, leaving auto-context's existing semantics
intact for OpenClaw and any other caller that depends on the
no_project_match envelope as a "do nothing" signal
While I was here, also tightened the slash command's argument
parsing to delegate alias-knowledge to the registry instead of
embedding a hardcoded list:
- old version had a literal list of "atocore", "p04", "p05",
"p06" and their aliases that needed manual maintenance every
time a project was added
- new version takes the last token of $ARGUMENTS and asks the
client's `detect-project` subcommand whether it's a known
alias; if matched, it's the explicit hint, if not it's part
of the prompt
- this delegates registry knowledge to the registry, where it
belongs
Unrelated improvement noted but NOT fixed in this commit:
- _rank_chunks in builder.py also has a naive substring boost
that uses the original hint without alias expansion. The
retriever already does the right thing, so this secondary
boost is redundant. Tracked as a future cleanup but not in
scope for the P1/P2 fix; codex's findings are about
project_state lookup, not about the secondary chunk boost.
Full suite: 162 passing (was 160), 1 warning. The +2 is the two
new P1 regression tests.
2026-04-07 07:47:03 -04:00
|
|
|
registered = get_registered_project(project_hint)
|
|
|
|
|
if registered is not None:
|
|
|
|
|
canonical_project = registered.project_id
|
|
|
|
|
|
|
|
|
|
state_entries = get_state(canonical_project)
|
2026-04-05 09:41:59 -04:00
|
|
|
if state_entries:
|
|
|
|
|
project_state_text = format_project_state(state_entries)
|
2026-04-05 17:53:23 -04:00
|
|
|
project_state_text, project_state_chars = _truncate_text_block(
|
|
|
|
|
project_state_text,
|
|
|
|
|
project_state_budget or budget,
|
|
|
|
|
)
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
|
|
|
|
|
# 2. Get identity + preference memories (second precedence)
|
2026-04-05 17:53:23 -04:00
|
|
|
memory_budget = min(int(budget * MEMORY_BUDGET_RATIO), max(budget - project_state_chars, 0))
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text, memory_chars = get_memories_for_context(
|
|
|
|
|
memory_types=["identity", "preference"],
|
|
|
|
|
budget=memory_budget,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# 3. Calculate remaining budget for retrieval
|
|
|
|
|
retrieval_budget = budget - project_state_chars - memory_chars
|
|
|
|
|
|
|
|
|
|
# 4. Retrieve candidates
|
fix: pass project_hint into retrieve and add path-signal ranking
Two changes that belong together:
1. builder.build_context() now passes project_hint into retrieve(),
so the project-aware boost actually fires for the retrieval pipeline
driven by /context/build. Before this, only direct /query callers
benefited from the registered-project boost.
2. retriever now applies two more ranking signals on every chunk:
- _query_match_boost: boosts chunks whose source/title/heading
echo high-signal query tokens (stop list filters out generic
words like "the", "project", "system")
- _path_signal_boost: down-weights archival noise (_archive,
_history, pre-cleanup, reviews) by 0.72 and up-weights current
high-signal docs (status, decision, requirements, charter,
system-map, error-budget, ...) by 1.18
Tests:
- test_context_builder_passes_project_hint_to_retrieval verifies
the wiring fix
- test_retrieve_downranks_archive_noise_and_prefers_high_signal_paths
verifies the new ranking helpers prefer current docs over archive
This addresses the cross-project competition and archive bleed
called out in current-state.md after the Wave 1 ingestion.
2026-04-06 18:37:07 -04:00
|
|
|
candidates = (
|
|
|
|
|
retrieve(
|
|
|
|
|
user_prompt,
|
|
|
|
|
top_k=_config.settings.context_top_k,
|
|
|
|
|
project_hint=project_hint,
|
|
|
|
|
)
|
|
|
|
|
if retrieval_budget > 0
|
|
|
|
|
else []
|
|
|
|
|
)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 5. Score and rank
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
scored = _rank_chunks(candidates, project_hint)
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 6. Select within remaining budget
|
2026-04-05 09:41:59 -04:00
|
|
|
selected = _select_within_budget(scored, max(retrieval_budget, 0))
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 7. Format full context
|
|
|
|
|
formatted = _format_full_context(project_state_text, memory_text, selected)
|
2026-04-05 17:53:23 -04:00
|
|
|
if len(formatted) > budget:
|
|
|
|
|
formatted, selected = _trim_context_to_budget(
|
|
|
|
|
project_state_text,
|
|
|
|
|
memory_text,
|
|
|
|
|
selected,
|
|
|
|
|
budget,
|
|
|
|
|
)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 8. Build full prompt
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
full_prompt = f"{SYSTEM_PREFIX}\n\n{formatted}\n\n{user_prompt}"
|
|
|
|
|
|
2026-04-05 17:53:23 -04:00
|
|
|
project_state_chars = len(project_state_text)
|
|
|
|
|
memory_chars = len(memory_text)
|
2026-04-05 09:41:59 -04:00
|
|
|
retrieval_chars = sum(c.char_count for c in selected)
|
2026-04-05 17:53:23 -04:00
|
|
|
total_chars = len(formatted)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
duration_ms = int((time.time() - start) * 1000)
|
|
|
|
|
|
|
|
|
|
pack = ContextPack(
|
|
|
|
|
chunks_used=selected,
|
2026-04-05 09:41:59 -04:00
|
|
|
project_state_text=project_state_text,
|
|
|
|
|
project_state_chars=project_state_chars,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text=memory_text,
|
|
|
|
|
memory_chars=memory_chars,
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
total_chars=total_chars,
|
|
|
|
|
budget=budget,
|
|
|
|
|
budget_remaining=budget - total_chars,
|
|
|
|
|
formatted_context=formatted,
|
|
|
|
|
full_prompt=full_prompt,
|
|
|
|
|
query=user_prompt,
|
|
|
|
|
project_hint=project_hint or "",
|
|
|
|
|
duration_ms=duration_ms,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
_last_context_pack = pack
|
|
|
|
|
|
|
|
|
|
log.info(
|
|
|
|
|
"context_built",
|
|
|
|
|
chunks_used=len(selected),
|
2026-04-05 09:41:59 -04:00
|
|
|
project_state_chars=project_state_chars,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_chars=memory_chars,
|
2026-04-05 09:41:59 -04:00
|
|
|
retrieval_chars=retrieval_chars,
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
total_chars=total_chars,
|
|
|
|
|
budget_remaining=budget - total_chars,
|
|
|
|
|
duration_ms=duration_ms,
|
|
|
|
|
)
|
|
|
|
|
log.debug("context_pack_detail", pack=_pack_to_dict(pack))
|
|
|
|
|
|
|
|
|
|
return pack
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def get_last_context_pack() -> ContextPack | None:
|
|
|
|
|
"""Return the last built context pack for debug inspection."""
|
|
|
|
|
return _last_context_pack
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _rank_chunks(
|
|
|
|
|
candidates: list[ChunkResult],
|
|
|
|
|
project_hint: str | None,
|
|
|
|
|
) -> list[tuple[float, ChunkResult]]:
|
|
|
|
|
"""Rank candidates with boosting for project match."""
|
|
|
|
|
scored = []
|
|
|
|
|
seen_content: set[str] = set()
|
|
|
|
|
|
|
|
|
|
for chunk in candidates:
|
|
|
|
|
# Deduplicate by content prefix (first 200 chars)
|
|
|
|
|
content_key = chunk.content[:200]
|
|
|
|
|
if content_key in seen_content:
|
|
|
|
|
continue
|
|
|
|
|
seen_content.add(content_key)
|
|
|
|
|
|
|
|
|
|
# Base score from similarity
|
|
|
|
|
final_score = chunk.score
|
|
|
|
|
|
|
|
|
|
# Project boost
|
|
|
|
|
if project_hint:
|
|
|
|
|
tags_str = chunk.tags.lower() if chunk.tags else ""
|
|
|
|
|
source_str = chunk.source_file.lower()
|
|
|
|
|
title_str = chunk.title.lower() if chunk.title else ""
|
|
|
|
|
hint_lower = project_hint.lower()
|
|
|
|
|
|
|
|
|
|
if hint_lower in tags_str or hint_lower in source_str or hint_lower in title_str:
|
2026-04-05 09:35:37 -04:00
|
|
|
final_score *= 1.3
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
scored.append((final_score, chunk))
|
|
|
|
|
|
|
|
|
|
# Sort by score descending
|
|
|
|
|
scored.sort(key=lambda x: x[0], reverse=True)
|
|
|
|
|
return scored
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _select_within_budget(
|
|
|
|
|
scored: list[tuple[float, ChunkResult]],
|
|
|
|
|
budget: int,
|
|
|
|
|
) -> list[ContextChunk]:
|
|
|
|
|
"""Select top chunks that fit within the character budget."""
|
|
|
|
|
selected = []
|
|
|
|
|
used = 0
|
|
|
|
|
|
|
|
|
|
for score, chunk in scored:
|
|
|
|
|
chunk_len = len(chunk.content)
|
|
|
|
|
if used + chunk_len > budget:
|
|
|
|
|
continue
|
|
|
|
|
selected.append(
|
|
|
|
|
ContextChunk(
|
|
|
|
|
content=chunk.content,
|
|
|
|
|
source_file=_shorten_path(chunk.source_file),
|
|
|
|
|
heading_path=chunk.heading_path,
|
|
|
|
|
score=score,
|
|
|
|
|
char_count=chunk_len,
|
|
|
|
|
)
|
|
|
|
|
)
|
|
|
|
|
used += chunk_len
|
|
|
|
|
|
|
|
|
|
return selected
|
|
|
|
|
|
|
|
|
|
|
2026-04-05 09:41:59 -04:00
|
|
|
def _format_full_context(
|
|
|
|
|
project_state_text: str,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
memory_text: str,
|
2026-04-05 09:41:59 -04:00
|
|
|
chunks: list[ContextChunk],
|
|
|
|
|
) -> str:
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
"""Format project state + memories + retrieved chunks into full context block."""
|
2026-04-05 09:41:59 -04:00
|
|
|
parts = []
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 1. Project state first (highest trust)
|
2026-04-05 09:41:59 -04:00
|
|
|
if project_state_text:
|
|
|
|
|
parts.append(project_state_text)
|
|
|
|
|
parts.append("")
|
|
|
|
|
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
# 2. Identity + preference memories (second trust level)
|
|
|
|
|
if memory_text:
|
|
|
|
|
parts.append(memory_text)
|
|
|
|
|
parts.append("")
|
|
|
|
|
|
|
|
|
|
# 3. Retrieved chunks (lowest trust)
|
2026-04-05 09:41:59 -04:00
|
|
|
if chunks:
|
|
|
|
|
parts.append("--- AtoCore Retrieved Context ---")
|
2026-04-05 17:53:23 -04:00
|
|
|
if project_state_text:
|
|
|
|
|
parts.append("If retrieved context conflicts with Trusted Project State above, trust the Trusted Project State.")
|
2026-04-05 09:41:59 -04:00
|
|
|
for chunk in chunks:
|
|
|
|
|
parts.append(
|
|
|
|
|
f"[Source: {chunk.source_file} | Section: {chunk.heading_path} | Score: {chunk.score:.2f}]"
|
|
|
|
|
)
|
|
|
|
|
parts.append(chunk.content)
|
|
|
|
|
parts.append("")
|
|
|
|
|
parts.append("--- End Context ---")
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
elif not project_state_text and not memory_text:
|
2026-04-05 09:41:59 -04:00
|
|
|
parts.append("--- AtoCore Context ---\nNo relevant context found.\n--- End Context ---")
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
2026-04-05 09:41:59 -04:00
|
|
|
return "\n".join(parts)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def _shorten_path(path: str) -> str:
|
|
|
|
|
"""Shorten an absolute path to a relative-like display."""
|
|
|
|
|
p = Path(path)
|
|
|
|
|
parts = p.parts
|
|
|
|
|
if len(parts) > 3:
|
|
|
|
|
return str(Path(*parts[-3:]))
|
|
|
|
|
return str(p)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _pack_to_dict(pack: ContextPack) -> dict:
|
|
|
|
|
"""Convert a context pack to a JSON-serializable dict."""
|
|
|
|
|
return {
|
|
|
|
|
"query": pack.query,
|
|
|
|
|
"project_hint": pack.project_hint,
|
2026-04-05 09:41:59 -04:00
|
|
|
"project_state_chars": pack.project_state_chars,
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
"memory_chars": pack.memory_chars,
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
"chunks_used": len(pack.chunks_used),
|
|
|
|
|
"total_chars": pack.total_chars,
|
|
|
|
|
"budget": pack.budget,
|
|
|
|
|
"budget_remaining": pack.budget_remaining,
|
|
|
|
|
"duration_ms": pack.duration_ms,
|
2026-04-05 09:41:59 -04:00
|
|
|
"has_project_state": bool(pack.project_state_text),
|
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory
Context builder integration (trust precedence per Master Plan):
1. Trusted Project State (highest trust, 20% budget)
2. Identity + Preference memories (10% budget)
3. Retrieved chunks (remaining budget)
Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
|
|
|
"has_memories": bool(pack.memory_text),
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
"chunks": [
|
|
|
|
|
{
|
|
|
|
|
"source_file": c.source_file,
|
|
|
|
|
"heading_path": c.heading_path,
|
|
|
|
|
"score": c.score,
|
|
|
|
|
"char_count": c.char_count,
|
|
|
|
|
"content_preview": c.content[:100],
|
|
|
|
|
}
|
|
|
|
|
for c in pack.chunks_used
|
|
|
|
|
],
|
|
|
|
|
}
|
2026-04-05 17:53:23 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def _truncate_text_block(text: str, budget: int) -> tuple[str, int]:
|
|
|
|
|
"""Trim a formatted text block so trusted tiers cannot exceed the total budget."""
|
|
|
|
|
if budget <= 0 or not text:
|
|
|
|
|
return "", 0
|
|
|
|
|
if len(text) <= budget:
|
|
|
|
|
return text, len(text)
|
|
|
|
|
if budget <= 3:
|
|
|
|
|
trimmed = text[:budget]
|
|
|
|
|
else:
|
|
|
|
|
trimmed = f"{text[: budget - 3].rstrip()}..."
|
|
|
|
|
return trimmed, len(trimmed)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _trim_context_to_budget(
|
|
|
|
|
project_state_text: str,
|
|
|
|
|
memory_text: str,
|
|
|
|
|
chunks: list[ContextChunk],
|
|
|
|
|
budget: int,
|
|
|
|
|
) -> tuple[str, list[ContextChunk]]:
|
|
|
|
|
"""Trim retrieval first, then memory, then project state until formatted context fits."""
|
|
|
|
|
kept_chunks = list(chunks)
|
|
|
|
|
formatted = _format_full_context(project_state_text, memory_text, kept_chunks)
|
|
|
|
|
while len(formatted) > budget and kept_chunks:
|
|
|
|
|
kept_chunks.pop()
|
|
|
|
|
formatted = _format_full_context(project_state_text, memory_text, kept_chunks)
|
|
|
|
|
|
|
|
|
|
if len(formatted) <= budget:
|
|
|
|
|
return formatted, kept_chunks
|
|
|
|
|
|
|
|
|
|
memory_text, _ = _truncate_text_block(memory_text, max(budget - len(project_state_text), 0))
|
|
|
|
|
formatted = _format_full_context(project_state_text, memory_text, kept_chunks)
|
|
|
|
|
if len(formatted) <= budget:
|
|
|
|
|
return formatted, kept_chunks
|
|
|
|
|
|
|
|
|
|
project_state_text, _ = _truncate_text_block(project_state_text, budget)
|
|
|
|
|
formatted = _format_full_context(project_state_text, "", [])
|
|
|
|
|
if len(formatted) > budget:
|
|
|
|
|
formatted, _ = _truncate_text_block(formatted, budget)
|
|
|
|
|
return formatted, []
|