Commit Graph

14 Commits

Author SHA1 Message Date
ccc49d3a8f feat: engineering-aware context assembly
When a query matches a known engineering entity by name, the context
pack now includes a structured '--- Engineering Context ---' band
showing the entity's type, description, and its relationships to
other entities (subsystems, materials, requirements, decisions).

Six-tier context assembly:
  1. Trusted Project State
  2. Identity / Preferences
  3. Project Memories
  4. Domain Knowledge
  5. Engineering Context (NEW)
  6. Retrieved Chunks

The engineering band uses the same token-overlap scoring as memory
ranking: query tokens are matched against entity names + descriptions.
The top match gets its full relationship context included.

10% budget allocation. Trims before domain knowledge (lowest
priority of the structured tiers since the same info may appear in
chunks).

Example: query 'lateral support design' against p04-gigabit
surfaces the Lateral Support subsystem entity with its relationships
to GF-PTFE material, M1 Mirror Assembly parent system, and related
components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:17:01 -04:00
9118f824fa feat: dual-layer knowledge extraction + domain knowledge band
The extraction system now produces two kinds of candidates from
the same conversation:

A. PROJECT-SPECIFIC: applied facts scoped to a named project
   (unchanged behavior)
B. DOMAIN KNOWLEDGE: generalizable engineering insight earned
   through project work, tagged with a domain (physics, materials,
   optics, mechanics, manufacturing, metrology, controls, software,
   math, finance) and stored with project="" so it surfaces across
   all projects.

Critical quality bar enforced in the system prompt: "Would a
competent engineer need experience to know this, or could they
find it in 30 seconds on Google?" Textbook values, definitions,
and obvious facts are explicitly excluded. Only hard-won insight
qualifies — the kind that takes weeks of FEA or real machining
experience to discover.

Domain tags are embedded in the content as a prefix ("[physics]",
"[materials]") so they survive without a schema migration. A future
column can parse them out.

Context builder gains a new tier between project memories and
retrieved chunks:

  Tier 1: Trusted Project State     (project-specific)
  Tier 2: Identity / Preferences    (global)
  Tier 3: Project Memories          (project-specific)
  Tier 4: Domain Knowledge (NEW)    (cross-project, 10% budget)
  Tier 5: Retrieved Chunks          (project-boosted)

Trim order: chunks -> domain knowledge -> project memories ->
identity/preference -> project state.

Host-side extraction script updated with the same prompt and
domain-tag handling.

LLM_EXTRACTOR_VERSION bumped to llm-0.3.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:04:04 -04:00
8b77e83f0a feat: Phase 4 — seed identity + preference memories, lower band to 5%
3 identity memories (Antoine's role, projects, infrastructure) and
3 preference memories (no API keys, multi-model collab, action bias)
seeded on live Dalidou. These fill the identity/preference band
that was previously empty.

Lowered MEMORY_BUDGET_RATIO from 0.10 to 0.05 because the 10%
allocation squeezed project memories and retrieval chunks enough
to regress 4 harness fixtures. At 5% the band fits at most 1 short
memory — enough for the most relevant identity/preference fact
without starving the project-specific tiers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:48:56 -04:00
5aeeb1cad1 feat: query-relevance ordering for memory selection
get_memories_for_context now accepts an optional query string.
When provided, candidate memories are reranked by lexical overlap
with the query (stemmed token intersection, ties broken by
confidence) before the budget walk. Without a query the order is
unchanged — effectively "by confidence desc" as before — so
non-builder callers see no behaviour change.

The fetch limit is raised from 10 to 30 so there's a real pool to
rerank. Token overlap reuses _normalize/_tokenize from
reinforcement.py so ranking and reinforcement matching share the
same notion of distinctive terms.

build_context passes the user_prompt through to both the identity/
preference and project-memory calls. The retrieval harness
regression the fix is targeting:

- p05-vendor-signal FAIL @ 1161645: "Zygo" missing from the pack
  even though an active vendor memory contained it. Root cause:
  higher-confidence p05 memories filled the 25% budget slice
  before the vendor memory ever got a chance. Query-aware ordering
  puts the vendor memory first when the query is about vendors.

New regression test test_project_memories_query_relevance_ordering
locks the behaviour in with two p05 memories and a tight budget.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:47:05 -04:00
1161645415 fix: raise project-memory budget ratio to 0.25
At 0.15 the effective per-call allowance (450 - 55 wrapper) was 395
chars, which is just under the length of a real paragraph-length
project memory (~400 chars). Verified on live p04 probe: band was
still absent after the flat-budget fix because the first memory
entry was one character too long for the budget.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:51:04 -04:00
8ea53f4003 feat: fold project-scoped memories into context pack
The retrieval-quality review on 2026-04-11 found that active
project/knowledge/episodic memories never reached the pack: only
Trusted Project State and identity/preference memories were being
assembled. Reinforcement bumped confidence on memories that had
no retrieval outlet, so the reflection loop was half-open.

This change adds a third memory tier between identity/preference
and retrieved chunks:

- PROJECT_MEMORY_BUDGET_RATIO = 0.15
- Memory types: project, knowledge, episodic
- Only populated when a canonical project is in scope — without
  a project hint, project memories stay out (cross-project bleed
  would rot the signal)
- Rendered under a dedicated "--- Project Memories ---" header
  so the LLM can distinguish it from the identity/preference band
- Trim order in _trim_context_to_budget: retrieval → project
  memories → identity/preference → project state (most recently
  added tier drops first when budget is tight)

get_memories_for_context gains header/footer kwargs so the two
memory blocks can be distinguished in a single pack without a
second helper.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:35:40 -04:00
fb6298a9a1 fix(P1+P2): canonicalize project names at every trust boundary
Three findings from codex's review of the previous P1+P2 fix. The
earlier commit (f2372ef) only fixed alias resolution at the context
builder. Codex correctly pointed out that the same fragmentation
applies at every other place a project name crosses a boundary —
project_state writes/reads, interaction capture/listing/filtering,
memory create/queries, and reinforcement's downstream queries. Plus
a real bug in the interaction `since` filter where the storage
format and the documented ISO format don't compare cleanly.

The fix is one helper used at every boundary instead of duplicating
the resolution inline.

New helper: src/atocore/projects/registry.py::resolve_project_name
---------------------------------------------------------------
- Single canonicalization boundary for project names
- Returns the canonical project_id when the input matches any
  registered id or alias
- Returns the input unchanged for empty/None and for unregistered
  names (preserves backwards compat with hand-curated state that
  predates the registry)
- Documented as the contract that every read/write at the trust
  boundary should pass through

P1 — Trusted Project State endpoints
------------------------------------
src/atocore/context/project_state.py: set_state, get_state, and
invalidate_state now all canonicalize project_name through
resolve_project_name BEFORE looking up or creating the project row.

Before this fix:
- POST /project/state with project="p05" called ensure_project("p05")
  which created a separate row in the projects table
- The state row was attached to that alias project_id
- Later context builds canonicalized "p05" -> "p05-interferometer"
  via the builder fix from f2372ef and never found the state
- Result: trusted state silently fragmented across alias rows

After this fix:
- The alias is resolved to the canonical id at every entry point
- Two captures (one via "p05", one via "p05-interferometer") write
  to the same row
- get_state via either alias or the canonical id finds the same row

Fixes the highest-priority gap codex flagged because Trusted Project
State is supposed to be the most dependable layer in the AtoCore
trust hierarchy.

P2.a — Interaction capture project canonicalization
----------------------------------------------------
src/atocore/interactions/service.py: record_interaction now
canonicalizes project before storing, so interaction.project is
always the canonical id regardless of what the client passed.

Downstream effects:
- reinforce_from_interaction queries memories by interaction.project
  -> previously missed memories stored under canonical id
  -> now consistent because interaction.project IS the canonical id
- the extractor stamps candidates with interaction.project
  -> previously created candidates in alias buckets
  -> now creates candidates in the canonical bucket
- list_interactions(project=alias) was already broken, now fixed by
  canonicalizing the filter input on the read side too

Memory service applied the same fix:
- src/atocore/memory/service.py: create_memory and get_memories
  both canonicalize project through resolve_project_name
- This keeps stored memory.project consistent with the
  reinforcement query path

P2.b — Interaction `since` filter format normalization
------------------------------------------------------
src/atocore/interactions/service.py: new _normalize_since helper.

The bug:
- created_at is stored as 'YYYY-MM-DD HH:MM:SS' (no timezone, UTC by
  convention) so it sorts lexically and compares cleanly with the
  SQLite CURRENT_TIMESTAMP default
- The `since` parameter was documented as ISO 8601 but compared as
  a raw string against the storage format
- The lexically-greater 'T' separator means an ISO timestamp like
  '2026-04-07T12:00:00Z' is GREATER than the storage form
  '2026-04-07 12:00:00' for the same instant
- Result: a client passing ISO `since` got an empty result for any
  row from the same day, even though those rows existed and were
  technically "after" the cutoff in real-world time

The fix:
- _normalize_since accepts ISO 8601 with T, optional Z suffix,
  optional fractional seconds, optional +HH:MM offsets
- Uses datetime.fromisoformat for parsing (Python 3.11+)
- Converts to UTC and reformats as the storage format before the
  SQL comparison
- The bare storage format still works (backwards compat path is a
  regex match that returns the input unchanged)
- Unparseable input is returned as-is so the comparison degrades
  gracefully (rows just don't match) instead of raising and
  breaking the listing endpoint

builder.py refactor
-------------------
The previous P1 fix had inline canonicalization. Now it uses the
shared helper for consistency:
- import changed from get_registered_project to resolve_project_name
- the inline lookup is replaced with a single helper call
- the comment block now points at representation-authority.md for
  the canonicalization contract

New shared test fixture: tests/conftest.py::project_registry
------------------------------------------------------------
- Standardizes the registry-setup pattern that was duplicated
  across test_context_builder.py, test_project_state.py,
  test_interactions.py, and test_reinforcement.py
- Returns a callable that takes (project_id, [aliases]) tuples
  and writes them into a temp registry file with the env var
  pointed at it and config.settings reloaded
- Used by all 12 new regression tests in this commit

Tests (12 new, all green on first run)
--------------------------------------
test_project_state.py:
- test_set_state_canonicalizes_alias: write via alias, read via
  every alias and the canonical id, verify same row id
- test_get_state_canonicalizes_alias_after_canonical_write
- test_invalidate_state_canonicalizes_alias
- test_unregistered_project_state_still_works (backwards compat)

test_interactions.py:
- test_record_interaction_canonicalizes_project
- test_list_interactions_canonicalizes_project_filter
- test_list_interactions_since_accepts_iso_with_t_separator
- test_list_interactions_since_accepts_z_suffix
- test_list_interactions_since_accepts_offset
- test_list_interactions_since_storage_format_still_works

test_reinforcement.py:
- test_reinforcement_works_when_capture_uses_alias (end-to-end:
  capture under alias, seed memory under canonical, verify
  reinforcement matches)
- test_get_memories_filter_by_alias

Full suite: 174 passing (was 162), 1 warning. The +12 is the
new regression tests, no existing tests regressed.

What's still NOT canonicalized (and why)
----------------------------------------
- _rank_chunks's secondary substring boost in builder.py — the
  retriever already does the right thing via its own
  _project_match_boost which calls get_registered_project. The
  redundant secondary boost still uses the raw hint but it's a
  multiplicative factor on top of correct retrieval, not a
  filter, so it can't drop relevant chunks. Tracked as a future
  cleanup but not a P1.
- update_memory's project field (you can't change a memory's
  project after creation in the API anyway).
- The retriever's project_hint parameter on direct /query calls
  — same reasoning as the builder boost, plus the retriever's
  own get_registered_project call already handles aliases there.
2026-04-07 08:29:33 -04:00
f2372eff9e fix(P1+P2): alias-aware project state lookup + slash command corpus fallback
Two regression fixes from codex's review of the slash command
refactor commit (78d4e97). Both findings are real and now have
covered tests.

P1 — server-side alias resolution for project_state lookup
----------------------------------------------------------
The bug:
- /context/build forwarded the caller's project hint verbatim to
  get_state(project_hint), which does an exact-name lookup against
  the projects table (case-insensitive but no alias resolution)
- the project registry's alias matching was only used by the
  client's auto-context path and the retriever's project-match
  boost, never by the server's project_state lookup
- consequence: /atocore-context "... p05" would silently miss
  trusted project state stored under the canonical id
  "p05-interferometer", weakening project-hinted retrieval to
  the point that an explicit alias hint was *worse* than no hint

The fix in src/atocore/context/builder.py:
- import get_registered_project from the projects registry
- before calling get_state(project_hint), resolve the hint
  through get_registered_project; if a registry record exists,
  use the canonical project_id for the state lookup
- if no registry record exists, fall back to the raw hint so a
  hand-curated project_state entry that predates the registry
  still works (backwards compat with pre-registry deployments)

The retriever already does its own alias expansion via
get_registered_project for the project-match boost, so the
retriever side was never broken — only the project_state lookup
in the builder. The fix is scoped to that one call site.

Tests added in tests/test_context_builder.py:
- test_alias_hint_resolves_through_registry: stands up a fresh
  registry, sets state under "p05-interferometer", then verifies
  build_context with project_hint="p05" finds the state, AND
  with project_hint="interferometer" (the second alias) finds it
  too, AND with the canonical id finds it. Covers all three
  resolution paths.
- test_unknown_hint_falls_back_to_raw_lookup: empty registry,
  set state under an unregistered project name, verify the
  build_context call with that name as the hint still finds the
  state. Locks in the backwards-compat behavior.

P2 — slash command no-hint fallback to corpus-wide context build
----------------------------------------------------------------
The bug:
- the slash command's no-hint path called auto-context, which
  returns {"status": "no_project_match"} when project detection
  fails and does NOT fall back to a plain context-build
- the slash command's own help text told the user "call without
  a hint to use the corpus-wide context build" — which was a lie
  because the wrapper no longer did that
- consequence: generic prompts like "what changed in AtoCore
  backup policy?" or any cross-project question got a useless
  no_project_match envelope instead of a context pack

The fix in .claude/commands/atocore-context.md:
- the no-hint path now does the 2-step fallback dance:
    1. try `auto-context "<prompt>"` for project detection
    2. if the response contains "no_project_match", fall back to
       `context-build "<prompt>"` (no project arg)
- both branches return a real context pack, fail-open envelope
  is preserved for genuine network errors
- the underlying client surface is unchanged (no new flags, no
  new subcommands) — the fallback is per-frontend logic in the
  slash command, leaving auto-context's existing semantics
  intact for OpenClaw and any other caller that depends on the
  no_project_match envelope as a "do nothing" signal

While I was here, also tightened the slash command's argument
parsing to delegate alias-knowledge to the registry instead of
embedding a hardcoded list:
- old version had a literal list of "atocore", "p04", "p05",
  "p06" and their aliases that needed manual maintenance every
  time a project was added
- new version takes the last token of $ARGUMENTS and asks the
  client's `detect-project` subcommand whether it's a known
  alias; if matched, it's the explicit hint, if not it's part
  of the prompt
- this delegates registry knowledge to the registry, where it
  belongs

Unrelated improvement noted but NOT fixed in this commit:
- _rank_chunks in builder.py also has a naive substring boost
  that uses the original hint without alias expansion. The
  retriever already does the right thing, so this secondary
  boost is redundant. Tracked as a future cleanup but not in
  scope for the P1/P2 fix; codex's findings are about
  project_state lookup, not about the secondary chunk boost.

Full suite: 162 passing (was 160), 1 warning. The +2 is the two
new P1 regression tests.
2026-04-07 07:47:03 -04:00
14ab7c8e9f fix: pass project_hint into retrieve and add path-signal ranking
Two changes that belong together:

1. builder.build_context() now passes project_hint into retrieve(),
   so the project-aware boost actually fires for the retrieval pipeline
   driven by /context/build. Before this, only direct /query callers
   benefited from the registered-project boost.

2. retriever now applies two more ranking signals on every chunk:
   - _query_match_boost: boosts chunks whose source/title/heading
     echo high-signal query tokens (stop list filters out generic
     words like "the", "project", "system")
   - _path_signal_boost: down-weights archival noise (_archive,
     _history, pre-cleanup, reviews) by 0.72 and up-weights current
     high-signal docs (status, decision, requirements, charter,
     system-map, error-budget, ...) by 1.18

Tests:
- test_context_builder_passes_project_hint_to_retrieval verifies
  the wiring fix
- test_retrieve_downranks_archive_noise_and_prefers_high_signal_paths
  verifies the new ranking helpers prefer current docs over archive

This addresses the cross-project competition and archive bleed
called out in current-state.md after the Wave 1 ingestion.
2026-04-06 18:37:07 -04:00
b0889b3925 Stabilize core correctness and sync project plan state 2026-04-05 17:53:23 -04:00
b48f0c95ab feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory

Context builder integration (trust precedence per Master Plan):
  1. Trusted Project State (highest trust, 20% budget)
  2. Identity + Preference memories (10% budget)
  3. Retrieved chunks (remaining budget)

Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
531c560db7 feat: Phase 1 ingestion hardening + Phase 5 Trusted Project State
Phase 1 - Ingestion hardening:
- Encoding fallback (UTF-8/UTF-8-sig/Latin-1/CP1252)
- Delete detection: purge DB/vector entries for removed files
- Ingestion stats endpoint (GET /stats)

Phase 5 - Trusted Project State:
- project_state table with categories (status, decision, requirement, contact, milestone, fact, config)
- CRUD API: POST/GET/DELETE /project/state
- Upsert semantics, invalidation (supersede) support
- Context builder integrates project state at highest trust precedence
- Project state gets 20% budget allocation, appears first in context
- Trust precedence: Project State > Retrieved Chunks (per Master Plan)

33/33 tests passing. Validated end-to-end with GigaBIT M1 project data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:41:59 -04:00
6081462058 fix: critical bugs and hardening from validation audit
- Fix infinite loop in chunker _hard_split when overlap >= max_size
- Fix tag filter false positives by quoting tag values in ChromaDB query
- Fix score boost semantics (additive → multiplicative) to stay within 0-1 range
- Add error handling and type hints to all API routes
- Update README with proper project documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:35:37 -04:00
b4afbbb53a feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00