diff --git a/.claude/commands/atocore-context.md b/.claude/commands/atocore-context.md index d02ff93..f3389c5 100644 --- a/.claude/commands/atocore-context.md +++ b/.claude/commands/atocore-context.md @@ -22,42 +22,80 @@ The user invoked `/atocore-context` with: $ARGUMENTS ``` -Treat the **entire argument string** as the prompt by default. If the -last whitespace-separated token looks like a registered project id or -alias (`atocore`, `p04`, `p04-gigabit`, `gigabit`, `p05`, -`p05-interferometer`, `interferometer`, `p06`, `p06-polisher`, -`polisher`, or any case-insensitive variant), pull it off and treat -it as an explicit project hint. The remaining tokens become the -prompt. Otherwise leave the project hint empty and the client will -try to auto-detect one from the prompt itself. +You need to figure out two things: -## Step 2 — call the shared client +1. The **prompt text** — what AtoCore will retrieve context for +2. An **optional project hint** — used to scope retrieval to a + specific project's trusted state and corpus -Use the Bash tool. The client respects `ATOCORE_BASE_URL` (default -`http://dalidou:8100`) and is fail-open by default — if AtoCore is -unreachable it returns a `{"status": "unavailable"}` payload and -exits 0, which is what the daily-use loop wants. +The user may have passed a project id or alias as the **last +whitespace-separated token**. Don't maintain a hardcoded list of +known aliases — let the shared client decide. Use this rule: -**If the user passed an explicit project hint**, call `context-build` -directly so AtoCore uses exactly that project: +- Take the last token of `$ARGUMENTS`. Call it `MAYBE_HINT`. +- Run `python scripts/atocore_client.py detect-project "$MAYBE_HINT"` + to ask the registry whether it's a known project id or alias. + This call is cheap (it just hits `/projects` and does a regex + match) and inherits the client's fail-open behavior. +- If the response has a non-null `matched_project`, the last + token was an explicit project hint. `PROMPT_TEXT` is everything + except the last token; `PROJECT_HINT` is the matched canonical + project id. +- Otherwise the last token is just part of the prompt. + `PROMPT_TEXT` is the full `$ARGUMENTS`; `PROJECT_HINT` is empty. + +This delegates the alias-knowledge to the registry instead of +embedding a stale list in this markdown file. When you add a new +project to the registry, the slash command picks it up +automatically with no edits here. + +## Step 2 — call the shared client for the context pack + +The server resolves project hints through the registry before +looking up trusted state, so you can pass either the canonical id +or any alias to `context-build` and the trusted state lookup will +work either way. (Regression test: +`tests/test_context_builder.py::test_alias_hint_resolves_through_registry`.) + +**If `PROJECT_HINT` is non-empty**, call `context-build` directly +with that hint: ```bash python scripts/atocore_client.py context-build \ - "" \ - "" + "$PROMPT_TEXT" \ + "$PROJECT_HINT" ``` -**If no explicit project hint**, call `auto-context` which will run -the client's `detect-project` routing first and only call -`context-build` once it has a match: +**If `PROJECT_HINT` is empty**, do the 2-step fallback dance so the +user always gets a context pack regardless of whether the prompt +implies a project: ```bash -python scripts/atocore_client.py auto-context "" +# Try project auto-detection first. +RESULT=$(python scripts/atocore_client.py auto-context "$PROMPT_TEXT") + +# If auto-context could not detect a project it returns a small +# {"status": "no_project_match", ...} envelope. In that case fall +# back to a corpus-wide context build with no project hint, which +# is the right behaviour for cross-project or generic prompts like +# "what changed in AtoCore backup policy this week?" +if echo "$RESULT" | grep -q '"no_project_match"'; then + RESULT=$(python scripts/atocore_client.py context-build "$PROMPT_TEXT") +fi + +echo "$RESULT" ``` -In both cases the response is the JSON payload from `/context/build` -(or, for the `auto-context` no-match case, a small -`{"status": "no_project_match"}` envelope). +This is the fix for the P2 finding from codex's review: previously +the slash command sent every no-hint prompt through `auto-context` +and returned `no_project_match` to the user with no context, even +though the underlying client's `context-build` subcommand has +always supported corpus-wide context builds. + +In both branches the response is the JSON payload from +`/context/build` (or, in the rare case where even the corpus-wide +build fails, a `{"status": "unavailable"}` envelope from the +client's fail-open layer). ## Step 3 — present the context pack to the user @@ -78,21 +116,15 @@ Render in this order: 3. The `chunks` array as a small bullet list with `source_file`, `heading_path`, and `score` per chunk -Three special cases: +Two special cases: -- **`{"status": "no_project_match"}`** (from `auto-context`) - → Tell the user: "AtoCore could not auto-detect a project from the - prompt. Re-run with an explicit project id: - `/atocore-context ` (or call without a hint - to use the corpus-wide context build)." - **`{"status": "unavailable"}`** (fail-open from the client) → Tell the user: "AtoCore is unreachable at `$ATOCORE_BASE_URL`. Check `python scripts/atocore_client.py health` for diagnostics." - **Empty `chunks_used: 0` with no project state and no memories** → Tell the user: "AtoCore returned no context for this prompt — - either the corpus does not have relevant information for the - detected project or the project hint is wrong. Try a different - hint or a longer prompt." + either the corpus does not have relevant information or the + project hint is wrong. Try a different hint or a longer prompt." ## Step 4 — what about capturing the interaction @@ -115,8 +147,11 @@ queued. client is the contract between AtoCore and every LLM frontend; if you find a missing capability, the right fix is to extend the client, not to work around it. -- DO NOT silently change `ATOCORE_BASE_URL`. If the env var points at - the wrong instance, surface the error so the user can fix it. +- DO NOT maintain a hardcoded list of project aliases in this + file. Use `detect-project` to ask the registry — that's the + whole point of having a registry. +- DO NOT silently change `ATOCORE_BASE_URL`. If the env var points + at the wrong instance, surface the error so the user can fix it. - DO NOT hide the formatted context pack from the user. Showing what AtoCore would feed an LLM is the whole point. - The output goes into the user's working context as background; diff --git a/src/atocore/context/builder.py b/src/atocore/context/builder.py index 928ea56..57e159d 100644 --- a/src/atocore/context/builder.py +++ b/src/atocore/context/builder.py @@ -14,6 +14,7 @@ import atocore.config as _config from atocore.context.project_state import format_project_state, get_state from atocore.memory.service import get_memories_for_context from atocore.observability.logger import get_logger +from atocore.projects.registry import get_registered_project from atocore.retrieval.retriever import ChunkResult, retrieve log = get_logger("context_builder") @@ -84,8 +85,21 @@ def build_context( max(0, int(budget * PROJECT_STATE_BUDGET_RATIO)), ) + # Resolve the project hint through the registry so callers can pass + # an alias (`p05`, `gigabit`) and still find trusted state stored + # under the canonical project id (`p05-interferometer`, + # `p04-gigabit`). The retriever already does this for the + # project-match boost — the project_state lookup needs the same + # courtesy. If the registry has no entry for the hint, fall back to + # the raw hint so a hand-curated project_state entry that predates + # the registry still works. + canonical_project = project_hint if project_hint: - state_entries = get_state(project_hint) + registered = get_registered_project(project_hint) + if registered is not None: + canonical_project = registered.project_id + + state_entries = get_state(canonical_project) if state_entries: project_state_text = format_project_state(state_entries) project_state_text, project_state_chars = _truncate_text_block( diff --git a/tests/test_context_builder.py b/tests/test_context_builder.py index fe3c24f..85c59ec 100644 --- a/tests/test_context_builder.py +++ b/tests/test_context_builder.py @@ -1,5 +1,8 @@ """Tests for the context builder.""" +import json + +import atocore.config as config from atocore.context.builder import build_context, get_last_context_pack from atocore.context.project_state import init_project_state_schema, set_state from atocore.ingestion.pipeline import ingest_file @@ -162,3 +165,89 @@ def test_no_project_state_without_hint(tmp_data_dir, sample_markdown): pack = build_context("What is AtoCore?") assert pack.project_state_chars == 0 assert "--- Trusted Project State ---" not in pack.formatted_context + + +def test_alias_hint_resolves_through_registry(tmp_data_dir, sample_markdown, monkeypatch): + """An alias hint like 'p05' should find project state stored under 'p05-interferometer'. + + This is the regression test for the P1 finding from codex's review: + /context/build was previously doing an exact-name lookup that + silently dropped trusted project state when the caller passed an + alias instead of the canonical project id. + """ + init_db() + init_project_state_schema() + ingest_file(sample_markdown) + + # Stand up a minimal project registry that knows the aliases. + # The registry lives in a JSON file pointed to by + # ATOCORE_PROJECT_REGISTRY_PATH; the dataclass-driven loader picks + # it up on every call (no in-process cache to invalidate). + registry_path = tmp_data_dir / "project-registry.json" + registry_path.write_text( + json.dumps( + { + "projects": [ + { + "id": "p05-interferometer", + "aliases": ["p05", "interferometer"], + "description": "P05 alias-resolution regression test", + "ingest_roots": [ + {"source": "vault", "subpath": "incoming/projects/p05"} + ], + } + ] + } + ), + encoding="utf-8", + ) + monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path)) + config.settings = config.Settings() + + # Trusted state is stored under the canonical id (the way the + # /project/state endpoint always writes it). + set_state( + "p05-interferometer", + "status", + "next_focus", + "Wave 2 trusted-operational ingestion", + ) + + # The bug: pack with alias hint used to silently miss the state. + pack_with_alias = build_context("status?", project_hint="p05", budget=2000) + assert "Wave 2 trusted-operational ingestion" in pack_with_alias.formatted_context + assert pack_with_alias.project_state_chars > 0 + + # The canonical id should still work the same way. + pack_with_canonical = build_context( + "status?", project_hint="p05-interferometer", budget=2000 + ) + assert "Wave 2 trusted-operational ingestion" in pack_with_canonical.formatted_context + + # A second alias should also resolve. + pack_with_other_alias = build_context( + "status?", project_hint="interferometer", budget=2000 + ) + assert "Wave 2 trusted-operational ingestion" in pack_with_other_alias.formatted_context + + +def test_unknown_hint_falls_back_to_raw_lookup(tmp_data_dir, sample_markdown, monkeypatch): + """A hint that isn't in the registry should still try the raw name. + + This preserves backwards compatibility with hand-curated + project_state entries that predate the project registry. + """ + init_db() + init_project_state_schema() + ingest_file(sample_markdown) + + # Empty registry — the hint won't resolve through it. + registry_path = tmp_data_dir / "project-registry.json" + registry_path.write_text('{"projects": []}', encoding="utf-8") + monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path)) + config.settings = config.Settings() + + set_state("orphan-project", "status", "phase", "Solo run") + + pack = build_context("status?", project_hint="orphan-project", budget=2000) + assert "Solo run" in pack.formatted_context