refactor slash command onto shared client + llm-client-integration doc

Codex's review caught that the Claude Code slash command shipped in Session 2 was a parallel reimplementation of routing logic the existing scripts/atocore_client.py already had. That client was introduced via the codex/port-atocore-ops-client merge and is already a comprehensive operator client (auto-context, detect-project, refresh-project, project-state, audit-query, etc.). The slash command should have been a thin wrapper from the start. This commit fixes the shape without expanding scope. .claude/commands/atocore-context.md ----------------------------------- Rewritten as a thin Claude Code-specific frontend that shells out to the shared client: - explicit project hint -> calls `python scripts/atocore_client.py context-build "<prompt>" "<project>"` - no explicit hint -> calls `python scripts/atocore_client.py auto-context "<prompt>"` which runs the client's detect-project routing first and falls through to context-build with the match Inherits the client's stable behaviour for free: - ATOCORE_BASE_URL env var (default http://dalidou:8100) - fail-open on network errors via ATOCORE_FAIL_OPEN - consistent JSON output shape - the same project alias matching the OpenClaw helper uses Removes the speculative `--capture` capture path that was in the original draft. Capture/extract/queue/promote/reject are intentionally NOT in the shared client yet (memory-review workflow not exercised in real use), so the slash command can't expose them either. docs/architecture/llm-client-integration.md ------------------------------------------- New planning doc that defines the layering rule for AtoCore's relationship with LLM client contexts: Three layers: 1. AtoCore HTTP API (universal, src/atocore/api/routes.py) 2. Shared operator client (scripts/atocore_client.py) — the canonical Python backbone for stable AtoCore operations 3. Per-agent thin frontends (Claude Code slash command, OpenClaw helper, future Codex skill, future MCP server) that shell out to the shared client Three non-negotiable rules: - every per-agent frontend is a thin wrapper (translate the agent's command format and render the JSON; nothing else) - the shared client never duplicates the API (it composes endpoints; new logic goes in the API first) - the shared client only exposes stable operations (subcommands land only after the API has been exercised in a real workflow) Doc covers: - the full table of subcommands currently in scope (project lifecycle, ingestion, project-state, retrieval, context build, audit-query, debug-context, health/stats) - the three deferred families with rationale: memory review queue (workflow not exercised), backup admin (fail-open default would hide errors), engineering layer entities (V1 not yet implemented) - the integration recipe for new agent platforms - explicit acknowledgement that the OpenClaw helper currently duplicates routing logic and that the refactor to the shared client is a queued cross-repo follow-up - how the layering connects to phase 8 (OpenClaw) and phase 11 (multi-model) - versioning and stability rules for the shared client surface - open follow-ups: OpenClaw refactor, memory-review subcommands when ready, optional backup admin subcommands, engineering entity subcommands during V1 implementation master-plan-status.md updated ----------------------------- - New "LLM Client Integration" subsection that points to the layering doc and explicitly notes the deferral of memory-review and engineering-entity subcommands - Frames the layering as sitting between phase 8 and phase 11 Scope is intentionally narrow per codex's framing: promote the existing client to canonical status, refactor the slash command to use it, document the layering. No new client subcommands added in this commit. The OpenClaw helper refactor is a separate cross-repo follow-up. Memory-review and engineering- entity work stay deferred. Full suite: 160 passing, no behavior changes.
2026-04-07 07:22:54 -04:00
parent d6ce6128cf
commit 78d4e979e5
3 changed files with 426 additions and 86 deletions
--- a/docs/architecture/llm-client-integration.md
+++ b/docs/architecture/llm-client-integration.md
@@ -0,0 +1,323 @@
+# LLM Client Integration (the layering)
+
+## Why this document exists
+
+AtoCore must be reachable from many different LLM client contexts:
+
+- **OpenClaw** on the T420 (already integrated via the read-only
+  helper skill at `/home/papa/clawd/skills/atocore-context/`)
+- **Claude Code** on the laptop (via the slash command shipped in
+  this repo at `.claude/commands/atocore-context.md`)
+- **Codex** sessions (future)
+- **Direct API consumers** — scripts, Python code, ad-hoc curl
+- **The eventual MCP server** when it's worth building
+
+Without an explicit layering rule, every new client tends to
+reimplement the same routing logic (project detection, context
+build, retrieval audit, project-state inspection) in slightly
+different ways. That is exactly what almost happened in the first
+draft of the Claude Code slash command, which started as a curl +
+jq script that duplicated capabilities the existing operator client
+already had.
+
+This document defines the layering so future clients don't repeat
+that mistake.
+
+## The layering
+
+Three layers, top to bottom:
+
+```
+        +----------------------------------------------------+
+        |  Per-agent thin frontends                          |
+        |                                                    |
+        |  - Claude Code slash command                       |
+        |    (.claude/commands/atocore-context.md)           |
+        |  - OpenClaw helper skill                           |
+        |    (/home/papa/clawd/skills/atocore-context/)      |
+        |  - Codex skill (future)                            |
+        |  - MCP server (future)                             |
+        +----------------------------------------------------+
+                              |
+                              | shells out to / imports
+                              v
+        +----------------------------------------------------+
+        |  Shared operator client                            |
+        |  scripts/atocore_client.py                         |
+        |                                                    |
+        |  - subcommands for stable AtoCore operations       |
+        |  - fail-open on network errors                     |
+        |  - consistent JSON output across all subcommands   |
+        |  - environment-driven configuration                |
+        |    (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS,     |
+        |     ATOCORE_REFRESH_TIMEOUT_SECONDS,               |
+        |     ATOCORE_FAIL_OPEN)                             |
+        +----------------------------------------------------+
+                              |
+                              | HTTP
+                              v
+        +----------------------------------------------------+
+        |  AtoCore HTTP API                                  |
+        |  src/atocore/api/routes.py                         |
+        |                                                    |
+        |  - the universal interface to AtoCore              |
+        |  - everything else above is glue                   |
+        +----------------------------------------------------+
+```
+
+## The non-negotiable rules
+
+These rules are what make the layering work.
+
+### Rule 1 — every per-agent frontend is a thin wrapper
+
+A per-agent frontend exists to do exactly two things:
+
+1. **Translate the agent platform's command/skill format** into an
+   invocation of the shared client (or a small sequence of them)
+2. **Render the JSON response** into whatever shape the agent
+   platform wants (markdown for Claude Code, plaintext for
+   OpenClaw, MCP tool result for an MCP server, etc.)
+
+Everything else — talking to AtoCore, project detection, retrieval
+audit, fail-open behavior, configuration — is the **shared
+client's** job.
+
+If a per-agent frontend grows logic beyond the two responsibilities
+above, that logic is in the wrong place. It belongs in the shared
+client where every other frontend gets to use it.
+
+### Rule 2 — the shared client never duplicates the API
+
+The shared client is allowed to **compose** API calls (e.g.
+`auto-context` calls `detect-project` then `context-build`), but
+it never reimplements API logic. If a useful operation can't be
+expressed via the existing API endpoints, the right fix is to
+extend the API, not to embed the logic in the client.
+
+This rule keeps the API as the single source of truth for what
+AtoCore can do.
+
+### Rule 3 — the shared client only exposes stable operations
+
+A subcommand only makes it into the shared client when:
+
+- the API endpoint behind it has been exercised by at least one
+  real workflow
+- the request and response shapes are unlikely to change
+- the operation is one that more than one frontend will plausibly
+  want
+
+This rule keeps the client surface stable so frontends don't have
+to chase changes. New endpoints land in the API first, get
+exercised in real use, and only then get a client subcommand.
+
+## What's in scope for the shared client today
+
+The currently shipped scope (per `scripts/atocore_client.py`):
+
+| Subcommand | Purpose | API endpoint(s) |
+|---|---|---|
+| `health` | service status, mount + source readiness | `GET /health` |
+| `sources` | enabled source roots and their existence | `GET /sources` |
+| `stats` | document/chunk/vector counts | `GET /stats` |
+| `projects` | registered projects | `GET /projects` |
+| `project-template` | starter shape for a new project | `GET /projects/template` |
+| `propose-project` | preview a registration | `POST /projects/proposal` |
+| `register-project` | persist a registration | `POST /projects/register` |
+| `update-project` | update an existing registration | `PUT /projects/{name}` |
+| `refresh-project` | re-ingest a project's roots | `POST /projects/{name}/refresh` |
+| `project-state` | list trusted state for a project | `GET /project/state/{name}` |
+| `project-state-set` | curate trusted state | `POST /project/state` |
+| `project-state-invalidate` | supersede trusted state | `DELETE /project/state` |
+| `query` | raw retrieval | `POST /query` |
+| `context-build` | full context pack | `POST /context/build` |
+| `auto-context` | detect-project then context-build | composes `/projects` + `/context/build` |
+| `detect-project` | match a prompt to a registered project | composes `/projects` + local regex |
+| `audit-query` | retrieval-quality audit with classification | composes `/query` + local labelling |
+| `debug-context` | last context pack inspection | `GET /debug/context` |
+| `ingest-sources` | ingest configured source dirs | `POST /ingest/sources` |
+
+That covers everything in the "stable operations" set today:
+project lifecycle, ingestion, project-state curation, retrieval and
+context build, retrieval-quality audit, health and stats inspection.
+
+## What's intentionally NOT in scope today
+
+Three families of operations are explicitly **deferred** until
+their workflows have been exercised in real use:
+
+### 1. Memory review queue and reflection loop
+
+Phase 9 Commit C shipped these endpoints:
+
+- `POST /interactions` (capture)
+- `POST /interactions/{id}/reinforce`
+- `POST /interactions/{id}/extract`
+- `GET /memory?status=candidate`
+- `POST /memory/{id}/promote`
+- `POST /memory/{id}/reject`
+
+The contracts are stable, but the **workflow ergonomics** are not.
+Until a real human has actually exercised the capture → extract →
+review → promote/reject loop a few times and we know what feels
+right, exposing those operations through the shared client would
+prematurely freeze a UX that's still being designed.
+
+When the loop has been exercised in real use and we know what
+the right subcommand shapes are, the shared client gains:
+
+- `capture <prompt> <response> [--project P] [--client C]`
+- `extract <interaction-id> [--persist]`
+- `queue` (list candidate review queue)
+- `promote <memory-id>`
+- `reject <memory-id>`
+
+At that point the Claude Code slash command can grow a companion
+`/atocore-record-response` command and the OpenClaw helper can be
+extended with the same flow.
+
+### 2. Backup and restore admin operations
+
+Phase 9 Commit B shipped these endpoints:
+
+- `POST /admin/backup` (with `include_chroma`)
+- `GET /admin/backup` (list)
+- `GET /admin/backup/{stamp}/validate`
+
+The backup endpoints are stable, but the documented operational
+procedure (`docs/backup-restore-procedure.md`) intentionally uses
+direct curl rather than the shared client. The reason is that
+backup operations are *administrative* and benefit from being
+explicit about which instance they're targeting, with no
+fail-open behavior. The shared client's fail-open default would
+hide a real backup failure.
+
+If we later decide to add backup commands to the shared client,
+they would set `ATOCORE_FAIL_OPEN=false` for the duration of the
+call so the operator gets a real error on failure rather than a
+silent fail-open envelope.
+
+### 3. Engineering layer entity operations
+
+The engineering layer is in planning, not implementation. When
+V1 ships per `engineering-v1-acceptance.md`, the shared client
+will gain entity, relationship, conflict, and Mirror commands.
+None of those exist as stable contracts yet, so they are not in
+the shared client today.
+
+## How a new agent platform integrates
+
+When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom
+GPT, a Cursor extension), the integration recipe is:
+
+1. **Don't reimplement.** Don't write a new HTTP client. Use the
+   shared client.
+2. **Write a thin frontend** that translates the platform's
+   command/skill format into a shell call to
+   `python scripts/atocore_client.py <subcommand> <args...>`.
+3. **Render the JSON response** in the platform's preferred shape.
+4. **Inherit fail-open and env-var behavior** from the shared
+   client. Don't override unless the platform explicitly needs
+   to (e.g. an admin tool that wants to see real errors).
+5. **If a needed capability is missing**, propose adding it to
+   the shared client. If the underlying API endpoint also
+   doesn't exist, propose adding it to the API first. Don't
+   add the logic to your frontend.
+
+The Claude Code slash command in this repo is a worked example:
+~50 lines of markdown that does argument parsing, calls the
+shared client, and renders the result. It contains zero AtoCore
+business logic of its own.
+
+## How OpenClaw fits
+
+OpenClaw's helper skill at `/home/papa/clawd/skills/atocore-context/`
+on the T420 currently has its own implementation of `auto-context`,
+`detect-project`, and the project lifecycle commands. It predates
+this layering doc.
+
+The right long-term shape is to **refactor the OpenClaw helper to
+shell out to the shared client** instead of duplicating the
+routing logic. This isn't urgent because:
+
+- OpenClaw's helper works today and is in active use
+- The duplication is on the OpenClaw side; AtoCore itself is not
+  affected
+- The shared client and the OpenClaw helper are in different
+  repos (AtoCore vs OpenClaw clawd), so the refactor is a
+  cross-repo coordination
+
+The refactor is queued as a follow-up. Until then, **the OpenClaw
+helper and the Claude Code slash command are parallel
+implementations** of the same idea. The shared client is the
+canonical backbone going forward; new clients should follow the
+new pattern even though the existing OpenClaw helper still has
+its own.
+
+## How this connects to the master plan
+
+| Layer | Phase home | Status |
+|---|---|---|
+| AtoCore HTTP API | Phases 0/0.5/1/2/3/5/7/9 | shipped |
+| Shared operator client (`scripts/atocore_client.py`) | implicitly Phase 8 (OpenClaw integration) infrastructure | shipped via codex/port-atocore-ops-client merge |
+| OpenClaw helper skill (T420) | Phase 8 — partial | shipped (own implementation, refactor queued) |
+| Claude Code slash command (this repo) | precursor to Phase 11 (multi-model) | shipped (refactored to use the shared client) |
+| Codex skill | Phase 11 | future |
+| MCP server | Phase 11 | future |
+| Web UI / dashboard | Phase 11+ | future |
+
+The shared client is the **substrate Phase 11 will build on**.
+Every new client added in Phase 11 should be a thin frontend on
+the shared client, not a fresh reimplementation.
+
+## Versioning and stability
+
+The shared client's subcommand surface is **stable**. Adding new
+subcommands is non-breaking. Changing or removing existing
+subcommands is breaking and would require a coordinated update
+of every frontend that depends on them.
+
+The current shared client has no explicit version constant; the
+implicit contract is "the subcommands and JSON shapes documented
+in this file". When the client surface meaningfully changes,
+add a `CLIENT_VERSION = "x.y.z"` constant to
+`scripts/atocore_client.py` and bump it per semver:
+
+- patch: bug fixes, no surface change
+- minor: new subcommands or new optional fields
+- major: removed subcommands, renamed fields, changed defaults
+
+## Open follow-ups
+
+1. **Refactor the OpenClaw helper** to shell out to the shared
+   client. Cross-repo coordination, not blocking anything in
+   AtoCore itself.
+2. **Add memory-review subcommands** when the Phase 9 review
+   workflow has been exercised in real use.
+3. **Add backup admin subcommands** if and when we decide the
+   shared client should be the canonical backup operator
+   interface (with fail-open disabled for admin commands).
+4. **Add engineering-layer entity subcommands** as part of the
+   engineering V1 implementation sprint, per
+   `engineering-v1-acceptance.md`.
+5. **Tag a `CLIENT_VERSION` constant** the next time the shared
+   client surface meaningfully changes. Today's surface is the
+   v0.1.0 baseline.
+
+## TL;DR
+
+- AtoCore HTTP API is the universal interface
+- `scripts/atocore_client.py` is the canonical shared Python
+  backbone for stable AtoCore operations
+- Per-agent frontends (Claude Code slash command, OpenClaw
+  helper, future Codex skill, future MCP server) are thin
+  wrappers that shell out to the shared client
+- The shared client today covers project lifecycle, ingestion,
+  retrieval, context build, project-state, and retrieval audit
+- Memory-review and engineering-entity commands are deferred
+  until their workflows are exercised
+- The OpenClaw helper is currently a parallel implementation and
+  the refactor to the shared client is a queued follow-up
+- New LLM clients should never reimplement HTTP calls — they
+  follow the shell-out pattern documented here
--- a/docs/master-plan-status.md
+++ b/docs/master-plan-status.md
@@ -72,6 +72,22 @@ active project set.
 The next concrete next step is the V1 implementation sprint, which
 should follow engineering-v1-acceptance.md as its checklist.

+### LLM Client Integration
+
+A separate but related architectural concern: how AtoCore is reachable
+from many different LLM client contexts (OpenClaw, Claude Code, future
+Codex skills, future MCP server). The layering rule is documented in:
+
+- [llm-client-integration.md](architecture/llm-client-integration.md) —
+  three-layer shape: HTTP API → shared operator client
+  (`scripts/atocore_client.py`) → per-agent thin frontends; the
+  shared client is the canonical backbone every new client should
+  shell out to instead of reimplementing HTTP calls
+
+This sits implicitly between Phase 8 (OpenClaw) and Phase 11
+(multi-model). Memory-review and engineering-entity commands are
+deferred from the shared client until their workflows are exercised.
+
 ## What Is Real Today

 - canonical AtoCore runtime on Dalidou