Files
ATOCore/docs/architecture/llm-client-integration.md
Anto01 78d4e979e5 refactor slash command onto shared client + llm-client-integration doc
Codex's review caught that the Claude Code slash command shipped in
Session 2 was a parallel reimplementation of routing logic the
existing scripts/atocore_client.py already had. That client was
introduced via the codex/port-atocore-ops-client merge and is
already a comprehensive operator client (auto-context,
detect-project, refresh-project, project-state, audit-query, etc.).
The slash command should have been a thin wrapper from the start.

This commit fixes the shape without expanding scope.

.claude/commands/atocore-context.md
-----------------------------------
Rewritten as a thin Claude Code-specific frontend that shells out
to the shared client:

- explicit project hint -> calls `python scripts/atocore_client.py
  context-build "<prompt>" "<project>"`
- no explicit hint -> calls `python scripts/atocore_client.py
  auto-context "<prompt>"` which runs the client's detect-project
  routing first and falls through to context-build with the match

Inherits the client's stable behaviour for free:
- ATOCORE_BASE_URL env var (default http://dalidou:8100)
- fail-open on network errors via ATOCORE_FAIL_OPEN
- consistent JSON output shape
- the same project alias matching the OpenClaw helper uses

Removes the speculative `--capture` capture path that was in the
original draft. Capture/extract/queue/promote/reject are
intentionally NOT in the shared client yet (memory-review
workflow not exercised in real use), so the slash command can't
expose them either.

docs/architecture/llm-client-integration.md
-------------------------------------------
New planning doc that defines the layering rule for AtoCore's
relationship with LLM client contexts:

Three layers:
1. AtoCore HTTP API (universal, src/atocore/api/routes.py)
2. Shared operator client (scripts/atocore_client.py) — the
   canonical Python backbone for stable AtoCore operations
3. Per-agent thin frontends (Claude Code slash command,
   OpenClaw helper, future Codex skill, future MCP server)
   that shell out to the shared client

Three non-negotiable rules:
- every per-agent frontend is a thin wrapper (translate the
  agent's command format and render the JSON; nothing else)
- the shared client never duplicates the API (it composes
  endpoints; new logic goes in the API first)
- the shared client only exposes stable operations (subcommands
  land only after the API has been exercised in a real workflow)

Doc covers:
- the full table of subcommands currently in scope (project
  lifecycle, ingestion, project-state, retrieval, context build,
  audit-query, debug-context, health/stats)
- the three deferred families with rationale: memory review
  queue (workflow not exercised), backup admin (fail-open
  default would hide errors), engineering layer entities (V1
  not yet implemented)
- the integration recipe for new agent platforms
- explicit acknowledgement that the OpenClaw helper currently
  duplicates routing logic and that the refactor to the shared
  client is a queued cross-repo follow-up
- how the layering connects to phase 8 (OpenClaw) and phase 11
  (multi-model)
- versioning and stability rules for the shared client surface
- open follow-ups: OpenClaw refactor, memory-review subcommands
  when ready, optional backup admin subcommands, engineering
  entity subcommands during V1 implementation

master-plan-status.md updated
-----------------------------
- New "LLM Client Integration" subsection that points to the
  layering doc and explicitly notes the deferral of memory-review
  and engineering-entity subcommands
- Frames the layering as sitting between phase 8 and phase 11

Scope is intentionally narrow per codex's framing: promote the
existing client to canonical status, refactor the slash command
to use it, document the layering. No new client subcommands
added in this commit. The OpenClaw helper refactor is a
separate cross-repo follow-up. Memory-review and engineering-
entity work stay deferred.

Full suite: 160 passing, no behavior changes.
2026-04-07 07:22:54 -04:00

324 lines
14 KiB
Markdown

# LLM Client Integration (the layering)
## Why this document exists
AtoCore must be reachable from many different LLM client contexts:
- **OpenClaw** on the T420 (already integrated via the read-only
helper skill at `/home/papa/clawd/skills/atocore-context/`)
- **Claude Code** on the laptop (via the slash command shipped in
this repo at `.claude/commands/atocore-context.md`)
- **Codex** sessions (future)
- **Direct API consumers** — scripts, Python code, ad-hoc curl
- **The eventual MCP server** when it's worth building
Without an explicit layering rule, every new client tends to
reimplement the same routing logic (project detection, context
build, retrieval audit, project-state inspection) in slightly
different ways. That is exactly what almost happened in the first
draft of the Claude Code slash command, which started as a curl +
jq script that duplicated capabilities the existing operator client
already had.
This document defines the layering so future clients don't repeat
that mistake.
## The layering
Three layers, top to bottom:
```
+----------------------------------------------------+
| Per-agent thin frontends |
| |
| - Claude Code slash command |
| (.claude/commands/atocore-context.md) |
| - OpenClaw helper skill |
| (/home/papa/clawd/skills/atocore-context/) |
| - Codex skill (future) |
| - MCP server (future) |
+----------------------------------------------------+
|
| shells out to / imports
v
+----------------------------------------------------+
| Shared operator client |
| scripts/atocore_client.py |
| |
| - subcommands for stable AtoCore operations |
| - fail-open on network errors |
| - consistent JSON output across all subcommands |
| - environment-driven configuration |
| (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS, |
| ATOCORE_REFRESH_TIMEOUT_SECONDS, |
| ATOCORE_FAIL_OPEN) |
+----------------------------------------------------+
|
| HTTP
v
+----------------------------------------------------+
| AtoCore HTTP API |
| src/atocore/api/routes.py |
| |
| - the universal interface to AtoCore |
| - everything else above is glue |
+----------------------------------------------------+
```
## The non-negotiable rules
These rules are what make the layering work.
### Rule 1 — every per-agent frontend is a thin wrapper
A per-agent frontend exists to do exactly two things:
1. **Translate the agent platform's command/skill format** into an
invocation of the shared client (or a small sequence of them)
2. **Render the JSON response** into whatever shape the agent
platform wants (markdown for Claude Code, plaintext for
OpenClaw, MCP tool result for an MCP server, etc.)
Everything else — talking to AtoCore, project detection, retrieval
audit, fail-open behavior, configuration — is the **shared
client's** job.
If a per-agent frontend grows logic beyond the two responsibilities
above, that logic is in the wrong place. It belongs in the shared
client where every other frontend gets to use it.
### Rule 2 — the shared client never duplicates the API
The shared client is allowed to **compose** API calls (e.g.
`auto-context` calls `detect-project` then `context-build`), but
it never reimplements API logic. If a useful operation can't be
expressed via the existing API endpoints, the right fix is to
extend the API, not to embed the logic in the client.
This rule keeps the API as the single source of truth for what
AtoCore can do.
### Rule 3 — the shared client only exposes stable operations
A subcommand only makes it into the shared client when:
- the API endpoint behind it has been exercised by at least one
real workflow
- the request and response shapes are unlikely to change
- the operation is one that more than one frontend will plausibly
want
This rule keeps the client surface stable so frontends don't have
to chase changes. New endpoints land in the API first, get
exercised in real use, and only then get a client subcommand.
## What's in scope for the shared client today
The currently shipped scope (per `scripts/atocore_client.py`):
| Subcommand | Purpose | API endpoint(s) |
|---|---|---|
| `health` | service status, mount + source readiness | `GET /health` |
| `sources` | enabled source roots and their existence | `GET /sources` |
| `stats` | document/chunk/vector counts | `GET /stats` |
| `projects` | registered projects | `GET /projects` |
| `project-template` | starter shape for a new project | `GET /projects/template` |
| `propose-project` | preview a registration | `POST /projects/proposal` |
| `register-project` | persist a registration | `POST /projects/register` |
| `update-project` | update an existing registration | `PUT /projects/{name}` |
| `refresh-project` | re-ingest a project's roots | `POST /projects/{name}/refresh` |
| `project-state` | list trusted state for a project | `GET /project/state/{name}` |
| `project-state-set` | curate trusted state | `POST /project/state` |
| `project-state-invalidate` | supersede trusted state | `DELETE /project/state` |
| `query` | raw retrieval | `POST /query` |
| `context-build` | full context pack | `POST /context/build` |
| `auto-context` | detect-project then context-build | composes `/projects` + `/context/build` |
| `detect-project` | match a prompt to a registered project | composes `/projects` + local regex |
| `audit-query` | retrieval-quality audit with classification | composes `/query` + local labelling |
| `debug-context` | last context pack inspection | `GET /debug/context` |
| `ingest-sources` | ingest configured source dirs | `POST /ingest/sources` |
That covers everything in the "stable operations" set today:
project lifecycle, ingestion, project-state curation, retrieval and
context build, retrieval-quality audit, health and stats inspection.
## What's intentionally NOT in scope today
Three families of operations are explicitly **deferred** until
their workflows have been exercised in real use:
### 1. Memory review queue and reflection loop
Phase 9 Commit C shipped these endpoints:
- `POST /interactions` (capture)
- `POST /interactions/{id}/reinforce`
- `POST /interactions/{id}/extract`
- `GET /memory?status=candidate`
- `POST /memory/{id}/promote`
- `POST /memory/{id}/reject`
The contracts are stable, but the **workflow ergonomics** are not.
Until a real human has actually exercised the capture → extract →
review → promote/reject loop a few times and we know what feels
right, exposing those operations through the shared client would
prematurely freeze a UX that's still being designed.
When the loop has been exercised in real use and we know what
the right subcommand shapes are, the shared client gains:
- `capture <prompt> <response> [--project P] [--client C]`
- `extract <interaction-id> [--persist]`
- `queue` (list candidate review queue)
- `promote <memory-id>`
- `reject <memory-id>`
At that point the Claude Code slash command can grow a companion
`/atocore-record-response` command and the OpenClaw helper can be
extended with the same flow.
### 2. Backup and restore admin operations
Phase 9 Commit B shipped these endpoints:
- `POST /admin/backup` (with `include_chroma`)
- `GET /admin/backup` (list)
- `GET /admin/backup/{stamp}/validate`
The backup endpoints are stable, but the documented operational
procedure (`docs/backup-restore-procedure.md`) intentionally uses
direct curl rather than the shared client. The reason is that
backup operations are *administrative* and benefit from being
explicit about which instance they're targeting, with no
fail-open behavior. The shared client's fail-open default would
hide a real backup failure.
If we later decide to add backup commands to the shared client,
they would set `ATOCORE_FAIL_OPEN=false` for the duration of the
call so the operator gets a real error on failure rather than a
silent fail-open envelope.
### 3. Engineering layer entity operations
The engineering layer is in planning, not implementation. When
V1 ships per `engineering-v1-acceptance.md`, the shared client
will gain entity, relationship, conflict, and Mirror commands.
None of those exist as stable contracts yet, so they are not in
the shared client today.
## How a new agent platform integrates
When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom
GPT, a Cursor extension), the integration recipe is:
1. **Don't reimplement.** Don't write a new HTTP client. Use the
shared client.
2. **Write a thin frontend** that translates the platform's
command/skill format into a shell call to
`python scripts/atocore_client.py <subcommand> <args...>`.
3. **Render the JSON response** in the platform's preferred shape.
4. **Inherit fail-open and env-var behavior** from the shared
client. Don't override unless the platform explicitly needs
to (e.g. an admin tool that wants to see real errors).
5. **If a needed capability is missing**, propose adding it to
the shared client. If the underlying API endpoint also
doesn't exist, propose adding it to the API first. Don't
add the logic to your frontend.
The Claude Code slash command in this repo is a worked example:
~50 lines of markdown that does argument parsing, calls the
shared client, and renders the result. It contains zero AtoCore
business logic of its own.
## How OpenClaw fits
OpenClaw's helper skill at `/home/papa/clawd/skills/atocore-context/`
on the T420 currently has its own implementation of `auto-context`,
`detect-project`, and the project lifecycle commands. It predates
this layering doc.
The right long-term shape is to **refactor the OpenClaw helper to
shell out to the shared client** instead of duplicating the
routing logic. This isn't urgent because:
- OpenClaw's helper works today and is in active use
- The duplication is on the OpenClaw side; AtoCore itself is not
affected
- The shared client and the OpenClaw helper are in different
repos (AtoCore vs OpenClaw clawd), so the refactor is a
cross-repo coordination
The refactor is queued as a follow-up. Until then, **the OpenClaw
helper and the Claude Code slash command are parallel
implementations** of the same idea. The shared client is the
canonical backbone going forward; new clients should follow the
new pattern even though the existing OpenClaw helper still has
its own.
## How this connects to the master plan
| Layer | Phase home | Status |
|---|---|---|
| AtoCore HTTP API | Phases 0/0.5/1/2/3/5/7/9 | shipped |
| Shared operator client (`scripts/atocore_client.py`) | implicitly Phase 8 (OpenClaw integration) infrastructure | shipped via codex/port-atocore-ops-client merge |
| OpenClaw helper skill (T420) | Phase 8 — partial | shipped (own implementation, refactor queued) |
| Claude Code slash command (this repo) | precursor to Phase 11 (multi-model) | shipped (refactored to use the shared client) |
| Codex skill | Phase 11 | future |
| MCP server | Phase 11 | future |
| Web UI / dashboard | Phase 11+ | future |
The shared client is the **substrate Phase 11 will build on**.
Every new client added in Phase 11 should be a thin frontend on
the shared client, not a fresh reimplementation.
## Versioning and stability
The shared client's subcommand surface is **stable**. Adding new
subcommands is non-breaking. Changing or removing existing
subcommands is breaking and would require a coordinated update
of every frontend that depends on them.
The current shared client has no explicit version constant; the
implicit contract is "the subcommands and JSON shapes documented
in this file". When the client surface meaningfully changes,
add a `CLIENT_VERSION = "x.y.z"` constant to
`scripts/atocore_client.py` and bump it per semver:
- patch: bug fixes, no surface change
- minor: new subcommands or new optional fields
- major: removed subcommands, renamed fields, changed defaults
## Open follow-ups
1. **Refactor the OpenClaw helper** to shell out to the shared
client. Cross-repo coordination, not blocking anything in
AtoCore itself.
2. **Add memory-review subcommands** when the Phase 9 review
workflow has been exercised in real use.
3. **Add backup admin subcommands** if and when we decide the
shared client should be the canonical backup operator
interface (with fail-open disabled for admin commands).
4. **Add engineering-layer entity subcommands** as part of the
engineering V1 implementation sprint, per
`engineering-v1-acceptance.md`.
5. **Tag a `CLIENT_VERSION` constant** the next time the shared
client surface meaningfully changes. Today's surface is the
v0.1.0 baseline.
## TL;DR
- AtoCore HTTP API is the universal interface
- `scripts/atocore_client.py` is the canonical shared Python
backbone for stable AtoCore operations
- Per-agent frontends (Claude Code slash command, OpenClaw
helper, future Codex skill, future MCP server) are thin
wrappers that shell out to the shared client
- The shared client today covers project lifecycle, ingestion,
retrieval, context build, project-state, and retrieval audit
- Memory-review and engineering-entity commands are deferred
until their workflows are exercised
- The OpenClaw helper is currently a parallel implementation and
the refactor to the shared client is a queued follow-up
- New LLM clients should never reimplement HTTP calls — they
follow the shell-out pattern documented here