ATOCore/docs/architecture/llm-client-integration.md

# LLM Client Integration (the layering)

## Why this document exists

AtoCore must be reachable from many different LLM client contexts:

- **OpenClaw** on the T420 (already integrated via the read-only
  helper skill at `/home/papa/clawd/skills/atocore-context/`)
- **Claude Code** on the laptop (via the slash command shipped in
  this repo at `.claude/commands/atocore-context.md`)
- **Codex** sessions (future)
- **Direct API consumers** — scripts, Python code, ad-hoc curl
- **The eventual MCP server** when it's worth building

Without an explicit layering rule, every new client tends to
reimplement the same routing logic (project detection, context
build, retrieval audit, project-state inspection) in slightly
different ways. That is exactly what almost happened in the first
draft of the Claude Code slash command, which started as a curl +
jq script that duplicated capabilities the existing operator client
already had.

This document defines the layering so future clients don't repeat
that mistake.

## The layering

Three layers, top to bottom:

```
        +----------------------------------------------------+
        |  Per-agent thin frontends                          |
        |                                                    |
        |  - Claude Code slash command                       |
        |    (.claude/commands/atocore-context.md)           |
        |  - OpenClaw helper skill                           |
        |    (/home/papa/clawd/skills/atocore-context/)      |
        |  - Codex skill (future)                            |
        |  - MCP server (future)                             |
        +----------------------------------------------------+
                              |
                              | shells out to / imports
                              v
        +----------------------------------------------------+
        |  Shared operator client                            |
        |  scripts/atocore_client.py                         |
        |                                                    |
        |  - subcommands for stable AtoCore operations       |
        |  - fail-open on network errors                     |
        |  - consistent JSON output across all subcommands   |
        |  - environment-driven configuration                |
        |    (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS,     |
        |     ATOCORE_REFRESH_TIMEOUT_SECONDS,               |
        |     ATOCORE_FAIL_OPEN)                             |
        +----------------------------------------------------+
                              |
                              | HTTP
                              v
        +----------------------------------------------------+
        |  AtoCore HTTP API                                  |
        |  src/atocore/api/routes.py                         |
        |                                                    |
        |  - the universal interface to AtoCore              |
        |  - everything else above is glue                   |
        +----------------------------------------------------+
```

## The non-negotiable rules

These rules are what make the layering work.

### Rule 1 — every per-agent frontend is a thin wrapper

A per-agent frontend exists to do exactly two things:

1. **Translate the agent platform's command/skill format** into an
   invocation of the shared client (or a small sequence of them)
2. **Render the JSON response** into whatever shape the agent
   platform wants (markdown for Claude Code, plaintext for
   OpenClaw, MCP tool result for an MCP server, etc.)

Everything else — talking to AtoCore, project detection, retrieval
audit, fail-open behavior, configuration — is the **shared
client's** job.

If a per-agent frontend grows logic beyond the two responsibilities
above, that logic is in the wrong place. It belongs in the shared
client where every other frontend gets to use it.

### Rule 2 — the shared client never duplicates the API

The shared client is allowed to **compose** API calls (e.g.
`auto-context` calls `detect-project` then `context-build`), but
it never reimplements API logic. If a useful operation can't be
expressed via the existing API endpoints, the right fix is to
extend the API, not to embed the logic in the client.

This rule keeps the API as the single source of truth for what
AtoCore can do.

### Rule 3 — the shared client only exposes stable operations

A subcommand only makes it into the shared client when:

- the API endpoint behind it has been exercised by at least one
  real workflow
- the request and response shapes are unlikely to change
- the operation is one that more than one frontend will plausibly
  want

This rule keeps the client surface stable so frontends don't have
to chase changes. New endpoints land in the API first, get
exercised in real use, and only then get a client subcommand.

## What's in scope for the shared client today

The currently shipped scope (per `scripts/atocore_client.py`):

| Subcommand | Purpose | API endpoint(s) |
|---|---|---|
| `health` | service status, mount + source readiness | `GET /health` |
| `sources` | enabled source roots and their existence | `GET /sources` |
| `stats` | document/chunk/vector counts | `GET /stats` |
| `projects` | registered projects | `GET /projects` |
| `project-template` | starter shape for a new project | `GET /projects/template` |
| `propose-project` | preview a registration | `POST /projects/proposal` |
| `register-project` | persist a registration | `POST /projects/register` |
| `update-project` | update an existing registration | `PUT /projects/{name}` |
| `refresh-project` | re-ingest a project's roots | `POST /projects/{name}/refresh` |
| `project-state` | list trusted state for a project | `GET /project/state/{name}` |
| `project-state-set` | curate trusted state | `POST /project/state` |
| `project-state-invalidate` | supersede trusted state | `DELETE /project/state` |
| `query` | raw retrieval | `POST /query` |
| `context-build` | full context pack | `POST /context/build` |
| `auto-context` | detect-project then context-build | composes `/projects` + `/context/build` |
| `detect-project` | match a prompt to a registered project | composes `/projects` + local regex |
| `audit-query` | retrieval-quality audit with classification | composes `/query` + local labelling |
| `debug-context` | last context pack inspection | `GET /debug/context` |
| `ingest-sources` | ingest configured source dirs | `POST /ingest/sources` |

That covers everything in the "stable operations" set today:
project lifecycle, ingestion, project-state curation, retrieval and
context build, retrieval-quality audit, health and stats inspection.

## What's intentionally NOT in scope today

Three families of operations are explicitly **deferred** until
their workflows have been exercised in real use:

### 1. Memory review queue and reflection loop

Phase 9 Commit C shipped these endpoints:

- `POST /interactions` (capture)
- `POST /interactions/{id}/reinforce`
- `POST /interactions/{id}/extract`
- `GET /memory?status=candidate`
- `POST /memory/{id}/promote`
- `POST /memory/{id}/reject`

The contracts are stable, but the **workflow ergonomics** are not.
Until a real human has actually exercised the capture → extract →
review → promote/reject loop a few times and we know what feels
right, exposing those operations through the shared client would
prematurely freeze a UX that's still being designed.

When the loop has been exercised in real use and we know what
the right subcommand shapes are, the shared client gains:

- `capture <prompt> <response> [--project P] [--client C]`
- `extract <interaction-id> [--persist]`
- `queue` (list candidate review queue)
- `promote <memory-id>`
- `reject <memory-id>`

At that point the Claude Code slash command can grow a companion
`/atocore-record-response` command and the OpenClaw helper can be
extended with the same flow.

### 2. Backup and restore admin operations

Phase 9 Commit B shipped these endpoints:

- `POST /admin/backup` (with `include_chroma`)
- `GET /admin/backup` (list)
- `GET /admin/backup/{stamp}/validate`

The backup endpoints are stable, but the documented operational
procedure (`docs/backup-restore-procedure.md`) intentionally uses
direct curl rather than the shared client. The reason is that
backup operations are *administrative* and benefit from being
explicit about which instance they're targeting, with no
fail-open behavior. The shared client's fail-open default would
hide a real backup failure.

If we later decide to add backup commands to the shared client,
they would set `ATOCORE_FAIL_OPEN=false` for the duration of the
call so the operator gets a real error on failure rather than a
silent fail-open envelope.

### 3. Engineering layer entity operations

The engineering layer is in planning, not implementation. When
V1 ships per `engineering-v1-acceptance.md`, the shared client
will gain entity, relationship, conflict, and Mirror commands.
None of those exist as stable contracts yet, so they are not in
the shared client today.

## How a new agent platform integrates

When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom
GPT, a Cursor extension), the integration recipe is:

1. **Don't reimplement.** Don't write a new HTTP client. Use the
   shared client.
2. **Write a thin frontend** that translates the platform's
   command/skill format into a shell call to
   `python scripts/atocore_client.py <subcommand> <args...>`.
3. **Render the JSON response** in the platform's preferred shape.
4. **Inherit fail-open and env-var behavior** from the shared
   client. Don't override unless the platform explicitly needs
   to (e.g. an admin tool that wants to see real errors).
5. **If a needed capability is missing**, propose adding it to
   the shared client. If the underlying API endpoint also
   doesn't exist, propose adding it to the API first. Don't
   add the logic to your frontend.

The Claude Code slash command in this repo is a worked example:
~50 lines of markdown that does argument parsing, calls the
shared client, and renders the result. It contains zero AtoCore
business logic of its own.

## How OpenClaw fits

OpenClaw's helper skill at `/home/papa/clawd/skills/atocore-context/`
on the T420 currently has its own implementation of `auto-context`,
`detect-project`, and the project lifecycle commands. It predates
this layering doc.

The right long-term shape is to **refactor the OpenClaw helper to
shell out to the shared client** instead of duplicating the
routing logic. This isn't urgent because:

- OpenClaw's helper works today and is in active use
- The duplication is on the OpenClaw side; AtoCore itself is not
  affected
- The shared client and the OpenClaw helper are in different
  repos (AtoCore vs OpenClaw clawd), so the refactor is a
  cross-repo coordination

The refactor is queued as a follow-up. Until then, **the OpenClaw
helper and the Claude Code slash command are parallel
implementations** of the same idea. The shared client is the
canonical backbone going forward; new clients should follow the
new pattern even though the existing OpenClaw helper still has
its own.

## How this connects to the master plan

| Layer | Phase home | Status |
|---|---|---|
| AtoCore HTTP API | Phases 0/0.5/1/2/3/5/7/9 | shipped |
| Shared operator client (`scripts/atocore_client.py`) | implicitly Phase 8 (OpenClaw integration) infrastructure | shipped via codex/port-atocore-ops-client merge |
| OpenClaw helper skill (T420) | Phase 8 — partial | shipped (own implementation, refactor queued) |
| Claude Code slash command (this repo) | precursor to Phase 11 (multi-model) | shipped (refactored to use the shared client) |
| Codex skill | Phase 11 | future |
| MCP server | Phase 11 | future |
| Web UI / dashboard | Phase 11+ | future |

The shared client is the **substrate Phase 11 will build on**.
Every new client added in Phase 11 should be a thin frontend on
the shared client, not a fresh reimplementation.

## Versioning and stability

The shared client's subcommand surface is **stable**. Adding new
subcommands is non-breaking. Changing or removing existing
subcommands is breaking and would require a coordinated update
of every frontend that depends on them.

The current shared client has no explicit version constant; the
implicit contract is "the subcommands and JSON shapes documented
in this file". When the client surface meaningfully changes,
add a `CLIENT_VERSION = "x.y.z"` constant to
`scripts/atocore_client.py` and bump it per semver:

- patch: bug fixes, no surface change
- minor: new subcommands or new optional fields
- major: removed subcommands, renamed fields, changed defaults

## Open follow-ups

1. **Refactor the OpenClaw helper** to shell out to the shared
   client. Cross-repo coordination, not blocking anything in
   AtoCore itself.
2. **Add memory-review subcommands** when the Phase 9 review
   workflow has been exercised in real use.
3. **Add backup admin subcommands** if and when we decide the
   shared client should be the canonical backup operator
   interface (with fail-open disabled for admin commands).
4. **Add engineering-layer entity subcommands** as part of the
   engineering V1 implementation sprint, per
   `engineering-v1-acceptance.md`.
5. **Tag a `CLIENT_VERSION` constant** the next time the shared
   client surface meaningfully changes. Today's surface is the
   v0.1.0 baseline.

## TL;DR

- AtoCore HTTP API is the universal interface
- `scripts/atocore_client.py` is the canonical shared Python
  backbone for stable AtoCore operations
- Per-agent frontends (Claude Code slash command, OpenClaw
  helper, future Codex skill, future MCP server) are thin
  wrappers that shell out to the shared client
- The shared client today covers project lifecycle, ingestion,
  retrieval, context build, project-state, and retrieval audit
- Memory-review and engineering-entity commands are deferred
  until their workflows are exercised
- The OpenClaw helper is currently a parallel implementation and
  the refactor to the shared client is a queued follow-up
- New LLM clients should never reimplement HTTP calls — they
  follow the shell-out pattern documented here