Files
ATOCore/docs/architecture/llm-client-integration.md
Anto01 78d4e979e5 refactor slash command onto shared client + llm-client-integration doc
Codex's review caught that the Claude Code slash command shipped in
Session 2 was a parallel reimplementation of routing logic the
existing scripts/atocore_client.py already had. That client was
introduced via the codex/port-atocore-ops-client merge and is
already a comprehensive operator client (auto-context,
detect-project, refresh-project, project-state, audit-query, etc.).
The slash command should have been a thin wrapper from the start.

This commit fixes the shape without expanding scope.

.claude/commands/atocore-context.md
-----------------------------------
Rewritten as a thin Claude Code-specific frontend that shells out
to the shared client:

- explicit project hint -> calls `python scripts/atocore_client.py
  context-build "<prompt>" "<project>"`
- no explicit hint -> calls `python scripts/atocore_client.py
  auto-context "<prompt>"` which runs the client's detect-project
  routing first and falls through to context-build with the match

Inherits the client's stable behaviour for free:
- ATOCORE_BASE_URL env var (default http://dalidou:8100)
- fail-open on network errors via ATOCORE_FAIL_OPEN
- consistent JSON output shape
- the same project alias matching the OpenClaw helper uses

Removes the speculative `--capture` capture path that was in the
original draft. Capture/extract/queue/promote/reject are
intentionally NOT in the shared client yet (memory-review
workflow not exercised in real use), so the slash command can't
expose them either.

docs/architecture/llm-client-integration.md
-------------------------------------------
New planning doc that defines the layering rule for AtoCore's
relationship with LLM client contexts:

Three layers:
1. AtoCore HTTP API (universal, src/atocore/api/routes.py)
2. Shared operator client (scripts/atocore_client.py) — the
   canonical Python backbone for stable AtoCore operations
3. Per-agent thin frontends (Claude Code slash command,
   OpenClaw helper, future Codex skill, future MCP server)
   that shell out to the shared client

Three non-negotiable rules:
- every per-agent frontend is a thin wrapper (translate the
  agent's command format and render the JSON; nothing else)
- the shared client never duplicates the API (it composes
  endpoints; new logic goes in the API first)
- the shared client only exposes stable operations (subcommands
  land only after the API has been exercised in a real workflow)

Doc covers:
- the full table of subcommands currently in scope (project
  lifecycle, ingestion, project-state, retrieval, context build,
  audit-query, debug-context, health/stats)
- the three deferred families with rationale: memory review
  queue (workflow not exercised), backup admin (fail-open
  default would hide errors), engineering layer entities (V1
  not yet implemented)
- the integration recipe for new agent platforms
- explicit acknowledgement that the OpenClaw helper currently
  duplicates routing logic and that the refactor to the shared
  client is a queued cross-repo follow-up
- how the layering connects to phase 8 (OpenClaw) and phase 11
  (multi-model)
- versioning and stability rules for the shared client surface
- open follow-ups: OpenClaw refactor, memory-review subcommands
  when ready, optional backup admin subcommands, engineering
  entity subcommands during V1 implementation

master-plan-status.md updated
-----------------------------
- New "LLM Client Integration" subsection that points to the
  layering doc and explicitly notes the deferral of memory-review
  and engineering-entity subcommands
- Frames the layering as sitting between phase 8 and phase 11

Scope is intentionally narrow per codex's framing: promote the
existing client to canonical status, refactor the slash command
to use it, document the layering. No new client subcommands
added in this commit. The OpenClaw helper refactor is a
separate cross-repo follow-up. Memory-review and engineering-
entity work stay deferred.

Full suite: 160 passing, no behavior changes.
2026-04-07 07:22:54 -04:00

14 KiB

LLM Client Integration (the layering)

Why this document exists

AtoCore must be reachable from many different LLM client contexts:

  • OpenClaw on the T420 (already integrated via the read-only helper skill at /home/papa/clawd/skills/atocore-context/)
  • Claude Code on the laptop (via the slash command shipped in this repo at .claude/commands/atocore-context.md)
  • Codex sessions (future)
  • Direct API consumers — scripts, Python code, ad-hoc curl
  • The eventual MCP server when it's worth building

Without an explicit layering rule, every new client tends to reimplement the same routing logic (project detection, context build, retrieval audit, project-state inspection) in slightly different ways. That is exactly what almost happened in the first draft of the Claude Code slash command, which started as a curl + jq script that duplicated capabilities the existing operator client already had.

This document defines the layering so future clients don't repeat that mistake.

The layering

Three layers, top to bottom:

        +----------------------------------------------------+
        |  Per-agent thin frontends                          |
        |                                                    |
        |  - Claude Code slash command                       |
        |    (.claude/commands/atocore-context.md)           |
        |  - OpenClaw helper skill                           |
        |    (/home/papa/clawd/skills/atocore-context/)      |
        |  - Codex skill (future)                            |
        |  - MCP server (future)                             |
        +----------------------------------------------------+
                              |
                              | shells out to / imports
                              v
        +----------------------------------------------------+
        |  Shared operator client                            |
        |  scripts/atocore_client.py                         |
        |                                                    |
        |  - subcommands for stable AtoCore operations       |
        |  - fail-open on network errors                     |
        |  - consistent JSON output across all subcommands   |
        |  - environment-driven configuration                |
        |    (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS,     |
        |     ATOCORE_REFRESH_TIMEOUT_SECONDS,               |
        |     ATOCORE_FAIL_OPEN)                             |
        +----------------------------------------------------+
                              |
                              | HTTP
                              v
        +----------------------------------------------------+
        |  AtoCore HTTP API                                  |
        |  src/atocore/api/routes.py                         |
        |                                                    |
        |  - the universal interface to AtoCore              |
        |  - everything else above is glue                   |
        +----------------------------------------------------+

The non-negotiable rules

These rules are what make the layering work.

Rule 1 — every per-agent frontend is a thin wrapper

A per-agent frontend exists to do exactly two things:

  1. Translate the agent platform's command/skill format into an invocation of the shared client (or a small sequence of them)
  2. Render the JSON response into whatever shape the agent platform wants (markdown for Claude Code, plaintext for OpenClaw, MCP tool result for an MCP server, etc.)

Everything else — talking to AtoCore, project detection, retrieval audit, fail-open behavior, configuration — is the shared client's job.

If a per-agent frontend grows logic beyond the two responsibilities above, that logic is in the wrong place. It belongs in the shared client where every other frontend gets to use it.

Rule 2 — the shared client never duplicates the API

The shared client is allowed to compose API calls (e.g. auto-context calls detect-project then context-build), but it never reimplements API logic. If a useful operation can't be expressed via the existing API endpoints, the right fix is to extend the API, not to embed the logic in the client.

This rule keeps the API as the single source of truth for what AtoCore can do.

Rule 3 — the shared client only exposes stable operations

A subcommand only makes it into the shared client when:

  • the API endpoint behind it has been exercised by at least one real workflow
  • the request and response shapes are unlikely to change
  • the operation is one that more than one frontend will plausibly want

This rule keeps the client surface stable so frontends don't have to chase changes. New endpoints land in the API first, get exercised in real use, and only then get a client subcommand.

What's in scope for the shared client today

The currently shipped scope (per scripts/atocore_client.py):

Subcommand Purpose API endpoint(s)
health service status, mount + source readiness GET /health
sources enabled source roots and their existence GET /sources
stats document/chunk/vector counts GET /stats
projects registered projects GET /projects
project-template starter shape for a new project GET /projects/template
propose-project preview a registration POST /projects/proposal
register-project persist a registration POST /projects/register
update-project update an existing registration PUT /projects/{name}
refresh-project re-ingest a project's roots POST /projects/{name}/refresh
project-state list trusted state for a project GET /project/state/{name}
project-state-set curate trusted state POST /project/state
project-state-invalidate supersede trusted state DELETE /project/state
query raw retrieval POST /query
context-build full context pack POST /context/build
auto-context detect-project then context-build composes /projects + /context/build
detect-project match a prompt to a registered project composes /projects + local regex
audit-query retrieval-quality audit with classification composes /query + local labelling
debug-context last context pack inspection GET /debug/context
ingest-sources ingest configured source dirs POST /ingest/sources

That covers everything in the "stable operations" set today: project lifecycle, ingestion, project-state curation, retrieval and context build, retrieval-quality audit, health and stats inspection.

What's intentionally NOT in scope today

Three families of operations are explicitly deferred until their workflows have been exercised in real use:

1. Memory review queue and reflection loop

Phase 9 Commit C shipped these endpoints:

  • POST /interactions (capture)
  • POST /interactions/{id}/reinforce
  • POST /interactions/{id}/extract
  • GET /memory?status=candidate
  • POST /memory/{id}/promote
  • POST /memory/{id}/reject

The contracts are stable, but the workflow ergonomics are not. Until a real human has actually exercised the capture → extract → review → promote/reject loop a few times and we know what feels right, exposing those operations through the shared client would prematurely freeze a UX that's still being designed.

When the loop has been exercised in real use and we know what the right subcommand shapes are, the shared client gains:

  • capture <prompt> <response> [--project P] [--client C]
  • extract <interaction-id> [--persist]
  • queue (list candidate review queue)
  • promote <memory-id>
  • reject <memory-id>

At that point the Claude Code slash command can grow a companion /atocore-record-response command and the OpenClaw helper can be extended with the same flow.

2. Backup and restore admin operations

Phase 9 Commit B shipped these endpoints:

  • POST /admin/backup (with include_chroma)
  • GET /admin/backup (list)
  • GET /admin/backup/{stamp}/validate

The backup endpoints are stable, but the documented operational procedure (docs/backup-restore-procedure.md) intentionally uses direct curl rather than the shared client. The reason is that backup operations are administrative and benefit from being explicit about which instance they're targeting, with no fail-open behavior. The shared client's fail-open default would hide a real backup failure.

If we later decide to add backup commands to the shared client, they would set ATOCORE_FAIL_OPEN=false for the duration of the call so the operator gets a real error on failure rather than a silent fail-open envelope.

3. Engineering layer entity operations

The engineering layer is in planning, not implementation. When V1 ships per engineering-v1-acceptance.md, the shared client will gain entity, relationship, conflict, and Mirror commands. None of those exist as stable contracts yet, so they are not in the shared client today.

How a new agent platform integrates

When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom GPT, a Cursor extension), the integration recipe is:

  1. Don't reimplement. Don't write a new HTTP client. Use the shared client.
  2. Write a thin frontend that translates the platform's command/skill format into a shell call to python scripts/atocore_client.py <subcommand> <args...>.
  3. Render the JSON response in the platform's preferred shape.
  4. Inherit fail-open and env-var behavior from the shared client. Don't override unless the platform explicitly needs to (e.g. an admin tool that wants to see real errors).
  5. If a needed capability is missing, propose adding it to the shared client. If the underlying API endpoint also doesn't exist, propose adding it to the API first. Don't add the logic to your frontend.

The Claude Code slash command in this repo is a worked example: ~50 lines of markdown that does argument parsing, calls the shared client, and renders the result. It contains zero AtoCore business logic of its own.

How OpenClaw fits

OpenClaw's helper skill at /home/papa/clawd/skills/atocore-context/ on the T420 currently has its own implementation of auto-context, detect-project, and the project lifecycle commands. It predates this layering doc.

The right long-term shape is to refactor the OpenClaw helper to shell out to the shared client instead of duplicating the routing logic. This isn't urgent because:

  • OpenClaw's helper works today and is in active use
  • The duplication is on the OpenClaw side; AtoCore itself is not affected
  • The shared client and the OpenClaw helper are in different repos (AtoCore vs OpenClaw clawd), so the refactor is a cross-repo coordination

The refactor is queued as a follow-up. Until then, the OpenClaw helper and the Claude Code slash command are parallel implementations of the same idea. The shared client is the canonical backbone going forward; new clients should follow the new pattern even though the existing OpenClaw helper still has its own.

How this connects to the master plan

Layer Phase home Status
AtoCore HTTP API Phases 0/0.5/1/2/3/5/7/9 shipped
Shared operator client (scripts/atocore_client.py) implicitly Phase 8 (OpenClaw integration) infrastructure shipped via codex/port-atocore-ops-client merge
OpenClaw helper skill (T420) Phase 8 — partial shipped (own implementation, refactor queued)
Claude Code slash command (this repo) precursor to Phase 11 (multi-model) shipped (refactored to use the shared client)
Codex skill Phase 11 future
MCP server Phase 11 future
Web UI / dashboard Phase 11+ future

The shared client is the substrate Phase 11 will build on. Every new client added in Phase 11 should be a thin frontend on the shared client, not a fresh reimplementation.

Versioning and stability

The shared client's subcommand surface is stable. Adding new subcommands is non-breaking. Changing or removing existing subcommands is breaking and would require a coordinated update of every frontend that depends on them.

The current shared client has no explicit version constant; the implicit contract is "the subcommands and JSON shapes documented in this file". When the client surface meaningfully changes, add a CLIENT_VERSION = "x.y.z" constant to scripts/atocore_client.py and bump it per semver:

  • patch: bug fixes, no surface change
  • minor: new subcommands or new optional fields
  • major: removed subcommands, renamed fields, changed defaults

Open follow-ups

  1. Refactor the OpenClaw helper to shell out to the shared client. Cross-repo coordination, not blocking anything in AtoCore itself.
  2. Add memory-review subcommands when the Phase 9 review workflow has been exercised in real use.
  3. Add backup admin subcommands if and when we decide the shared client should be the canonical backup operator interface (with fail-open disabled for admin commands).
  4. Add engineering-layer entity subcommands as part of the engineering V1 implementation sprint, per engineering-v1-acceptance.md.
  5. Tag a CLIENT_VERSION constant the next time the shared client surface meaningfully changes. Today's surface is the v0.1.0 baseline.

TL;DR

  • AtoCore HTTP API is the universal interface
  • scripts/atocore_client.py is the canonical shared Python backbone for stable AtoCore operations
  • Per-agent frontends (Claude Code slash command, OpenClaw helper, future Codex skill, future MCP server) are thin wrappers that shell out to the shared client
  • The shared client today covers project lifecycle, ingestion, retrieval, context build, project-state, and retrieval audit
  • Memory-review and engineering-entity commands are deferred until their workflows are exercised
  • The OpenClaw helper is currently a parallel implementation and the refactor to the shared client is a queued follow-up
  • New LLM clients should never reimplement HTTP calls — they follow the shell-out pattern documented here