Files

Anto01 fad30d5461 feat(client): Phase 9 reflection loop surface in shared operator CLI

Codex's sequence step 3: finish the Phase 9 operator surface in the
shared client. The previous client version (0.1.0) covered stable
operations (project lifecycle, retrieval, context build, trusted
state, audit-query) but explicitly deferred capture/extract/queue/
promote/reject pending "exercised workflow". That deferral ran
into a bootstrap problem: real Claude Code sessions can't exercise
the Phase 9 loop without a usable client surface to drive it. This
commit ships the 8 missing subcommands so the next step (real
validation on Dalidou) is unblocked.

Bumps CLIENT_VERSION from 0.1.0 to 0.2.0 per the semver rules in
llm-client-integration.md (new subcommands = minor bump).

New subcommands in scripts/atocore_client.py
--------------------------------------------
| Subcommand            | Endpoint                                  |
|-----------------------|-------------------------------------------|
| capture               | POST /interactions                        |
| extract               | POST /interactions/{id}/extract           |
| reinforce-interaction | POST /interactions/{id}/reinforce         |
| list-interactions     | GET  /interactions                        |
| get-interaction       | GET  /interactions/{id}                   |
| queue                 | GET  /memory?status=candidate             |
| promote               | POST /memory/{id}/promote                 |
| reject                | POST /memory/{id}/reject                  |

Each follows the existing client style: positional arguments with
empty-string defaults for optional filters, truthy-string arguments
for booleans (matching the existing refresh-project pattern), JSON
output via print_json(), fail-open behavior inherited from
request().

capture accepts prompt + response + project + client + session_id +
reinforce as positionals, defaulting the client field to
"atocore-client" when omitted so every capture from the shared
client is identifiable in the interactions audit trail.

extract defaults to preview mode (persist=false). Pass "true" as
the second positional to create candidate memories.

list-interactions and queue build URL query strings with
url-encoded values and always include the limit, matching how the
existing context-build subcommand handles its parameters.

Security fix: ID-field URL encoding
-----------------------------------
The initial draft used urllib.parse.quote() with the default safe
set, which does NOT encode "/" because it's a reserved path
character. That's a security footgun on ID fields: passing
"promote mem/evil/action" would build /memory/mem/evil/action/promote
and hit a completely different endpoint than intended.

Fixed by passing safe="" to urllib.parse.quote() on every ID field
(interaction_id and memory_id). The tests cover this explicitly via
test_extract_url_encodes_interaction_id and test_promote_url_encodes_memory_id,
both of which would have failed with the default behavior.

Project names keep the default quote behavior because a project
name with a slash would already be broken elsewhere in the system
(ingest root resolution, file paths, etc).

tests/test_atocore_client.py (new, 18 tests, all green)
-------------------------------------------------------
A dedicated test file for the shared client that mocks the
request() helper and verifies each subcommand:
- calls the correct HTTP method and path
- builds the correct JSON body (or query string)
- passes the right subset of CLI arguments through
- URL-encodes ID fields so path traversal isn't possible

Tests are structured as unit tests (not integration tests) because
the API surface on the server side already has its own route tests
in test_api_storage.py and the Phase 9 specific files. These tests
are the wiring contract between CLI args and HTTP calls.

Test file highlights:
- capture: default values, custom client, reinforce=false
- extract: preview by default, persist=true opt-in, URL encoding
- reinforce-interaction: correct path construction
- list-interactions: no filters, single filter, full filter set
  (including ISO 8601 since parameter with T separator and Z)
- get-interaction: fetch by id
- queue: always filters status=candidate, accepts memory_type
  and project, coerces limit to int
- promote / reject: correct path + URL encoding
- test_phase9_full_loop_via_client_shape: end-to-end sequence
  that drives capture -> extract preview -> extract persist ->
  queue list -> promote -> reject through the shared client and
  verifies the exact sequence of HTTP calls that would be made

These tests run in ~0.2s because they mock request() — no DB, no
Chroma, no HTTP. The fast feedback loop matters because the
client surface is what every agent integration eventually depends
on.

docs/architecture/llm-client-integration.md updates
---------------------------------------------------
- New "Phase 9 reflection loop (shipped after migration safety
  work)" section under "What's in scope for the shared client
  today" with the full 8-subcommand table and a note explaining
  the bootstrap-problem rationale
- Removed the "Memory review queue and reflection loop" section
  from "What's intentionally NOT in scope today"; backup admin
  and engineering-entity commands remain the only deferred
  families
- Renumbered the deferred-commands list (was 3 items, now 2)
- Open follow-ups updated: memory-review-subcommand item replaced
  with "real-usage validation of the Phase 9 loop" as the next
  concrete dependency
- TL;DR updated to list the reflection-loop subcommands
- Versioning note records the v0.1.0 -> v0.2.0 bump with the
  subcommands included

Full suite: 215 passing (was 197), 1 warning. The +18 is
tests/test_atocore_client.py. Runtime unchanged because the new
tests don't touch the DB.

What this commit does NOT do
----------------------------
- Does NOT change the server-side endpoints. All 8 subcommands
  call existing API routes that were shipped in Phase 9 Commits
  A/B/C. This is purely a client-side wiring commit.
- Does NOT run the reflection loop against the live Dalidou
  instance. That's the next concrete step and is explicitly
  called out in the open-follow-ups section of the updated doc.
- Does NOT modify the Claude Code slash command. It still pulls
  context only; the capture/extract/queue/promote companion
  commands (e.g. /atocore-record-response) are deferred until the
  capture workflow has been exercised in real use at least once.
- Does NOT refactor the OpenClaw helper. That's a cross-repo
  change and remains a queued follow-up, now unblocked by the
  shared client having the reflection-loop subcommands.

2026-04-08 16:09:42 -04:00

15 KiB

Raw Blame History

LLM Client Integration (the layering)

Why this document exists

AtoCore must be reachable from many different LLM client contexts:

OpenClaw on the T420 (already integrated via the read-only helper skill at /home/papa/clawd/skills/atocore-context/)
Claude Code on the laptop (via the slash command shipped in this repo at .claude/commands/atocore-context.md)
Codex sessions (future)
Direct API consumers — scripts, Python code, ad-hoc curl
The eventual MCP server when it's worth building

Without an explicit layering rule, every new client tends to reimplement the same routing logic (project detection, context build, retrieval audit, project-state inspection) in slightly different ways. That is exactly what almost happened in the first draft of the Claude Code slash command, which started as a curl + jq script that duplicated capabilities the existing operator client already had.

This document defines the layering so future clients don't repeat that mistake.

The layering

Three layers, top to bottom:

        +----------------------------------------------------+
        |  Per-agent thin frontends                          |
        |                                                    |
        |  - Claude Code slash command                       |
        |    (.claude/commands/atocore-context.md)           |
        |  - OpenClaw helper skill                           |
        |    (/home/papa/clawd/skills/atocore-context/)      |
        |  - Codex skill (future)                            |
        |  - MCP server (future)                             |
        +----------------------------------------------------+
                              |
                              | shells out to / imports
                              v
        +----------------------------------------------------+
        |  Shared operator client                            |
        |  scripts/atocore_client.py                         |
        |                                                    |
        |  - subcommands for stable AtoCore operations       |
        |  - fail-open on network errors                     |
        |  - consistent JSON output across all subcommands   |
        |  - environment-driven configuration                |
        |    (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS,     |
        |     ATOCORE_REFRESH_TIMEOUT_SECONDS,               |
        |     ATOCORE_FAIL_OPEN)                             |
        +----------------------------------------------------+
                              |
                              | HTTP
                              v
        +----------------------------------------------------+
        |  AtoCore HTTP API                                  |
        |  src/atocore/api/routes.py                         |
        |                                                    |
        |  - the universal interface to AtoCore              |
        |  - everything else above is glue                   |
        +----------------------------------------------------+

The non-negotiable rules

These rules are what make the layering work.

Rule 1 — every per-agent frontend is a thin wrapper

A per-agent frontend exists to do exactly two things:

Translate the agent platform's command/skill format into an invocation of the shared client (or a small sequence of them)
Render the JSON response into whatever shape the agent platform wants (markdown for Claude Code, plaintext for OpenClaw, MCP tool result for an MCP server, etc.)

Everything else — talking to AtoCore, project detection, retrieval audit, fail-open behavior, configuration — is the shared client's job.

If a per-agent frontend grows logic beyond the two responsibilities above, that logic is in the wrong place. It belongs in the shared client where every other frontend gets to use it.

Rule 2 — the shared client never duplicates the API

The shared client is allowed to compose API calls (e.g. auto-context calls detect-project then context-build), but it never reimplements API logic. If a useful operation can't be expressed via the existing API endpoints, the right fix is to extend the API, not to embed the logic in the client.

This rule keeps the API as the single source of truth for what AtoCore can do.

Rule 3 — the shared client only exposes stable operations

A subcommand only makes it into the shared client when:

the API endpoint behind it has been exercised by at least one real workflow
the request and response shapes are unlikely to change
the operation is one that more than one frontend will plausibly want

This rule keeps the client surface stable so frontends don't have to chase changes. New endpoints land in the API first, get exercised in real use, and only then get a client subcommand.

What's in scope for the shared client today

The currently shipped scope (per scripts/atocore_client.py):

Stable operations (shipped since the client was introduced)

Subcommand	Purpose	API endpoint(s)
`health`	service status, mount + source readiness	`GET /health`
`sources`	enabled source roots and their existence	`GET /sources`
`stats`	document/chunk/vector counts	`GET /stats`
`projects`	registered projects	`GET /projects`
`project-template`	starter shape for a new project	`GET /projects/template`
`propose-project`	preview a registration	`POST /projects/proposal`
`register-project`	persist a registration	`POST /projects/register`
`update-project`	update an existing registration	`PUT /projects/{name}`
`refresh-project`	re-ingest a project's roots	`POST /projects/{name}/refresh`
`project-state`	list trusted state for a project	`GET /project/state/{name}`
`project-state-set`	curate trusted state	`POST /project/state`
`project-state-invalidate`	supersede trusted state	`DELETE /project/state`
`query`	raw retrieval	`POST /query`
`context-build`	full context pack	`POST /context/build`
`auto-context`	detect-project then context-build	composes `/projects` + `/context/build`
`detect-project`	match a prompt to a registered project	composes `/projects` + local regex
`audit-query`	retrieval-quality audit with classification	composes `/query` + local labelling
`debug-context`	last context pack inspection	`GET /debug/context`
`ingest-sources`	ingest configured source dirs	`POST /ingest/sources`

Phase 9 reflection loop (shipped after migration safety work)

These were explicitly deferred in earlier versions of this doc pending "exercised workflow". The constraint was real — premature API freeze would have made it harder to iterate on the ergonomics — but the deferral ran into a bootstrap problem: you can't exercise the workflow in real Claude Code sessions without a usable client surface to drive it from. The fix is to ship a minimal Phase 9 surface now and treat it as stable-but-refinable: adding new optional parameters is fine, renaming subcommands is not.

Subcommand	Purpose	API endpoint(s)
`capture`	record one interaction round-trip	`POST /interactions`
`extract`	run the rule-based extractor (preview or persist)	`POST /interactions/{id}/extract`
`reinforce-interaction`	backfill reinforcement on an existing interaction	`POST /interactions/{id}/reinforce`
`list-interactions`	paginated list with filters	`GET /interactions`
`get-interaction`	fetch one interaction by id	`GET /interactions/{id}`
`queue`	list the candidate review queue	`GET /memory?status=candidate`
`promote`	move a candidate memory to active	`POST /memory/{id}/promote`
`reject`	mark a candidate memory invalid	`POST /memory/{id}/reject`

All 8 Phase 9 subcommands have test coverage in tests/test_atocore_client.py via mocked request(), including an end-to-end test that drives the full capture → extract → queue → promote/reject cycle through the client.

Coverage summary

That covers everything in the "stable operations" set AND the full Phase 9 reflection loop: project lifecycle, ingestion, project-state curation, retrieval, context build, retrieval-quality audit, health and stats inspection, interaction capture, candidate extraction, candidate review queue.

What's intentionally NOT in scope today

Two families of operations remain deferred:

1. Backup and restore admin operations

Phase 9 Commit B shipped these endpoints:

POST /admin/backup (with include_chroma)
GET /admin/backup (list)
GET /admin/backup/{stamp}/validate

The backup endpoints are stable, but the documented operational procedure (docs/backup-restore-procedure.md) intentionally uses direct curl rather than the shared client. The reason is that backup operations are administrative and benefit from being explicit about which instance they're targeting, with no fail-open behavior. The shared client's fail-open default would hide a real backup failure.

If we later decide to add backup commands to the shared client, they would set ATOCORE_FAIL_OPEN=false for the duration of the call so the operator gets a real error on failure rather than a silent fail-open envelope.

2. Engineering layer entity operations

The engineering layer is in planning, not implementation. When V1 ships per engineering-v1-acceptance.md, the shared client will gain entity, relationship, conflict, and Mirror commands. None of those exist as stable contracts yet, so they are not in the shared client today.

How a new agent platform integrates

When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom GPT, a Cursor extension), the integration recipe is:

Don't reimplement. Don't write a new HTTP client. Use the shared client.
Write a thin frontend that translates the platform's command/skill format into a shell call to python scripts/atocore_client.py <subcommand> <args...>.
Render the JSON response in the platform's preferred shape.
Inherit fail-open and env-var behavior from the shared client. Don't override unless the platform explicitly needs to (e.g. an admin tool that wants to see real errors).
If a needed capability is missing, propose adding it to the shared client. If the underlying API endpoint also doesn't exist, propose adding it to the API first. Don't add the logic to your frontend.

The Claude Code slash command in this repo is a worked example: ~50 lines of markdown that does argument parsing, calls the shared client, and renders the result. It contains zero AtoCore business logic of its own.

How OpenClaw fits

OpenClaw's helper skill at /home/papa/clawd/skills/atocore-context/ on the T420 currently has its own implementation of auto-context, detect-project, and the project lifecycle commands. It predates this layering doc.

The right long-term shape is to refactor the OpenClaw helper to shell out to the shared client instead of duplicating the routing logic. This isn't urgent because:

OpenClaw's helper works today and is in active use
The duplication is on the OpenClaw side; AtoCore itself is not affected
The shared client and the OpenClaw helper are in different repos (AtoCore vs OpenClaw clawd), so the refactor is a cross-repo coordination

The refactor is queued as a follow-up. Until then, the OpenClaw helper and the Claude Code slash command are parallel implementations of the same idea. The shared client is the canonical backbone going forward; new clients should follow the new pattern even though the existing OpenClaw helper still has its own.

How this connects to the master plan

Layer	Phase home	Status
AtoCore HTTP API	Phases 0/0.5/1/2/3/5/7/9	shipped
Shared operator client (`scripts/atocore_client.py`)	implicitly Phase 8 (OpenClaw integration) infrastructure	shipped via codex/port-atocore-ops-client merge
OpenClaw helper skill (T420)	Phase 8 — partial	shipped (own implementation, refactor queued)
Claude Code slash command (this repo)	precursor to Phase 11 (multi-model)	shipped (refactored to use the shared client)
Codex skill	Phase 11	future
MCP server	Phase 11	future
Web UI / dashboard	Phase 11+	future

The shared client is the substrate Phase 11 will build on. Every new client added in Phase 11 should be a thin frontend on the shared client, not a fresh reimplementation.

Versioning and stability

The shared client's subcommand surface is stable. Adding new subcommands is non-breaking. Changing or removing existing subcommands is breaking and would require a coordinated update of every frontend that depends on them.

The current shared client has no explicit version constant; the implicit contract is "the subcommands and JSON shapes documented in this file". When the client surface meaningfully changes, add a CLIENT_VERSION = "x.y.z" constant to scripts/atocore_client.py and bump it per semver:

patch: bug fixes, no surface change
minor: new subcommands or new optional fields
major: removed subcommands, renamed fields, changed defaults

Open follow-ups

Refactor the OpenClaw helper to shell out to the shared client. Cross-repo coordination, not blocking anything in AtoCore itself. With the Phase 9 subcommands now in the shared client, the OpenClaw refactor can reuse all the reflection-loop work instead of duplicating it.
Real-usage validation of the Phase 9 loop, now that the client surface exists. First capture → extract → review cycle against the live Dalidou instance, likely via the Claude Code slash command flow. Findings feed back into subcommand refinement (new optional flags are fine, renames require a semver bump).
Add backup admin subcommands if and when we decide the shared client should be the canonical backup operator interface (with fail-open disabled for admin commands).
Add engineering-layer entity subcommands as part of the engineering V1 implementation sprint, per engineering-v1-acceptance.md.
Tag a CLIENT_VERSION constant the next time the shared client surface meaningfully changes. Today's surface with the Phase 9 loop added is the v0.2.0 baseline (v0.1.0 was the stable-ops-only version).

TL;DR

AtoCore HTTP API is the universal interface
scripts/atocore_client.py is the canonical shared Python backbone for stable AtoCore operations
Per-agent frontends (Claude Code slash command, OpenClaw helper, future Codex skill, future MCP server) are thin wrappers that shell out to the shared client
The shared client today covers project lifecycle, ingestion, retrieval, context build, project-state, retrieval audit, AND the full Phase 9 reflection loop (capture / extract / reinforce / list / queue / promote / reject)
Backup admin and engineering-entity commands remain deferred
The OpenClaw helper is currently a parallel implementation and the refactor to the shared client is a queued follow-up
New LLM clients should never reimplement HTTP calls — they follow the shell-out pattern documented here

15 KiB Raw Blame History