ATOCore/docs/universal-consumption.md

# Universal Consumption — Connecting LLM Clients to AtoCore

Phase 1 of the Master Brain plan. Every LLM interaction across the ecosystem
pulls context from AtoCore automatically, without the user or agent having
to remember to ask for it.

## Architecture

```
                 ┌─────────────────────┐
                 │  AtoCore HTTP API   │  ← single source of truth
                 │  http://dalidou:8100│
                 └──────────┬──────────┘
                            │
       ┌────────────────────┼────────────────────┐
       │                    │                    │
   ┌───┴────┐         ┌─────┴────┐         ┌────┴────┐
   │  MCP   │         │ OpenClaw │         │  HTTP   │
   │ server │         │  plugin  │         │  proxy  │
   └───┬────┘         └──────┬───┘         └────┬────┘
       │                     │                   │
   Claude/Cursor/         OpenClaw            Codex/Ollama/
   Zed/Windsurf                                any OpenAI-compat client
```

Three adapters, one HTTP backend. Each adapter is a thin passthrough — no
business logic duplicated.

---

## Adapter 1: MCP Server (Claude Desktop, Claude Code, Cursor, Zed, Windsurf)

The MCP server is `scripts/atocore_mcp.py` — stdlib-only Python, stdio
transport, wraps the HTTP API. Claude-family clients see AtoCore as built-in
tools just like `Read` or `Bash`.

### Tools exposed

- **`atocore_context`** (most important): Full context pack for a query —
  Trusted Project State + memories + retrieved chunks. Use at the start of
  any project-related conversation to ground it.
- **`atocore_search`**: Semantic search over ingested documents (top-K chunks).
- **`atocore_memory_list`**: List active memories, filterable by project + type.
- **`atocore_memory_create`**: Propose a candidate memory (enters triage queue).
- **`atocore_project_state`**: Get Trusted Project State entries by category.
- **`atocore_projects`**: List registered projects + aliases.
- **`atocore_health`**: Service status check.

### Registration

#### Claude Code (CLI)
```bash
claude mcp add atocore -- python C:/Users/antoi/ATOCore/scripts/atocore_mcp.py
claude mcp list    # verify: "atocore ... ✓ Connected"
```

#### Claude Desktop (GUI)
Edit `~/Library/Application Support/Claude/claude_desktop_config.json`
(macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):

```json
{
  "mcpServers": {
    "atocore": {
      "command": "python",
      "args": ["C:/Users/antoi/ATOCore/scripts/atocore_mcp.py"],
      "env": {
        "ATOCORE_URL": "http://dalidou:8100"
      }
    }
  }
}
```
Restart Claude Desktop.

#### Cursor / Zed / Windsurf
Similar JSON config in each tool's MCP settings. Consult their docs —
the config schema is standard MCP.

### Configuration

Environment variables the MCP server honors:

| Var | Default | Purpose |
|---|---|---|
| `ATOCORE_URL` | `http://dalidou:8100` | Where to reach AtoCore |
| `ATOCORE_TIMEOUT` | `10` | Per-request HTTP timeout (seconds) |

### Behavior

- Fail-open: if Dalidou is unreachable, tools return "AtoCore unavailable"
  error messages but don't crash the client.
- Zero business logic: every tool is a direct HTTP passthrough.
- stdlib only: no MCP SDK dependency.

---

## Adapter 2: OpenClaw Plugin (`openclaw-plugins/atocore-capture/handler.js`)

The plugin on T420 OpenClaw has two responsibilities:

1. **CAPTURE**: On `before_agent_start` + `llm_output`, POST completed turns
   to AtoCore `/interactions` (existing).
2. **PULL**: On `before_prompt_build`, call `/context/build` and inject the
   context pack via `prependContext` so the agent's system prompt includes
   AtoCore knowledge.

### Deployment

The plugin is loaded from
`/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/`
on the T420 (per OpenClaw's plugin config at `~/.openclaw/openclaw.json`).

To update:
```bash
scp openclaw-plugins/atocore-capture/handler.js \
    papa@192.168.86.39:/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/index.js
ssh papa@192.168.86.39 'systemctl --user restart openclaw-gateway'
```

Verify in gateway logs: look for "ready (7 plugins: acpx, atocore-capture, ...)"

### Configuration (env vars set on T420)

| Var | Default | Purpose |
|---|---|---|
| `ATOCORE_BASE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
| `ATOCORE_PULL_DISABLED` | (unset) | Set to `1` to disable context pull |

### Behavior

- Fail-open: AtoCore unreachable = no injection, no capture, agent runs
  normally.
- 6s timeout on context pull, 10s on capture — won't stall the agent.
- Context pack prepended as a clearly-bracketed block so the agent can see
  it's auto-injected grounding info.

---

## Adapter 3: HTTP Proxy (`scripts/atocore_proxy.py`)

A stdlib-only OpenAI-compatible HTTP proxy. Sits between any
OpenAI-API-speaking client and the real provider, enriches every
`/chat/completions` request with AtoCore context.

Works with:
- **Codex CLI** (OpenAI-compatible endpoint)
- **Ollama** (has OpenAI-compatible `/v1` endpoint since 0.1.24)
- **LiteLLM**, **llama.cpp server**, custom agents
- Anything that can be pointed at a custom base URL

### Start it

```bash
# For Ollama (local models):
ATOCORE_UPSTREAM=http://localhost:11434/v1 \
  python scripts/atocore_proxy.py

# For OpenAI cloud:
ATOCORE_UPSTREAM=https://api.openai.com/v1 \
  ATOCORE_CLIENT_LABEL=codex \
  python scripts/atocore_proxy.py

# Test:
curl http://127.0.0.1:11435/healthz
```

### Point a client at it

Set the client's OpenAI base URL to `http://127.0.0.1:11435/v1`.

#### Ollama example:
```bash
OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
  some-openai-client --model llama3:8b
```

#### Codex CLI:
Set `OPENAI_BASE_URL=http://127.0.0.1:11435/v1` in your codex config.

### Configuration

| Var | Default | Purpose |
|---|---|---|
| `ATOCORE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
| `ATOCORE_UPSTREAM` | (required) | Real provider base URL |
| `ATOCORE_PROXY_PORT` | `11435` | Proxy listen port |
| `ATOCORE_PROXY_HOST` | `127.0.0.1` | Proxy bind address |
| `ATOCORE_CLIENT_LABEL` | `proxy` | Client id in captures |
| `ATOCORE_INJECT` | `1` | Inject context (set `0` to disable) |
| `ATOCORE_CAPTURE` | `1` | Capture interactions (set `0` to disable) |

### Behavior

- GET requests (model listing etc) pass through unchanged
- POST to `/chat/completions` (or `/v1/chat/completions`) gets enriched:
  1. Last user message extracted as query
  2. AtoCore `/context/build` called with 6s timeout
  3. Pack injected as system message (or prepended to existing system)
  4. Enriched body forwarded to upstream
  5. After success, interaction POSTed to `/interactions` in background
- Fail-open: AtoCore unreachable = pass through without injection
- Streaming responses: currently buffered (not true stream). Good enough for
  most cases; can be upgraded later if needed.

### Running as a service

On Linux, create `~/.config/systemd/user/atocore-proxy.service`:
```ini
[Unit]
Description=AtoCore HTTP proxy

[Service]
Environment=ATOCORE_UPSTREAM=http://localhost:11434/v1
Environment=ATOCORE_CLIENT_LABEL=ollama
ExecStart=/usr/bin/python3 /path/to/scripts/atocore_proxy.py
Restart=on-failure

[Install]
WantedBy=default.target
```
Then: `systemctl --user enable --now atocore-proxy`

On Windows, register via Task Scheduler (similar pattern to backup task)
or use NSSM to install as a service.

---

## Verification Checklist

Fresh end-to-end test to confirm Phase 1 is working:

### For Claude Code (MCP)
1. Open a new Claude Code session (not this one).
2. Ask: "what do we know about p06 polisher's control architecture?"
3. Claude should invoke `atocore_context` or `atocore_project_state`
   on its own and answer grounded in AtoCore data.

### For OpenClaw (plugin pull)
1. Send a Discord message to OpenClaw: "what's the status on p04?"
2. Check T420 logs: `journalctl --user -u openclaw-gateway --since "1 min ago" | grep atocore-pull`
3. Expect: `atocore-pull:injected project=p04-gigabit chars=NNN`

### For proxy (any OpenAI-compat client)
1. Start proxy with appropriate upstream
2. Run a client query through it
3. Check stderr: `[atocore-proxy] inject: project=... chars=...`
4. Check `curl http://127.0.0.1:8100/interactions?client=proxy` — should
   show the captured turn

---

## Why not just MCP everywhere?

MCP is great for Claude-family clients but:
- Not supported natively by Codex CLI, Ollama, or OpenAI's own API
- No universal "attach MCP" mechanism in all LLM runtimes
- HTTP APIs are truly universal

HTTP API is the truth, each adapter is the thinnest possible shim for its
ecosystem. When new adapters are needed (Gemini CLI, Claude Code plugin
system, etc.), they follow the same pattern.

---

## Future enhancements

- **Streaming passthrough** in the proxy (currently buffered for simplicity)
- **Response grounding check**: parse assistant output for references to
  injected context, count reinforcement events
- **Per-client metrics** in the dashboard: how often each client pulls,
  context pack size, injection rate
- **Smart project detection**: today we use keyword matching; could use
  AtoCore's own project resolver endpoint