275 lines
9.3 KiB
Markdown
275 lines
9.3 KiB
Markdown
|
|
# Universal Consumption — Connecting LLM Clients to AtoCore
|
||
|
|
|
||
|
|
Phase 1 of the Master Brain plan. Every LLM interaction across the ecosystem
|
||
|
|
pulls context from AtoCore automatically, without the user or agent having
|
||
|
|
to remember to ask for it.
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────────┐
|
||
|
|
│ AtoCore HTTP API │ ← single source of truth
|
||
|
|
│ http://dalidou:8100│
|
||
|
|
└──────────┬──────────┘
|
||
|
|
│
|
||
|
|
┌────────────────────┼────────────────────┐
|
||
|
|
│ │ │
|
||
|
|
┌───┴────┐ ┌─────┴────┐ ┌────┴────┐
|
||
|
|
│ MCP │ │ OpenClaw │ │ HTTP │
|
||
|
|
│ server │ │ plugin │ │ proxy │
|
||
|
|
└───┬────┘ └──────┬───┘ └────┬────┘
|
||
|
|
│ │ │
|
||
|
|
Claude/Cursor/ OpenClaw Codex/Ollama/
|
||
|
|
Zed/Windsurf any OpenAI-compat client
|
||
|
|
```
|
||
|
|
|
||
|
|
Three adapters, one HTTP backend. Each adapter is a thin passthrough — no
|
||
|
|
business logic duplicated.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Adapter 1: MCP Server (Claude Desktop, Claude Code, Cursor, Zed, Windsurf)
|
||
|
|
|
||
|
|
The MCP server is `scripts/atocore_mcp.py` — stdlib-only Python, stdio
|
||
|
|
transport, wraps the HTTP API. Claude-family clients see AtoCore as built-in
|
||
|
|
tools just like `Read` or `Bash`.
|
||
|
|
|
||
|
|
### Tools exposed
|
||
|
|
|
||
|
|
- **`atocore_context`** (most important): Full context pack for a query —
|
||
|
|
Trusted Project State + memories + retrieved chunks. Use at the start of
|
||
|
|
any project-related conversation to ground it.
|
||
|
|
- **`atocore_search`**: Semantic search over ingested documents (top-K chunks).
|
||
|
|
- **`atocore_memory_list`**: List active memories, filterable by project + type.
|
||
|
|
- **`atocore_memory_create`**: Propose a candidate memory (enters triage queue).
|
||
|
|
- **`atocore_project_state`**: Get Trusted Project State entries by category.
|
||
|
|
- **`atocore_projects`**: List registered projects + aliases.
|
||
|
|
- **`atocore_health`**: Service status check.
|
||
|
|
|
||
|
|
### Registration
|
||
|
|
|
||
|
|
#### Claude Code (CLI)
|
||
|
|
```bash
|
||
|
|
claude mcp add atocore -- python C:/Users/antoi/ATOCore/scripts/atocore_mcp.py
|
||
|
|
claude mcp list # verify: "atocore ... ✓ Connected"
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Claude Desktop (GUI)
|
||
|
|
Edit `~/Library/Application Support/Claude/claude_desktop_config.json`
|
||
|
|
(macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"mcpServers": {
|
||
|
|
"atocore": {
|
||
|
|
"command": "python",
|
||
|
|
"args": ["C:/Users/antoi/ATOCore/scripts/atocore_mcp.py"],
|
||
|
|
"env": {
|
||
|
|
"ATOCORE_URL": "http://dalidou:8100"
|
||
|
|
}
|
||
|
|
}
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
Restart Claude Desktop.
|
||
|
|
|
||
|
|
#### Cursor / Zed / Windsurf
|
||
|
|
Similar JSON config in each tool's MCP settings. Consult their docs —
|
||
|
|
the config schema is standard MCP.
|
||
|
|
|
||
|
|
### Configuration
|
||
|
|
|
||
|
|
Environment variables the MCP server honors:
|
||
|
|
|
||
|
|
| Var | Default | Purpose |
|
||
|
|
|---|---|---|
|
||
|
|
| `ATOCORE_URL` | `http://dalidou:8100` | Where to reach AtoCore |
|
||
|
|
| `ATOCORE_TIMEOUT` | `10` | Per-request HTTP timeout (seconds) |
|
||
|
|
|
||
|
|
### Behavior
|
||
|
|
|
||
|
|
- Fail-open: if Dalidou is unreachable, tools return "AtoCore unavailable"
|
||
|
|
error messages but don't crash the client.
|
||
|
|
- Zero business logic: every tool is a direct HTTP passthrough.
|
||
|
|
- stdlib only: no MCP SDK dependency.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Adapter 2: OpenClaw Plugin (`openclaw-plugins/atocore-capture/handler.js`)
|
||
|
|
|
||
|
|
The plugin on T420 OpenClaw has two responsibilities:
|
||
|
|
|
||
|
|
1. **CAPTURE**: On `before_agent_start` + `llm_output`, POST completed turns
|
||
|
|
to AtoCore `/interactions` (existing).
|
||
|
|
2. **PULL**: On `before_prompt_build`, call `/context/build` and inject the
|
||
|
|
context pack via `prependContext` so the agent's system prompt includes
|
||
|
|
AtoCore knowledge.
|
||
|
|
|
||
|
|
### Deployment
|
||
|
|
|
||
|
|
The plugin is loaded from
|
||
|
|
`/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/`
|
||
|
|
on the T420 (per OpenClaw's plugin config at `~/.openclaw/openclaw.json`).
|
||
|
|
|
||
|
|
To update:
|
||
|
|
```bash
|
||
|
|
scp openclaw-plugins/atocore-capture/handler.js \
|
||
|
|
papa@192.168.86.39:/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/index.js
|
||
|
|
ssh papa@192.168.86.39 'systemctl --user restart openclaw-gateway'
|
||
|
|
```
|
||
|
|
|
||
|
|
Verify in gateway logs: look for "ready (7 plugins: acpx, atocore-capture, ...)"
|
||
|
|
|
||
|
|
### Configuration (env vars set on T420)
|
||
|
|
|
||
|
|
| Var | Default | Purpose |
|
||
|
|
|---|---|---|
|
||
|
|
| `ATOCORE_BASE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
|
||
|
|
| `ATOCORE_PULL_DISABLED` | (unset) | Set to `1` to disable context pull |
|
||
|
|
|
||
|
|
### Behavior
|
||
|
|
|
||
|
|
- Fail-open: AtoCore unreachable = no injection, no capture, agent runs
|
||
|
|
normally.
|
||
|
|
- 6s timeout on context pull, 10s on capture — won't stall the agent.
|
||
|
|
- Context pack prepended as a clearly-bracketed block so the agent can see
|
||
|
|
it's auto-injected grounding info.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Adapter 3: HTTP Proxy (`scripts/atocore_proxy.py`)
|
||
|
|
|
||
|
|
A stdlib-only OpenAI-compatible HTTP proxy. Sits between any
|
||
|
|
OpenAI-API-speaking client and the real provider, enriches every
|
||
|
|
`/chat/completions` request with AtoCore context.
|
||
|
|
|
||
|
|
Works with:
|
||
|
|
- **Codex CLI** (OpenAI-compatible endpoint)
|
||
|
|
- **Ollama** (has OpenAI-compatible `/v1` endpoint since 0.1.24)
|
||
|
|
- **LiteLLM**, **llama.cpp server**, custom agents
|
||
|
|
- Anything that can be pointed at a custom base URL
|
||
|
|
|
||
|
|
### Start it
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# For Ollama (local models):
|
||
|
|
ATOCORE_UPSTREAM=http://localhost:11434/v1 \
|
||
|
|
python scripts/atocore_proxy.py
|
||
|
|
|
||
|
|
# For OpenAI cloud:
|
||
|
|
ATOCORE_UPSTREAM=https://api.openai.com/v1 \
|
||
|
|
ATOCORE_CLIENT_LABEL=codex \
|
||
|
|
python scripts/atocore_proxy.py
|
||
|
|
|
||
|
|
# Test:
|
||
|
|
curl http://127.0.0.1:11435/healthz
|
||
|
|
```
|
||
|
|
|
||
|
|
### Point a client at it
|
||
|
|
|
||
|
|
Set the client's OpenAI base URL to `http://127.0.0.1:11435/v1`.
|
||
|
|
|
||
|
|
#### Ollama example:
|
||
|
|
```bash
|
||
|
|
OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
|
||
|
|
some-openai-client --model llama3:8b
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Codex CLI:
|
||
|
|
Set `OPENAI_BASE_URL=http://127.0.0.1:11435/v1` in your codex config.
|
||
|
|
|
||
|
|
### Configuration
|
||
|
|
|
||
|
|
| Var | Default | Purpose |
|
||
|
|
|---|---|---|
|
||
|
|
| `ATOCORE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
|
||
|
|
| `ATOCORE_UPSTREAM` | (required) | Real provider base URL |
|
||
|
|
| `ATOCORE_PROXY_PORT` | `11435` | Proxy listen port |
|
||
|
|
| `ATOCORE_PROXY_HOST` | `127.0.0.1` | Proxy bind address |
|
||
|
|
| `ATOCORE_CLIENT_LABEL` | `proxy` | Client id in captures |
|
||
|
|
| `ATOCORE_INJECT` | `1` | Inject context (set `0` to disable) |
|
||
|
|
| `ATOCORE_CAPTURE` | `1` | Capture interactions (set `0` to disable) |
|
||
|
|
|
||
|
|
### Behavior
|
||
|
|
|
||
|
|
- GET requests (model listing etc) pass through unchanged
|
||
|
|
- POST to `/chat/completions` (or `/v1/chat/completions`) gets enriched:
|
||
|
|
1. Last user message extracted as query
|
||
|
|
2. AtoCore `/context/build` called with 6s timeout
|
||
|
|
3. Pack injected as system message (or prepended to existing system)
|
||
|
|
4. Enriched body forwarded to upstream
|
||
|
|
5. After success, interaction POSTed to `/interactions` in background
|
||
|
|
- Fail-open: AtoCore unreachable = pass through without injection
|
||
|
|
- Streaming responses: currently buffered (not true stream). Good enough for
|
||
|
|
most cases; can be upgraded later if needed.
|
||
|
|
|
||
|
|
### Running as a service
|
||
|
|
|
||
|
|
On Linux, create `~/.config/systemd/user/atocore-proxy.service`:
|
||
|
|
```ini
|
||
|
|
[Unit]
|
||
|
|
Description=AtoCore HTTP proxy
|
||
|
|
|
||
|
|
[Service]
|
||
|
|
Environment=ATOCORE_UPSTREAM=http://localhost:11434/v1
|
||
|
|
Environment=ATOCORE_CLIENT_LABEL=ollama
|
||
|
|
ExecStart=/usr/bin/python3 /path/to/scripts/atocore_proxy.py
|
||
|
|
Restart=on-failure
|
||
|
|
|
||
|
|
[Install]
|
||
|
|
WantedBy=default.target
|
||
|
|
```
|
||
|
|
Then: `systemctl --user enable --now atocore-proxy`
|
||
|
|
|
||
|
|
On Windows, register via Task Scheduler (similar pattern to backup task)
|
||
|
|
or use NSSM to install as a service.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Verification Checklist
|
||
|
|
|
||
|
|
Fresh end-to-end test to confirm Phase 1 is working:
|
||
|
|
|
||
|
|
### For Claude Code (MCP)
|
||
|
|
1. Open a new Claude Code session (not this one).
|
||
|
|
2. Ask: "what do we know about p06 polisher's control architecture?"
|
||
|
|
3. Claude should invoke `atocore_context` or `atocore_project_state`
|
||
|
|
on its own and answer grounded in AtoCore data.
|
||
|
|
|
||
|
|
### For OpenClaw (plugin pull)
|
||
|
|
1. Send a Discord message to OpenClaw: "what's the status on p04?"
|
||
|
|
2. Check T420 logs: `journalctl --user -u openclaw-gateway --since "1 min ago" | grep atocore-pull`
|
||
|
|
3. Expect: `atocore-pull:injected project=p04-gigabit chars=NNN`
|
||
|
|
|
||
|
|
### For proxy (any OpenAI-compat client)
|
||
|
|
1. Start proxy with appropriate upstream
|
||
|
|
2. Run a client query through it
|
||
|
|
3. Check stderr: `[atocore-proxy] inject: project=... chars=...`
|
||
|
|
4. Check `curl http://127.0.0.1:8100/interactions?client=proxy` — should
|
||
|
|
show the captured turn
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Why not just MCP everywhere?
|
||
|
|
|
||
|
|
MCP is great for Claude-family clients but:
|
||
|
|
- Not supported natively by Codex CLI, Ollama, or OpenAI's own API
|
||
|
|
- No universal "attach MCP" mechanism in all LLM runtimes
|
||
|
|
- HTTP APIs are truly universal
|
||
|
|
|
||
|
|
HTTP API is the truth, each adapter is the thinnest possible shim for its
|
||
|
|
ecosystem. When new adapters are needed (Gemini CLI, Claude Code plugin
|
||
|
|
system, etc.), they follow the same pattern.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Future enhancements
|
||
|
|
|
||
|
|
- **Streaming passthrough** in the proxy (currently buffered for simplicity)
|
||
|
|
- **Response grounding check**: parse assistant output for references to
|
||
|
|
injected context, count reinforcement events
|
||
|
|
- **Per-client metrics** in the dashboard: how often each client pulls,
|
||
|
|
context pack size, injection rate
|
||
|
|
- **Smart project detection**: today we use keyword matching; could use
|
||
|
|
AtoCore's own project resolver endpoint
|