feat: universal LLM consumption (Phase 1 complete)
Completes the Phase 1 master brain keystone: every LLM interaction
across the ecosystem now pulls context from AtoCore automatically.
Three adapters, one HTTP backend:
1. OpenClaw plugin pull (handler.js):
- Added before_prompt_build hook that calls /context/build and
injects the pack via prependContext
- Existing capture hooks (before_agent_start + llm_output)
unchanged
- 6s context timeout, fail-open on AtoCore unreachable
- Deployed to T420, gateway restarted, "7 plugins loaded"
2. atocore-proxy (scripts/atocore_proxy.py):
- Stdlib-only OpenAI-compatible HTTP middleware
- Drop-in layer for Codex, Ollama, LiteLLM, any OpenAI-compat client
- Intercepts /chat/completions: extracts query, pulls context,
injects as system message, forwards to upstream, captures back
- Fail-open: AtoCore down = passthrough without injection
- Configurable via env: UPSTREAM, PORT, CLIENT_LABEL, INJECT, CAPTURE
3. (from prior commit c49363f) atocore-mcp:
- stdio MCP server, stdlib Python, 7 tools exposed
- Registered in Claude Code: "✓ Connected"
Plus quick win:
- Project synthesis moved from Sunday-only to daily cron so wiki /
mirror pages stay fresh (Step C in batch-extract.sh). Lint stays
weekly.
Plus docs:
- docs/universal-consumption.md: configuration guide for all 3 adapters
with registration/env-var tables and verification checklist
Plus housekeeping:
- .gitignore: add .mypy_cache/
Tests: 303/303 passing.
This closes the consumption gap: the reinforcement feedback loop
can now actually work (memories get injected → get referenced →
reinforcement fires → auto-promotion). Every Claude, OpenClaw,
Codex, or Ollama session is automatically AtoCore-grounded.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
274
docs/universal-consumption.md
Normal file
274
docs/universal-consumption.md
Normal file
@@ -0,0 +1,274 @@
|
||||
# Universal Consumption — Connecting LLM Clients to AtoCore
|
||||
|
||||
Phase 1 of the Master Brain plan. Every LLM interaction across the ecosystem
|
||||
pulls context from AtoCore automatically, without the user or agent having
|
||||
to remember to ask for it.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ AtoCore HTTP API │ ← single source of truth
|
||||
│ http://dalidou:8100│
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
┌────────────────────┼────────────────────┐
|
||||
│ │ │
|
||||
┌───┴────┐ ┌─────┴────┐ ┌────┴────┐
|
||||
│ MCP │ │ OpenClaw │ │ HTTP │
|
||||
│ server │ │ plugin │ │ proxy │
|
||||
└───┬────┘ └──────┬───┘ └────┬────┘
|
||||
│ │ │
|
||||
Claude/Cursor/ OpenClaw Codex/Ollama/
|
||||
Zed/Windsurf any OpenAI-compat client
|
||||
```
|
||||
|
||||
Three adapters, one HTTP backend. Each adapter is a thin passthrough — no
|
||||
business logic duplicated.
|
||||
|
||||
---
|
||||
|
||||
## Adapter 1: MCP Server (Claude Desktop, Claude Code, Cursor, Zed, Windsurf)
|
||||
|
||||
The MCP server is `scripts/atocore_mcp.py` — stdlib-only Python, stdio
|
||||
transport, wraps the HTTP API. Claude-family clients see AtoCore as built-in
|
||||
tools just like `Read` or `Bash`.
|
||||
|
||||
### Tools exposed
|
||||
|
||||
- **`atocore_context`** (most important): Full context pack for a query —
|
||||
Trusted Project State + memories + retrieved chunks. Use at the start of
|
||||
any project-related conversation to ground it.
|
||||
- **`atocore_search`**: Semantic search over ingested documents (top-K chunks).
|
||||
- **`atocore_memory_list`**: List active memories, filterable by project + type.
|
||||
- **`atocore_memory_create`**: Propose a candidate memory (enters triage queue).
|
||||
- **`atocore_project_state`**: Get Trusted Project State entries by category.
|
||||
- **`atocore_projects`**: List registered projects + aliases.
|
||||
- **`atocore_health`**: Service status check.
|
||||
|
||||
### Registration
|
||||
|
||||
#### Claude Code (CLI)
|
||||
```bash
|
||||
claude mcp add atocore -- python C:/Users/antoi/ATOCore/scripts/atocore_mcp.py
|
||||
claude mcp list # verify: "atocore ... ✓ Connected"
|
||||
```
|
||||
|
||||
#### Claude Desktop (GUI)
|
||||
Edit `~/Library/Application Support/Claude/claude_desktop_config.json`
|
||||
(macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"atocore": {
|
||||
"command": "python",
|
||||
"args": ["C:/Users/antoi/ATOCore/scripts/atocore_mcp.py"],
|
||||
"env": {
|
||||
"ATOCORE_URL": "http://dalidou:8100"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
Restart Claude Desktop.
|
||||
|
||||
#### Cursor / Zed / Windsurf
|
||||
Similar JSON config in each tool's MCP settings. Consult their docs —
|
||||
the config schema is standard MCP.
|
||||
|
||||
### Configuration
|
||||
|
||||
Environment variables the MCP server honors:
|
||||
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `ATOCORE_URL` | `http://dalidou:8100` | Where to reach AtoCore |
|
||||
| `ATOCORE_TIMEOUT` | `10` | Per-request HTTP timeout (seconds) |
|
||||
|
||||
### Behavior
|
||||
|
||||
- Fail-open: if Dalidou is unreachable, tools return "AtoCore unavailable"
|
||||
error messages but don't crash the client.
|
||||
- Zero business logic: every tool is a direct HTTP passthrough.
|
||||
- stdlib only: no MCP SDK dependency.
|
||||
|
||||
---
|
||||
|
||||
## Adapter 2: OpenClaw Plugin (`openclaw-plugins/atocore-capture/handler.js`)
|
||||
|
||||
The plugin on T420 OpenClaw has two responsibilities:
|
||||
|
||||
1. **CAPTURE**: On `before_agent_start` + `llm_output`, POST completed turns
|
||||
to AtoCore `/interactions` (existing).
|
||||
2. **PULL**: On `before_prompt_build`, call `/context/build` and inject the
|
||||
context pack via `prependContext` so the agent's system prompt includes
|
||||
AtoCore knowledge.
|
||||
|
||||
### Deployment
|
||||
|
||||
The plugin is loaded from
|
||||
`/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/`
|
||||
on the T420 (per OpenClaw's plugin config at `~/.openclaw/openclaw.json`).
|
||||
|
||||
To update:
|
||||
```bash
|
||||
scp openclaw-plugins/atocore-capture/handler.js \
|
||||
papa@192.168.86.39:/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/index.js
|
||||
ssh papa@192.168.86.39 'systemctl --user restart openclaw-gateway'
|
||||
```
|
||||
|
||||
Verify in gateway logs: look for "ready (7 plugins: acpx, atocore-capture, ...)"
|
||||
|
||||
### Configuration (env vars set on T420)
|
||||
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `ATOCORE_BASE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
|
||||
| `ATOCORE_PULL_DISABLED` | (unset) | Set to `1` to disable context pull |
|
||||
|
||||
### Behavior
|
||||
|
||||
- Fail-open: AtoCore unreachable = no injection, no capture, agent runs
|
||||
normally.
|
||||
- 6s timeout on context pull, 10s on capture — won't stall the agent.
|
||||
- Context pack prepended as a clearly-bracketed block so the agent can see
|
||||
it's auto-injected grounding info.
|
||||
|
||||
---
|
||||
|
||||
## Adapter 3: HTTP Proxy (`scripts/atocore_proxy.py`)
|
||||
|
||||
A stdlib-only OpenAI-compatible HTTP proxy. Sits between any
|
||||
OpenAI-API-speaking client and the real provider, enriches every
|
||||
`/chat/completions` request with AtoCore context.
|
||||
|
||||
Works with:
|
||||
- **Codex CLI** (OpenAI-compatible endpoint)
|
||||
- **Ollama** (has OpenAI-compatible `/v1` endpoint since 0.1.24)
|
||||
- **LiteLLM**, **llama.cpp server**, custom agents
|
||||
- Anything that can be pointed at a custom base URL
|
||||
|
||||
### Start it
|
||||
|
||||
```bash
|
||||
# For Ollama (local models):
|
||||
ATOCORE_UPSTREAM=http://localhost:11434/v1 \
|
||||
python scripts/atocore_proxy.py
|
||||
|
||||
# For OpenAI cloud:
|
||||
ATOCORE_UPSTREAM=https://api.openai.com/v1 \
|
||||
ATOCORE_CLIENT_LABEL=codex \
|
||||
python scripts/atocore_proxy.py
|
||||
|
||||
# Test:
|
||||
curl http://127.0.0.1:11435/healthz
|
||||
```
|
||||
|
||||
### Point a client at it
|
||||
|
||||
Set the client's OpenAI base URL to `http://127.0.0.1:11435/v1`.
|
||||
|
||||
#### Ollama example:
|
||||
```bash
|
||||
OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
|
||||
some-openai-client --model llama3:8b
|
||||
```
|
||||
|
||||
#### Codex CLI:
|
||||
Set `OPENAI_BASE_URL=http://127.0.0.1:11435/v1` in your codex config.
|
||||
|
||||
### Configuration
|
||||
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `ATOCORE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
|
||||
| `ATOCORE_UPSTREAM` | (required) | Real provider base URL |
|
||||
| `ATOCORE_PROXY_PORT` | `11435` | Proxy listen port |
|
||||
| `ATOCORE_PROXY_HOST` | `127.0.0.1` | Proxy bind address |
|
||||
| `ATOCORE_CLIENT_LABEL` | `proxy` | Client id in captures |
|
||||
| `ATOCORE_INJECT` | `1` | Inject context (set `0` to disable) |
|
||||
| `ATOCORE_CAPTURE` | `1` | Capture interactions (set `0` to disable) |
|
||||
|
||||
### Behavior
|
||||
|
||||
- GET requests (model listing etc) pass through unchanged
|
||||
- POST to `/chat/completions` (or `/v1/chat/completions`) gets enriched:
|
||||
1. Last user message extracted as query
|
||||
2. AtoCore `/context/build` called with 6s timeout
|
||||
3. Pack injected as system message (or prepended to existing system)
|
||||
4. Enriched body forwarded to upstream
|
||||
5. After success, interaction POSTed to `/interactions` in background
|
||||
- Fail-open: AtoCore unreachable = pass through without injection
|
||||
- Streaming responses: currently buffered (not true stream). Good enough for
|
||||
most cases; can be upgraded later if needed.
|
||||
|
||||
### Running as a service
|
||||
|
||||
On Linux, create `~/.config/systemd/user/atocore-proxy.service`:
|
||||
```ini
|
||||
[Unit]
|
||||
Description=AtoCore HTTP proxy
|
||||
|
||||
[Service]
|
||||
Environment=ATOCORE_UPSTREAM=http://localhost:11434/v1
|
||||
Environment=ATOCORE_CLIENT_LABEL=ollama
|
||||
ExecStart=/usr/bin/python3 /path/to/scripts/atocore_proxy.py
|
||||
Restart=on-failure
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
```
|
||||
Then: `systemctl --user enable --now atocore-proxy`
|
||||
|
||||
On Windows, register via Task Scheduler (similar pattern to backup task)
|
||||
or use NSSM to install as a service.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Fresh end-to-end test to confirm Phase 1 is working:
|
||||
|
||||
### For Claude Code (MCP)
|
||||
1. Open a new Claude Code session (not this one).
|
||||
2. Ask: "what do we know about p06 polisher's control architecture?"
|
||||
3. Claude should invoke `atocore_context` or `atocore_project_state`
|
||||
on its own and answer grounded in AtoCore data.
|
||||
|
||||
### For OpenClaw (plugin pull)
|
||||
1. Send a Discord message to OpenClaw: "what's the status on p04?"
|
||||
2. Check T420 logs: `journalctl --user -u openclaw-gateway --since "1 min ago" | grep atocore-pull`
|
||||
3. Expect: `atocore-pull:injected project=p04-gigabit chars=NNN`
|
||||
|
||||
### For proxy (any OpenAI-compat client)
|
||||
1. Start proxy with appropriate upstream
|
||||
2. Run a client query through it
|
||||
3. Check stderr: `[atocore-proxy] inject: project=... chars=...`
|
||||
4. Check `curl http://127.0.0.1:8100/interactions?client=proxy` — should
|
||||
show the captured turn
|
||||
|
||||
---
|
||||
|
||||
## Why not just MCP everywhere?
|
||||
|
||||
MCP is great for Claude-family clients but:
|
||||
- Not supported natively by Codex CLI, Ollama, or OpenAI's own API
|
||||
- No universal "attach MCP" mechanism in all LLM runtimes
|
||||
- HTTP APIs are truly universal
|
||||
|
||||
HTTP API is the truth, each adapter is the thinnest possible shim for its
|
||||
ecosystem. When new adapters are needed (Gemini CLI, Claude Code plugin
|
||||
system, etc.), they follow the same pattern.
|
||||
|
||||
---
|
||||
|
||||
## Future enhancements
|
||||
|
||||
- **Streaming passthrough** in the proxy (currently buffered for simplicity)
|
||||
- **Response grounding check**: parse assistant output for references to
|
||||
injected context, count reinforcement events
|
||||
- **Per-client metrics** in the dashboard: how often each client pulls,
|
||||
context pack size, injection rate
|
||||
- **Smart project detection**: today we use keyword matching; could use
|
||||
AtoCore's own project resolver endpoint
|
||||
Reference in New Issue
Block a user