Files
ATOCore/docs/universal-consumption.md
Anto01 86637f8eee feat: universal LLM consumption (Phase 1 complete)
Completes the Phase 1 master brain keystone: every LLM interaction
across the ecosystem now pulls context from AtoCore automatically.

Three adapters, one HTTP backend:

1. OpenClaw plugin pull (handler.js):
   - Added before_prompt_build hook that calls /context/build and
     injects the pack via prependContext
   - Existing capture hooks (before_agent_start + llm_output)
     unchanged
   - 6s context timeout, fail-open on AtoCore unreachable
   - Deployed to T420, gateway restarted, "7 plugins loaded"

2. atocore-proxy (scripts/atocore_proxy.py):
   - Stdlib-only OpenAI-compatible HTTP middleware
   - Drop-in layer for Codex, Ollama, LiteLLM, any OpenAI-compat client
   - Intercepts /chat/completions: extracts query, pulls context,
     injects as system message, forwards to upstream, captures back
   - Fail-open: AtoCore down = passthrough without injection
   - Configurable via env: UPSTREAM, PORT, CLIENT_LABEL, INJECT, CAPTURE

3. (from prior commit c49363f) atocore-mcp:
   - stdio MCP server, stdlib Python, 7 tools exposed
   - Registered in Claude Code: "✓ Connected"

Plus quick win:
- Project synthesis moved from Sunday-only to daily cron so wiki /
  mirror pages stay fresh (Step C in batch-extract.sh). Lint stays
  weekly.

Plus docs:
- docs/universal-consumption.md: configuration guide for all 3 adapters
  with registration/env-var tables and verification checklist

Plus housekeeping:
- .gitignore: add .mypy_cache/

Tests: 303/303 passing.

This closes the consumption gap: the reinforcement feedback loop
can now actually work (memories get injected → get referenced →
reinforcement fires → auto-promotion). Every Claude, OpenClaw,
Codex, or Ollama session is automatically AtoCore-grounded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:14:25 -04:00

9.3 KiB

Universal Consumption — Connecting LLM Clients to AtoCore

Phase 1 of the Master Brain plan. Every LLM interaction across the ecosystem pulls context from AtoCore automatically, without the user or agent having to remember to ask for it.

Architecture

                 ┌─────────────────────┐
                 │  AtoCore HTTP API   │  ← single source of truth
                 │  http://dalidou:8100│
                 └──────────┬──────────┘
                            │
       ┌────────────────────┼────────────────────┐
       │                    │                    │
   ┌───┴────┐         ┌─────┴────┐         ┌────┴────┐
   │  MCP   │         │ OpenClaw │         │  HTTP   │
   │ server │         │  plugin  │         │  proxy  │
   └───┬────┘         └──────┬───┘         └────┬────┘
       │                     │                   │
   Claude/Cursor/         OpenClaw            Codex/Ollama/
   Zed/Windsurf                                any OpenAI-compat client

Three adapters, one HTTP backend. Each adapter is a thin passthrough — no business logic duplicated.


Adapter 1: MCP Server (Claude Desktop, Claude Code, Cursor, Zed, Windsurf)

The MCP server is scripts/atocore_mcp.py — stdlib-only Python, stdio transport, wraps the HTTP API. Claude-family clients see AtoCore as built-in tools just like Read or Bash.

Tools exposed

  • atocore_context (most important): Full context pack for a query — Trusted Project State + memories + retrieved chunks. Use at the start of any project-related conversation to ground it.
  • atocore_search: Semantic search over ingested documents (top-K chunks).
  • atocore_memory_list: List active memories, filterable by project + type.
  • atocore_memory_create: Propose a candidate memory (enters triage queue).
  • atocore_project_state: Get Trusted Project State entries by category.
  • atocore_projects: List registered projects + aliases.
  • atocore_health: Service status check.

Registration

Claude Code (CLI)

claude mcp add atocore -- python C:/Users/antoi/ATOCore/scripts/atocore_mcp.py
claude mcp list    # verify: "atocore ... ✓ Connected"

Claude Desktop (GUI)

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "atocore": {
      "command": "python",
      "args": ["C:/Users/antoi/ATOCore/scripts/atocore_mcp.py"],
      "env": {
        "ATOCORE_URL": "http://dalidou:8100"
      }
    }
  }
}

Restart Claude Desktop.

Cursor / Zed / Windsurf

Similar JSON config in each tool's MCP settings. Consult their docs — the config schema is standard MCP.

Configuration

Environment variables the MCP server honors:

Var Default Purpose
ATOCORE_URL http://dalidou:8100 Where to reach AtoCore
ATOCORE_TIMEOUT 10 Per-request HTTP timeout (seconds)

Behavior

  • Fail-open: if Dalidou is unreachable, tools return "AtoCore unavailable" error messages but don't crash the client.
  • Zero business logic: every tool is a direct HTTP passthrough.
  • stdlib only: no MCP SDK dependency.

Adapter 2: OpenClaw Plugin (openclaw-plugins/atocore-capture/handler.js)

The plugin on T420 OpenClaw has two responsibilities:

  1. CAPTURE: On before_agent_start + llm_output, POST completed turns to AtoCore /interactions (existing).
  2. PULL: On before_prompt_build, call /context/build and inject the context pack via prependContext so the agent's system prompt includes AtoCore knowledge.

Deployment

The plugin is loaded from /tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/ on the T420 (per OpenClaw's plugin config at ~/.openclaw/openclaw.json).

To update:

scp openclaw-plugins/atocore-capture/handler.js \
    papa@192.168.86.39:/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/index.js
ssh papa@192.168.86.39 'systemctl --user restart openclaw-gateway'

Verify in gateway logs: look for "ready (7 plugins: acpx, atocore-capture, ...)"

Configuration (env vars set on T420)

Var Default Purpose
ATOCORE_BASE_URL http://dalidou:8100 AtoCore HTTP endpoint
ATOCORE_PULL_DISABLED (unset) Set to 1 to disable context pull

Behavior

  • Fail-open: AtoCore unreachable = no injection, no capture, agent runs normally.
  • 6s timeout on context pull, 10s on capture — won't stall the agent.
  • Context pack prepended as a clearly-bracketed block so the agent can see it's auto-injected grounding info.

Adapter 3: HTTP Proxy (scripts/atocore_proxy.py)

A stdlib-only OpenAI-compatible HTTP proxy. Sits between any OpenAI-API-speaking client and the real provider, enriches every /chat/completions request with AtoCore context.

Works with:

  • Codex CLI (OpenAI-compatible endpoint)
  • Ollama (has OpenAI-compatible /v1 endpoint since 0.1.24)
  • LiteLLM, llama.cpp server, custom agents
  • Anything that can be pointed at a custom base URL

Start it

# For Ollama (local models):
ATOCORE_UPSTREAM=http://localhost:11434/v1 \
  python scripts/atocore_proxy.py

# For OpenAI cloud:
ATOCORE_UPSTREAM=https://api.openai.com/v1 \
  ATOCORE_CLIENT_LABEL=codex \
  python scripts/atocore_proxy.py

# Test:
curl http://127.0.0.1:11435/healthz

Point a client at it

Set the client's OpenAI base URL to http://127.0.0.1:11435/v1.

Ollama example:

OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
  some-openai-client --model llama3:8b

Codex CLI:

Set OPENAI_BASE_URL=http://127.0.0.1:11435/v1 in your codex config.

Configuration

Var Default Purpose
ATOCORE_URL http://dalidou:8100 AtoCore HTTP endpoint
ATOCORE_UPSTREAM (required) Real provider base URL
ATOCORE_PROXY_PORT 11435 Proxy listen port
ATOCORE_PROXY_HOST 127.0.0.1 Proxy bind address
ATOCORE_CLIENT_LABEL proxy Client id in captures
ATOCORE_INJECT 1 Inject context (set 0 to disable)
ATOCORE_CAPTURE 1 Capture interactions (set 0 to disable)

Behavior

  • GET requests (model listing etc) pass through unchanged
  • POST to /chat/completions (or /v1/chat/completions) gets enriched:
    1. Last user message extracted as query
    2. AtoCore /context/build called with 6s timeout
    3. Pack injected as system message (or prepended to existing system)
    4. Enriched body forwarded to upstream
    5. After success, interaction POSTed to /interactions in background
  • Fail-open: AtoCore unreachable = pass through without injection
  • Streaming responses: currently buffered (not true stream). Good enough for most cases; can be upgraded later if needed.

Running as a service

On Linux, create ~/.config/systemd/user/atocore-proxy.service:

[Unit]
Description=AtoCore HTTP proxy

[Service]
Environment=ATOCORE_UPSTREAM=http://localhost:11434/v1
Environment=ATOCORE_CLIENT_LABEL=ollama
ExecStart=/usr/bin/python3 /path/to/scripts/atocore_proxy.py
Restart=on-failure

[Install]
WantedBy=default.target

Then: systemctl --user enable --now atocore-proxy

On Windows, register via Task Scheduler (similar pattern to backup task) or use NSSM to install as a service.


Verification Checklist

Fresh end-to-end test to confirm Phase 1 is working:

For Claude Code (MCP)

  1. Open a new Claude Code session (not this one).
  2. Ask: "what do we know about p06 polisher's control architecture?"
  3. Claude should invoke atocore_context or atocore_project_state on its own and answer grounded in AtoCore data.

For OpenClaw (plugin pull)

  1. Send a Discord message to OpenClaw: "what's the status on p04?"
  2. Check T420 logs: journalctl --user -u openclaw-gateway --since "1 min ago" | grep atocore-pull
  3. Expect: atocore-pull:injected project=p04-gigabit chars=NNN

For proxy (any OpenAI-compat client)

  1. Start proxy with appropriate upstream
  2. Run a client query through it
  3. Check stderr: [atocore-proxy] inject: project=... chars=...
  4. Check curl http://127.0.0.1:8100/interactions?client=proxy — should show the captured turn

Why not just MCP everywhere?

MCP is great for Claude-family clients but:

  • Not supported natively by Codex CLI, Ollama, or OpenAI's own API
  • No universal "attach MCP" mechanism in all LLM runtimes
  • HTTP APIs are truly universal

HTTP API is the truth, each adapter is the thinnest possible shim for its ecosystem. When new adapters are needed (Gemini CLI, Claude Code plugin system, etc.), they follow the same pattern.


Future enhancements

  • Streaming passthrough in the proxy (currently buffered for simplicity)
  • Response grounding check: parse assistant output for references to injected context, count reinforcement events
  • Per-client metrics in the dashboard: how often each client pulls, context pack size, injection rate
  • Smart project detection: today we use keyword matching; could use AtoCore's own project resolver endpoint