feat: universal LLM consumption (Phase 1 complete)

Completes the Phase 1 master brain keystone: every LLM interaction across the ecosystem now pulls context from AtoCore automatically. Three adapters, one HTTP backend: 1. OpenClaw plugin pull (handler.js): - Added before_prompt_build hook that calls /context/build and injects the pack via prependContext - Existing capture hooks (before_agent_start + llm_output) unchanged - 6s context timeout, fail-open on AtoCore unreachable - Deployed to T420, gateway restarted, "7 plugins loaded" 2. atocore-proxy (scripts/atocore_proxy.py): - Stdlib-only OpenAI-compatible HTTP middleware - Drop-in layer for Codex, Ollama, LiteLLM, any OpenAI-compat client - Intercepts /chat/completions: extracts query, pulls context, injects as system message, forwards to upstream, captures back - Fail-open: AtoCore down = passthrough without injection - Configurable via env: UPSTREAM, PORT, CLIENT_LABEL, INJECT, CAPTURE 3. (from prior commit c49363f) atocore-mcp: - stdio MCP server, stdlib Python, 7 tools exposed - Registered in Claude Code: "✓ Connected" Plus quick win: - Project synthesis moved from Sunday-only to daily cron so wiki / mirror pages stay fresh (Step C in batch-extract.sh). Lint stays weekly. Plus docs: - docs/universal-consumption.md: configuration guide for all 3 adapters with registration/env-var tables and verification checklist Plus housekeeping: - .gitignore: add .mypy_cache/ Tests: 303/303 passing. This closes the consumption gap: the reinforcement feedback loop can now actually work (memories get injected → get referenced → reinforcement fires → auto-promotion). Every Claude, OpenClaw, Codex, or Ollama session is automatically AtoCore-grounded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:14:25 -04:00
parent c49363fccc
commit 86637f8eee
5 changed files with 726 additions and 46 deletions
--- a/docs/universal-consumption.md
+++ b/docs/universal-consumption.md
@@ -0,0 +1,274 @@
+# Universal Consumption — Connecting LLM Clients to AtoCore
+
+Phase 1 of the Master Brain plan. Every LLM interaction across the ecosystem
+pulls context from AtoCore automatically, without the user or agent having
+to remember to ask for it.
+
+## Architecture
+
+```
+                 ┌─────────────────────┐
+                 │  AtoCore HTTP API   │  ← single source of truth
+                 │  http://dalidou:8100│
+                 └──────────┬──────────┘
+                            │
+       ┌────────────────────┼────────────────────┐
+       │                    │                    │
+   ┌───┴────┐         ┌─────┴────┐         ┌────┴────┐
+   │  MCP   │         │ OpenClaw │         │  HTTP   │
+   │ server │         │  plugin  │         │  proxy  │
+   └───┬────┘         └──────┬───┘         └────┬────┘
+       │                     │                   │
+   Claude/Cursor/         OpenClaw            Codex/Ollama/
+   Zed/Windsurf                                any OpenAI-compat client
+```
+
+Three adapters, one HTTP backend. Each adapter is a thin passthrough — no
+business logic duplicated.
+
+---
+
+## Adapter 1: MCP Server (Claude Desktop, Claude Code, Cursor, Zed, Windsurf)
+
+The MCP server is `scripts/atocore_mcp.py` — stdlib-only Python, stdio
+transport, wraps the HTTP API. Claude-family clients see AtoCore as built-in
+tools just like `Read` or `Bash`.
+
+### Tools exposed
+
+- **`atocore_context`** (most important): Full context pack for a query —
+  Trusted Project State + memories + retrieved chunks. Use at the start of
+  any project-related conversation to ground it.
+- **`atocore_search`**: Semantic search over ingested documents (top-K chunks).
+- **`atocore_memory_list`**: List active memories, filterable by project + type.
+- **`atocore_memory_create`**: Propose a candidate memory (enters triage queue).
+- **`atocore_project_state`**: Get Trusted Project State entries by category.
+- **`atocore_projects`**: List registered projects + aliases.
+- **`atocore_health`**: Service status check.
+
+### Registration
+
+#### Claude Code (CLI)
+```bash
+claude mcp add atocore -- python C:/Users/antoi/ATOCore/scripts/atocore_mcp.py
+claude mcp list    # verify: "atocore ... ✓ Connected"
+```
+
+#### Claude Desktop (GUI)
+Edit `~/Library/Application Support/Claude/claude_desktop_config.json`
+(macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
+
+```json
+{
+  "mcpServers": {
+    "atocore": {
+      "command": "python",
+      "args": ["C:/Users/antoi/ATOCore/scripts/atocore_mcp.py"],
+      "env": {
+        "ATOCORE_URL": "http://dalidou:8100"
+      }
+    }
+  }
+}
+```
+Restart Claude Desktop.
+
+#### Cursor / Zed / Windsurf
+Similar JSON config in each tool's MCP settings. Consult their docs —
+the config schema is standard MCP.
+
+### Configuration
+
+Environment variables the MCP server honors:
+
+| Var | Default | Purpose |
+|---|---|---|
+| `ATOCORE_URL` | `http://dalidou:8100` | Where to reach AtoCore |
+| `ATOCORE_TIMEOUT` | `10` | Per-request HTTP timeout (seconds) |
+
+### Behavior
+
+- Fail-open: if Dalidou is unreachable, tools return "AtoCore unavailable"
+  error messages but don't crash the client.
+- Zero business logic: every tool is a direct HTTP passthrough.
+- stdlib only: no MCP SDK dependency.
+
+---
+
+## Adapter 2: OpenClaw Plugin (`openclaw-plugins/atocore-capture/handler.js`)
+
+The plugin on T420 OpenClaw has two responsibilities:
+
+1. **CAPTURE**: On `before_agent_start` + `llm_output`, POST completed turns
+   to AtoCore `/interactions` (existing).
+2. **PULL**: On `before_prompt_build`, call `/context/build` and inject the
+   context pack via `prependContext` so the agent's system prompt includes
+   AtoCore knowledge.
+
+### Deployment
+
+The plugin is loaded from
+`/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/`
+on the T420 (per OpenClaw's plugin config at `~/.openclaw/openclaw.json`).
+
+To update:
+```bash
+scp openclaw-plugins/atocore-capture/handler.js \
+    papa@192.168.86.39:/tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/index.js
+ssh papa@192.168.86.39 'systemctl --user restart openclaw-gateway'
+```
+
+Verify in gateway logs: look for "ready (7 plugins: acpx, atocore-capture, ...)"
+
+### Configuration (env vars set on T420)
+
+| Var | Default | Purpose |
+|---|---|---|
+| `ATOCORE_BASE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
+| `ATOCORE_PULL_DISABLED` | (unset) | Set to `1` to disable context pull |
+
+### Behavior
+
+- Fail-open: AtoCore unreachable = no injection, no capture, agent runs
+  normally.
+- 6s timeout on context pull, 10s on capture — won't stall the agent.
+- Context pack prepended as a clearly-bracketed block so the agent can see
+  it's auto-injected grounding info.
+
+---
+
+## Adapter 3: HTTP Proxy (`scripts/atocore_proxy.py`)
+
+A stdlib-only OpenAI-compatible HTTP proxy. Sits between any
+OpenAI-API-speaking client and the real provider, enriches every
+`/chat/completions` request with AtoCore context.
+
+Works with:
+- **Codex CLI** (OpenAI-compatible endpoint)
+- **Ollama** (has OpenAI-compatible `/v1` endpoint since 0.1.24)
+- **LiteLLM**, **llama.cpp server**, custom agents
+- Anything that can be pointed at a custom base URL
+
+### Start it
+
+```bash
+# For Ollama (local models):
+ATOCORE_UPSTREAM=http://localhost:11434/v1 \
+  python scripts/atocore_proxy.py
+
+# For OpenAI cloud:
+ATOCORE_UPSTREAM=https://api.openai.com/v1 \
+  ATOCORE_CLIENT_LABEL=codex \
+  python scripts/atocore_proxy.py
+
+# Test:
+curl http://127.0.0.1:11435/healthz
+```
+
+### Point a client at it
+
+Set the client's OpenAI base URL to `http://127.0.0.1:11435/v1`.
+
+#### Ollama example:
+```bash
+OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
+  some-openai-client --model llama3:8b
+```
+
+#### Codex CLI:
+Set `OPENAI_BASE_URL=http://127.0.0.1:11435/v1` in your codex config.
+
+### Configuration
+
+| Var | Default | Purpose |
+|---|---|---|
+| `ATOCORE_URL` | `http://dalidou:8100` | AtoCore HTTP endpoint |
+| `ATOCORE_UPSTREAM` | (required) | Real provider base URL |
+| `ATOCORE_PROXY_PORT` | `11435` | Proxy listen port |
+| `ATOCORE_PROXY_HOST` | `127.0.0.1` | Proxy bind address |
+| `ATOCORE_CLIENT_LABEL` | `proxy` | Client id in captures |
+| `ATOCORE_INJECT` | `1` | Inject context (set `0` to disable) |
+| `ATOCORE_CAPTURE` | `1` | Capture interactions (set `0` to disable) |
+
+### Behavior
+
+- GET requests (model listing etc) pass through unchanged
+- POST to `/chat/completions` (or `/v1/chat/completions`) gets enriched:
+  1. Last user message extracted as query
+  2. AtoCore `/context/build` called with 6s timeout
+  3. Pack injected as system message (or prepended to existing system)
+  4. Enriched body forwarded to upstream
+  5. After success, interaction POSTed to `/interactions` in background
+- Fail-open: AtoCore unreachable = pass through without injection
+- Streaming responses: currently buffered (not true stream). Good enough for
+  most cases; can be upgraded later if needed.
+
+### Running as a service
+
+On Linux, create `~/.config/systemd/user/atocore-proxy.service`:
+```ini
+[Unit]
+Description=AtoCore HTTP proxy
+
+[Service]
+Environment=ATOCORE_UPSTREAM=http://localhost:11434/v1
+Environment=ATOCORE_CLIENT_LABEL=ollama
+ExecStart=/usr/bin/python3 /path/to/scripts/atocore_proxy.py
+Restart=on-failure
+
+[Install]
+WantedBy=default.target
+```
+Then: `systemctl --user enable --now atocore-proxy`
+
+On Windows, register via Task Scheduler (similar pattern to backup task)
+or use NSSM to install as a service.
+
+---
+
+## Verification Checklist
+
+Fresh end-to-end test to confirm Phase 1 is working:
+
+### For Claude Code (MCP)
+1. Open a new Claude Code session (not this one).
+2. Ask: "what do we know about p06 polisher's control architecture?"
+3. Claude should invoke `atocore_context` or `atocore_project_state`
+   on its own and answer grounded in AtoCore data.
+
+### For OpenClaw (plugin pull)
+1. Send a Discord message to OpenClaw: "what's the status on p04?"
+2. Check T420 logs: `journalctl --user -u openclaw-gateway --since "1 min ago" | grep atocore-pull`
+3. Expect: `atocore-pull:injected project=p04-gigabit chars=NNN`
+
+### For proxy (any OpenAI-compat client)
+1. Start proxy with appropriate upstream
+2. Run a client query through it
+3. Check stderr: `[atocore-proxy] inject: project=... chars=...`
+4. Check `curl http://127.0.0.1:8100/interactions?client=proxy` — should
+   show the captured turn
+
+---
+
+## Why not just MCP everywhere?
+
+MCP is great for Claude-family clients but:
+- Not supported natively by Codex CLI, Ollama, or OpenAI's own API
+- No universal "attach MCP" mechanism in all LLM runtimes
+- HTTP APIs are truly universal
+
+HTTP API is the truth, each adapter is the thinnest possible shim for its
+ecosystem. When new adapters are needed (Gemini CLI, Claude Code plugin
+system, etc.), they follow the same pattern.
+
+---
+
+## Future enhancements
+
+- **Streaming passthrough** in the proxy (currently buffered for simplicity)
+- **Response grounding check**: parse assistant output for references to
+  injected context, count reinforcement events
+- **Per-client metrics** in the dashboard: how often each client pulls,
+  context pack size, injection rate
+- **Smart project detection**: today we use keyword matching; could use
+  AtoCore's own project resolver endpoint