Files
ATOCore/scripts/atocore_client.py

449 lines
18 KiB
Python
Raw Normal View History

"""Operator-facing API client for live AtoCore instances.
This script is intentionally external to the app runtime. It is for admins and
operators who want a convenient way to inspect live project state, refresh
projects, audit retrieval quality, and manage trusted project-state entries.
"""
from __future__ import annotations
import argparse
import json
import os
import re
import sys
import urllib.error
import urllib.parse
import urllib.request
from typing import Any
BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://dalidou:8100").rstrip("/")
TIMEOUT = int(os.environ.get("ATOCORE_TIMEOUT_SECONDS", "30"))
REFRESH_TIMEOUT = int(os.environ.get("ATOCORE_REFRESH_TIMEOUT_SECONDS", "1800"))
FAIL_OPEN = os.environ.get("ATOCORE_FAIL_OPEN", "true").lower() == "true"
feat(client): Phase 9 reflection loop surface in shared operator CLI Codex's sequence step 3: finish the Phase 9 operator surface in the shared client. The previous client version (0.1.0) covered stable operations (project lifecycle, retrieval, context build, trusted state, audit-query) but explicitly deferred capture/extract/queue/ promote/reject pending "exercised workflow". That deferral ran into a bootstrap problem: real Claude Code sessions can't exercise the Phase 9 loop without a usable client surface to drive it. This commit ships the 8 missing subcommands so the next step (real validation on Dalidou) is unblocked. Bumps CLIENT_VERSION from 0.1.0 to 0.2.0 per the semver rules in llm-client-integration.md (new subcommands = minor bump). New subcommands in scripts/atocore_client.py -------------------------------------------- | Subcommand | Endpoint | |-----------------------|-------------------------------------------| | capture | POST /interactions | | extract | POST /interactions/{id}/extract | | reinforce-interaction | POST /interactions/{id}/reinforce | | list-interactions | GET /interactions | | get-interaction | GET /interactions/{id} | | queue | GET /memory?status=candidate | | promote | POST /memory/{id}/promote | | reject | POST /memory/{id}/reject | Each follows the existing client style: positional arguments with empty-string defaults for optional filters, truthy-string arguments for booleans (matching the existing refresh-project pattern), JSON output via print_json(), fail-open behavior inherited from request(). capture accepts prompt + response + project + client + session_id + reinforce as positionals, defaulting the client field to "atocore-client" when omitted so every capture from the shared client is identifiable in the interactions audit trail. extract defaults to preview mode (persist=false). Pass "true" as the second positional to create candidate memories. list-interactions and queue build URL query strings with url-encoded values and always include the limit, matching how the existing context-build subcommand handles its parameters. Security fix: ID-field URL encoding ----------------------------------- The initial draft used urllib.parse.quote() with the default safe set, which does NOT encode "/" because it's a reserved path character. That's a security footgun on ID fields: passing "promote mem/evil/action" would build /memory/mem/evil/action/promote and hit a completely different endpoint than intended. Fixed by passing safe="" to urllib.parse.quote() on every ID field (interaction_id and memory_id). The tests cover this explicitly via test_extract_url_encodes_interaction_id and test_promote_url_encodes_memory_id, both of which would have failed with the default behavior. Project names keep the default quote behavior because a project name with a slash would already be broken elsewhere in the system (ingest root resolution, file paths, etc). tests/test_atocore_client.py (new, 18 tests, all green) ------------------------------------------------------- A dedicated test file for the shared client that mocks the request() helper and verifies each subcommand: - calls the correct HTTP method and path - builds the correct JSON body (or query string) - passes the right subset of CLI arguments through - URL-encodes ID fields so path traversal isn't possible Tests are structured as unit tests (not integration tests) because the API surface on the server side already has its own route tests in test_api_storage.py and the Phase 9 specific files. These tests are the wiring contract between CLI args and HTTP calls. Test file highlights: - capture: default values, custom client, reinforce=false - extract: preview by default, persist=true opt-in, URL encoding - reinforce-interaction: correct path construction - list-interactions: no filters, single filter, full filter set (including ISO 8601 since parameter with T separator and Z) - get-interaction: fetch by id - queue: always filters status=candidate, accepts memory_type and project, coerces limit to int - promote / reject: correct path + URL encoding - test_phase9_full_loop_via_client_shape: end-to-end sequence that drives capture -> extract preview -> extract persist -> queue list -> promote -> reject through the shared client and verifies the exact sequence of HTTP calls that would be made These tests run in ~0.2s because they mock request() — no DB, no Chroma, no HTTP. The fast feedback loop matters because the client surface is what every agent integration eventually depends on. docs/architecture/llm-client-integration.md updates --------------------------------------------------- - New "Phase 9 reflection loop (shipped after migration safety work)" section under "What's in scope for the shared client today" with the full 8-subcommand table and a note explaining the bootstrap-problem rationale - Removed the "Memory review queue and reflection loop" section from "What's intentionally NOT in scope today"; backup admin and engineering-entity commands remain the only deferred families - Renumbered the deferred-commands list (was 3 items, now 2) - Open follow-ups updated: memory-review-subcommand item replaced with "real-usage validation of the Phase 9 loop" as the next concrete dependency - TL;DR updated to list the reflection-loop subcommands - Versioning note records the v0.1.0 -> v0.2.0 bump with the subcommands included Full suite: 215 passing (was 197), 1 warning. The +18 is tests/test_atocore_client.py. Runtime unchanged because the new tests don't touch the DB. What this commit does NOT do ---------------------------- - Does NOT change the server-side endpoints. All 8 subcommands call existing API routes that were shipped in Phase 9 Commits A/B/C. This is purely a client-side wiring commit. - Does NOT run the reflection loop against the live Dalidou instance. That's the next concrete step and is explicitly called out in the open-follow-ups section of the updated doc. - Does NOT modify the Claude Code slash command. It still pulls context only; the capture/extract/queue/promote companion commands (e.g. /atocore-record-response) are deferred until the capture workflow has been exercised in real use at least once. - Does NOT refactor the OpenClaw helper. That's a cross-repo change and remains a queued follow-up, now unblocked by the shared client having the reflection-loop subcommands.
2026-04-08 16:09:42 -04:00
# Bumped when the subcommand surface or JSON output shapes meaningfully
# change. See docs/architecture/llm-client-integration.md for the
# semver rules. History:
# 0.1.0 initial stable-ops-only client
# 0.2.0 Phase 9 reflection loop added: capture, extract,
# reinforce-interaction, list-interactions, get-interaction,
# queue, promote, reject
CLIENT_VERSION = "0.2.0"
def print_json(payload: Any) -> None:
print(json.dumps(payload, ensure_ascii=True, indent=2))
def fail_open_payload() -> dict[str, Any]:
return {"status": "unavailable", "source": "atocore", "fail_open": True}
def request(
method: str,
path: str,
data: dict[str, Any] | None = None,
timeout: int | None = None,
) -> Any:
url = f"{BASE_URL}{path}"
headers = {"Content-Type": "application/json"} if data is not None else {}
payload = json.dumps(data).encode("utf-8") if data is not None else None
req = urllib.request.Request(url, data=payload, headers=headers, method=method)
try:
with urllib.request.urlopen(req, timeout=timeout or TIMEOUT) as response:
body = response.read().decode("utf-8")
except urllib.error.HTTPError as exc:
body = exc.read().decode("utf-8")
if body:
print(body)
raise SystemExit(22) from exc
except (urllib.error.URLError, TimeoutError, OSError):
if FAIL_OPEN:
print_json(fail_open_payload())
raise SystemExit(0)
raise
if not body.strip():
return {}
return json.loads(body)
def parse_aliases(aliases_csv: str) -> list[str]:
return [alias.strip() for alias in aliases_csv.split(",") if alias.strip()]
def detect_project(prompt: str) -> dict[str, Any]:
payload = request("GET", "/projects")
prompt_lower = prompt.lower()
best_project = None
best_alias = None
best_score = -1
for project in payload.get("projects", []):
candidates = [project.get("id", ""), *project.get("aliases", [])]
for candidate in candidates:
candidate = (candidate or "").strip()
if not candidate:
continue
pattern = rf"(?<![a-z0-9]){re.escape(candidate.lower())}(?![a-z0-9])"
matched = re.search(pattern, prompt_lower) is not None
if not matched and candidate.lower() not in prompt_lower:
continue
score = len(candidate)
if score > best_score:
best_project = project.get("id")
best_alias = candidate
best_score = score
return {"matched_project": best_project, "matched_alias": best_alias}
def classify_result(result: dict[str, Any]) -> dict[str, Any]:
source_file = (result.get("source_file") or "").lower()
heading = (result.get("heading_path") or "").lower()
title = (result.get("title") or "").lower()
text = " ".join([source_file, heading, title])
labels: list[str] = []
if any(token in text for token in ["_archive", "/archive", "archive/", "pre-cleanup", "pre-migration", "history"]):
labels.append("archive_or_history")
if any(token in text for token in ["status", "dashboard", "current-state", "current state", "next-steps", "next steps"]):
labels.append("current_status")
if any(token in text for token in ["decision", "adr", "tradeoff", "selected architecture", "selection"]):
labels.append("decision")
if any(token in text for token in ["requirement", "spec", "constraints", "baseline", "cdr", "sow"]):
labels.append("requirements")
if any(token in text for token in ["roadmap", "milestone", "plan", "workflow", "calibration", "contract"]):
labels.append("execution_plan")
if not labels:
labels.append("reference")
return {
"score": result.get("score"),
"title": result.get("title"),
"heading_path": result.get("heading_path"),
"source_file": result.get("source_file"),
"labels": labels,
"is_noise_risk": "archive_or_history" in labels,
}
def audit_query(prompt: str, top_k: int, project: str | None) -> dict[str, Any]:
response = request(
"POST",
"/query",
{"prompt": prompt, "top_k": top_k, "project": project or None},
)
classifications = [classify_result(result) for result in response.get("results", [])]
broad_prompt = len(prompt.split()) <= 2
noise_hits = sum(1 for item in classifications if item["is_noise_risk"])
current_hits = sum(1 for item in classifications if "current_status" in item["labels"])
decision_hits = sum(1 for item in classifications if "decision" in item["labels"])
requirements_hits = sum(1 for item in classifications if "requirements" in item["labels"])
recommendations: list[str] = []
if broad_prompt:
recommendations.append("Prompt is broad; prefer a project-specific question with intent, artifact type, or constraint language.")
if noise_hits:
recommendations.append("Archive/history noise is present; prefer current-status, decision, requirements, and baseline docs in the next ingestion/ranking pass.")
if current_hits == 0:
recommendations.append("No current-status docs surfaced in the top results; Wave 2 should ingest or strengthen trusted operational truth.")
if decision_hits == 0:
recommendations.append("No decision docs surfaced in the top results; add or freeze decision logs for the active project.")
if requirements_hits == 0:
recommendations.append("No requirements/baseline docs surfaced in the top results; prioritize baseline and architecture-freeze material.")
if not recommendations:
recommendations.append("Ranking looks healthy for this prompt.")
return {
"prompt": prompt,
"project": project,
"top_k": top_k,
"broad_prompt": broad_prompt,
"noise_hits": noise_hits,
"current_status_hits": current_hits,
"decision_hits": decision_hits,
"requirements_hits": requirements_hits,
"results": classifications,
"recommendations": recommendations,
}
def project_payload(
project_id: str,
aliases_csv: str,
source: str,
subpath: str,
description: str,
label: str,
) -> dict[str, Any]:
return {
"project_id": project_id,
"aliases": parse_aliases(aliases_csv),
"description": description,
"ingest_roots": [{"source": source, "subpath": subpath, "label": label}],
}
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(description="AtoCore live API client")
sub = parser.add_subparsers(dest="command", required=True)
for name in ["health", "sources", "stats", "projects", "project-template", "debug-context", "ingest-sources"]:
sub.add_parser(name)
p = sub.add_parser("detect-project")
p.add_argument("prompt")
p = sub.add_parser("auto-context")
p.add_argument("prompt")
p.add_argument("budget", nargs="?", type=int, default=3000)
p.add_argument("project", nargs="?", default="")
for name in ["propose-project", "register-project"]:
p = sub.add_parser(name)
p.add_argument("project_id")
p.add_argument("aliases_csv")
p.add_argument("source")
p.add_argument("subpath")
p.add_argument("description", nargs="?", default="")
p.add_argument("label", nargs="?", default="")
p = sub.add_parser("update-project")
p.add_argument("project")
p.add_argument("description")
p.add_argument("aliases_csv", nargs="?", default="")
p = sub.add_parser("refresh-project")
p.add_argument("project")
p.add_argument("purge_deleted", nargs="?", default="false")
p = sub.add_parser("project-state")
p.add_argument("project")
p.add_argument("category", nargs="?", default="")
p = sub.add_parser("project-state-set")
p.add_argument("project")
p.add_argument("category")
p.add_argument("key")
p.add_argument("value")
p.add_argument("source", nargs="?", default="")
p.add_argument("confidence", nargs="?", type=float, default=1.0)
p = sub.add_parser("project-state-invalidate")
p.add_argument("project")
p.add_argument("category")
p.add_argument("key")
p = sub.add_parser("query")
p.add_argument("prompt")
p.add_argument("top_k", nargs="?", type=int, default=5)
p.add_argument("project", nargs="?", default="")
p = sub.add_parser("context-build")
p.add_argument("prompt")
p.add_argument("project", nargs="?", default="")
p.add_argument("budget", nargs="?", type=int, default=3000)
p = sub.add_parser("audit-query")
p.add_argument("prompt")
p.add_argument("top_k", nargs="?", type=int, default=5)
p.add_argument("project", nargs="?", default="")
feat(client): Phase 9 reflection loop surface in shared operator CLI Codex's sequence step 3: finish the Phase 9 operator surface in the shared client. The previous client version (0.1.0) covered stable operations (project lifecycle, retrieval, context build, trusted state, audit-query) but explicitly deferred capture/extract/queue/ promote/reject pending "exercised workflow". That deferral ran into a bootstrap problem: real Claude Code sessions can't exercise the Phase 9 loop without a usable client surface to drive it. This commit ships the 8 missing subcommands so the next step (real validation on Dalidou) is unblocked. Bumps CLIENT_VERSION from 0.1.0 to 0.2.0 per the semver rules in llm-client-integration.md (new subcommands = minor bump). New subcommands in scripts/atocore_client.py -------------------------------------------- | Subcommand | Endpoint | |-----------------------|-------------------------------------------| | capture | POST /interactions | | extract | POST /interactions/{id}/extract | | reinforce-interaction | POST /interactions/{id}/reinforce | | list-interactions | GET /interactions | | get-interaction | GET /interactions/{id} | | queue | GET /memory?status=candidate | | promote | POST /memory/{id}/promote | | reject | POST /memory/{id}/reject | Each follows the existing client style: positional arguments with empty-string defaults for optional filters, truthy-string arguments for booleans (matching the existing refresh-project pattern), JSON output via print_json(), fail-open behavior inherited from request(). capture accepts prompt + response + project + client + session_id + reinforce as positionals, defaulting the client field to "atocore-client" when omitted so every capture from the shared client is identifiable in the interactions audit trail. extract defaults to preview mode (persist=false). Pass "true" as the second positional to create candidate memories. list-interactions and queue build URL query strings with url-encoded values and always include the limit, matching how the existing context-build subcommand handles its parameters. Security fix: ID-field URL encoding ----------------------------------- The initial draft used urllib.parse.quote() with the default safe set, which does NOT encode "/" because it's a reserved path character. That's a security footgun on ID fields: passing "promote mem/evil/action" would build /memory/mem/evil/action/promote and hit a completely different endpoint than intended. Fixed by passing safe="" to urllib.parse.quote() on every ID field (interaction_id and memory_id). The tests cover this explicitly via test_extract_url_encodes_interaction_id and test_promote_url_encodes_memory_id, both of which would have failed with the default behavior. Project names keep the default quote behavior because a project name with a slash would already be broken elsewhere in the system (ingest root resolution, file paths, etc). tests/test_atocore_client.py (new, 18 tests, all green) ------------------------------------------------------- A dedicated test file for the shared client that mocks the request() helper and verifies each subcommand: - calls the correct HTTP method and path - builds the correct JSON body (or query string) - passes the right subset of CLI arguments through - URL-encodes ID fields so path traversal isn't possible Tests are structured as unit tests (not integration tests) because the API surface on the server side already has its own route tests in test_api_storage.py and the Phase 9 specific files. These tests are the wiring contract between CLI args and HTTP calls. Test file highlights: - capture: default values, custom client, reinforce=false - extract: preview by default, persist=true opt-in, URL encoding - reinforce-interaction: correct path construction - list-interactions: no filters, single filter, full filter set (including ISO 8601 since parameter with T separator and Z) - get-interaction: fetch by id - queue: always filters status=candidate, accepts memory_type and project, coerces limit to int - promote / reject: correct path + URL encoding - test_phase9_full_loop_via_client_shape: end-to-end sequence that drives capture -> extract preview -> extract persist -> queue list -> promote -> reject through the shared client and verifies the exact sequence of HTTP calls that would be made These tests run in ~0.2s because they mock request() — no DB, no Chroma, no HTTP. The fast feedback loop matters because the client surface is what every agent integration eventually depends on. docs/architecture/llm-client-integration.md updates --------------------------------------------------- - New "Phase 9 reflection loop (shipped after migration safety work)" section under "What's in scope for the shared client today" with the full 8-subcommand table and a note explaining the bootstrap-problem rationale - Removed the "Memory review queue and reflection loop" section from "What's intentionally NOT in scope today"; backup admin and engineering-entity commands remain the only deferred families - Renumbered the deferred-commands list (was 3 items, now 2) - Open follow-ups updated: memory-review-subcommand item replaced with "real-usage validation of the Phase 9 loop" as the next concrete dependency - TL;DR updated to list the reflection-loop subcommands - Versioning note records the v0.1.0 -> v0.2.0 bump with the subcommands included Full suite: 215 passing (was 197), 1 warning. The +18 is tests/test_atocore_client.py. Runtime unchanged because the new tests don't touch the DB. What this commit does NOT do ---------------------------- - Does NOT change the server-side endpoints. All 8 subcommands call existing API routes that were shipped in Phase 9 Commits A/B/C. This is purely a client-side wiring commit. - Does NOT run the reflection loop against the live Dalidou instance. That's the next concrete step and is explicitly called out in the open-follow-ups section of the updated doc. - Does NOT modify the Claude Code slash command. It still pulls context only; the capture/extract/queue/promote companion commands (e.g. /atocore-record-response) are deferred until the capture workflow has been exercised in real use at least once. - Does NOT refactor the OpenClaw helper. That's a cross-repo change and remains a queued follow-up, now unblocked by the shared client having the reflection-loop subcommands.
2026-04-08 16:09:42 -04:00
# --- Phase 9 reflection loop surface --------------------------------
#
# capture: record one interaction (prompt + response + context used).
# Mirrors POST /interactions. response is positional so shell
# callers can pass it via $(cat file.txt) or heredoc. project,
# client, and session_id are optional positionals with empty
# defaults, matching the existing script's style.
p = sub.add_parser("capture")
p.add_argument("prompt")
p.add_argument("response", nargs="?", default="")
p.add_argument("project", nargs="?", default="")
p.add_argument("client", nargs="?", default="")
p.add_argument("session_id", nargs="?", default="")
p.add_argument("reinforce", nargs="?", default="true")
# extract: run the Phase 9 C rule-based extractor against an
# already-captured interaction. persist='true' writes the
# candidates as status='candidate' memories; default is
# preview-only.
p = sub.add_parser("extract")
p.add_argument("interaction_id")
p.add_argument("persist", nargs="?", default="false")
# reinforce: backfill reinforcement on an already-captured interaction.
p = sub.add_parser("reinforce-interaction")
p.add_argument("interaction_id")
# list-interactions: paginated listing with filters.
p = sub.add_parser("list-interactions")
p.add_argument("project", nargs="?", default="")
p.add_argument("session_id", nargs="?", default="")
p.add_argument("client", nargs="?", default="")
p.add_argument("since", nargs="?", default="")
p.add_argument("limit", nargs="?", type=int, default=50)
# get-interaction: fetch one by id
p = sub.add_parser("get-interaction")
p.add_argument("interaction_id")
# queue: list the candidate review queue
p = sub.add_parser("queue")
p.add_argument("memory_type", nargs="?", default="")
p.add_argument("project", nargs="?", default="")
p.add_argument("limit", nargs="?", type=int, default=50)
# promote: candidate -> active
p = sub.add_parser("promote")
p.add_argument("memory_id")
# reject: candidate -> invalid
p = sub.add_parser("reject")
p.add_argument("memory_id")
return parser
def main() -> int:
args = build_parser().parse_args()
cmd = args.command
if cmd == "health":
print_json(request("GET", "/health"))
elif cmd == "sources":
print_json(request("GET", "/sources"))
elif cmd == "stats":
print_json(request("GET", "/stats"))
elif cmd == "projects":
print_json(request("GET", "/projects"))
elif cmd == "project-template":
print_json(request("GET", "/projects/template"))
elif cmd == "debug-context":
print_json(request("GET", "/debug/context"))
elif cmd == "ingest-sources":
print_json(request("POST", "/ingest/sources", {}))
elif cmd == "detect-project":
print_json(detect_project(args.prompt))
elif cmd == "auto-context":
project = args.project or detect_project(args.prompt).get("matched_project") or ""
if not project:
print_json({"status": "no_project_match", "source": "atocore", "mode": "auto-context"})
else:
print_json(request("POST", "/context/build", {"prompt": args.prompt, "project": project, "budget": args.budget}))
elif cmd in {"propose-project", "register-project"}:
path = "/projects/proposal" if cmd == "propose-project" else "/projects/register"
print_json(request("POST", path, project_payload(args.project_id, args.aliases_csv, args.source, args.subpath, args.description, args.label)))
elif cmd == "update-project":
payload: dict[str, Any] = {"description": args.description}
if args.aliases_csv.strip():
payload["aliases"] = parse_aliases(args.aliases_csv)
print_json(request("PUT", f"/projects/{urllib.parse.quote(args.project)}", payload))
elif cmd == "refresh-project":
purge_deleted = args.purge_deleted.lower() in {"1", "true", "yes", "y"}
path = f"/projects/{urllib.parse.quote(args.project)}/refresh?purge_deleted={str(purge_deleted).lower()}"
print_json(request("POST", path, {}, timeout=REFRESH_TIMEOUT))
elif cmd == "project-state":
suffix = f"?category={urllib.parse.quote(args.category)}" if args.category else ""
print_json(request("GET", f"/project/state/{urllib.parse.quote(args.project)}{suffix}"))
elif cmd == "project-state-set":
print_json(request("POST", "/project/state", {
"project": args.project,
"category": args.category,
"key": args.key,
"value": args.value,
"source": args.source,
"confidence": args.confidence,
}))
elif cmd == "project-state-invalidate":
print_json(request("DELETE", "/project/state", {"project": args.project, "category": args.category, "key": args.key}))
elif cmd == "query":
print_json(request("POST", "/query", {"prompt": args.prompt, "top_k": args.top_k, "project": args.project or None}))
elif cmd == "context-build":
print_json(request("POST", "/context/build", {"prompt": args.prompt, "project": args.project or None, "budget": args.budget}))
elif cmd == "audit-query":
print_json(audit_query(args.prompt, args.top_k, args.project or None))
feat(client): Phase 9 reflection loop surface in shared operator CLI Codex's sequence step 3: finish the Phase 9 operator surface in the shared client. The previous client version (0.1.0) covered stable operations (project lifecycle, retrieval, context build, trusted state, audit-query) but explicitly deferred capture/extract/queue/ promote/reject pending "exercised workflow". That deferral ran into a bootstrap problem: real Claude Code sessions can't exercise the Phase 9 loop without a usable client surface to drive it. This commit ships the 8 missing subcommands so the next step (real validation on Dalidou) is unblocked. Bumps CLIENT_VERSION from 0.1.0 to 0.2.0 per the semver rules in llm-client-integration.md (new subcommands = minor bump). New subcommands in scripts/atocore_client.py -------------------------------------------- | Subcommand | Endpoint | |-----------------------|-------------------------------------------| | capture | POST /interactions | | extract | POST /interactions/{id}/extract | | reinforce-interaction | POST /interactions/{id}/reinforce | | list-interactions | GET /interactions | | get-interaction | GET /interactions/{id} | | queue | GET /memory?status=candidate | | promote | POST /memory/{id}/promote | | reject | POST /memory/{id}/reject | Each follows the existing client style: positional arguments with empty-string defaults for optional filters, truthy-string arguments for booleans (matching the existing refresh-project pattern), JSON output via print_json(), fail-open behavior inherited from request(). capture accepts prompt + response + project + client + session_id + reinforce as positionals, defaulting the client field to "atocore-client" when omitted so every capture from the shared client is identifiable in the interactions audit trail. extract defaults to preview mode (persist=false). Pass "true" as the second positional to create candidate memories. list-interactions and queue build URL query strings with url-encoded values and always include the limit, matching how the existing context-build subcommand handles its parameters. Security fix: ID-field URL encoding ----------------------------------- The initial draft used urllib.parse.quote() with the default safe set, which does NOT encode "/" because it's a reserved path character. That's a security footgun on ID fields: passing "promote mem/evil/action" would build /memory/mem/evil/action/promote and hit a completely different endpoint than intended. Fixed by passing safe="" to urllib.parse.quote() on every ID field (interaction_id and memory_id). The tests cover this explicitly via test_extract_url_encodes_interaction_id and test_promote_url_encodes_memory_id, both of which would have failed with the default behavior. Project names keep the default quote behavior because a project name with a slash would already be broken elsewhere in the system (ingest root resolution, file paths, etc). tests/test_atocore_client.py (new, 18 tests, all green) ------------------------------------------------------- A dedicated test file for the shared client that mocks the request() helper and verifies each subcommand: - calls the correct HTTP method and path - builds the correct JSON body (or query string) - passes the right subset of CLI arguments through - URL-encodes ID fields so path traversal isn't possible Tests are structured as unit tests (not integration tests) because the API surface on the server side already has its own route tests in test_api_storage.py and the Phase 9 specific files. These tests are the wiring contract between CLI args and HTTP calls. Test file highlights: - capture: default values, custom client, reinforce=false - extract: preview by default, persist=true opt-in, URL encoding - reinforce-interaction: correct path construction - list-interactions: no filters, single filter, full filter set (including ISO 8601 since parameter with T separator and Z) - get-interaction: fetch by id - queue: always filters status=candidate, accepts memory_type and project, coerces limit to int - promote / reject: correct path + URL encoding - test_phase9_full_loop_via_client_shape: end-to-end sequence that drives capture -> extract preview -> extract persist -> queue list -> promote -> reject through the shared client and verifies the exact sequence of HTTP calls that would be made These tests run in ~0.2s because they mock request() — no DB, no Chroma, no HTTP. The fast feedback loop matters because the client surface is what every agent integration eventually depends on. docs/architecture/llm-client-integration.md updates --------------------------------------------------- - New "Phase 9 reflection loop (shipped after migration safety work)" section under "What's in scope for the shared client today" with the full 8-subcommand table and a note explaining the bootstrap-problem rationale - Removed the "Memory review queue and reflection loop" section from "What's intentionally NOT in scope today"; backup admin and engineering-entity commands remain the only deferred families - Renumbered the deferred-commands list (was 3 items, now 2) - Open follow-ups updated: memory-review-subcommand item replaced with "real-usage validation of the Phase 9 loop" as the next concrete dependency - TL;DR updated to list the reflection-loop subcommands - Versioning note records the v0.1.0 -> v0.2.0 bump with the subcommands included Full suite: 215 passing (was 197), 1 warning. The +18 is tests/test_atocore_client.py. Runtime unchanged because the new tests don't touch the DB. What this commit does NOT do ---------------------------- - Does NOT change the server-side endpoints. All 8 subcommands call existing API routes that were shipped in Phase 9 Commits A/B/C. This is purely a client-side wiring commit. - Does NOT run the reflection loop against the live Dalidou instance. That's the next concrete step and is explicitly called out in the open-follow-ups section of the updated doc. - Does NOT modify the Claude Code slash command. It still pulls context only; the capture/extract/queue/promote companion commands (e.g. /atocore-record-response) are deferred until the capture workflow has been exercised in real use at least once. - Does NOT refactor the OpenClaw helper. That's a cross-repo change and remains a queued follow-up, now unblocked by the shared client having the reflection-loop subcommands.
2026-04-08 16:09:42 -04:00
# --- Phase 9 reflection loop surface ------------------------------
elif cmd == "capture":
body: dict[str, Any] = {
"prompt": args.prompt,
"response": args.response,
"project": args.project,
"client": args.client or "atocore-client",
"session_id": args.session_id,
"reinforce": args.reinforce.lower() in {"1", "true", "yes", "y"},
}
print_json(request("POST", "/interactions", body))
elif cmd == "extract":
persist = args.persist.lower() in {"1", "true", "yes", "y"}
print_json(
request(
"POST",
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/extract",
{"persist": persist},
)
)
elif cmd == "reinforce-interaction":
print_json(
request(
"POST",
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/reinforce",
{},
)
)
elif cmd == "list-interactions":
query_parts: list[str] = []
if args.project:
query_parts.append(f"project={urllib.parse.quote(args.project)}")
if args.session_id:
query_parts.append(f"session_id={urllib.parse.quote(args.session_id)}")
if args.client:
query_parts.append(f"client={urllib.parse.quote(args.client)}")
if args.since:
query_parts.append(f"since={urllib.parse.quote(args.since)}")
query_parts.append(f"limit={int(args.limit)}")
query = "?" + "&".join(query_parts)
print_json(request("GET", f"/interactions{query}"))
elif cmd == "get-interaction":
print_json(
request(
"GET",
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}",
)
)
elif cmd == "queue":
query_parts = ["status=candidate"]
if args.memory_type:
query_parts.append(f"memory_type={urllib.parse.quote(args.memory_type)}")
if args.project:
query_parts.append(f"project={urllib.parse.quote(args.project)}")
query_parts.append(f"limit={int(args.limit)}")
query = "?" + "&".join(query_parts)
print_json(request("GET", f"/memory{query}"))
elif cmd == "promote":
print_json(
request(
"POST",
f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/promote",
{},
)
)
elif cmd == "reject":
print_json(
request(
"POST",
f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/reject",
{},
)
)
else:
return 1
return 0
if __name__ == "__main__":
raise SystemExit(main())