Compare commits
2 Commits
775960c8c8
...
ba36a28453
| Author | SHA1 | Date | |
|---|---|---|---|
| ba36a28453 | |||
| 999788b790 |
@@ -6,22 +6,23 @@
|
|||||||
|
|
||||||
## Orientation
|
## Orientation
|
||||||
|
|
||||||
- **live_sha** (Dalidou `/health` build_sha): `c2e7064` (verified 2026-04-15 via /health, build_time 2026-04-15T15:08:51Z)
|
- **live_sha** (Dalidou `/health` build_sha): `775960c` (verified 2026-04-16 via /health, build_time 2026-04-16T17:59:30Z)
|
||||||
- **last_updated**: 2026-04-15 by Claude (deploy caught up; R10/R13 closed)
|
- **last_updated**: 2026-04-16 by Claude ("Make It Actually Useful" sprint — observability + Phase 10)
|
||||||
- **main_tip**: `c2e7064` (plus one pending doc/ledger commit for this session)
|
- **main_tip**: `999788b`
|
||||||
- **test_count**: 299 collected via `pytest --collect-only -q` on a clean checkout, 2026-04-15 (reproduction recipe in Quick Commands)
|
- **test_count**: 303 (4 new Phase 10 tests)
|
||||||
- **harness**: `18/18 PASS`
|
- **harness**: `17/18 PASS` on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression)
|
||||||
- **vectors**: 33,253
|
- **vectors**: 33,253
|
||||||
- **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
|
- **active_memories**: 84 (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
|
||||||
- **candidate_memories**: 2
|
- **candidate_memories**: 2
|
||||||
|
- **interactions**: 234 total (192 claude-code, 38 openclaw, 4 test)
|
||||||
- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08)
|
- **registered_projects**: atocore, p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space (aliased p08)
|
||||||
- **project_state_entries**: 78 total (p04=9, p05=13, p06=13, atocore=43)
|
- **project_state_entries**: 110 total (atocore=47, p06=19, p05=18, p04=15, abb=6, atomizer=5)
|
||||||
- **entities**: 35 (engineering knowledge graph, Layer 2)
|
- **entities**: 35 (engineering knowledge graph, Layer 2)
|
||||||
- **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron, verified
|
- **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron, verified
|
||||||
- **nightly_pipeline**: backup → cleanup → rsync → **OpenClaw import** (NEW) → vault refresh (NEW) → extract → auto-triage → weekly synth/lint Sundays
|
- **nightly_pipeline**: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → auto-triage → **auto-promote/expire (NEW)** → weekly synth/lint Sundays → **retrieval harness (NEW)** → **pipeline summary (NEW)**
|
||||||
- **capture_clients**: claude-code (Stop hook), openclaw (plugin + file importer)
|
- **capture_clients**: claude-code (Stop hook + cwd project inference), openclaw (before_agent_start + llm_output plugin, verified live)
|
||||||
- **wiki**: http://dalidou:8100/wiki (browse), /wiki/projects/{id}, /wiki/entities/{id}, /wiki/search
|
- **wiki**: http://dalidou:8100/wiki (browse), /wiki/projects/{id}, /wiki/entities/{id}, /wiki/search
|
||||||
- **dashboard**: http://dalidou:8100/admin/dashboard
|
- **dashboard**: http://dalidou:8100/admin/dashboard (now shows pipeline health, interaction totals by client, all registered projects)
|
||||||
|
|
||||||
## Active Plan
|
## Active Plan
|
||||||
|
|
||||||
@@ -159,6 +160,12 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
|||||||
|
|
||||||
## Session Log
|
## Session Log
|
||||||
|
|
||||||
|
- **2026-04-16 Claude** `b687e7f..999788b` **"Make It Actually Useful" sprint.** Two-part session: ops fixes then consolidation sprint.
|
||||||
|
|
||||||
|
**Part 1 — Ops fixes:** Deployed `b687e7f` (project inference from cwd). Fixed cron logging (was `/dev/null` — redirected to `~/atocore-logs/`). Fixed OpenClaw gateway crash-loop (`discord.replyToMode: "any"` invalid → `"all"`). Deployed `atocore-capture` plugin on T420 OpenClaw using `before_agent_start` + `llm_output` hooks — verified end-to-end: 38 `client=openclaw` interactions captured. Backfilled project tags on 179/181 unscoped interactions (165 atocore, 8 p06, 6 p04).
|
||||||
|
|
||||||
|
**Part 2 — Sprint (Phase A+C):** Pipeline observability: retrieval harness now runs nightly (Step E), pipeline summary persisted to project state (Step F), dashboard enhanced with interaction totals by client + pipeline health section + dynamic project list. Phase 10 landed: `auto_promote_reinforced()` (candidate→active when reference_count≥3, confidence≥0.7) + `expire_stale_candidates()` (14-day unreinforced→auto-reject), both wired into nightly cron Step B2. Seeding script created (26 entries across 6 projects — all already existed from prior session). Tests 299→303. Harness 17/18 on live Dalidou (p04-constraints expects "Zerodur" — retrieval content gap, not regression). Deployed `775960c`.
|
||||||
|
|
||||||
- **2026-04-15 Claude (pm)** Closed the last harness failure honestly. **p06-tailscale fixed: 18/18 PASS.** Root-caused: not a retrieval bug — the p06 `ARCHITECTURE.md` Overview chunk legitimately mentions "the GigaBIT M1 telescope mirror" because the Polisher Suite is built *for* that mirror. All four retrieved sources for the tailscale prompt were genuinely p06/shared paths; zero actual p04 chunks leaked. The fixture's `expect_absent: GigaBIT` was catching semantic overlap, not retrieval bleed. Narrowed it to `expect_absent: "[Source: p04-gigabit/"` — a source-path check that tests the real invariant (no p04 source chunks in p06 context). Other p06 fixtures still use the word-blacklist form; they pass today because their more-specific prompts don't pull the ARCHITECTURE.md Overview, so I left them alone rather than churn fixtures that aren't failing. Did NOT change retrieval/ranking — no code change, fixture-only fix. Tests unchanged at 299.
|
- **2026-04-15 Claude (pm)** Closed the last harness failure honestly. **p06-tailscale fixed: 18/18 PASS.** Root-caused: not a retrieval bug — the p06 `ARCHITECTURE.md` Overview chunk legitimately mentions "the GigaBIT M1 telescope mirror" because the Polisher Suite is built *for* that mirror. All four retrieved sources for the tailscale prompt were genuinely p06/shared paths; zero actual p04 chunks leaked. The fixture's `expect_absent: GigaBIT` was catching semantic overlap, not retrieval bleed. Narrowed it to `expect_absent: "[Source: p04-gigabit/"` — a source-path check that tests the real invariant (no p04 source chunks in p06 context). Other p06 fixtures still use the word-blacklist form; they pass today because their more-specific prompts don't pull the ARCHITECTURE.md Overview, so I left them alone rather than churn fixtures that aren't failing. Did NOT change retrieval/ranking — no code change, fixture-only fix. Tests unchanged at 299.
|
||||||
|
|
||||||
- **2026-04-15 Claude** Deploy + doc debt sweep. Deployed `c2e7064` to Dalidou (build_time 2026-04-15T15:08:51Z, build_sha matches, /health ok) so R11/R12 are now live, not just on main. **R11 verified on live**: `POST /admin/extract-batch {"mode":"llm"}` against http://127.0.0.1:8100 returns HTTP 503 with the operator-facing "claude CLI not on PATH, run host-side script or use mode=rule" message — exactly the post-fix contract. **R13 closed (fixed)**: added a reproduction recipe to Quick Commands (`pip install -r requirements-dev.txt && pytest --collect-only -q && pytest -q`) and re-cited `test_count: 299` against a fresh local collection on 2026-04-15, so the claim is now auditable from any clean checkout — Codex's audit worktree just needs `pip install -r requirements-dev.txt`. **R10 closed (fixed)**: rewrote the `docs/master-plan-status.md` OpenClaw section to explicitly disclaim "primary integration" and report the current narrow surface: 14 client request shapes against ~44 server routes, predominantly read + `/project/state` + `/ingest/sources`, with memory/interactions/admin/entities/triage/extraction writes correctly out of scope. Open findings now: none blocking. Next natural move: the last harness failure `p06-tailscale` (chunk bleed).
|
- **2026-04-15 Claude** Deploy + doc debt sweep. Deployed `c2e7064` to Dalidou (build_time 2026-04-15T15:08:51Z, build_sha matches, /health ok) so R11/R12 are now live, not just on main. **R11 verified on live**: `POST /admin/extract-batch {"mode":"llm"}` against http://127.0.0.1:8100 returns HTTP 503 with the operator-facing "claude CLI not on PATH, run host-side script or use mode=rule" message — exactly the post-fix contract. **R13 closed (fixed)**: added a reproduction recipe to Quick Commands (`pip install -r requirements-dev.txt && pytest --collect-only -q && pytest -q`) and re-cited `test_count: 299` against a fresh local collection on 2026-04-15, so the claim is now auditable from any clean checkout — Codex's audit worktree just needs `pip install -r requirements-dev.txt`. **R10 closed (fixed)**: rewrote the `docs/master-plan-status.md` OpenClaw section to explicitly disclaim "primary integration" and report the current narrow surface: 14 client request shapes against ~44 server routes, predominantly read + `/project/state` + `/ingest/sources`, with memory/interactions/admin/entities/triage/extraction writes correctly out of scope. Open findings now: none blocking. Next natural move: the last harness failure `p06-tailscale` (chunk bleed).
|
||||||
|
|||||||
@@ -126,25 +126,29 @@ This sits implicitly between Phase 8 (OpenClaw) and Phase 11
|
|||||||
(multi-model). Memory-review and engineering-entity commands are
|
(multi-model). Memory-review and engineering-entity commands are
|
||||||
deferred from the shared client until their workflows are exercised.
|
deferred from the shared client until their workflows are exercised.
|
||||||
|
|
||||||
## What Is Real Today (updated 2026-04-12)
|
## What Is Real Today (updated 2026-04-16)
|
||||||
|
|
||||||
- canonical AtoCore runtime on Dalidou (build_sha tracked, deploy.sh verified)
|
- canonical AtoCore runtime on Dalidou (`775960c`, deploy.sh verified)
|
||||||
- 33,253 vectors across 5 registered projects
|
- 33,253 vectors across 6 registered projects
|
||||||
- project registry with template, proposal, register, update, refresh
|
- 234 captured interactions (192 claude-code, 38 openclaw, 4 test)
|
||||||
- 5 registered projects:
|
- 6 registered projects:
|
||||||
- `p04-gigabit` (483 docs, 5 state entries)
|
- `p04-gigabit` (483 docs, 15 state entries)
|
||||||
- `p05-interferometer` (109 docs, 9 state entries)
|
- `p05-interferometer` (109 docs, 18 state entries)
|
||||||
- `p06-polisher` (564 docs, 9 state entries)
|
- `p06-polisher` (564 docs, 19 state entries)
|
||||||
- `atomizer-v2` (568 docs, newly ingested 2026-04-12)
|
- `atomizer-v2` (568 docs, 5 state entries)
|
||||||
- `atocore` (drive source, 38 state entries)
|
- `abb-space` (6 state entries)
|
||||||
- 47 active memories (16 project, 16 knowledge, 6 adaptation, 3 identity, 3 preference, 3 episodic)
|
- `atocore` (drive source, 47 state entries)
|
||||||
|
- 110 Trusted Project State entries across all projects (decisions, requirements, facts, contacts, milestones)
|
||||||
|
- 84 active memories (31 project, 23 knowledge, 10 episodic, 8 adaptation, 7 preference, 5 identity)
|
||||||
- context pack assembly with 4 tiers: Trusted Project State > identity/preference > project memories > retrieved chunks
|
- context pack assembly with 4 tiers: Trusted Project State > identity/preference > project memories > retrieved chunks
|
||||||
- query-relevance memory ranking with overlap-density scoring
|
- query-relevance memory ranking with overlap-density scoring
|
||||||
- retrieval eval harness: 18 fixtures, 17/18 passing
|
- retrieval eval harness: 18 fixtures, 17/18 passing on live
|
||||||
- 290 tests passing
|
- 303 tests passing
|
||||||
- nightly pipeline: backup → cleanup → rsync → LLM extraction (sonnet) → auto-triage
|
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
|
||||||
|
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
|
||||||
|
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
|
||||||
- off-host backup to clawdbot (T420) via rsync
|
- off-host backup to clawdbot (T420) via rsync
|
||||||
- both Claude Code and OpenClaw capture interactions to AtoCore
|
- both Claude Code and OpenClaw capture interactions to AtoCore (OpenClaw via `before_agent_start` + `llm_output` plugin, verified live)
|
||||||
- DEV-LEDGER.md as shared operating memory between Claude and Codex
|
- DEV-LEDGER.md as shared operating memory between Claude and Codex
|
||||||
- observability dashboard at GET /admin/dashboard
|
- observability dashboard at GET /admin/dashboard
|
||||||
|
|
||||||
@@ -152,26 +156,28 @@ deferred from the shared client until their workflows are exercised.
|
|||||||
|
|
||||||
These are the current practical priorities.
|
These are the current practical priorities.
|
||||||
|
|
||||||
1. **Observe and stabilize** — let the nightly pipeline run for a week,
|
1. **Observe the enhanced pipeline** — let the nightly pipeline run for a
|
||||||
check the dashboard daily, verify memories accumulate correctly
|
week with the new harness + summary + auto-promote steps. Check the
|
||||||
from organic Claude Code and OpenClaw use
|
dashboard daily. Verify pipeline summary populates correctly.
|
||||||
2. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
|
2. **Knowledge density** — run batch extraction over the full 234
|
||||||
|
interactions (`--since 2026-01-01`) to mine the backlog for knowledge.
|
||||||
|
Target: 100+ active memories.
|
||||||
|
3. **Multi-model triage** (Phase 11 entry) — switch auto-triage to a
|
||||||
different model than the extractor for independent validation
|
different model than the extractor for independent validation
|
||||||
3. **Automated eval in cron** (Phase 12 entry) — add retrieval harness
|
4. **Fix p04-constraints harness failure** — retrieval doesn't surface
|
||||||
to the nightly cron so regressions are caught automatically
|
"Zerodur" for p04 constraint queries. Investigate if it's a missing
|
||||||
4. **Atomizer-v2 state entries** — curate Trusted Project State for the
|
memory or retrieval ranking issue.
|
||||||
newly ingested Atomizer knowledge base
|
|
||||||
|
|
||||||
## Next
|
## Next
|
||||||
|
|
||||||
These are the next major layers after the current stabilization pass.
|
These are the next major layers after the current stabilization pass.
|
||||||
|
|
||||||
1. Phase 10 Write-back — confidence-based auto-promotion from
|
1. Phase 6 AtoDrive — clarify Google Drive as a trusted operational
|
||||||
reinforcement signal (a memory reinforced N times auto-promotes)
|
|
||||||
2. Phase 6 AtoDrive — clarify Google Drive as a trusted operational
|
|
||||||
source and ingest from it
|
source and ingest from it
|
||||||
3. Phase 13 Hardening — Chroma backup policy, monitoring, alerting,
|
2. Phase 13 Hardening — Chroma backup policy, monitoring, alerting,
|
||||||
failure visibility beyond log files
|
failure visibility beyond log files
|
||||||
|
3. Engineering V1 implementation sprint — once knowledge density is
|
||||||
|
sufficient and the pipeline feels boring and dependable
|
||||||
|
|
||||||
## Later
|
## Later
|
||||||
|
|
||||||
@@ -193,9 +199,10 @@ These remain intentionally deferred.
|
|||||||
plugin now exists (`openclaw-plugins/atocore-capture/`), interactions
|
plugin now exists (`openclaw-plugins/atocore-capture/`), interactions
|
||||||
flow. Write-back of promoted memories back to OpenClaw's own memory
|
flow. Write-back of promoted memories back to OpenClaw's own memory
|
||||||
system is still deferred.
|
system is still deferred.
|
||||||
- ~~automatic memory promotion~~ — auto-triage now handles promote/reject
|
- ~~automatic memory promotion~~ — Phase 10 complete: auto-triage handles
|
||||||
for extraction candidates. Reinforcement-based auto-promotion
|
extraction candidates, reinforcement-based auto-promotion graduates
|
||||||
(Phase 10) is the remaining piece.
|
candidates referenced 3+ times to active, stale candidates expire
|
||||||
|
after 14 days unreinforced.
|
||||||
- ~~reflection loop integration~~ — fully operational: capture (both
|
- ~~reflection loop integration~~ — fully operational: capture (both
|
||||||
clients) → reinforce (automatic) → extract (nightly cron, sonnet) →
|
clients) → reinforce (automatic) → extract (nightly cron, sonnet) →
|
||||||
auto-triage (nightly, sonnet) → only needs_human reaches the user.
|
auto-triage (nightly, sonnet) → only needs_human reaches the user.
|
||||||
|
|||||||
63
openclaw-plugins/atocore-capture/handler.js
Normal file
63
openclaw-plugins/atocore-capture/handler.js
Normal file
@@ -0,0 +1,63 @@
|
|||||||
|
/**
|
||||||
|
* AtoCore capture hook for OpenClaw.
|
||||||
|
*
|
||||||
|
* Listens on message:received (buffer prompt) and message:sent (POST pair).
|
||||||
|
* Fail-open: errors are caught silently.
|
||||||
|
*/
|
||||||
|
|
||||||
|
const BASE_URL = process.env.ATOCORE_BASE_URL || "http://dalidou:8100";
|
||||||
|
const MIN_LEN = 15;
|
||||||
|
const MAX_RESP = 50000;
|
||||||
|
|
||||||
|
let lastPrompt = null; // simple single-slot buffer
|
||||||
|
|
||||||
|
const atocoreCaptureHook = async (event) => {
|
||||||
|
try {
|
||||||
|
if (process.env.ATOCORE_CAPTURE_DISABLED === "1") return;
|
||||||
|
|
||||||
|
if (event.type === "message" && event.action === "received") {
|
||||||
|
const content = (event.context?.content || "").trim();
|
||||||
|
if (content.length >= MIN_LEN && !content.startsWith("<")) {
|
||||||
|
lastPrompt = { text: content, ts: Date.now() };
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (event.type === "message" && event.action === "sent") {
|
||||||
|
if (!event.context?.success) return;
|
||||||
|
const response = (event.context?.content || "").trim();
|
||||||
|
if (!response || !lastPrompt) return;
|
||||||
|
|
||||||
|
// Discard stale prompts (>5 min old)
|
||||||
|
if (Date.now() - lastPrompt.ts > 300000) {
|
||||||
|
lastPrompt = null;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const prompt = lastPrompt.text;
|
||||||
|
lastPrompt = null;
|
||||||
|
|
||||||
|
const body = JSON.stringify({
|
||||||
|
prompt,
|
||||||
|
response: response.length > MAX_RESP
|
||||||
|
? response.slice(0, MAX_RESP) + "\n\n[truncated]"
|
||||||
|
: response,
|
||||||
|
client: "openclaw",
|
||||||
|
session_id: event.sessionKey || "",
|
||||||
|
project: "",
|
||||||
|
reinforce: true,
|
||||||
|
});
|
||||||
|
|
||||||
|
fetch(BASE_URL.replace(/\/$/, "") + "/interactions", {
|
||||||
|
method: "POST",
|
||||||
|
headers: { "Content-Type": "application/json" },
|
||||||
|
body,
|
||||||
|
signal: AbortSignal.timeout(10000),
|
||||||
|
}).catch(() => {});
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
// fail-open: never crash the gateway
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
export default atocoreCaptureHook;
|
||||||
Reference in New Issue
Block a user