ATOCore

Author	SHA1	Message	Date
Anto01	a87d9845a8	fix(memory): widen query-time context candidates	2026-04-24 20:29:00 -04:00
Antoine Letarte	4744c69d10	fix(memory): rank project memories by query intent	2026-04-24 20:49:53 +00:00
Anto01	05c11fd4fb	fix(retrieval): fail open on registry resolution errors	2026-04-24 11:32:46 -04:00
Anto01	ce6ffdbb63	fix(retrieval): preserve project ids across unscoped ingest	2026-04-24 11:22:13 -04:00
Anto01	c03022d864	feat(retrieval): persist explicit chunk project ids	2026-04-24 11:02:30 -04:00
Anto01	c7212900b0	fix(retrieval): enforce project-scoped context boundaries	2026-04-24 10:46:56 -04:00
Anto01	0989fed9ee	fix(api): R14 — promote route translates V1-0 ValueError to 400 Squash-merge of branch claude/r14-promote-400 (`3888db9`), approved by Codex (no findings, targeted suite 15 passed). POST /entities/{id}/promote now wraps promote_entity in try/except ValueError → HTTPException(400). Previously the V1-0 provenance re-check raised ValueError that the route didn't catch, so legacy no-provenance candidates promoted via the API surfaced as 500 instead of 400. Matches the existing ValueError → 400 handling on POST /entities (routes.py:1490). New regression test test_api_promote_returns_400_on_legacy_no_provenance inserts a pre-V1-0 candidate directly, POSTs promote, asserts 400 with the expected detail, asserts the row stays candidate. Also adds .obsidian/, .vscode/, .idea/ to .gitignore so editor state doesn't sneak into future commits. Test count: 547 → 548. Closes R14. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 11:20:15 -04:00
Anto01	2712c5d2d0	feat(engineering): enforce V1-0 write invariants	2026-04-22 14:59:17 -04:00
Anto01	e147ab2abd	feat(wiki): [[wikilinks]] with redlinks + cross-project resolver (Issue B) Last P2 from Antoine's "daily-usable" sprint. Entities referenced via [[Name]] in descriptions or mirror markdown now render as: - live wikilink if the name matches an entity in the same project - live cross-project link with "(in project X)" scope indicator if the only match is in another project - red italic redlink pointing at /wiki/new?name=... otherwise Clicking a redlink opens a pre-filled "create this entity" form that POSTs to /v1/entities and redirects to the new entity's page. - engineering/wiki.py: _wikilink_transform + _resolve_wikilink, applied in render_project (pre-markdown) and render_entity (description body). render_new_entity_form for the create page. CSS for .wikilink / .wikilink-cross / .redlink / .new-entity-form - api/routes.py: GET /wiki/new?name&project - tests/test_wikilinks.py: 12 tests including the spec regression (A references [[B]] -> redlink; create B -> link becomes live) - DEV-LEDGER.md: session log + test_count 521 -> 533 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 09:15:14 -04:00
Anto01	b94f9dff56	feat(api): PATCH /entities/{id} + /v1/engineering/* aliases PATCH lets users edit an active entity's description, properties, confidence, and source_refs without cloning — closes the duplicate-trap half-fixed by /invalidate + /supersede. Issue D just adds the /engineering/* query surface to the /v1 allowlist. - engineering/service.py: update_entity supports description replace, properties shallow merge with null-delete semantics, confidence 0..1 bounds check, source_refs dedup-append. Writes audit row - api/routes.py: PATCH /entities/{id} with EntityPatchRequest - main.py: engineering/* query endpoints aliased under /v1 (Issue D) - tests/test_patch_entity.py: 12 tests (merge, null-delete, bounds, dedup, 404, audit, v1 alias) - DEV-LEDGER.md: session log + test_count 509 -> 521 Forbidden fields via PATCH (by design): entity_type, project, name, status. Use supersede+create or the dedicated status endpoints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 09:02:13 -04:00
Anto01	081c058f77	feat(api): invalidate + supersede for active entities and memories (Issue E) Public retraction path so mistakes can be corrected without SQL. Unblocks the correction workflows that the live AKC p05 session exposed. - engineering/service.py: invalidate_active_entity returns (ok, code) with codes invalidated/already_invalid/not_active/not_found for clean HTTP mapping. supersede_entity gains superseded_by + auto-creates the supersedes relationship (new SUPERSEDES old), rejects self-supersede - memory/service.py: invalidate_memory/supersede_memory accept reason string that lands in audit note - api/routes.py: POST /entities/{id}/invalidate, /supersede; POST /memory/{id}/invalidate, /supersede (all 4 behind /v1 aliases) - tests/test_invalidate_supersede.py: 15 tests (idempotency, 404/409, supersede relationship auto-creation, self-supersede rejection, missing-replacement rejection, v1 alias presence) - DEV-LEDGER.md: session log + test_count 494 -> 509 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:56:24 -04:00
Anto01	069d155585	feat(assets): binary asset store + artifact entity + wiki evidence (Issue F) Wires visual evidence into the knowledge graph. Images, PDFs, and CAD exports can now be uploaded, deduped by SHA-256, thumbnailed, linked to entities via EVIDENCED_BY, and rendered inline on wiki pages. Unblocks AKC uploading voice-session screenshots alongside extracted entities. - assets/ module: store_asset (hash dedup + MIME allowlist + 20 MB cap), get_asset_binary, get_thumbnail (Pillow, on-disk cache under .thumbnails/<size>/), list_orphan_assets, invalidate_asset - models/database.py: new `assets` table + indexes - engineering/service.py: `artifact` added to ENTITY_TYPES - api/routes.py: POST /assets (multipart), GET /assets/{id}, /assets/{id}/thumbnail, /assets/{id}/meta, /admin/assets/orphans, DELETE /assets/{id} (409 if still referenced), GET /entities/{id}/evidence (EVIDENCED_BY artifacts with asset meta) - main.py: all new paths aliased under /v1 - engineering/wiki.py: entity pages render EVIDENCED_BY → artifact as a "Visual evidence" thumbnail strip; artifact pages render the full image + caption + capture_context - deploy/dalidou/docker-compose.yml: bind-mount ${ATOCORE_ASSETS_DIR} - config.py: assets_dir + assets_max_upload_bytes settings - requirements.txt + pyproject.toml: python-multipart, Pillow>=10.0.0 - tests/test_assets.py: 16 tests (dedup, cap, thumbnail cache, orphan detection, invalidate gating, API upload/fetch, evidence, v1 aliases, wiki rendering) - DEV-LEDGER.md: session log + cleanup note + test_count 478 -> 494 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:46:52 -04:00
Anto01	b1a3dd071e	feat(entities): inbox + cross-project (project="") support (Issue C) Makes `inbox` a reserved pseudo-project and `project=""` a first-class cross-project bucket. Unblocks AKC capturing pre-project leads/quotes and cross-project facts (materials, vendors) that don't fit a single registered project. - projects/registry.py: INBOX_PROJECT/GLOBAL_PROJECT constants, is_reserved_project(), register/update guards, resolve_project_name passthrough for "inbox" - engineering/service.py: get_entities scoping rules (inbox-only, global-only, real+global default, scope_only=true strict). promote_entity accepts target_project to retarget on promote - api/routes.py: GET /entities gains scope_only; POST /entities accepts project=null as ""; POST /entities/{id}/promote accepts {target_project, note} - engineering/wiki.py: homepage shows "Inbox & Global" cards with live counts linking to scoped lists - tests/test_inbox_crossproject.py: 15 tests (reserved enforcement, scoping rules, API shape, promote retargeting) - DEV-LEDGER.md: session log, test_count 463 -> 478 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 20:17:32 -04:00
Anto01	5fbd7e6094	feat(api): /v1 alias router for stable external contract (Issue A) Mounts an explicit allowlist of public handlers under /v1 alongside the existing unversioned paths. External clients (AKC, OpenClaw, future tools) should target /v1; internal callers (hooks, wiki, admin UI) keep working unchanged. Breaking schema changes will bump the prefix to /v2. - src/atocore/main.py: _V1_PUBLIC_PATHS allowlist + second router - tests/test_v1_aliases.py: parity + OpenAPI presence (5 tests) - README.md: API versioning section - DEV-LEDGER.md: session log, test_count 459 -> 463 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 20:04:46 -04:00
Anto01	90001c1956	fix(7A): host-side memory_dedup.py must stay stdlib-only Broke the dedup-watcher cron when I wrote memory_dedup.py in session 7A: imported atocore.memory.similarity, which transitively pulls sentence-transformers + pydantic_settings onto host Python that intentionally doesn't have them. Every UI-triggered + cron dedup scan since 7A deployed was silently crashing with ModuleNotFoundError (visible only in /home/papa/atocore-logs/dedup-ondemand-*.log). I even documented this architecture rule in atocore.memory._llm_prompt ('This module MUST stay stdlib-only') then violated it one session later. Shame. Real fix — matches the extractor pattern: - New endpoint POST /admin/memory/dedup-cluster on the server: takes {project, similarity_threshold, max_clusters}, runs the embedding + transitive-clustering inside the container where sentence-transformers lives, returns cluster shape. - scripts/memory_dedup.py now pure stdlib: pulls clusters via HTTP, LLM-drafts merges via claude CLI, POSTs proposals back. No atocore imports beyond the stdlib-only _dedup_prompt shared module. - Regression test pins the rule: test_memory_dedup_script_is_stdlib_only snapshots sys.modules before/after importing the script and asserts no non-allowed atocore modules were pulled. Also: similarity.py + cluster_by_threshold stay server-side, still covered by the same tests that used to live in the host tier-helper section. Tests 459 → 458 (-1 via rewrite of obsolete host-tier helper tests, +2 for the new stdlib-only regression + endpoint shape tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 16:18:00 -04:00
Anto01	9c91d778d9	feat: Claude Code context injection (UserPromptSubmit hook) Closes the asymmetry the user surfaced: before this, Claude Code captured every turn (Stop hook) but retrieval only happened when Claude chose to call atocore_context (opt-in MCP tool). OpenClaw had both sides covered after 7I; Claude Code did not. Now symmetric. Every Claude Code prompt is auto-sent to /context/build and the returned pack is prepended via hookSpecificOutput.additionalContext — same as what OpenClaw's before_agent_start hook now does. - deploy/hooks/inject_context.py — UserPromptSubmit hook. Fail-open (always exit 0). Skips short/XML prompts. 5s timeout. Project inference mirrors capture_stop.py cwd→slug table. Kill switch: ATOCORE_CONTEXT_DISABLED=1. - ~/.claude/settings.json registered the hook (local config, not committed; copy-paste snippet in docs/capture-surfaces.md). - Removed /wiki/capture from topnav. Endpoint still exists but the page is now labeled "fallback only" with a warning banner. The sanctioned surfaces are Claude Code + OpenClaw; manual paste is explicitly not the design. - docs/capture-surfaces.md — scope statement: two surfaces, nothing else. Anthropic API polling explicitly prohibited. Tests: +8 for inject_context.py (exit 0 on all failure modes, kill switch, short prompt filter, XML filter, bad stdin, mock-server success shape, project inference from cwd). Updated 2 wiki tests for the topnav change. 450 → 459. Verified live with real AtoCore: injected 2979 chars of atocore project context on a cwd-matched prompt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 12:01:41 -04:00
Anto01	6e43cc7383	feat: Phase 7I + UI refresh (capture form, memory/domain/activity pages, topnav) Closes three gaps the user surfaced: (1) OpenClaw agents run blind without AtoCore context, (2) mobile/desktop chats can't be captured at all, (3) wiki UI hadn't kept up with backend capabilities. Phase 7I — OpenClaw two-way bridge - Plugin now calls /context/build on before_agent_start and prepends the context pack to event.prompt, so whatever LLM runs underneath (sonnet, opus, codex, local model) answers grounded in AtoCore knowledge. Captured prompt stays the user's original text; fail-open with a 5s timeout. Config-gated via injectContext flag. - Plugin version 0.0.0 → 0.2.0; README rewritten. UI refresh - /wiki/capture — paste-to-ingest form for Claude Desktop / web / mobile / ChatGPT / other. Goes through normal /interactions pipeline with client="claude-desktop\|claude-web\|claude-mobile\|chatgpt\|other". Fixes the rotovap/mushroom-on-phone gap. - /wiki/memories/{id} (Phase 7E) — full memory detail: content, status, confidence, refs, valid_until, domain_tags (clickable to domain pages), project link, source chunk, graduated-to-entity link, full audit trail, related-by-tag neighbors. - /wiki/domains/{tag} (Phase 7F) — cross-project view: all active memories with the given tag grouped by project, sorted by count. Case-insensitive, whitespace-tolerant. Also surfaces graduated entities carrying the tag. - /wiki/activity — autonomous-activity timeline feed. Summary chips by action (created/promoted/merged/superseded/decayed/canonicalized) and by actor (auto-dedup-tier1, auto-dedup-tier2, confidence-decay, phase10-auto-promote, transient-to-durable, tag-canon, human-triage). Answers "what has the brain been doing while I was away?" - Home refresh: persistent topnav (Home · Activity · Capture · Triage · Dashboard), "What the brain is doing" snippet above project cards showing recent autonomous-actor counts, link to full activity. Tests: +10 (capture page, memory detail + 404, domain cross-project + empty + tag normalization, activity feed + groupings, home topnav, superseded-source detail after merge). 440 → 450. Known next: capture-browser extension for Claude.ai web (bigger project, deferred); voice/mobile relay (adjacent). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 10:14:15 -04:00
Anto01	877b97ec78	feat: Phase 7C — tag canonicalization (autonomous, weekly) LLM proposes alias→canonical mappings for domain_tags; confidence >= 0.8 auto-apply, below goes to human triage. Protects project identifiers (p04, p05, p06, atocore, apm, etc.) from ever being canonicalized since they're their own namespace, not concepts. Problem solved: tag drift fragments retrieval. "fw" vs "firmware" vs "firmware-control" all mean the same thing, but cross-cutting queries that filter by tag only hit one variant. Weekly canonicalization pass keeps the tag graph clean. - Schema: tag_aliases table (pending \| approved \| rejected) - atocore.memory._tag_canon_prompt (stdlib-only, protected project tokens) - service: get_tag_distribution, apply_tag_alias (atomic per-memory, dedupes if both alias + canonical present), create / approve / reject proposal lifecycle, per-memory audit rows with action="tag_canonicalized" - scripts/canonicalize_tags.py: host-side detector, autonomous by default, --no-auto-approve kill switch - 6 API endpoints under /admin/tags/* (distribution, list, propose, apply, approve/{id}, reject/{id}) - Step B4 in batch-extract.sh (Sundays only — weekly cadence) - 26 new tests (prompt parser, normalizer protections, distribution counting, rewrite atomicity, dedup, audit, lifecycle). 414 → 440. Design: aggressive protection of project tokens because a false canonicalization (p04 → p04-gigabit, or vice versa) would scramble cross-project filtering. Err toward preservation; the alias only applies if the model is very confident AND both strings appear in the current distribution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 09:41:02 -04:00
Anto01	e840ef4be3	feat: Phase 7D — confidence decay on unreferenced cold memories Daily job multiplies confidence by 0.97 (~2-month half-life) for active memories with reference_count=0 AND idle > 30 days. Below 0.3 → auto-supersede with audit. Reversible via reinforcement (which already bumps confidence back up). Rationale: stale memories currently rank equal to fresh ones in retrieval. Without decay, the brain accumulates obsolete facts that compete with fresh knowledge for context-pack slots. With decay, memories earn their longevity via reference. - decay_unreferenced_memories() in service.py (stdlib-only, no cron infra needed) - POST /admin/memory/decay-run endpoint - Nightly Step F4 in batch-extract.sh - Exempt: reinforced (refcount > 0), graduated, superseded, invalid - Audit row per supersession ("decayed below floor, no references"), actor="confidence-decay". Per-decay rows skipped (chatty, no human value — status change is the meaningful signal). - Configurable via env: ATOCORE_DECAY_* (exposed through endpoint body) Tests: +13 (basic decay, reinforcement protection, supersede at floor, audit trail, graduated/superseded exemption, reinforcement reversibility, threshold tuning, parameter validation, cross-run stacking). 401 → 414. Next in Phase 7: 7C tag canonicalization (weekly), then 7B contradiction detection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 16:50:20 -04:00
Anto01	56d5df0ab4	feat: Phase 7A.1 — autonomous merge tiering (sonnet → opus → human) Dedup detector now merges high-confidence duplicates silently instead of piling every proposal into a human triage queue. Matches the 3-tier escalation pattern that auto_triage already uses. Tiering decision per cluster: TIER-1 auto-approve: sonnet confidence >= 0.8 AND min_pairwise_sim >= 0.92 AND all sources share project+type → auto-merge silently (actor="auto-dedup-tier1" in audit log) TIER-2 escalation: sonnet 0.5-0.8 conf OR sim 0.85-0.92 → opus second opinion. Opus confirms with conf >= 0.8 → auto-merge (actor="auto-dedup-tier2"). Opus overrides (reject) → skip silently. Opus low conf → human triage with opus's refined draft. HUMAN triage: Only the genuinely ambiguous land in /admin/triage. Env-tunable thresholds: ATOCORE_DEDUP_AUTO_APPROVE_CONF (0.8) ATOCORE_DEDUP_AUTO_APPROVE_SIM (0.92) ATOCORE_DEDUP_TIER2_MIN_CONF (0.5) ATOCORE_DEDUP_TIER2_MIN_SIM (0.85) ATOCORE_DEDUP_TIER2_MODEL (opus) New flag --no-auto-approve for kill-switch testing (everything → human queue). Tests: +6 (tier-2 prompt content, same_bucket edges, min_pairwise_similarity on identical + transitive clusters). 395 → 401. Rationale: user asked for autonomous behavior — "this needs to be intelligent, I don't want to manually triage stuff". Matches the consolidation principle: never discard details, but let the brain tidy up on its own for the easy cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 15:46:26 -04:00
Anto01	028d4c3594	feat: Phase 7A — semantic memory dedup ("sleep cycle" V1) New table memory_merge_candidates + service functions to cluster near-duplicate active memories within (project, memory_type) buckets, draft a unified content via LLM, and merge on human approval. Source memories become superseded (never deleted); merged memory carries union of tags, max of confidence, sum of reference_count. - schema migration for memory_merge_candidates - atocore.memory.similarity: cosine + transitive clustering - atocore.memory._dedup_prompt: stdlib-only LLM prompt preserving every specific - service: merge_memories / create_merge_candidate / get_merge_candidates / reject_merge_candidate - scripts/memory_dedup.py: host-side detector (HTTP-only, idempotent) - 5 API endpoints under /admin/memory/merge-candidates* + /admin/memory/dedup-scan - triage UI: purple "🔗 Merge Candidates" section + "🔗 Scan for duplicates" bar - batch-extract.sh Step B3 (0.90 daily, 0.85 Sundays) - deploy/dalidou/dedup-watcher.sh for UI-triggered scans - 21 new tests (374 → 395) - docs/PHASE-7-MEMORY-CONSOLIDATION.md covering 7A-7H roadmap Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:30:49 -04:00
Anto01	02055e8db3	feat: Phase 6 — Living Taxonomy + Universal Capture Closes two real-use gaps: 1. "APM tool" gap: work done outside Claude Code (desktop, web, phone, other machine) was invisible to AtoCore. 2. Project discovery gap: manual JSON-file edits required to promote an emerging theme to a first-class project. B — atocore_remember MCP tool (scripts/atocore_mcp.py): - New MCP tool for universal capture from any MCP-aware client (Claude Desktop, Code, Cursor, Zed, Windsurf, etc.) - Accepts content (required) + memory_type/project/confidence/ valid_until/domain_tags (all optional with sensible defaults) - Creates a candidate memory, goes through the existing 3-tier triage (no bypass — the quality gate catches noise) - Detailed tool description guides Claude on when to invoke: "remember this", "save that for later", "don't lose this fact" - Total tools exposed by MCP server: 14 → 15 C.1 Emerging-concepts detector (scripts/detect_emerging.py): - Nightly scan of active + candidate memories for: * Unregistered project names with ≥3 memory occurrences * Top 20 domain_tags by frequency (emerging categories) * Active memories with reference_count ≥ 5 + valid_until set (reinforced transients — candidates for extension) - Writes findings to atocore/proposals/* project state entries - Emits "warning" alert via Phase 4 framework the FIRST time a new project crosses the 5-memory alert threshold (avoids spam) - Configurable via env vars: ATOCORE_EMERGING_PROJECT_MIN (default 3), ATOCORE_EMERGING_ALERT_THRESHOLD (default 5), TOP_TAGS_LIMIT (20) C.2 Registration surface (src/atocore/api/routes.py + wiki.py): - POST /admin/projects/register-emerging — one-click register with sensible defaults (ingest_roots auto-filled with vault:incoming/projects/<id>/ convention). Clears the proposal from the dashboard list on success. - Dashboard /admin/dashboard: new "proposals" section with unregistered_projects + emerging_categories + reinforced_transients. - Wiki homepage: "📋 Emerging" section rendering each unregistered project as a card with count + 2 sample memory previews + inline "📌 Register as project" button that calls the endpoint via fetch, reloads the page on success. C.3 Transient-to-durable extension (src/atocore/memory/service.py + API + cron): - New extend_reinforced_valid_until() function — scans active memories with valid_until in the next 30 days and reference_count ≥ 5. Extends expiry by 90 days. If reference_count ≥ 10, clears expiry entirely (makes permanent). Writes audit rows via the Phase 4 memory_audit framework with actor="transient-to-durable". - POST /admin/memory/extend-reinforced — API wrapper for cron. - Matches the user's intuition: "something transient becomes important if you keep coming back to it". Nightly cron (deploy/dalidou/batch-extract.sh): - Step F2: detect_emerging.py (after F pipeline summary) - Step F3: /admin/memory/extend-reinforced (before integrity check) - Both fail-open; errors don't break the pipeline. Tests: 366 → 374 (+8 for Phase 6): - 6 tests for extend_reinforced_valid_until covering: extension path, permanent path, skip far-future, skip low-refs, skip permanent memories, audit row write - 2 smoke tests for the detector (imports cleanly, handles empty DB) - MCP tool changes don't need new tests — the wrapper is pure passthrough Design decisions documented in plan file: - atocore_remember deliberately doesn't bypass triage (quality gate) - Detector is passive (surfaces proposals) not active (auto-registers) - Sensible ingest-root defaults ("vault:incoming/projects/<id>/") so registration is one-click with no file-path thinking - Extension adds 90 days rather than clearing expiry (gradual permanence earned through sustained reinforcement) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 08:08:55 -04:00
Anto01	3ca19724a5	feat: 3-tier triage escalation + project validation + enriched context Addresses the triage-quality problems the user observed: - Candidates getting wrong project/product attribution - Stale facts promoted as if still true - "Hard to decide" items reaching human queue without real value Solution: let sonnet handle the easy 80%, escalate borderline cases to opus, auto-discard (or flag) what two models can't resolve. Plus enrich the context the triage model sees so it can catch misattribution, contradictions, and temporal drift earlier. THE 3-TIER FLOW (scripts/auto_triage.py): Tier 1: sonnet (fast, cheap) - confidence >= 0.8 + clear verdict → PROMOTE or REJECT (done) - otherwise → escalate to tier 2 Tier 2: opus (smarter, sees tier-1 verdict + reasoning) - second opinion with explicit "sonnet said X, resolve the uncertainty" - confidence >= 0.8 → PROMOTE or REJECT with note="[opus]" - still uncertain → tier 3 Tier 3: configurable (default discard) - ATOCORE_TRIAGE_TIER3=discard (default): auto-reject with "two models couldn't decide" reason - ATOCORE_TRIAGE_TIER3=human: leave in queue for /admin/triage Configuration via env vars: ATOCORE_TRIAGE_MODEL_TIER1 (default sonnet) ATOCORE_TRIAGE_MODEL_TIER2 (default opus) ATOCORE_TRIAGE_TIER3 (default discard) ATOCORE_TRIAGE_ESCALATION_THRESHOLD (default 0.75) ATOCORE_TRIAGE_TIER2_TIMEOUT_S (default 120 — opus is slower) ENRICHED CONTEXT shown to the triage model (both tiers): - List of registered project ids so misattribution is detectable - Trusted project state entries (ground truth, higher trust than memories) - Top 30 active memories for the claimed project (was 20) - Tier 2 additionally sees tier 1's verdict + reason PROJECT MISATTRIBUTION DETECTION: - Triage prompt asks the model to output "suggested_project" when it detects the claimed project is wrong but the content clearly belongs to a registered one - Main loop auto-applies the fix via PUT /memory/{id} (which canonicalizes through the registry) - Misattribution is the #1 pollution source — this catches it upstream TEMPORAL AGGRESSIVENESS: - Prompt upgraded: "be aggressive with valid_until for anything that reads like 'current state' or 'this week'. When in doubt, 2-4 week expiry rather than null." - Stale facts decay automatically via Phase 3's expiry filter CONFIDENCE GRADING (new in prompt): - 0.9+: crystal clear durable fact or clear noise - 0.75-0.9: confident but not cryptographic-certain - 0.6-0.75: borderline — WILL escalate - <0.6: genuinely ambiguous — human or discard Tests: 356 → 366 (10 new, all in test_triage_escalation.py): - High-confidence tier-1 promote/reject → no tier-2 call - Low-confidence tier-1 → tier-2 escalates → decides - needs_human always escalates regardless of confidence - tier-2 uncertain → discard by default - tier-2 uncertain → human when configured - dry-run skips all API calls - suggested_project flag surfaces + gets printed - parse_verdict captures suggested_project Runtime behavior unchanged for the clear cases (sonnet still handles them). The 20-30% of candidates that currently land as needs_human will now route through opus, and only the genuinely stuck get a human (or discard) action. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 09:09:58 -04:00
Anto01	3316ff99f9	feat: Phase 5F/5G/5H — graduation, conflicts, MCP engineering tools The population move + the safety net + the universal consumer hookup, all shipped together. This is where the engineering graph becomes genuinely useful against the real 262-memory corpus. 5F: Memory → Entity graduation (THE population move) - src/atocore/engineering/_graduation_prompt.py: stdlib-only shared prompt module mirroring _llm_prompt.py pattern (container + host use same system prompt, no drift) - scripts/graduate_memories.py: host-side batch driver that asks claude-p "does this memory describe a typed entity?" and creates entity candidates with source_refs pointing back to the memory - promote_entity() now scans source_refs for memory:* prefix; if found, flips source memory to status='graduated' with graduated_to_entity_id forward pointer + writes memory_audit row - GET /admin/graduation/stats exposes graduation rate for dashboard 5G: Sync conflict detection on entity promote - src/atocore/engineering/conflicts.py: detect_conflicts_for_entity() runs on every active promote. V1 checks 3 slot kinds narrowly to avoid false positives: * component.material (multiple USES_MATERIAL edges) * component.part_of (multiple PART_OF edges) * requirement.name (duplicate active Requirements in same project) - Conflicts + members persist via the tables built in 5A - Fires a "warning" alert via Phase 4 framework - Deduplicates: same (slot_kind, slot_key) won't get a new row - resolve_conflict(action="dismiss\|supersede_others\|no_action"): supersede_others marks non-winner members as status='superseded' - GET /admin/conflicts + POST /admin/conflicts/{id}/resolve 5H: MCP + context pack integration - scripts/atocore_mcp.py: 7 new engineering tools exposed to every MCP-aware client (Claude Desktop, Claude Code, Cursor, Zed): * atocore_engineering_map (Q-001/004 system tree) * atocore_engineering_gaps (Q-006/009/011 killer queries — THE director's question surfaced as a built-in tool) * atocore_engineering_requirements_for_component (Q-005) * atocore_engineering_decisions (Q-008) * atocore_engineering_changes (Q-013 — reads entity audit log) * atocore_engineering_impact (Q-016 BFS downstream) * atocore_engineering_evidence (Q-017 inbound provenance) - MCP tools total: 14 (7 memory/state/health + 7 engineering) - context/builder.py _build_engineering_context now appends a compact gaps summary ("Gaps: N orphan reqs, M risky decisions, K unsupported claims") so every project-scoped LLM call sees "what we're missing" Tests: 341 → 356 (15 new): - 5F: graduation prompt parses positive/negative decisions, rejects unknown entity types, tolerates markdown fences; promote_entity marks source memory graduated with forward pointer; entity without memory refs promotes cleanly - 5G: component.material + component.part_of + requirement.name conflicts detected; clean component triggers nothing; dedup works; supersede_others resolution marks losers; dismiss leaves both active; end-to-end promote triggers detection - 5H: graduation user message includes project + type + content No regressions across the 341 prior tests. The MCP server now answers "which p05 requirements aren't satisfied?" directly from any Claude session — no user prompt engineering, no context hacks. Next to kick off from user: run graduation script on Dalidou to populate the graph from 262 existing memories: ssh papa@dalidou 'cd /srv/storage/atocore/app && PYTHONPATH=src \ python3 scripts/graduate_memories.py --project p05-interferometer --limit 30 --dry-run' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 07:53:03 -04:00
Anto01	53b71639ad	feat: Phase 5B-5D — 10 canonical engineering queries + triage UI The graph becomes useful. Before this commit, entities sat in the DB as data with no narrative. After: the director can ask "what am I forgetting?" and get a structured answer in milliseconds. New module (src/atocore/engineering/queries.py, 360 lines): Structure queries (Q-001/004/005/008/013): - system_map(project): full subsystem → component tree + orphans + materials joined per component - decisions_affecting(project, subsystem_id?): decisions linked via AFFECTED_BY_DECISION, scoped to a subsystem or whole project - requirements_for(component_id): Q-005 forward trace - recent_changes(project, since, limit): Q-013 via memory_audit join (reuses the Phase 4 audit infrastructure — entity_kind='entity') The 3 killer queries (the real value): - orphan_requirements(project): requirements with NO inbound SATISFIES edge. "What do I claim the system must do that nothing actually claims to handle?" Q-006. - risky_decisions(project): decisions whose BASED_ON_ASSUMPTION edge points to an assumption with status in ('superseded','invalid') OR properties.flagged=True. Finds cascading risk from shaky premises. Q-009. - unsupported_claims(project): ValidationClaim entities with no inbound SUPPORTS edge — asserted but no Result to back them. Q-011. - all_gaps(project): runs all three in one call for dashboards. History + impact (Q-016/017): - impact_analysis(entity_id, max_depth=3): BFS over outbound edges. "What's downstream of this if I change it?" - evidence_chain(entity_id): inbound SUPPORTS/EVIDENCED_BY/DESCRIBED_BY/ VALIDATED_BY/ANALYZED_BY. "How do I know this is true?" API (src/atocore/api/routes.py) exposes 10 endpoints: - GET /engineering/projects/{p}/systems - GET /engineering/decisions?project=&subsystem= - GET /engineering/components/{id}/requirements - GET /engineering/changes?project=&since=&limit= - GET /engineering/gaps/orphan-requirements?project= - GET /engineering/gaps/risky-decisions?project= - GET /engineering/gaps/unsupported-claims?project= - GET /engineering/gaps?project= (combined) - GET /engineering/impact?entity=&max_depth= - GET /engineering/evidence?entity= Mirror integration (src/atocore/engineering/mirror.py): - New _gaps_section() renders at top of every project page - If any gap non-empty: shows up-to-10 per category with names + context - Clean project: "✅ No gaps detected" — signals everything is traced Triage UI (src/atocore/engineering/triage_ui.py): - /admin/triage now shows BOTH memory candidates AND entity candidates - Entity cards: name, type, project, confidence, source provenance, Promote/Reject buttons, link to wiki entity page - Entity promote/reject via fetch to /entities/{id}/promote\|reject - One triage UI for the whole pipeline — consistent muscle memory Tests: 326 → 341 (15 new, all in test_engineering_queries.py): - System map structure + orphan detection + material joins - Killer queries: positive + negative cases (empty when clean) - Decisions query: project-wide and subsystem-scoped - Impact analysis walks outbound BFS - Evidence chain walks inbound provenance No regressions. All 10 daily queries from the plan are now live and answering real questions against the graph. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 07:18:46 -04:00
Anto01	07664bd743	feat: Phase 5A — Engineering V1 foundation First slice of the Engineering V1 sprint. Lays the schema + lifecycle plumbing so the 10 canonical queries, memory graduation, and conflict detection can land cleanly on top. Schema (src/atocore/models/database.py): - conflicts + conflict_members tables per conflict-model.md (with 5 indexes on status/project/slot/members) - memory_audit.entity_kind discriminator — same audit table serves both memories ("memory") and entities ("entity"); unified history without duplicating infrastructure - memories.graduated_to_entity_id forward pointer for graduated memories (M → E transition preserves the memory as historical pointer) Memory (src/atocore/memory/service.py): - MEMORY_STATUSES gains "graduated" — memory-entity graduation flow ready to wire in Phase 5F Engineering service (src/atocore/engineering/service.py): - RELATIONSHIP_TYPES organized into 4 families per ontology-v1.md: + Structural: contains, part_of, interfaces_with + Intent: satisfies, constrained_by, affected_by_decision, based_on_assumption (new), supersedes + Validation: analyzed_by, validated_by, supports (new), conflicts_with (new), depends_on + Provenance: described_by, updated_by_session (new), evidenced_by (new), summarized_in (new) - create_entity + create_relationship now call resolve_project_name() on write (canonicalization contract per doc) - Both accept actor= parameter for audit provenance - _audit_entity() helper uses shared memory_audit table with entity_kind="entity" — one observability layer for everything - promote_entity / reject_entity_candidate / supersede_entity — mirror the memory lifecycle exactly (same pattern, same naming) - get_entity_audit() reads from the shared table filtered by entity_kind API (src/atocore/api/routes.py): - POST /entities/{id}/promote (candidate → active) - POST /entities/{id}/reject (candidate → invalid) - GET /entities/{id}/audit (full history for one entity) - POST /entities passes actor="api-http" through Tests: 317 → 326 (9 new): - test_entity_project_canonicalization (p04 → p04-gigabit) - test_promote_entity_candidate_to_active - test_reject_entity_candidate - test_promote_active_entity_noop (only candidates promote) - test_entity_audit_log_captures_lifecycle (before/after snapshots) - test_new_relationship_types_available (6 new types present) - test_conflicts_tables_exist - test_memory_audit_has_entity_kind - test_graduated_status_accepted What's next (5B-5I, deferred): entity triage UI tab, core structure queries, the 3 killer queries, memory graduation script, conflict detection, MCP + context pack integration. See plan file. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 07:01:28 -04:00
Anto01	88f2f7c4e1	feat: Phase 4 V1 — Robustness Hardening Adds the observability + safety layer that turns AtoCore from "works until something silently breaks" into "every mutation is traceable, drift is detected, failures raise alerts." 1. Audit log (memory_audit table): - New table with id, memory_id, action, actor, before/after JSON, note, timestamp; 3 indexes for memory_id/timestamp/action - _audit_memory() helper called from every mutation: create_memory, update_memory, promote_memory, reject_candidate_memory, invalidate_memory, supersede_memory, reinforce_memory, auto_promote_reinforced, expire_stale_candidates - Action verb auto-selected: promoted/rejected/invalidated/ superseded/updated based on state transition - "actor" threaded through: api-http, human-triage, phase10-auto- promote, candidate-expiry, reinforcement, etc. - Fail-open: audit write failure logs but never breaks the mutation - GET /memory/{id}/audit: full history for one memory - GET /admin/audit/recent: last 50 mutations across the system 2. Alerts framework (src/atocore/observability/alerts.py): - emit_alert(severity, title, message, context) fans out to: - structlog logger (always) - ~/atocore-logs/alerts.log append (configurable via ATOCORE_ALERT_LOG) - project_state atocore/alert/last_{severity} (dashboard surface) - ATOCORE_ALERT_WEBHOOK POST if set (auto-detects Discord webhook format for nice embeds; generic JSON otherwise) - Every sink fail-open — one failure doesn't prevent the others - Pipeline alert step in nightly cron: harness < 85% → warning; candidate queue > 200 → warning 3. Integrity checks (scripts/integrity_check.py): - Nightly scan for drift: - Memories → missing source_chunk_id references - Duplicate active memories (same type+content+project) - project_state → missing projects - Orphaned source_chunks (no parent document) - Results persisted to atocore/status/integrity_check_result - Any finding emits a warning alert - Added as Step G in deploy/dalidou/batch-extract.sh nightly cron 4. Dashboard surfaces it all: - integrity (findings + details) - alerts (last info/warning/critical per severity) - recent_audit (last 10 mutations with actor + action + preview) Tests: 308 → 317 (9 new): - test_audit_create_logs_entry - test_audit_promote_logs_entry - test_audit_reject_logs_entry - test_audit_update_captures_before_after - test_audit_reinforce_logs_entry - test_recent_audit_returns_cross_memory_entries - test_emit_alert_writes_log_file - test_emit_alert_invalid_severity_falls_back_to_info - test_emit_alert_fails_open_on_log_write_error Deferred: formal migration framework with rollback (current additive pattern is fine for V1); memory detail wiki page with audit view (quick follow-up). To enable Discord alerts: set ATOCORE_ALERT_WEBHOOK to a Discord webhook URL in Dalidou's environment. Default = log-only. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 21:54:10 -04:00
Anto01	bfa7dba4de	feat: Phase 3 V1 — Auto-Organization (domain_tags + valid_until) Adds structural metadata that the LLM triage was already implicitly reasoning about ("stale snapshot" → reject). Phase 3 captures that reasoning as fields so it can DRIVE retrieval, not just rejection. Schema (src/atocore/models/database.py): - domain_tags TEXT DEFAULT '[]' JSON array of lowercase topic keywords - valid_until DATETIME ISO date; null = permanent - idx_memories_valid_until index for efficient expiry queries Memory service (src/atocore/memory/service.py): - Memory dataclass gains domain_tags + valid_until - create_memory, update_memory accept/persist both - _row_to_memory safely reads both (JSON-decode + null handling) - _normalize_tags helper: lowercase, dedup, strip, cap at 10 - get_memories_for_context filters expired (valid_until < today UTC) - _rank_memories_for_query adds tag-boost: memories whose domain_tags appear as substrings in query text rank higher (tertiary key after content-overlap density + absolute overlap, before confidence) LLM extractor (_llm_prompt.py → llm-0.5.0): - SYSTEM_PROMPT documents domain_tags (2-5 keywords) + valid_until (time-bounded facts get expiry dates; durable facts stay null) - normalize_candidate_item parses both fields from model output with graceful fallback for string/null/missing LLM triage (scripts/auto_triage.py): - TRIAGE_SYSTEM_PROMPT documents same two fields - parse_verdict extracts them from verdict JSON - On promote: PUT /memory/{id} with tags + valid_until BEFORE POST /memory/{id}/promote, so active memories carry them API (src/atocore/api/routes.py): - MemoryCreateRequest: adds domain_tags, valid_until - MemoryUpdateRequest: adds domain_tags, valid_until, memory_type - GET /memory response exposes domain_tags + valid_until + created_at Triage UI (src/atocore/engineering/triage_ui.py): - Renders existing tags as colored badges - Adds inline text field for tags (comma-separated) + date picker for valid_until on every candidate card - Save&Promote button persists edits via PUT then promotes - Plain Promote (and Y shortcut) also saves tags/expiry if edited Wiki (src/atocore/engineering/wiki.py): - Search now matches memory content OR domain_tags - Search results render tags as clickable badges linking to /wiki/search?q=<tag> for cross-project navigation - valid_until shown as amber "valid until YYYY-MM-DD" hint Tests: 303 → 308 (5 new for Phase 3 behavior): - test_create_memory_with_tags_and_valid_until - test_create_memory_normalizes_tags - test_update_memory_sets_tags_and_valid_until - test_get_memories_for_context_excludes_expired - test_context_builder_tag_boost_orders_results Deferred (explicitly): temporal_scope enum, source_refs memory graph, HDBSCAN clustering, memory detail wiki page, backfill of existing actives. See docs/MASTER-BRAIN-PLAN.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 21:37:01 -04:00
Anto01	775960c8c8	feat: "Make It Actually Useful" sprint — observability + Phase 10 Pipeline observability: - Retrieval harness runs nightly (Step E in batch-extract.sh) - Pipeline summary persisted to project state after each run (pipeline_last_run, pipeline_summary, retrieval_harness_result) - Dashboard enhanced: interaction total + by_client, pipeline health (last_run, hours_since, harness results, triage stats), dynamic project list from registry Phase 10 — reinforcement-based auto-promotion: - auto_promote_reinforced(): candidates with reference_count >= 3 and confidence >= 0.7 auto-graduate to active - expire_stale_candidates(): candidates unreinforced for 14+ days auto-rejected to prevent unbounded queue growth - Both wired into nightly cron (Step B2) - Batch script: scripts/auto_promote_reinforced.py (--dry-run support) Knowledge seeding: - scripts/seed_project_state.py: 26 curated Trusted Project State entries across p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, abb-space, atocore (decisions, requirements, facts, contacts, milestones) Tests: 299 → 303 (4 new Phase 10 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 13:59:12 -04:00
Anto01	c2e7064238	fix(extraction): R11 container 503 + R12 shared prompt module R11: POST /admin/extract-batch with mode=llm now returns 503 when the claude CLI is unavailable (was silently returning success with 0 candidates), with a message pointing at the host-side script. +2 tests. R12: extracted SYSTEM_PROMPT + parse_llm_json_array + normalize_candidate_item + build_user_message into stdlib-only src/atocore/memory/_llm_prompt.py. Both the container extractor and scripts/batch_llm_extract_live.py now import from it, eliminating the prompt/parser drift risk. Tests 297 -> 299. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 10:47:01 -04:00
Anto01	c57617f611	feat: auto-project-detection + project stages Three changes: 1. ABB-Space registered as a lead project with stage=lead in Trusted Project State. Projects now have lifecycle awareness (lead/proposition vs active contract vs completed). 2. Extraction no longer drops unregistered project tags. When the LLM extractor sees a conversation about a project not in the registry, it keeps the model's tag on the candidate instead of falling back to empty. This enables auto-detection of new projects/leads from organic conversations. The nightly pipeline surfaces these candidates for triage, where the operator sees "hey, there's a new project called X" and can decide whether to register it. 3. Extraction prompt updated to tell the model: "If the conversation discusses a project NOT in the known list, still tag it — the system will auto-detect it." This removes the artificial ceiling that prevented new project discovery. Updated Case D test: unregistered + unscoped now keeps the model's tag instead of dropping to empty. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 17:16:04 -04:00
Anto01	dc20033a93	feat: Engineering Knowledge Layer V1 — entities + relationships Layer 2 of the AtoCore architecture. Adds typed engineering entities with relationships on top of the flat memory/state/chunk substrate. Schema: - entities table: id, entity_type, name, project, description, properties (JSON), status, confidence, source_refs, timestamps - relationships table: source_entity_id, target_entity_id, relationship_type, confidence, source_refs 15 entity types: project, system, subsystem, component, interface, requirement, constraint, decision, material, parameter, analysis_model, result, validation_claim, vendor, process 12 relationship types: contains, part_of, interfaces_with, satisfies, constrained_by, affected_by_decision, analyzed_by, validated_by, depends_on, uses_material, described_by, supersedes Service layer: full CRUD + get_entity_with_context (returns an entity with its relationships and all related entities in one call). API endpoints: - POST /entities — create entity - GET /entities — list/filter by type, project, status, name - GET /entities/{id} — entity + relationships + related entities - POST /relationships — create relationship Schema auto-initialized on app startup via init_engineering_schema(). 7 tests covering entity CRUD, relationships, context traversal, filtering, name search, and validation. Test count: 290 -> 297. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 09:50:58 -04:00
Anto01	e5e9a9931e	fix(R9): trust hierarchy for project attribution Batch 3, Days 1-3. The core R9 failure was Case F: when the model returned a registered project DIFFERENT from the interaction's known scope, the old code trusted the model because the project was registered. A p06-polisher interaction could silently produce a p04-gigabit candidate. New rule (trust hierarchy): 1. Interaction scope always wins when set (cases A, C, E, F) 2. Model project used only for unscoped interactions AND only when it resolves to a registered project (cases D, G) 3. Empty string when both are empty or unregistered (case B) The rule is: interaction.project is the strongest signal because it comes from the capture hook's project detection, which runs before the LLM ever sees the content. The model's project guess is only useful when the capture hook had no project context. 7 case tests (A-G) cover every combination of model/interaction project state. Pre-existing tests updated for the new behavior. Host-side script mirrors the same hierarchy using _known_projects fetched from GET /projects at startup. Test count: 286 -> 290 (+4 net, 7 new R9 cases, 3 old tests consolidated). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 15:37:29 -04:00
Anto01	69c971708a	feat: Day 4+5 — R7/R9 fixes + integration tests (R8) Day 4: - R7 fixed: overlap-density ranking. p06-firmware-interface now passes (was the last memory-ranking failure). Harness 16/18→17/18. - R9 fixed: LLM extractor checks project registry before trusting model-supplied project. Hallucinated projects fall back to interaction's known scope. Registry lookup via load_project_registry(), matched by project_id. Host-side script mirrors this via GET /projects at startup. Day 5: - R8 addressed: 5 integration tests in test_extraction_pipeline.py covering the full LLM extract → persist as candidate → promote/ reject flow, project fallback, failure handling, and dedup behavior. Uses mocked subprocess to avoid real claude -p calls. Harness: 17/18 (only p06-tailscale remains — chunk bleed from source content, not a memory/ranking issue). Tests: 280 → 286 (+6). Batch complete. Before/after for this batch: R1: fixed (extraction pipeline operational on Dalidou) R5: fixed (batch endpoint + host-side script) R7: fixed (overlap-density ranking) R9: fixed (project trust-preservation via registry check) R8: addressed (5 integration tests) Harness: 16/18 → 17/18 Active memories: 36 → 41 Nightly pipeline: backup → cleanup → rsync → extract → auto-triage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:44:02 -04:00
Anto01	8951c624fe	fix(R7/R9): overlap-density ranking + project trust-preservation R7: ranking scorer now uses overlap-density (overlap_count / memory_token_count) as primary key instead of raw overlap count. A 5-token memory with 3 overlapping tokens (density 0.6) now beats a 40-token overview memory with 3 overlapping tokens (density 0.075) at the same absolute count. Secondary: absolute overlap. Tertiary: confidence. Targeting p06-firmware-interface harness fixture. R9: when the LLM extractor returns a project that differs from the interaction's known project, it now checks the project registry. If the model's project is a registered canonical ID, trust it. If not (hallucinated name), fall back to the interaction's project. Uses load_project_registry() for the check. The host-side script mirrors this via an API call to GET /projects at startup. Two new tests: test_parser_keeps_registered_model_project and test_parser_rejects_hallucinated_project. Test count: 280 -> 281. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:34:33 -04:00
Anto01	39d73e91b4	fix(R6): fall back to interaction.project when LLM returns empty Codex R6: the LLM extractor accepted the model's project field verbatim. When the model returned empty string, clearly p06 memories got promoted as project='', making them invisible to the p06 project-memory band and explaining the p06-offline-design harness failure. Fix: if model returns empty project but interaction.project is set, inherit the interaction's project. Model-supplied project still takes precedence when non-empty. Two new tests lock the fallback and precedence behaviors. R5 acknowledged (LLM extractor not yet wired into API — next task). Test count: 278 -> 280. Harness re-run pending after deploy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 07:37:14 -04:00
Anto01	a29b5e22f2	feat(eval-loop): Day 4 — LLM extractor via claude -p (OAuth, no API key) Second pass on the LLM-assisted extractor after Antoine's explicit rule: no API key, ever. Refactored src/atocore/memory/extractor_llm.py to shell out to the Claude Code 'claude -p' CLI via subprocess instead of the anthropic SDK, so extraction reuses the user's existing Claude.ai OAuth credentials and needs zero secret management. Implementation: - subprocess.run(["claude", "-p", "--model", "haiku", "--append-system-prompt", <instructions>, "--no-session-persistence", "--disable-slash-commands", user_message], ...) - cwd is a cached tempfile.mkdtemp() so every invocation starts with a clean context instead of auto-discovering CLAUDE.md / AGENTS.md / DEV-LEDGER.md from the repo root. We cannot use --bare because it forces API-key auth, which defeats the purpose; the temp-cwd trick is the lightest way to keep OAuth auth while skipping project context loading. - Silent-failure contract unchanged: missing CLI, non-zero exit, timeout, malformed JSON — all return [] and log an error. The capture audit trail must not break on an optional side effect. - Default timeout bumped from 20s to 90s: Haiku + Node.js startup + OAuth check is ~20-40s per call in practice, plus real responses up to 8KB take longer. 45s hit 2 timeouts on the first live run. - tests/test_extractor_llm.py refactored: the API-key / anthropic SDK tests are replaced by subprocess-mocking tests covering missing CLI, timeout, non-zero exit, and a happy-path stdout parse. 14 tests, all green. scripts/extractor_eval.py: - New --output <path> flag writes the JSON result directly to a file, bypassing stdout/log interleaving (structlog sends INFO to stdout via PrintLoggerFactory, so a naive '> out.json' pollutes the file). - Forces UTF-8 on stdout so real LLM output with em-dashes / arrows / CJK doesn't crash the human report on Windows cp1252 consoles. First live baseline run against the 20-interaction labeled corpus (scripts/eval_data/extractor_llm_baseline_2026-04-11.json): mode=llm labeled=20 recall=1.0 precision=0.357 yield_rate=2.55 total_actual_candidates=51 total_expected_candidates=7 false_negative_interactions=0 false_positive_interactions=9 Recall 0% -> 100% vs rule baseline — every human-labeled positive is caught. Precision reads low (0.357) but inspection shows the "false positives" are real candidates the human labels under-counted. For example interaction a6b0d279 was labeled at 2 expected candidates, the model caught all 6 polisher architectural facts; interaction 52c8c0f3 was labeled at 1, the model caught all 5 infra commitments. The labels are the bottleneck, not the model. Day 4 gate against Codex's criteria: - candidate yield: 255% vs ≥15-25% target - FP rate tolerable for manual triage: 51 candidates reviewable in ~10 minutes via the triage CLI - ≥2 real non-synthetic candidates worth review: 20+ obvious wins (polisher architecture set, p05 infra set, DEV-LEDGER protocol set) Gate cleared. LLM-assisted extraction is the path forward for conversational captures. Rule-based extractor stays as-is for structured-cue inputs and remains the default mode. The next step (Day 5 stabilize / document) will wire LLM mode behind a flag in the public extraction endpoint and document scope. Test count: 276 -> 278 passing. No existing tests changed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 17:45:24 -04:00
Anto01	b309e7fd49	feat(eval-loop): Day 4 — LLM-assisted extractor path (additive, flagged) Day 2 baseline showed 0% recall for the rule-based extractor across 5 distinct miss classes. Day 4 decision gate: prototype an LLM-assisted mode behind a flag. Option A ratified by Antoine. New module src/atocore/memory/extractor_llm.py: - extract_candidates_llm(interaction) returns the same MemoryCandidate dataclass the rule extractor produces, so both paths flow through the existing triage / candidate pipeline unchanged. - extract_candidates_llm_verbose() also returns the raw model output and any error string, for eval and debugging. - Uses Claude Haiku 4.5 by default; model overridable via ATOCORE_LLM_EXTRACTOR_MODEL env. Timeout via ATOCORE_LLM_EXTRACTOR_TIMEOUT_S (default 20s). - Silent-failure contract: missing API key, unreachable model, malformed JSON — all return [] and log an error. Never raises into the caller. The capture audit trail must not break on an optional side effect. - Parser tolerates markdown fences, surrounding prose, invalid memory types, clamps confidence to [0,1], drops empty content. - System prompt explicitly tells the model to return [] for most conversational turns (durable-fact bar, not "extract everything"). - Trust rules unchanged: candidates are never auto-promoted, extraction stays off the capture hot path, human triages via the existing CLI. scripts/extractor_eval.py: new --mode {rule,llm} flag so the same labeled corpus can be scored against both extractors. Default remains rule so existing invocations are unchanged. tests/test_extractor_llm.py: 12 new unit tests covering the parser (empty array, malformed JSON, markdown fences, surrounding prose, invalid types, empty content, confidence clamping, version tagging), plus contract tests for missing API key, empty response, and a mocked api_error path so failure modes never raise. Test count: 264 -> 276 passing. No existing tests changed. Next step: run `python scripts/extractor_eval.py --mode llm` against the labeled set with ANTHROPIC_API_KEY in env, record the delta, decide whether to wire LLM mode into the API endpoint and CLI or keep it script-only for now. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 15:18:30 -04:00
Anto01	5aeeb1cad1	feat: query-relevance ordering for memory selection get_memories_for_context now accepts an optional query string. When provided, candidate memories are reranked by lexical overlap with the query (stemmed token intersection, ties broken by confidence) before the budget walk. Without a query the order is unchanged — effectively "by confidence desc" as before — so non-builder callers see no behaviour change. The fetch limit is raised from 10 to 30 so there's a real pool to rerank. Token overlap reuses _normalize/_tokenize from reinforcement.py so ranking and reinforcement matching share the same notion of distinctive terms. build_context passes the user_prompt through to both the identity/ preference and project-memory calls. The retrieval harness regression the fix is targeting: - p05-vendor-signal FAIL @ `1161645`: "Zygo" missing from the pack even though an active vendor memory contained it. Root cause: higher-confidence p05 memories filled the 25% budget slice before the vendor memory ever got a chance. Query-aware ordering puts the vendor memory first when the query is about vendors. New regression test test_project_memories_query_relevance_ordering locks the behaviour in with two p05 memories and a tight budget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:47:05 -04:00
Anto01	8ea53f4003	feat: fold project-scoped memories into context pack The retrieval-quality review on 2026-04-11 found that active project/knowledge/episodic memories never reached the pack: only Trusted Project State and identity/preference memories were being assembled. Reinforcement bumped confidence on memories that had no retrieval outlet, so the reflection loop was half-open. This change adds a third memory tier between identity/preference and retrieved chunks: - PROJECT_MEMORY_BUDGET_RATIO = 0.15 - Memory types: project, knowledge, episodic - Only populated when a canonical project is in scope — without a project hint, project memories stay out (cross-project bleed would rot the signal) - Rendered under a dedicated "--- Project Memories ---" header so the LLM can distinguish it from the identity/preference band - Trim order in _trim_context_to_budget: retrieval → project memories → identity/preference → project state (most recently added tier drops first when budget is tight) get_memories_for_context gains header/footer kwargs so the two memory blocks can be distinguished in a single pack without a second helper. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:35:40 -04:00
Anto01	9366ba7879	feat: length-aware reinforcement + batch triage CLI + off-host backup - Reinforcement matcher now handles paragraph-length memories via a dual-mode threshold: short memories keep the 70% overlap rule, long memories (>15 stems) require 12 absolute overlaps AND 35% fraction so organic paraphrase can still reinforce. Diagnosis: every active memory stayed at reference_count=0 because 40-token project summaries never hit 70% overlap on real responses. - scripts/atocore_client.py gains batch-extract (fan out /interactions/{id}/extract over recent interactions) and triage (interactive promote/reject walker for the candidate queue), matching the Phase 9 reflection-loop review flow without pulling extraction into the capture hot path. - deploy/dalidou/cron-backup.sh adds an optional off-host rsync step gated on ATOCORE_BACKUP_RSYNC, fail-open when the target is offline so a laptop being off at 03:00 UTC never reds the local backup. - docs/next-steps.md records the retrieval-quality sweep: project state surfaces, chunks are on-topic but broad, active memories never reach the pack (reflection loop has no retrieval outlet yet). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:20:03 -04:00
Anto01	c5bad996a7	feat: enable reinforcement on live capture The Stop hook now sends reinforce=true so the token-overlap matcher runs on every captured interaction. Memory confidence will accumulate signal from organic Claude Code use. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 10:58:56 -04:00
Anto01	58c744fd2f	feat: post-backup validation + retention cleanup (Tasks B & C) - create_runtime_backup() now auto-validates its output and includes validated/validation_errors fields in returned metadata - New cleanup_old_backups() with retention policy: 7 daily, 4 weekly (Sundays), 6 monthly (1st of month), dry-run by default - CLI `cleanup` subcommand added to backup module - 9 new tests (2 validation + 7 retention), 259 total passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 09:46:46 -04:00
Anto01	a34a7a995f	fix: token-overlap matcher for reinforcement (Phase 9B) Replace the substring-based _memory_matches() with a token-overlap matcher that tokenizes both memory content and response, applies lightweight stemming (trailing s/ed/ing) and stop-word removal, then checks whether >= 70% of the memory's tokens appear in the response. This fixes the paraphrase blindness that prevented reinforcement from ever firing on natural responses ("prefers" vs "prefer", "because history" vs "because the history"). 7 new tests (26 total reinforcement tests, all passing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 09:40:05 -04:00
Anto01	92fc250b54	fix: use correct hook field name last_assistant_message The Claude Code Stop hook sends `last_assistant_message`, not `assistant_message`. This was causing response_chars=0 on all captured interactions. Also removes the temporary debug log block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 09:17:21 -04:00
Anto01	2d911909f8	feat: auto-capture Claude Code sessions via Stop hook Add deploy/hooks/capture_stop.py — a Claude Code Stop hook that reads the transcript JSONL, extracts the last user prompt, and POSTs to the AtoCore /interactions endpoint in conservative mode (reinforce=false). Conservative mode means: capture only, no automatic reinforcement or extraction into the review queue. Kill switch: ATOCORE_CAPTURE_DISABLED=1. Also: note build_sha cosmetic issue after restore in runbook, update project status docs to reflect drill pass and auto-capture wiring. 17 new tests (243 total, all passing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 09:00:42 -04:00
Anto01	1a8fdf4225	fix: chroma restore bind-mount bug + consolidate docs Two fixes from the 2026-04-09 first real restore drill on Dalidou, plus the long-overdue doc consolidation I should have done when I added the drill runbook instead of creating a duplicate. ## Chroma restore bind-mount bug (drill finding) src/atocore/ops/backup.py: restore_runtime_backup() used to call shutil.rmtree(dst_chroma) before copying the snapshot back. In the Dockerized Dalidou deployment the chroma dir is a bind-mounted volume — you can't unlink a mount point, rmtree raises OSError [Errno 16] Device or resource busy and the restore silently fails to touch Chroma. This bit the first real drill; the operator worked around it with --no-chroma plus a manual cp -a. Fix: clear the destination's CONTENTS (iterdir + rmtree/unlink per child) and use copytree(dirs_exist_ok=True) so the mount point itself is never touched. Equivalent semantics, bind-mount-safe. Regression test: tests/test_backup.py::test_restore_chroma_does_not_unlink_destination_directory captures Path.stat().st_ino of the dest dir before and after restore and asserts they match. That's the same invariant a bind-mounted chroma dir enforces — if the inode changed, the mount would have failed. 11/11 backup tests now pass. ## Doc consolidation docs/backup-restore-drill.md existed as a duplicate of the authoritative docs/backup-restore-procedure.md. When I added the drill runbook in commit `3362080` I wrote it from scratch instead of updating the existing procedure — bad doc hygiene on a project that's literally about being a context engine. - Deleted docs/backup-restore-drill.md - Folded its contents into docs/backup-restore-procedure.md: - Replaced the manual sudo cp restore sequence with the new `python -m atocore.ops.backup restore <STAMP> --confirm-service-stopped` CLI - Added the one-shot docker compose run pattern for running restore inside a container that reuses the live volume mounts - Documented the --no-pre-snapshot / --no-chroma / --chroma flags - New "Chroma restore and bind-mounted volumes" subsection explaining the bug and the regression test that protects the fix - New "Restore drill" subsection with three levels (unit tests, module round-trip, live Dalidou drill) and the cadence list - Failure-mode table gained four entries: restored_integrity_ok, Device-or-resource-busy, drill marker still present, chroma_snapshot_missing - "Open follow-ups" struck the restore_runtime_backup item (done) and added a "Done (historical)" note referencing 2026-04-09 - Quickstart cheat sheet now has a full drill one-liner using memory_type=episodic (the 2026-04-09 drill found the runbook's memory_type=note was invalid — the valid set is identity, preference, project, episodic, knowledge, adaptation) ## Status doc sync Long overdue — I've been landing code without updating the project's narrative state docs. docs/current-state.md: - "Reliability Baseline" now reflects: restore_runtime_backup is real with CLI, pre-restore safety snapshot, WAL cleanup, integrity check; live drill on 2026-04-09 surfaced and fixed Chroma bind-mount bug; deploy provenance via /health build_sha; deploy.sh self-update re-exec guard - "Immediate Next Focus" reshuffled: drill re-run (priority 1) and auto-capture (priority 2) are now ahead of retrieval quality work, reflecting the updated unblock sequence docs/next-steps.md: - New item 1: re-run the drill with chroma working end-to-end - New item 2: auto-capture conservative mode (Stop hook) - Old item 7 rewritten as item 9 listing what's DONE (create/list/validate/restore, admin/backup endpoint with include_chroma, /health provenance, self-update guard, procedure doc with failure modes) and what's still pending (retention cleanup, off-Dalidou target, auto-validation) ## Test count 226 passing (was 225 + 1 new inode-stability regression test). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 09:13:21 -04:00
Anto01	336208004c	ops: add restore_runtime_backup + drill runbook Close the backup side of the loop: we had create/list/validate but no restore, and no documented drill. A backup you've never restored is not a backup. This lands the missing restore surface and the procedure to exercise it before enabling any write-path automation (auto-capture, automated ingestion, reinforcement sweeps). Code — src/atocore/ops/backup.py: - restore_runtime_backup(stamp, *, include_chroma, pre_restore_snapshot, confirm_service_stopped) performs: 1. validate_backup() gate — refuse on any error 2. pre-restore safety snapshot of current state (reversibility anchor) 3. PRAGMA wal_checkpoint(TRUNCATE) on target db (flush + release OS handles; Windows needs this after conn.backup() reads) 4. unlink stale -wal/-shm sidecars (tolerant to Windows lock races) 5. shutil.copy2 snapshot db over target 6. restore registry if snapshot captured one 7. restore Chroma tree if snapshot captured one and include_chroma resolves to true (defaults to whether backup has Chroma) 8. PRAGMA integrity_check on restored db, report result - Refuses without confirm_service_stopped=True to prevent hot-restore into a running service (would corrupt SQLite state) - Rewrote main() as argparse with 4 subcommands: create, list, validate, restore. `python -m atocore.ops.backup restore STAMP --confirm-service-stopped` is the drill CLI entry point, run via `docker compose run --rm --entrypoint python atocore` so it reuses the live service's volume mounts Tests — tests/test_backup.py (6 new): - test_restore_refuses_without_confirm_service_stopped - test_restore_raises_on_invalid_backup - test_restore_round_trip_reverses_post_backup_mutations (canonical drill flow: seed -> backup -> mutate -> restore -> mutation gone + baseline survived + pre-restore snapshot has the mutation captured as rollback anchor) - test_restore_round_trip_with_chroma - test_restore_skips_pre_snapshot_when_requested - test_restore_cleans_stale_wal_sidecars (asserts stale byte markers do not survive, not file existence, since PRAGMA integrity_check may legitimately recreate -wal) Docs — docs/backup-restore-drill.md (new): - What gets backed up (hot sqlite, cold chroma, registry JSON, metadata.json) and what doesn't (.env, source content) - What restore does, step by step, and why confirm_service_stopped is a hard gate - 8-step drill procedure: capture -> baseline -> mutate -> stop -> restore -> start -> verify marker gone -> optional cleanup - Correct endpoint bodies verified against routes.py: POST /admin/backup with JSON body {"include_chroma": true} POST /memory with memory_type/content/project/confidence GET /memory?project=drill to list drill markers POST /query with {"prompt": ..., "top_k": ...} (not "query") - Failure modes: integrity_check fail, container won't start, marker still present after restore, with remediation for each - When to run: before new write-path automation, after backup.py or schema changes, after infra bumps, monthly as standing check 225/225 tests passing (219 existing + 6 new restore). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 21:17:48 -04:00
Anto01	be4099486c	deploy: add build_sha visibility for precise drift detection Make /health report the precise git SHA the container was built from, so 'is the live service current?' can be answered without ambiguity. 0.2.0 was too coarse to trust as a 'live is current' signal — many commits share the same __version__. Three layers: 1. /health endpoint (src/atocore/api/routes.py) - Reads ATOCORE_BUILD_SHA, ATOCORE_BUILD_TIME, ATOCORE_BUILD_BRANCH from environment, defaults to 'unknown' - Reports them alongside existing code_version field 2. docker-compose.yml - Forwards the three env vars from the host into the container - Defaults to 'unknown' so direct `docker compose up` runs (without deploy.sh) cleanly signal missing build provenance 3. deploy.sh - Step 2 captures git SHA + UTC timestamp + branch and exports them as env vars before `docker compose up -d --build` - Step 6 reads /health post-deploy and compares the reported build_sha against the freshly-built one. Mismatch exits non-zero (exit code 6) with a remediation hint covering cached image, env propagation, and concurrent restart cases Tests (tests/test_api_storage.py): - test_health_endpoint_reports_code_version_from_module - test_health_endpoint_reports_build_metadata_from_env - test_health_endpoint_reports_unknown_when_build_env_unset Docs (docs/dalidou-deployment.md): - Three-level drift detection table (code_version coarse, build_sha precise, build_time/branch forensic) - Canonical drift check script using LIVE_SHA vs EXPECTED_SHA - Note that running deploy.sh is itself the simplest drift check 219/219 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 20:25:32 -04:00
Anto01	b492f5f7b0	fix: schema init ordering, deploy.sh default, client BASE_URL docs Three issues Dalidou Claude surfaced during the first real deploy of commit `e877e5b` to the live service (report from 2026-04-08). Bug 1 was the critical one — a schema init ordering bug that would have bitten every future upgrade from a pre-Phase-9 schema — and the other two were usability traps around hostname resolution. Bug 1 (CRITICAL): schema init ordering -------------------------------------- src/atocore/models/database.py SCHEMA_SQL contained CREATE INDEX statements that referenced columns added later by _apply_migrations(): CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project); CREATE INDEX IF NOT EXISTS idx_interactions_project_name ON interactions(project); CREATE INDEX IF NOT EXISTS idx_interactions_session ON interactions(session_id); On a FRESH install, CREATE TABLE IF NOT EXISTS creates the tables with the Phase 9 shape (columns present), so the CREATE INDEX runs cleanly and _apply_migrations is effectively a no-op. On an UPGRADE from a pre-Phase-9 schema, CREATE TABLE IF NOT EXISTS is a no-op (the tables already exist in the old shape), the columns are NOT added yet, and the CREATE INDEX fails with "OperationalError: no such column: project" before _apply_migrations gets a chance to add the columns. Dalidou Claude hit this exactly when redeploying from 0.1.0 to 0.2.0 — had to manually ALTER TABLE to add the Phase 9 columns before the container could start. The fix is to remove the Phase 9-column indexes from SCHEMA_SQL. They already exist in _apply_migrations() AFTER the corresponding ALTER TABLE, so they still get created on both fresh and upgrade paths — just after the columns exist, not before. Indexes still in SCHEMA_SQL (all safe — reference columns that have existed since the first release): - idx_chunks_document on source_chunks(document_id) - idx_memories_type on memories(memory_type) - idx_memories_status on memories(status) - idx_interactions_project on interactions(project_id) Indexes moved to _apply_migrations (already there — just no longer duplicated in SCHEMA_SQL): - idx_memories_project on memories(project) - idx_interactions_project_name on interactions(project) - idx_interactions_session on interactions(session_id) - idx_interactions_created_at on interactions(created_at) Regression test: tests/test_database.py --------------------------------------- New test_init_db_upgrades_pre_phase9_schema_without_failing: - Seeds the DB with the exact pre-Phase-9 shape (no project / last_referenced_at / reference_count on memories; no project / client / session_id / response / memories_used / chunks_used on interactions) - Calls init_db() — which used to raise OperationalError before the fix - Verifies all Phase 9 columns are present after the call - Verifies the migration indexes exist Before the fix this test would have failed with "OperationalError: no such column: project" on the init_db call. After the fix it passes. This locks the invariant "init_db is safe on any legacy schema shape" so the bug can't silently come back. Full suite: 216 passing (was 215), 1 warning. The +1 is the new regression test. Bug 3 (usability): deploy.sh DNS default ---------------------------------------- deploy/dalidou/deploy.sh ATOCORE_GIT_REMOTE defaulted to http://dalidou:3000/Antoine/ATOCore.git which requires the "dalidou" hostname to resolve. On the Dalidou host itself it didn't (no /etc/hosts entry for localhost alias), so deploy.sh had to be run with the IP as a manual workaround. Fix: default ATOCORE_GIT_REMOTE to http://127.0.0.1:3000/Antoine/ATOCore.git. Loopback always works on the host running the script. Callers from a remote host (e.g. running deploy.sh from a laptop against the Dalidou LAN) set ATOCORE_GIT_REMOTE explicitly. The script header's Environment Variables section documents this with an explicit reference to the 2026-04-08 Dalidou deploy report so the rationale isn't lost. docs/dalidou-deployment.md gets a new "Troubleshooting hostname resolution" subsection and a new example invocation showing how to deploy from a remote host with an explicit ATOCORE_GIT_REMOTE override. Bug 2 (usability): atocore_client.py ATOCORE_BASE_URL documentation ------------------------------------------------------------------- scripts/atocore_client.py Same class of issue as bug 3. BASE_URL defaults to http://dalidou:8100 which resolves fine from a remote caller (laptop, T420/OpenClaw over Tailscale) but NOT from the Dalidou host itself or from inside the atocore container. Dalidou Claude saw the CLI return {"status": "unavailable", "fail_open": true} while direct curl to http://127.0.0.1:8100 worked. The fix here is NOT to change the default (remote callers are the common case and would break) but to DOCUMENT the override clearly so the next operator knows what's happening: - The script module docstring grew a new "Environment variables" section covering ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS, ATOCORE_REFRESH_TIMEOUT_SECONDS, and ATOCORE_FAIL_OPEN, with the explicit override example for on-host/in-container use - It calls out the exact symptom (fail-open envelope when the base URL doesn't resolve) so the diagnosis is obvious from the error alone - docs/dalidou-deployment.md troubleshooting section mirrors this guidance so there's one place to look regardless of whether the operator starts with the client help or the deploy doc What this commit does NOT do ---------------------------- - Does NOT change the default ATOCORE_BASE_URL. Doing that would break the T420 OpenClaw helper and every remote caller who currently relies on the hostname. Documentation is the right fix for this case. - Does NOT fix /etc/hosts on Dalidou. That's a host-level configuration issue that the user can fix if they prefer having the hostname resolve; the deploy.sh fix makes it unnecessary regardless. - Does NOT re-run the validation on Dalidou. The next step is for the live service to pull this commit via deploy.sh (which should now work without the IP workaround) and re-run the Phase 9 loop test to confirm nothing regressed.	2026-04-08 19:02:57 -04:00

1 2

74 Commits