Commit Graph

160 Commits

Author SHA1 Message Date
3ca19724a5 feat: 3-tier triage escalation + project validation + enriched context
Addresses the triage-quality problems the user observed:
- Candidates getting wrong project/product attribution
- Stale facts promoted as if still true
- "Hard to decide" items reaching human queue without real value

Solution: let sonnet handle the easy 80%, escalate borderline cases
to opus, auto-discard (or flag) what two models can't resolve.
Plus enrich the context the triage model sees so it can catch
misattribution, contradictions, and temporal drift earlier.

THE 3-TIER FLOW (scripts/auto_triage.py):

Tier 1: sonnet (fast, cheap)
  - confidence >= 0.8 + clear verdict → PROMOTE or REJECT (done)
  - otherwise → escalate to tier 2

Tier 2: opus (smarter, sees tier-1 verdict + reasoning)
  - second opinion with explicit "sonnet said X, resolve the uncertainty"
  - confidence >= 0.8 → PROMOTE or REJECT with note="[opus]"
  - still uncertain → tier 3

Tier 3: configurable (default discard)
  - ATOCORE_TRIAGE_TIER3=discard (default): auto-reject with
    "two models couldn't decide" reason
  - ATOCORE_TRIAGE_TIER3=human: leave in queue for /admin/triage

Configuration via env vars:
  ATOCORE_TRIAGE_MODEL_TIER1  (default sonnet)
  ATOCORE_TRIAGE_MODEL_TIER2  (default opus)
  ATOCORE_TRIAGE_TIER3        (default discard)
  ATOCORE_TRIAGE_ESCALATION_THRESHOLD (default 0.75)
  ATOCORE_TRIAGE_TIER2_TIMEOUT_S (default 120 — opus is slower)

ENRICHED CONTEXT shown to the triage model (both tiers):
- List of registered project ids so misattribution is detectable
- Trusted project state entries (ground truth, higher trust than memories)
- Top 30 active memories for the claimed project (was 20)
- Tier 2 additionally sees tier 1's verdict + reason

PROJECT MISATTRIBUTION DETECTION:
- Triage prompt asks the model to output "suggested_project" when it
  detects the claimed project is wrong but the content clearly belongs
  to a registered one
- Main loop auto-applies the fix via PUT /memory/{id} (which canonicalizes
  through the registry)
- Misattribution is the #1 pollution source — this catches it upstream

TEMPORAL AGGRESSIVENESS:
- Prompt upgraded: "be aggressive with valid_until for anything that
  reads like 'current state' or 'this week'. When in doubt, 2-4 week
  expiry rather than null."
- Stale facts decay automatically via Phase 3's expiry filter

CONFIDENCE GRADING (new in prompt):
- 0.9+: crystal clear durable fact or clear noise
- 0.75-0.9: confident but not cryptographic-certain
- 0.6-0.75: borderline — WILL escalate
- <0.6: genuinely ambiguous — human or discard

Tests: 356 → 366 (10 new, all in test_triage_escalation.py):
- High-confidence tier-1 promote/reject → no tier-2 call
- Low-confidence tier-1 → tier-2 escalates → decides
- needs_human always escalates regardless of confidence
- tier-2 uncertain → discard by default
- tier-2 uncertain → human when configured
- dry-run skips all API calls
- suggested_project flag surfaces + gets printed
- parse_verdict captures suggested_project

Runtime behavior unchanged for the clear cases (sonnet still handles
them). The 20-30% of candidates that currently land as needs_human
will now route through opus, and only the genuinely stuck get a human
(or discard) action.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 09:09:58 -04:00
3316ff99f9 feat: Phase 5F/5G/5H — graduation, conflicts, MCP engineering tools
The population move + the safety net + the universal consumer hookup,
all shipped together. This is where the engineering graph becomes
genuinely useful against the real 262-memory corpus.

5F: Memory → Entity graduation (THE population move)
- src/atocore/engineering/_graduation_prompt.py: stdlib-only shared
  prompt module mirroring _llm_prompt.py pattern (container + host
  use same system prompt, no drift)
- scripts/graduate_memories.py: host-side batch driver that asks
  claude-p "does this memory describe a typed entity?" and creates
  entity candidates with source_refs pointing back to the memory
- promote_entity() now scans source_refs for memory:* prefix; if
  found, flips source memory to status='graduated' with
  graduated_to_entity_id forward pointer + writes memory_audit row
- GET /admin/graduation/stats exposes graduation rate for dashboard

5G: Sync conflict detection on entity promote
- src/atocore/engineering/conflicts.py: detect_conflicts_for_entity()
  runs on every active promote. V1 checks 3 slot kinds narrowly to
  avoid false positives:
  * component.material (multiple USES_MATERIAL edges)
  * component.part_of (multiple PART_OF edges)
  * requirement.name (duplicate active Requirements in same project)
- Conflicts + members persist via the tables built in 5A
- Fires a "warning" alert via Phase 4 framework
- Deduplicates: same (slot_kind, slot_key) won't get a new row
- resolve_conflict(action="dismiss|supersede_others|no_action"):
  supersede_others marks non-winner members as status='superseded'
- GET /admin/conflicts + POST /admin/conflicts/{id}/resolve

5H: MCP + context pack integration
- scripts/atocore_mcp.py: 7 new engineering tools exposed to every
  MCP-aware client (Claude Desktop, Claude Code, Cursor, Zed):
  * atocore_engineering_map (Q-001/004 system tree)
  * atocore_engineering_gaps (Q-006/009/011 killer queries — THE
    director's question surfaced as a built-in tool)
  * atocore_engineering_requirements_for_component (Q-005)
  * atocore_engineering_decisions (Q-008)
  * atocore_engineering_changes (Q-013 — reads entity audit log)
  * atocore_engineering_impact (Q-016 BFS downstream)
  * atocore_engineering_evidence (Q-017 inbound provenance)
- MCP tools total: 14 (7 memory/state/health + 7 engineering)
- context/builder.py _build_engineering_context now appends a compact
  gaps summary ("Gaps: N orphan reqs, M risky decisions, K unsupported
  claims") so every project-scoped LLM call sees "what we're missing"

Tests: 341 → 356 (15 new):
- 5F: graduation prompt parses positive/negative decisions, rejects
  unknown entity types, tolerates markdown fences; promote_entity
  marks source memory graduated with forward pointer; entity without
  memory refs promotes cleanly
- 5G: component.material + component.part_of + requirement.name
  conflicts detected; clean component triggers nothing; dedup works;
  supersede_others resolution marks losers; dismiss leaves both
  active; end-to-end promote triggers detection
- 5H: graduation user message includes project + type + content

No regressions across the 341 prior tests. The MCP server now answers
"which p05 requirements aren't satisfied?" directly from any Claude
session — no user prompt engineering, no context hacks.

Next to kick off from user: run graduation script on Dalidou to
populate the graph from 262 existing memories:
  ssh papa@dalidou 'cd /srv/storage/atocore/app && PYTHONPATH=src \
    python3 scripts/graduate_memories.py --project p05-interferometer --limit 30 --dry-run'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:53:03 -04:00
53b71639ad feat: Phase 5B-5D — 10 canonical engineering queries + triage UI
The graph becomes useful. Before this commit, entities sat in the DB
as data with no narrative. After: the director can ask "what am I
forgetting?" and get a structured answer in milliseconds.

New module (src/atocore/engineering/queries.py, 360 lines):

Structure queries (Q-001/004/005/008/013):
- system_map(project): full subsystem → component tree + orphans +
  materials joined per component
- decisions_affecting(project, subsystem_id?): decisions linked via
  AFFECTED_BY_DECISION, scoped to a subsystem or whole project
- requirements_for(component_id): Q-005 forward trace
- recent_changes(project, since, limit): Q-013 via memory_audit join
  (reuses the Phase 4 audit infrastructure — entity_kind='entity')

The 3 killer queries (the real value):
- orphan_requirements(project): requirements with NO inbound SATISFIES
  edge. "What do I claim the system must do that nothing actually
  claims to handle?" Q-006.
- risky_decisions(project): decisions whose BASED_ON_ASSUMPTION edge
  points to an assumption with status in ('superseded','invalid') OR
  properties.flagged=True. Finds cascading risk from shaky premises. Q-009.
- unsupported_claims(project): ValidationClaim entities with no inbound
  SUPPORTS edge — asserted but no Result to back them. Q-011.
- all_gaps(project): runs all three in one call for dashboards.

History + impact (Q-016/017):
- impact_analysis(entity_id, max_depth=3): BFS over outbound edges.
  "What's downstream of this if I change it?"
- evidence_chain(entity_id): inbound SUPPORTS/EVIDENCED_BY/DESCRIBED_BY/
  VALIDATED_BY/ANALYZED_BY. "How do I know this is true?"

API (src/atocore/api/routes.py) exposes 10 endpoints:
- GET /engineering/projects/{p}/systems
- GET /engineering/decisions?project=&subsystem=
- GET /engineering/components/{id}/requirements
- GET /engineering/changes?project=&since=&limit=
- GET /engineering/gaps/orphan-requirements?project=
- GET /engineering/gaps/risky-decisions?project=
- GET /engineering/gaps/unsupported-claims?project=
- GET /engineering/gaps?project=  (combined)
- GET /engineering/impact?entity=&max_depth=
- GET /engineering/evidence?entity=

Mirror integration (src/atocore/engineering/mirror.py):
- New _gaps_section() renders at top of every project page
- If any gap non-empty: shows up-to-10 per category with names + context
- Clean project: " No gaps detected" — signals everything is traced

Triage UI (src/atocore/engineering/triage_ui.py):
- /admin/triage now shows BOTH memory candidates AND entity candidates
- Entity cards: name, type, project, confidence, source provenance,
  Promote/Reject buttons, link to wiki entity page
- Entity promote/reject via fetch to /entities/{id}/promote|reject
- One triage UI for the whole pipeline — consistent muscle memory

Tests: 326 → 341 (15 new, all in test_engineering_queries.py):
- System map structure + orphan detection + material joins
- Killer queries: positive + negative cases (empty when clean)
- Decisions query: project-wide and subsystem-scoped
- Impact analysis walks outbound BFS
- Evidence chain walks inbound provenance

No regressions. All 10 daily queries from the plan are now live and
answering real questions against the graph.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:18:46 -04:00
07664bd743 feat: Phase 5A — Engineering V1 foundation
First slice of the Engineering V1 sprint. Lays the schema + lifecycle
plumbing so the 10 canonical queries, memory graduation, and conflict
detection can land cleanly on top.

Schema (src/atocore/models/database.py):
- conflicts + conflict_members tables per conflict-model.md (with 5
  indexes on status/project/slot/members)
- memory_audit.entity_kind discriminator — same audit table serves
  both memories ("memory") and entities ("entity"); unified history
  without duplicating infrastructure
- memories.graduated_to_entity_id forward pointer for graduated
  memories (M → E transition preserves the memory as historical
  pointer)

Memory (src/atocore/memory/service.py):
- MEMORY_STATUSES gains "graduated" — memory-entity graduation flow
  ready to wire in Phase 5F

Engineering service (src/atocore/engineering/service.py):
- RELATIONSHIP_TYPES organized into 4 families per ontology-v1.md:
  + Structural: contains, part_of, interfaces_with
  + Intent: satisfies, constrained_by, affected_by_decision,
    based_on_assumption (new), supersedes
  + Validation: analyzed_by, validated_by, supports (new),
    conflicts_with (new), depends_on
  + Provenance: described_by, updated_by_session (new),
    evidenced_by (new), summarized_in (new)
- create_entity + create_relationship now call resolve_project_name()
  on write (canonicalization contract per doc)
- Both accept actor= parameter for audit provenance
- _audit_entity() helper uses shared memory_audit table with
  entity_kind="entity" — one observability layer for everything
- promote_entity / reject_entity_candidate / supersede_entity —
  mirror the memory lifecycle exactly (same pattern, same naming)
- get_entity_audit() reads from the shared table filtered by
  entity_kind

API (src/atocore/api/routes.py):
- POST /entities/{id}/promote (candidate → active)
- POST /entities/{id}/reject (candidate → invalid)
- GET /entities/{id}/audit (full history for one entity)
- POST /entities passes actor="api-http" through

Tests: 317 → 326 (9 new):
- test_entity_project_canonicalization (p04 → p04-gigabit)
- test_promote_entity_candidate_to_active
- test_reject_entity_candidate
- test_promote_active_entity_noop (only candidates promote)
- test_entity_audit_log_captures_lifecycle (before/after snapshots)
- test_new_relationship_types_available (6 new types present)
- test_conflicts_tables_exist
- test_memory_audit_has_entity_kind
- test_graduated_status_accepted

What's next (5B-5I, deferred): entity triage UI tab, core structure
queries, the 3 killer queries, memory graduation script, conflict
detection, MCP + context pack integration. See plan file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 07:01:28 -04:00
bb46e21c9b fix: integrity check runs in container (host lacks deps)
scripts/integrity_check.py now POSTs to /admin/integrity-check
instead of importing atocore directly. The actual scan lives in
the container where DB access + deps are available. Host-side
cron just triggers and logs the result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:01:43 -04:00
88f2f7c4e1 feat: Phase 4 V1 — Robustness Hardening
Adds the observability + safety layer that turns AtoCore from
"works until something silently breaks" into "every mutation is
traceable, drift is detected, failures raise alerts."

1. Audit log (memory_audit table):
   - New table with id, memory_id, action, actor, before/after JSON,
     note, timestamp; 3 indexes for memory_id/timestamp/action
   - _audit_memory() helper called from every mutation:
     create_memory, update_memory, promote_memory,
     reject_candidate_memory, invalidate_memory, supersede_memory,
     reinforce_memory, auto_promote_reinforced, expire_stale_candidates
   - Action verb auto-selected: promoted/rejected/invalidated/
     superseded/updated based on state transition
   - "actor" threaded through: api-http, human-triage, phase10-auto-
     promote, candidate-expiry, reinforcement, etc.
   - Fail-open: audit write failure logs but never breaks the mutation
   - GET /memory/{id}/audit: full history for one memory
   - GET /admin/audit/recent: last 50 mutations across the system

2. Alerts framework (src/atocore/observability/alerts.py):
   - emit_alert(severity, title, message, context) fans out to:
     - structlog logger (always)
     - ~/atocore-logs/alerts.log append (configurable via
       ATOCORE_ALERT_LOG)
     - project_state atocore/alert/last_{severity} (dashboard surface)
     - ATOCORE_ALERT_WEBHOOK POST if set (auto-detects Discord webhook
       format for nice embeds; generic JSON otherwise)
   - Every sink fail-open — one failure doesn't prevent the others
   - Pipeline alert step in nightly cron: harness < 85% → warning;
     candidate queue > 200 → warning

3. Integrity checks (scripts/integrity_check.py):
   - Nightly scan for drift:
     - Memories → missing source_chunk_id references
     - Duplicate active memories (same type+content+project)
     - project_state → missing projects
     - Orphaned source_chunks (no parent document)
   - Results persisted to atocore/status/integrity_check_result
   - Any finding emits a warning alert
   - Added as Step G in deploy/dalidou/batch-extract.sh nightly cron

4. Dashboard surfaces it all:
   - integrity (findings + details)
   - alerts (last info/warning/critical per severity)
   - recent_audit (last 10 mutations with actor + action + preview)

Tests: 308 → 317 (9 new):
  - test_audit_create_logs_entry
  - test_audit_promote_logs_entry
  - test_audit_reject_logs_entry
  - test_audit_update_captures_before_after
  - test_audit_reinforce_logs_entry
  - test_recent_audit_returns_cross_memory_entries
  - test_emit_alert_writes_log_file
  - test_emit_alert_invalid_severity_falls_back_to_info
  - test_emit_alert_fails_open_on_log_write_error

Deferred: formal migration framework with rollback (current additive
pattern is fine for V1); memory detail wiki page with audit view
(quick follow-up).

To enable Discord alerts: set ATOCORE_ALERT_WEBHOOK to a Discord
webhook URL in Dalidou's environment. Default = log-only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:54:10 -04:00
bfa7dba4de feat: Phase 3 V1 — Auto-Organization (domain_tags + valid_until)
Adds structural metadata that the LLM triage was already implicitly
reasoning about ("stale snapshot" → reject). Phase 3 captures that
reasoning as fields so it can DRIVE retrieval, not just rejection.

Schema (src/atocore/models/database.py):
- domain_tags TEXT DEFAULT '[]'  JSON array of lowercase topic keywords
- valid_until DATETIME            ISO date; null = permanent
- idx_memories_valid_until index for efficient expiry queries

Memory service (src/atocore/memory/service.py):
- Memory dataclass gains domain_tags + valid_until
- create_memory, update_memory accept/persist both
- _row_to_memory safely reads both (JSON-decode + null handling)
- _normalize_tags helper: lowercase, dedup, strip, cap at 10
- get_memories_for_context filters expired (valid_until < today UTC)
- _rank_memories_for_query adds tag-boost: memories whose domain_tags
  appear as substrings in query text rank higher (tertiary key after
  content-overlap density + absolute overlap, before confidence)

LLM extractor (_llm_prompt.py → llm-0.5.0):
- SYSTEM_PROMPT documents domain_tags (2-5 keywords) + valid_until
  (time-bounded facts get expiry dates; durable facts stay null)
- normalize_candidate_item parses both fields from model output with
  graceful fallback for string/null/missing

LLM triage (scripts/auto_triage.py):
- TRIAGE_SYSTEM_PROMPT documents same two fields
- parse_verdict extracts them from verdict JSON
- On promote: PUT /memory/{id} with tags + valid_until BEFORE
  POST /memory/{id}/promote, so active memories carry them

API (src/atocore/api/routes.py):
- MemoryCreateRequest: adds domain_tags, valid_until
- MemoryUpdateRequest: adds domain_tags, valid_until, memory_type
- GET /memory response exposes domain_tags + valid_until + created_at

Triage UI (src/atocore/engineering/triage_ui.py):
- Renders existing tags as colored badges
- Adds inline text field for tags (comma-separated) + date picker for
  valid_until on every candidate card
- Save&Promote button persists edits via PUT then promotes
- Plain Promote (and Y shortcut) also saves tags/expiry if edited

Wiki (src/atocore/engineering/wiki.py):
- Search now matches memory content OR domain_tags
- Search results render tags as clickable badges linking to
  /wiki/search?q=<tag> for cross-project navigation
- valid_until shown as amber "valid until YYYY-MM-DD" hint

Tests: 303 → 308 (5 new for Phase 3 behavior):
- test_create_memory_with_tags_and_valid_until
- test_create_memory_normalizes_tags
- test_update_memory_sets_tags_and_valid_until
- test_get_memories_for_context_excludes_expired
- test_context_builder_tag_boost_orders_results

Deferred (explicitly): temporal_scope enum, source_refs memory graph,
HDBSCAN clustering, memory detail wiki page, backfill of existing
actives. See docs/MASTER-BRAIN-PLAN.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:37:01 -04:00
271ee25d99 feat: on-demand auto-triage from web UI
Adds an "Auto-process queue" button to /admin/triage that lets the
user kick off a full LLM triage pass without SSH. Bridges the gap
between web UI (in container) and claude CLI (host-only).

Architecture:
- UI button POSTs to /admin/triage/request-drain
- Endpoint writes atocore/config/auto_triage_requested_at flag
- Host-side watcher cron (every 2 min) checks for the flag
- When found: clears flag, acquires lock, runs auto_triage.py,
  records progress via atocore/status/* entries
- UI polls /admin/triage/drain-status every 10s to show progress,
  auto-reloads when done

Safety:
- Lock file prevents concurrent runs on host
- Flag cleared before run so duplicate clicks queue at most one re-run
- Fail-open: watcher errors just log, don't break anything
- Status endpoint stays read-only

Installation on host (one-time):
  */2 * * * * /srv/storage/atocore/app/deploy/dalidou/auto-triage-watcher.sh \
    >> /home/papa/atocore-logs/auto-triage-watcher.log 2>&1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:05:30 -04:00
d8b370fd0a feat: /admin/triage web UI + auto-drain loop
Makes human triage sustainable. Before: command-line-only review,
auto-triage stopped after 100 candidates/run. Now:

1. Web UI at /admin/triage
   - Lists all pending candidates with inline promote/reject/edit
   - Edit content in-place before promoting (PUT /memory/{id})
   - Change type via dropdown
   - Keyboard shortcuts: Y=promote, N=reject, E=edit, S=scroll-next
   - Cards fade out after action, queue count updates live
   - Zero JS framework — vanilla fetch + event delegation

2. auto_triage.py drains queue
   - Loops up to 20 batches (default) of 100 candidates each
   - Tracks seen IDs so needs_human items don't reprocess
   - Exits cleanly when queue empty
   - Nightly cron naturally drains everything

3. Dashboard + wiki surface triage queue
   - Dashboard /admin/dashboard: new "triage" section with pending
     count + /admin/triage URL + warning/notice severity levels
   - Wiki homepage: prominent callout "N candidates awaiting triage —
     review now" linking to /admin/triage, styled with triage-warning
     (>50) or triage-notice (>20) CSS

Pattern: human intervenes only when AI can't decide. The UI makes
that intervention fast (20 candidates in 60 seconds). Nightly
auto-triage drains the queue so the human queue stays bounded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:28:56 -04:00
86637f8eee feat: universal LLM consumption (Phase 1 complete)
Completes the Phase 1 master brain keystone: every LLM interaction
across the ecosystem now pulls context from AtoCore automatically.

Three adapters, one HTTP backend:

1. OpenClaw plugin pull (handler.js):
   - Added before_prompt_build hook that calls /context/build and
     injects the pack via prependContext
   - Existing capture hooks (before_agent_start + llm_output)
     unchanged
   - 6s context timeout, fail-open on AtoCore unreachable
   - Deployed to T420, gateway restarted, "7 plugins loaded"

2. atocore-proxy (scripts/atocore_proxy.py):
   - Stdlib-only OpenAI-compatible HTTP middleware
   - Drop-in layer for Codex, Ollama, LiteLLM, any OpenAI-compat client
   - Intercepts /chat/completions: extracts query, pulls context,
     injects as system message, forwards to upstream, captures back
   - Fail-open: AtoCore down = passthrough without injection
   - Configurable via env: UPSTREAM, PORT, CLIENT_LABEL, INJECT, CAPTURE

3. (from prior commit c49363f) atocore-mcp:
   - stdio MCP server, stdlib Python, 7 tools exposed
   - Registered in Claude Code: "✓ Connected"

Plus quick win:
- Project synthesis moved from Sunday-only to daily cron so wiki /
  mirror pages stay fresh (Step C in batch-extract.sh). Lint stays
  weekly.

Plus docs:
- docs/universal-consumption.md: configuration guide for all 3 adapters
  with registration/env-var tables and verification checklist

Plus housekeeping:
- .gitignore: add .mypy_cache/

Tests: 303/303 passing.

This closes the consumption gap: the reinforcement feedback loop
can now actually work (memories get injected → get referenced →
reinforcement fires → auto-promotion). Every Claude, OpenClaw,
Codex, or Ollama session is automatically AtoCore-grounded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:14:25 -04:00
c49363fccc feat: atocore-mcp server for universal LLM consumption (Phase 1)
Stdlib-only Python stdio MCP server that wraps the AtoCore HTTP
API. Makes AtoCore available as built-in tools to every MCP-aware
client (Claude Desktop, Claude Code, Cursor, Zed, Windsurf).

7 tools exposed:
- atocore_context: full context pack (state + memories + chunks)
- atocore_search: semantic retrieval with scores + sources
- atocore_memory_list: filter active memories by project/type
- atocore_memory_create: propose a candidate memory
- atocore_project_state: query Trusted Project State by category
- atocore_projects: list registered projects + aliases
- atocore_health: service status check

Design choices:
- stdlib only (no mcp SDK dep) — AtoCore philosophy
- Thin HTTP passthrough — zero business logic, zero drift risk
- Fail-open: AtoCore unreachable returns graceful error, not crash
- Protocol MCP 2024-11-05 compatible

Registered in Claude Code: `claude mcp add atocore -- python ...`
Verified: ✓ Connected, all 7 tools exposed, context/search/state
return live data from Dalidou (sha=775960c8, vectors=33253).

This is the keystone for master brain vision: every Claude session
now has AtoCore available as built-in capability without the user
or agent having to remember to invoke it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:08:20 -04:00
33a6c61ca6 feat: daily backup to Windows main computer via pull-based scp
Third backup tier (after Dalidou local + T420 off-host): pull-based
backup to the user's Windows main computer.

- scripts/windows/atocore-backup-pull.ps1: PowerShell script using
  built-in OpenSSH scp. Fail-open: exits cleanly if Dalidou
  unreachable (e.g., laptop on the road). Pulls whole snapshots dir
  (~45MB, bounded by Dalidou's retention policy).
- docs/windows-backup-setup.md: Task Scheduler setup (automated +
  manual). Runs daily 10:00 local, catches up missed days via
  StartWhenAvailable, retries 2x on failure.

Verified: pulled 3 snapshots (45MB) to
C:\Users\antoi\Documents\ATOCore_Backups\. Task "AtoCore Backup
Pull" registered in Task Scheduler, State: Ready.

Three independent backup tiers now: Dalidou local, T420 off-host,
user Windows machine. Any two can fail without data loss.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:04:00 -04:00
33a106732f docs: master brain plan — vision, universal consumption, roadmap
Documents the path from current AtoCore (capture-only, thin
knowledge) to master brain status (universal consumption, dense
knowledge, auto-organized, self-growing, flawless).

Key strategic decisions documented:
- HTTP API is the canonical truth; every client gets a thin adapter
- MCP is for Claude ecosystem; OpenClaw plugin + middleware proxy
  handle Codex/Ollama/others
- Three-tier integration: MCP server, OpenClaw plugin, generic proxy
- Phase 1 (keystone) = universal consumption at prompt time
- 7-phase roadmap over 8-10 weeks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:55:19 -04:00
3011aa77da fix: retry + stderr capture + pacing in triage/extractor
Both scripts now:
- Retry up to 3x with 2s/4s exponential backoff on transient
  failures (rate limits, capacity spikes)
- Capture claude CLI stderr in the error message (200 char cap)
  instead of just the exit code — diagnostics actually useful now
- Sleep 0.5s between calls to avoid bursting the backend

Context: last batch run hit 100% failure in triage (every call
exit 1) after 40% failure in extraction. claude CLI worked fine
immediately after, so the failures were capacity/rate-limit
transients. With retry + pacing these batches should complete
cleanly now. 439 candidates are already in the queue waiting
for triage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:29:20 -04:00
ba36a28453 docs: sprint documentation — ledger + master-plan sync
Updated DEV-LEDGER orientation with post-sprint state:
- live_sha 775960c, tests 303, harness 17/18 on live
- interactions 234 (192 claude-code + 38 openclaw)
- project_state_entries 110 across 6 projects
- nightly pipeline now includes auto-promote, harness, summary

Updated master-plan-status.md "What Is Real Today" to match
actual 2026-04-16 state. Phase 10 moved from "Next" to
operational. New "Now" priorities: observe pipeline, knowledge
density, multi-model triage, fix p04-constraints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:08:19 -04:00
999788b790 chore: OpenClaw capture handler (llm_output) + ledger sync
- openclaw-plugins/atocore-capture/handler.js: simplified version
  using before_agent_start + llm_output hooks (survives gateway
  restarts). The production copy lives on T420 at
  /tmp/atocore-openclaw-capture-plugin/openclaw-plugins/atocore-capture/
- DEV-LEDGER: updated orientation (live_sha b687e7f, capture clients)
  and session log for 2026-04-16

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:04:40 -04:00
775960c8c8 feat: "Make It Actually Useful" sprint — observability + Phase 10
Pipeline observability:
- Retrieval harness runs nightly (Step E in batch-extract.sh)
- Pipeline summary persisted to project state after each run
  (pipeline_last_run, pipeline_summary, retrieval_harness_result)
- Dashboard enhanced: interaction total + by_client, pipeline health
  (last_run, hours_since, harness results, triage stats), dynamic
  project list from registry

Phase 10 — reinforcement-based auto-promotion:
- auto_promote_reinforced(): candidates with reference_count >= 3 and
  confidence >= 0.7 auto-graduate to active
- expire_stale_candidates(): candidates unreinforced for 14+ days
  auto-rejected to prevent unbounded queue growth
- Both wired into nightly cron (Step B2)
- Batch script: scripts/auto_promote_reinforced.py (--dry-run support)

Knowledge seeding:
- scripts/seed_project_state.py: 26 curated Trusted Project State
  entries across p04-gigabit, p05-interferometer, p06-polisher,
  atomizer-v2, abb-space, atocore (decisions, requirements, facts,
  contacts, milestones)

Tests: 299 → 303 (4 new Phase 10 tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:59:12 -04:00
b687e7fa6f feat(capture): wire project inference from cwd
Populate _PROJECT_PATH_MAP in capture_stop.py so Claude Code
interactions get tagged with the correct project at capture time
instead of relying on the nightly LLM extractor to guess from
content. Covers 6 vault PARA sub-projects (P04, P05, P11/P06,
P08, I01, I02) and 4 local code repos (ATOCore, Polisher-Sim,
Fullum-Interferometer, Atomizer-V2).

Also sync project-registry.json with live Dalidou (adds abb-space,
atomizer-v2, and p11/polisher-fullum aliases to p06-polisher).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:01:38 -04:00
4d4d5f437a test(harness): fix p06-tailscale false positive, 18/18 PASS
The fixture's expect_absent: "GigaBIT" was catching legitimate
semantic overlap, not retrieval bleed. The p06 ARCHITECTURE.md
Overview describes the Polisher Suite as built for the GigaBIT M1
mirror — it is what the polisher is for, so the word appears
correctly in p06 content. All retrieved sources for this prompt
were genuinely p06/shared paths; zero actual p04 chunks leaked.

Narrowed the assertion to expect_absent: "[Source: p04-gigabit/",
which tests the real invariant (no p04 source chunks retrieved
into p06 context) without the false positive.

No retrieval/ranking code change. Fixture-only fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 11:23:00 -04:00
5b114baa87 docs(ledger): deploy c2e7064 live; close R10 + R13
- R10 fixed: master-plan-status Phase 8 now disclaims "primary
  integration", reports current narrow surface (14 client shapes vs
  ~44 routes, read-heavy + project-state/ingest writes).
- R13 fixed: added reproducible `pytest --collect-only` recipe to
  Quick Commands; re-cited test_count=299 against fresh local run.
- Orientation bumped: live_sha and main_tip c2e7064.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 11:19:55 -04:00
c2e7064238 fix(extraction): R11 container 503 + R12 shared prompt module
R11: POST /admin/extract-batch with mode=llm now returns 503 when the
claude CLI is unavailable (was silently returning success with 0
candidates), with a message pointing at the host-side script. +2 tests.

R12: extracted SYSTEM_PROMPT + parse_llm_json_array +
normalize_candidate_item + build_user_message into stdlib-only
src/atocore/memory/_llm_prompt.py. Both the container extractor and
scripts/batch_llm_extract_live.py now import from it, eliminating the
prompt/parser drift risk.

Tests 297 -> 299.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:47:01 -04:00
dc9fdd3a38 chore(ledger): end-of-session sync (2026-04-14)
Reflects today's massive work: engineering layer + wiki + Karpathy
upgrades + OpenClaw importer + auto-detection. Active memories
47 -> 84. Ready for next session to pick up cold.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:24:25 -04:00
58ea21df80 fix: triage prompt leniency for OpenClaw-curated imports (real this time)
Previous commit had the wrong message — the diff was the config
persistence fix, not triage. This properly adds rule 4 to the
triage prompt: when candidate content starts with 'From OpenClaw/',
apply a much lower bar. OpenClaw's SOUL.md, USER.md, MEMORY.md,
MODEL-ROUTING.md, and daily memory/*.md are already curated —
promote unless clearly wrong or duplicate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:55:08 -04:00
8c0f1ff6f3 fix: triage is lenient on OpenClaw-curated content
Auto-triage was rejecting 8 of 10 OpenClaw imports as 'session log'
or 'process rule belongs elsewhere'. But OpenClaw's SOUL.md, USER.md,
MEMORY.md and daily memory/*.md files are already curated — they ARE
the canonical continuity layer we want to absorb. Applying the
conservative LLM-conversation triage bar to them discards the signal
the importer was designed to capture.

Triage prompt now has a rule 4: when candidate content starts with
'From OpenClaw/' apply a much lower bar. Session events, project
updates, stakeholder notes, and decisions from daily memory files
should promote, not reject.

The ABB-Space Schott quote that DID promote was the lucky exception
— after this fix, the other 7 daily notes (CDR execution log,
Discord migration plan, isogrid research, etc.) will promote too.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:54:17 -04:00
3db1dd99b5 fix: OpenClaw importer default path = /home/papa/clawd
The .openclaw/workspace-* dirs were empty templates. Antoine's real
OpenClaw workspace is /home/papa/clawd with SOUL.md, USER.md,
MEMORY.md, MODEL-ROUTING.md, IDENTITY.md, PROJECT_STATE.md and
rich continuity subdirs (decisions/, lessons/, knowledge/,
commitments/, preferences/, goals/, projects/, handoffs/, memory/).

First real import: 10 candidates produced from 11 files scanned.
MEMORY.md (36K chars) skipped as duplicate content; needs smarter
section-level splitting in a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:41:49 -04:00
57b64523fb feat: OpenClaw state importer — one-way pull via SSH
scripts/import_openclaw_state.py reads the OpenClaw file continuity
layer from clawdbot (T420) via SSH and imports candidate memories
into AtoCore. Loose coupling: OpenClaw's internals don't need to
change, AtoCore pulls from stable markdown files.

Per codex's integration proposal (docs/openclaw-atocore-integration-proposal.md):

Classification:
- SOUL.md          -> identity candidate
- USER.md          -> identity candidate
- MODEL-ROUTING.md -> adaptation candidate (routing rules)
- MEMORY.md        -> memory candidate (long-term curated)
- memory/YYYY-MM-DD.md -> episodic candidate (daily logs, last 7 days)
- heartbeat-state.json -> skipped (ops metadata only, not canonical)

Delta detection: SHA-256 hash per file stored in project_state
under atocore/status/openclaw_import_hashes. Only changed files
re-import. Hashes persist across runs so no wasted work.

All imports land as status=candidate. Auto-triage filters. Nothing
auto-promotes — the importer is a signal producer, the pipeline
decides what graduates.

Discord: deferred per codex's proposal — no durable local store in
current OpenClaw snapshot. Revisit if OpenClaw exposes an export.

Wired into cron-backup.sh as Step 3a (before vault refresh +
extraction) so OpenClaw signals flow through the same pipeline.
Gated on ATOCORE_OPENCLAW_IMPORT=true (default true).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:39:27 -04:00
a13ea3b9d1 docs: propose OpenClaw one-way pull integration 2026-04-14 10:34:15 -04:00
3f23ca1bc6 feat: signal-aggressive extraction + auto vault refresh in nightly cron
Extraction prompt rewritten for signal-aggressive mode. The old prompt
rewarded silence ("durable insight only, empty is correct") which
caused quiet failures — real project signal (Schott quotes arriving,
stakeholder events, blockers) was dropped as "not architectural enough".

New prompt explicitly lists what to emit:
1. Project activity (mentions with context — quote received, blocker,
   action item)
2. Decisions and choices (architectural commitments, vendor selection)
3. Durable engineering insight (earned knowledge, generalizable)
4. Stakeholder and vendor events (emails sent, meetings scheduled)
5. Preferences and adaptations (how Antoine works)

Philosophy shift: "capture more signal, let triage filter noise"
replaces "extract only durable architectural facts". Auto-triage
already rejects noise well, so moving the filter downstream gives us
visibility into weak signals without polluting active memory.

Added 'episodic' to the candidate types list to support stakeholder
events with a timestamp feel.

LLM_EXTRACTOR_VERSION bumped to llm-0.4.0.

Also: cron-backup.sh now runs POST /ingest/sources before extraction
so new PKM files flow in automatically. Fail-open, non-blocking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:24:50 -04:00
c1f5b3bdee feat: Karpathy-inspired upgrades — contradiction, lint, synthesis
Three additive upgrades borrowed from Karpathy's LLM Wiki pattern:

1. CONTRADICTION DETECTION: auto-triage now has a fourth verdict —
   "contradicts". When a candidate conflicts with an existing memory
   (not duplicates, genuine disagreement like "Option A selected"
   vs "Option B selected"), the triage model flags it and leaves
   it in the queue for human review instead of silently rejecting
   or double-storing. Preserves source tension rather than
   suppressing it.

2. WEEKLY LINT PASS: scripts/lint_knowledge_base.py checks for:
   - Orphan memories (active but zero references after 14 days)
   - Stale candidates (>7 days unreviewed)
   - Unused entities (no relationships)
   - Empty-state projects
   - Unregistered projects auto-detected in memories
   Runs Sundays via the cron. Outputs a report.

3. WEEKLY SYNTHESIS: scripts/synthesize_projects.py uses sonnet to
   generate a 3-5 sentence "current state" paragraph per project
   from state + memories + entities. Cached in project_state under
   status/synthesis_cache. Wiki project pages now show this at the
   top under "Current State (auto-synthesis)". Falls back to a
   deterministic summary if no cache exists.

deploy/dalidou/batch-extract.sh: added Step C (synthesis) and
Step D (lint) gated to Sundays via date check.

All additive — nothing existing changes behavior. The database
remains the source of truth; these operations just produce better
synthesized views and catch rot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:08:13 -04:00
761c483474 feat: wiki homepage groups projects by stage
Projects now appear under three buckets based on their state entries:
- Active Contracts
- Leads & Prospects
- Internal Tools & Infra

Each card shows the stage as a tag on the project title, the client
as an italic subtitle, and the project description. Empty buckets
hide. Makes it obvious at a glance what's contracted vs lead vs
internal.

Paired with stage/type/client state entries added to all 6 projects
so the grouping has data to work with.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:47:44 -04:00
c57617f611 feat: auto-project-detection + project stages
Three changes:

1. ABB-Space registered as a lead project with stage=lead in
   Trusted Project State. Projects now have lifecycle awareness
   (lead/proposition vs active contract vs completed).

2. Extraction no longer drops unregistered project tags. When the
   LLM extractor sees a conversation about a project not in the
   registry, it keeps the model's tag on the candidate instead of
   falling back to empty. This enables auto-detection of new
   projects/leads from organic conversations. The nightly pipeline
   surfaces these candidates for triage, where the operator sees
   "hey, there's a new project called X" and can decide whether
   to register it.

3. Extraction prompt updated to tell the model: "If the conversation
   discusses a project NOT in the known list, still tag it — the
   system will auto-detect it." This removes the artificial ceiling
   that prevented new project discovery.

Updated Case D test: unregistered + unscoped now keeps the model's
tag instead of dropping to empty.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:16:04 -04:00
3f18ba3b35 feat: AtoCore Wiki — navigable project knowledge browser
Full wiki interface at /wiki with:

- /wiki — Homepage with project cards, search box, system stats
- /wiki/projects/{name} — Project page with clickable entity links
- /wiki/entities/{id} — Entity detail with relationships as links
- /wiki/search?q=... — Search across entities and memories

Every entity name in a project page links to its detail page.
Entity detail pages show properties, relationships as clickable
links to related entities, and breadcrumb navigation back to the
project and wiki home.

Responsive, dark-mode, mobile-friendly. Card grid for projects.
Generated on-demand from the database — always current, no static
files, source of truth is the DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:09:12 -04:00
8527c369ee fix: add markdown to pyproject.toml (container pip install reads this, not requirements.txt)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:37:22 -04:00
bd3dc50100 feat: HTML mirror pages — readable project dashboards in browser
GET /projects/{name}/mirror.html serves a styled HTML page rendered
from the mirror markdown. Clean typography, responsive, dark mode
support, mobile-friendly. Open from phone or desktop:

  http://dalidou:8100/projects/p04-gigabit/mirror.html
  http://dalidou:8100/projects/p05-interferometer/mirror.html
  http://dalidou:8100/projects/p06-polisher/mirror.html

Uses the markdown library for md→html conversion. Added to
requirements.txt. The JSON endpoint (/mirror) still exists for
programmatic access.

Source of truth remains the AtoCore database. The HTML page is a
derived view with a clear disclaimer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:31:03 -04:00
700e3ca2c2 feat: Human Mirror — GET /projects/{name}/mirror
Layer 3 of the AtoCore architecture. Generates a human-readable
project overview in markdown from structured data:

- Trusted Project State (by category)
- System Architecture (systems → subsystems → components with
  material and interface links)
- Decisions (with affected entities)
- Requirements & Constraints
- Materials
- Vendors
- Active Memories (with confidence and reference counts)

The mirror is DERIVED — every line traces back to an entity, state
entry, or memory. The footer stamps the generation timestamp and
the "not canonical truth" disclaimer.

API: GET /projects/{project_name}/mirror returns {project, format,
content} where content is the full markdown page. Supports project
aliases via resolve_project_name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:37:12 -04:00
ccc49d3a8f feat: engineering-aware context assembly
When a query matches a known engineering entity by name, the context
pack now includes a structured '--- Engineering Context ---' band
showing the entity's type, description, and its relationships to
other entities (subsystems, materials, requirements, decisions).

Six-tier context assembly:
  1. Trusted Project State
  2. Identity / Preferences
  3. Project Memories
  4. Domain Knowledge
  5. Engineering Context (NEW)
  6. Retrieved Chunks

The engineering band uses the same token-overlap scoring as memory
ranking: query tokens are matched against entity names + descriptions.
The top match gets its full relationship context included.

10% budget allocation. Trims before domain knowledge (lowest
priority of the structured tiers since the same info may appear in
chunks).

Example: query 'lateral support design' against p04-gigabit
surfaces the Lateral Support subsystem entity with its relationships
to GF-PTFE material, M1 Mirror Assembly parent system, and related
components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:17:01 -04:00
3e0a357441 feat: bootstrap 35 engineering entities + relationships from project knowledge
Seeds the entity graph from existing project state, memories, and
vault docs across p04-gigabit (11 entities), p05-interferometer (10),
and p06-polisher (14). Covers systems, subsystems, components,
materials, decisions, requirements, constraints, vendors, and
parameters with structural and intent relationships.

Example: GET /entities/{M1 Mirror Assembly id} returns the full
context — 4 subsystems it contains, 2 requirements it's constrained
by, and the parent project — traversable in one API call.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:57:53 -04:00
dc20033a93 feat: Engineering Knowledge Layer V1 — entities + relationships
Layer 2 of the AtoCore architecture. Adds typed engineering entities
with relationships on top of the flat memory/state/chunk substrate.

Schema:
- entities table: id, entity_type, name, project, description,
  properties (JSON), status, confidence, source_refs, timestamps
- relationships table: source_entity_id, target_entity_id,
  relationship_type, confidence, source_refs

15 entity types: project, system, subsystem, component, interface,
requirement, constraint, decision, material, parameter,
analysis_model, result, validation_claim, vendor, process

12 relationship types: contains, part_of, interfaces_with,
satisfies, constrained_by, affected_by_decision, analyzed_by,
validated_by, depends_on, uses_material, described_by, supersedes

Service layer: full CRUD + get_entity_with_context (returns an
entity with its relationships and all related entities in one call).

API endpoints:
- POST /entities — create entity
- GET /entities — list/filter by type, project, status, name
- GET /entities/{id} — entity + relationships + related entities
- POST /relationships — create relationship

Schema auto-initialized on app startup via init_engineering_schema().

7 tests covering entity CRUD, relationships, context traversal,
filtering, name search, and validation.

Test count: 290 -> 297.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:50:58 -04:00
b86181eb6c docs: knowledge architecture — dual-layer model + domain knowledge
Comprehensive architecture doc covering:
- The problem (applied vs domain knowledge separation)
- The quality bar (earned insight vs common knowledge, with examples)
- Five-tier context assembly with budget allocation
- Knowledge domains (10 domains: physics through finance)
- Domain tag encoding (prefix in content, no schema migration)
- Full flow: capture → extract → triage → surface
- Cross-project example (p04 insight surfaces in p06 context)
- Future directions: personal branch, multi-model, reinforcement

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:14:32 -04:00
9118f824fa feat: dual-layer knowledge extraction + domain knowledge band
The extraction system now produces two kinds of candidates from
the same conversation:

A. PROJECT-SPECIFIC: applied facts scoped to a named project
   (unchanged behavior)
B. DOMAIN KNOWLEDGE: generalizable engineering insight earned
   through project work, tagged with a domain (physics, materials,
   optics, mechanics, manufacturing, metrology, controls, software,
   math, finance) and stored with project="" so it surfaces across
   all projects.

Critical quality bar enforced in the system prompt: "Would a
competent engineer need experience to know this, or could they
find it in 30 seconds on Google?" Textbook values, definitions,
and obvious facts are explicitly excluded. Only hard-won insight
qualifies — the kind that takes weeks of FEA or real machining
experience to discover.

Domain tags are embedded in the content as a prefix ("[physics]",
"[materials]") so they survive without a schema migration. A future
column can parse them out.

Context builder gains a new tier between project memories and
retrieved chunks:

  Tier 1: Trusted Project State     (project-specific)
  Tier 2: Identity / Preferences    (global)
  Tier 3: Project Memories          (project-specific)
  Tier 4: Domain Knowledge (NEW)    (cross-project, 10% budget)
  Tier 5: Retrieved Chunks          (project-boosted)

Trim order: chunks -> domain knowledge -> project memories ->
identity/preference -> project state.

Host-side extraction script updated with the same prompt and
domain-tag handling.

LLM_EXTRACTOR_VERSION bumped to llm-0.3.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:04:04 -04:00
db89978871 docs: full session sync — master plan + ledger + atomizer-v2 ingested
Master plan status updated to reflect current reality:
- 5 registered projects (atomizer-v2 newly ingested, 33,253 vectors)
- 47 active memories across all types
- 61 project state entries
- Nightly pipeline fully operational (both capture clients)
- 7/14 phases baseline complete
- "Now" section updated: observe/stabilize, multi-model triage,
  automated eval, atomizer state entries
- "Next" section updated: write-back, AtoDrive, hardening
- "Not Yet" items crossed off where applicable (reflection loop,
  auto-promotion, OpenClaw write-back)

DEV-LEDGER orientation fully refreshed with current vectors,
projects, pipeline state, and capture clients.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 20:32:47 -04:00
4ac4e5cc44 Merge codex/openclaw-capture-plugin — OpenClaw capture integration
Adds openclaw-plugins/atocore-capture/: a minimal OpenClaw plugin
that mirrors Claude Code's Stop hook. Captures user-triggered
assistant turns and POSTs to AtoCore /interactions with
client=openclaw, reinforce=true, fail-open.

Review verdict: functionally complete, one polish item (prompt
includes wrapper context — not blocking, extraction pipeline
handles noisy prompts). End-to-end verified on Dalidou with a
real client=openclaw interaction.

Both Claude Code and OpenClaw now feed AtoCore's reflection loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:34:47 -04:00
a6ae6166a4 feat: add OpenClaw AtoCore capture plugin 2026-04-12 22:06:07 +00:00
4f8bec7419 feat: deeper Wave 2 + observability dashboard
Wave 2 deeper ingestion:
- 6 new Trusted Project State entries from design-level docs:
  p05: test rig architecture, CGH specification, procurement combos
  p06: force control architecture, control channels, calibration loop
- Total state entries: ~23 (was ~17)

Observability:
- GET /admin/dashboard — one-shot system overview: memory counts
  by type/project/status, reinforced count, project state entry
  counts, recent interaction timestamp, extraction pipeline status.
  Replaces the need to query 4+ endpoints to understand system state.

Harness: 17/18 (no regression from new state entries).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:09:36 -04:00
52380a233e docs: Phase 4 baseline complete
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:56:24 -04:00
8b77e83f0a feat: Phase 4 — seed identity + preference memories, lower band to 5%
3 identity memories (Antoine's role, projects, infrastructure) and
3 preference memories (no API keys, multi-model collab, action bias)
seeded on live Dalidou. These fill the identity/preference band
that was previously empty.

Lowered MEMORY_BUDGET_RATIO from 0.10 to 0.05 because the 10%
allocation squeezed project memories and retrieval chunks enough
to regress 4 harness fixtures. At 5% the band fits at most 1 short
memory — enough for the most relevant identity/preference fact
without starving the project-specific tiers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:48:56 -04:00
dbb8f915e2 chore(ledger): Batch 3 close — R9 fixed, before/after documented
Before: a model returning 'p04-gigabit' for a p06-polisher
interaction would silently override the known scope because the
project was registered. After: interaction.project always wins
when set. Model project is only a fallback for unscoped captures.

Not yet guaranteed: within-project semantic errors (model says
the right project but wrong content). That's a content-quality
concern, not a trust-hierarchy issue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:38:19 -04:00
e5e9a9931e fix(R9): trust hierarchy for project attribution
Batch 3, Days 1-3. The core R9 failure was Case F: when the model
returned a registered project DIFFERENT from the interaction's
known scope, the old code trusted the model because the project
was registered. A p06-polisher interaction could silently produce
a p04-gigabit candidate.

New rule (trust hierarchy):
1. Interaction scope always wins when set (cases A, C, E, F)
2. Model project used only for unscoped interactions AND only when
   it resolves to a registered project (cases D, G)
3. Empty string when both are empty or unregistered (case B)

The rule is: interaction.project is the strongest signal because
it comes from the capture hook's project detection, which runs
before the LLM ever sees the content. The model's project guess
is only useful when the capture hook had no project context.

7 case tests (A-G) cover every combination of model/interaction
project state. Pre-existing tests updated for the new behavior.

Host-side script mirrors the same hierarchy using _known_projects
fetched from GET /projects at startup.

Test count: 286 -> 290 (+4 net, 7 new R9 cases, 3 old tests
consolidated).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:37:29 -04:00
144dbbd700 Merge codex/audit-batch2 — R7/R8 confirmed fixed, R9 stays open
Codex verified R1/R5/R7/R8 fixed, harness 17/18, auto-triage
dry-run works. R9 stays open: registered-but-wrong project from
model can still override interaction scope. Fair — the registry
check prevents hallucinated names but not misattribution between
real projects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:28:00 -04:00
7650c339a2 audit: verify batch2 claims and findings 2026-04-12 19:06:51 +00:00