fad30d546176fbf430275818efcd79cc8fa4054a
24 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
| fad30d5461 |
feat(client): Phase 9 reflection loop surface in shared operator CLI
Codex's sequence step 3: finish the Phase 9 operator surface in the
shared client. The previous client version (0.1.0) covered stable
operations (project lifecycle, retrieval, context build, trusted
state, audit-query) but explicitly deferred capture/extract/queue/
promote/reject pending "exercised workflow". That deferral ran
into a bootstrap problem: real Claude Code sessions can't exercise
the Phase 9 loop without a usable client surface to drive it. This
commit ships the 8 missing subcommands so the next step (real
validation on Dalidou) is unblocked.
Bumps CLIENT_VERSION from 0.1.0 to 0.2.0 per the semver rules in
llm-client-integration.md (new subcommands = minor bump).
New subcommands in scripts/atocore_client.py
--------------------------------------------
| Subcommand | Endpoint |
|-----------------------|-------------------------------------------|
| capture | POST /interactions |
| extract | POST /interactions/{id}/extract |
| reinforce-interaction | POST /interactions/{id}/reinforce |
| list-interactions | GET /interactions |
| get-interaction | GET /interactions/{id} |
| queue | GET /memory?status=candidate |
| promote | POST /memory/{id}/promote |
| reject | POST /memory/{id}/reject |
Each follows the existing client style: positional arguments with
empty-string defaults for optional filters, truthy-string arguments
for booleans (matching the existing refresh-project pattern), JSON
output via print_json(), fail-open behavior inherited from
request().
capture accepts prompt + response + project + client + session_id +
reinforce as positionals, defaulting the client field to
"atocore-client" when omitted so every capture from the shared
client is identifiable in the interactions audit trail.
extract defaults to preview mode (persist=false). Pass "true" as
the second positional to create candidate memories.
list-interactions and queue build URL query strings with
url-encoded values and always include the limit, matching how the
existing context-build subcommand handles its parameters.
Security fix: ID-field URL encoding
-----------------------------------
The initial draft used urllib.parse.quote() with the default safe
set, which does NOT encode "/" because it's a reserved path
character. That's a security footgun on ID fields: passing
"promote mem/evil/action" would build /memory/mem/evil/action/promote
and hit a completely different endpoint than intended.
Fixed by passing safe="" to urllib.parse.quote() on every ID field
(interaction_id and memory_id). The tests cover this explicitly via
test_extract_url_encodes_interaction_id and test_promote_url_encodes_memory_id,
both of which would have failed with the default behavior.
Project names keep the default quote behavior because a project
name with a slash would already be broken elsewhere in the system
(ingest root resolution, file paths, etc).
tests/test_atocore_client.py (new, 18 tests, all green)
-------------------------------------------------------
A dedicated test file for the shared client that mocks the
request() helper and verifies each subcommand:
- calls the correct HTTP method and path
- builds the correct JSON body (or query string)
- passes the right subset of CLI arguments through
- URL-encodes ID fields so path traversal isn't possible
Tests are structured as unit tests (not integration tests) because
the API surface on the server side already has its own route tests
in test_api_storage.py and the Phase 9 specific files. These tests
are the wiring contract between CLI args and HTTP calls.
Test file highlights:
- capture: default values, custom client, reinforce=false
- extract: preview by default, persist=true opt-in, URL encoding
- reinforce-interaction: correct path construction
- list-interactions: no filters, single filter, full filter set
(including ISO 8601 since parameter with T separator and Z)
- get-interaction: fetch by id
- queue: always filters status=candidate, accepts memory_type
and project, coerces limit to int
- promote / reject: correct path + URL encoding
- test_phase9_full_loop_via_client_shape: end-to-end sequence
that drives capture -> extract preview -> extract persist ->
queue list -> promote -> reject through the shared client and
verifies the exact sequence of HTTP calls that would be made
These tests run in ~0.2s because they mock request() — no DB, no
Chroma, no HTTP. The fast feedback loop matters because the
client surface is what every agent integration eventually depends
on.
docs/architecture/llm-client-integration.md updates
---------------------------------------------------
- New "Phase 9 reflection loop (shipped after migration safety
work)" section under "What's in scope for the shared client
today" with the full 8-subcommand table and a note explaining
the bootstrap-problem rationale
- Removed the "Memory review queue and reflection loop" section
from "What's intentionally NOT in scope today"; backup admin
and engineering-entity commands remain the only deferred
families
- Renumbered the deferred-commands list (was 3 items, now 2)
- Open follow-ups updated: memory-review-subcommand item replaced
with "real-usage validation of the Phase 9 loop" as the next
concrete dependency
- TL;DR updated to list the reflection-loop subcommands
- Versioning note records the v0.1.0 -> v0.2.0 bump with the
subcommands included
Full suite: 215 passing (was 197), 1 warning. The +18 is
tests/test_atocore_client.py. Runtime unchanged because the new
tests don't touch the DB.
What this commit does NOT do
----------------------------
- Does NOT change the server-side endpoints. All 8 subcommands
call existing API routes that were shipped in Phase 9 Commits
A/B/C. This is purely a client-side wiring commit.
- Does NOT run the reflection loop against the live Dalidou
instance. That's the next concrete step and is explicitly
called out in the open-follow-ups section of the updated doc.
- Does NOT modify the Claude Code slash command. It still pulls
context only; the capture/extract/queue/promote companion
commands (e.g. /atocore-record-response) are deferred until the
capture workflow has been exercised in real use at least once.
- Does NOT refactor the OpenClaw helper. That's a cross-repo
change and remains a queued follow-up, now unblocked by the
shared client having the reflection-loop subcommands.
|
|||
| 261277fd51 |
fix(migration): preserve superseded/invalid shadow state during rekey
Codex caught a real data-loss bug in the legacy alias migration
shipped in
|
|||
| 7e60f5a0e6 |
feat(ops): legacy alias migration script with dry-run/apply modes
Closes the compatibility gap documented in docs/architecture/project-identity-canonicalization.md. Before |
|||
| 1953e559f9 |
docs+test: clarify legacy alias compatibility gap, add gap regression test
Codex caught a real documentation accuracy bug in the previous canonicalization doc commit ( |
|||
| fb6298a9a1 |
fix(P1+P2): canonicalize project names at every trust boundary
Three findings from codex's review of the previous P1+P2 fix. The earlier commit ( |
|||
| f2372eff9e |
fix(P1+P2): alias-aware project state lookup + slash command corpus fallback
Two regression fixes from codex's review of the slash command
refactor commit (
|
|||
| 53147d326c |
feat(phase9-C): rule-based candidate extractor and review queue
Phase 9 Commit C. Closes the capture loop: Commit A records what
AtoCore fed the LLM and what came back, Commit B bumps confidence on
active memories the response actually references, and this commit
turns structured cues in the response into candidate memories for a
human review queue.
Nothing extracted here is ever automatically promoted into trusted
state. Every candidate sits at status="candidate" until a human (or
later, a confident automatic policy) calls /memory/{id}/promote or
/memory/{id}/reject. This keeps the "bad memory is worse than no
memory" invariant from the operating model intact.
New module: src/atocore/memory/extractor.py
- MemoryCandidate dataclass (type, content, rule, source_span,
project, confidence, source_interaction_id)
- extract_candidates_from_interaction(interaction): runs a fixed set
of regex rules over the response + response_summary and returns
a list of candidates
V0 rule set (deliberately narrow to keep false positives low):
- decision_heading ## Decision: / ## Decision - / ## Decision —
-> adaptation candidate
- constraint_heading ## Constraint: ... -> project candidate
- requirement_heading ## Requirement: ... -> project candidate
- fact_heading ## Fact: ... -> knowledge candidate
- preference_sentence "I prefer X" / "the user prefers X"
-> preference candidate
- decided_to_sentence "decided to X" -> adaptation candidate
- requirement_sentence "the requirement is X" -> project candidate
Extractor post-processing:
- clean_value: collapse whitespace, strip trailing punctuation
- min content length 8 chars, max 280 (keeps candidates reviewable)
- dedupe by (memory_type, normalized value, rule)
- drop candidates whose content already matches an active memory of
the same type+project so the queue doesn't ask humans to re-curate
things they already promoted
Memory service (extends Commit B candidate-status foundation):
- promote_memory(id): candidate -> active (404 if not a candidate)
- reject_candidate_memory(id): candidate -> invalid
- both are no-ops if the target isn't currently a candidate so the
API can surface 404 without the caller needing to pre-check
API endpoints (new):
- POST /interactions/{id}/extract run extractor, preview-only
body: {"persist": false} (default) returns candidates
{"persist": true} creates candidate memories
- POST /memory/{id}/promote candidate -> active
- POST /memory/{id}/reject candidate -> invalid
- GET /memory?status=candidate list review queue explicitly
(existing endpoint now accepts status= override)
- GET /memory now also returns reference_count and last_referenced_at
per memory so the Commit B reinforcement signal is visible to clients
Trust model unchanged:
- candidates NEVER appear in context packs (get_memories_for_context
still filters to active via the active_only default)
- candidates NEVER get reinforced by the Commit B loop (reinforcement
refuses non-active memories)
- trusted project state is untouched end-to-end
Tests (25 new, all green):
- heading pattern: decision, constraint, requirement, fact
- separator variants :, -, em-dash
- sentence patterns: preference, decided_to, requirement
- rejects too-short matches
- dedupes identical matches
- strips trailing punctuation
- carries project and source_interaction_id onto candidates
- drops candidates that duplicate an existing active memory
- returns empty for prose without structural cues
- candidate and active coexist in the memory table
- promote_memory moves candidate -> active
- promote on non-candidate returns False
- reject_candidate_memory moves candidate -> invalid
- reject on non-candidate returns False
- get_memories(status="candidate") returns just the queue
- POST /interactions/{id}/extract preview-only path
- POST /interactions/{id}/extract persist=true path
- POST /interactions/{id}/extract 404 for missing interaction
- POST /memory/{id}/promote success + 404 on non-candidate
- POST /memory/{id}/reject 404 on missing
- GET /memory?status=candidate surfaces the queue
- GET /memory?status=<invalid> returns 400
Full suite: 160 passing (was 135).
What Phase 9 looks like end to end after this commit
----------------------------------------------------
prompt
-> context pack assembled
-> LLM response
-> POST /interactions (capture)
-> automatic Commit B reinforcement (active memories only)
-> [optional] POST /interactions/{id}/extract
-> Commit C extractor proposes candidates
-> human reviews via GET /memory?status=candidate
-> POST /memory/{id}/promote (candidate -> active)
OR POST /memory/{id}/reject (candidate -> invalid)
Not in this commit (deferred on purpose):
- Decay of unused memories (we keep reference_count and
last_referenced_at so a later decay job has the signal it needs)
- LLM-based extractor as an alternative to the regex rules
- Automatic promotion of high-confidence candidates
- Candidate-to-entity upgrade path (needs the engineering layer
memory-vs-entities decision, planned in a coming architecture doc)
|
|||
| 2704997256 |
feat(phase9-B): reinforce active memories from captured interactions
Phase 9 Commit B from the agreed plan. With Commit A capturing what
AtoCore fed to the LLM and what came back, this commit closes the
weakest part of the loop: when a memory is actually referenced in a
response, its confidence should drift up, and stale memories that
nobody ever mentions should stay where they are.
This is reinforcement only — nothing is promoted into trusted state
and no candidates are created. Extraction is Commit C.
Schema (additive migration):
- memories.last_referenced_at DATETIME (null by default)
- memories.reference_count INTEGER DEFAULT 0
- idx_memories_last_referenced on last_referenced_at
- memories.status now accepts the new "candidate" value so Commit C
has the status slot to land on. Existing active/superseded/invalid
rows are untouched.
New module: src/atocore/memory/reinforcement.py
- reinforce_from_interaction(interaction): scans the interaction's
response + response_summary for echoes of active memories and
bumps confidence / reference_count for each match
- matching is intentionally simple and explainable:
* normalize both sides (lowercase, collapse whitespace)
* require >= 12 chars of memory content to match
* compare the leading 80-char window of each memory
- the candidate pool is project-scoped memories for the interaction's
project + global identity + preference memories, deduplicated
- candidates and invalidated memories are NEVER reinforced; only
active memories move
Memory service changes:
- MEMORY_STATUSES = ["candidate", "active", "superseded", "invalid"]
- create_memory(status="candidate"|"active"|...) with per-status
duplicate scoping so a candidate and an active with identical text
can legitimately coexist during review
- get_memories(status=...) explicit override of the legacy active_only
flag; callers can now list the review queue cleanly
- update_memory accepts any valid status including "candidate"
- reinforce_memory(id, delta): low-level primitive that bumps
confidence (capped at 1.0), increments reference_count, and sets
last_referenced_at. Only active memories; returns (applied, old, new)
- promote_memory / reject_candidate_memory helpers prepping Commit C
Interactions service:
- record_interaction(reinforce=True) runs reinforce_from_interaction
automatically when the interaction has response content. reinforcement
errors are logged but never raised back to the caller so capture
itself is never blocked by a flaky downstream.
- circular import between interactions service and memory.reinforcement
avoided by lazy import inside the function
API:
- POST /interactions now accepts a reinforce bool field (default true)
- POST /interactions/{id}/reinforce runs reinforcement on an existing
captured interaction — useful for backfilling or for retrying after
a transient error in the automatic pass
- response lists which memory ids were reinforced with
old / new confidence for audit
Tests (17 new, all green):
- reinforce_memory bumps, caps at 1.0, accumulates reference_count
- reinforce_memory rejects candidates and missing ids
- reinforce_memory rejects negative delta
- reinforce_from_interaction matches active memory
- reinforce_from_interaction ignores candidates and inactive
- reinforce_from_interaction requires minimum content length
- reinforce_from_interaction handles empty response cleanly
- reinforce_from_interaction normalizes casing and whitespace
- reinforce_from_interaction deduplicates across memory buckets
- record_interaction auto-reinforces by default
- record_interaction reinforce=False skips the pass
- record_interaction handles empty response
- POST /interactions/{id}/reinforce runs against stored interaction
- POST /interactions/{id}/reinforce returns 404 for missing id
- POST /interactions accepts reinforce=false
Full suite: 135 passing (was 118).
Trust model unchanged:
- reinforcement only moves confidence within the existing active set
- the candidate lifecycle is declared but only Commit C will actually
create candidate memories
- trusted project state is never touched by reinforcement
Next: Commit C adds the rule-based extractor that produces candidate
memories from captured interactions plus the promote/reject review
queue endpoints.
|
|||
| ea3fed3d44 |
feat(phase9-A): interaction capture loop foundation
Phase 9 Commit A from the agreed plan: turn AtoCore from a stateless
context enhancer into a system that records what it actually fed to an
LLM and what came back. This is the audit trail Reflection (Commit B)
and Extraction (Commit C) will be layered on top of.
The interactions table existed in the schema since the original PoC
but nothing wrote to it. This change makes it real:
Schema migration (additive only):
- response full LLM response (caller decides how much)
- memories_used JSON list of memory ids in the context pack
- chunks_used JSON list of chunk ids in the context pack
- client identifier of the calling system
(openclaw, claude-code, manual, ...)
- session_id groups multi-turn conversations
- project project name (mirrors the memory module pattern,
no FK so capture stays cheap)
- indexes on session_id, project, created_at
The created_at column is now written explicitly with a SQLite-compatible
'YYYY-MM-DD HH:MM:SS' format so the same string lives in the DB and the
returned dataclass. Without this the `since` filter on list_interactions
would silently fail because CURRENT_TIMESTAMP and isoformat use different
shapes that do not compare cleanly as strings.
New module src/atocore/interactions/:
- Interaction dataclass
- record_interaction() persists one round-trip (prompt required;
everything else optional). Refuses empty prompts.
- list_interactions() filters by project / session_id / client / since,
newest-first, hard-capped at 500
- get_interaction() fetch by id, full response + context pack
API endpoints:
- POST /interactions capture one interaction
- GET /interactions list with summaries (no full response)
- GET /interactions/{id} full record incl. response + pack
Trust model:
- Capture is read-only with respect to memories, project state, and
source chunks. Nothing here promotes anything into trusted state.
- The audit trail becomes the dataset Commit B (reinforcement) and
Commit C (extraction + review queue) will operate on.
Tests (13 new, all green):
- service: persist + roundtrip every field
- service: minimum-fields path (prompt only)
- service: empty / whitespace prompt rejected
- service: get by id returns None for missing
- service: filter by project, session, client
- service: ordering newest-first with limit
- service: since filter inclusive on cutoff (the bug the timestamp
fix above caught)
- service: limit=0 returns empty
- API: POST records and round-trips through GET /interactions/{id}
- API: empty prompt returns 400
- API: missing id returns 404
- API: list filter returns summaries (not full response bodies)
Full suite: 118 passing (was 105).
master-plan-status.md updated to move Phase 9 from "not started" to
"started" with the explicit note that Commit A is in and Commits B/C
remain.
|
|||
| c9b9eede25 |
feat: tunable ranking, refresh status, chroma backup + admin endpoints
Three small improvements that move the operational baseline forward
without changing the existing trust model.
1. Tunable retrieval ranking weights
- rank_project_match_boost, rank_query_token_step,
rank_query_token_cap, rank_path_high_signal_boost,
rank_path_low_signal_penalty are now Settings fields
- all overridable via ATOCORE_* env vars
- retriever no longer hard-codes 2.0 / 1.18 / 0.72 / 0.08 / 1.32
- lets ranking be tuned per environment as Wave 1 is exercised
without code changes
2. /projects/{name}/refresh status
- refresh_registered_project now returns an overall status field
("ingested", "partial", "nothing_to_ingest") plus roots_ingested
and roots_skipped counters
- ProjectRefreshResponse advertises the new fields so callers can
rely on them
- covers the case where every configured root is missing on disk
3. Chroma cold snapshot + admin backup endpoints
- create_runtime_backup now accepts include_chroma and writes a
cold directory copy of the chroma persistence path
- new list_runtime_backups() and validate_backup() helpers
- new endpoints:
- POST /admin/backup create snapshot (optional chroma)
- GET /admin/backup list snapshots
- GET /admin/backup/{stamp}/validate structural validation
- chroma snapshots are taken under exclusive_ingestion() so a refresh
or ingest cannot race with the cold copy
- backup metadata records what was actually included and how big
Tests:
- 8 new tests covering tunable weights, refresh status branches
(ingested / partial / nothing_to_ingest), chroma snapshot, list,
validate, and the API endpoints (including the lock-acquisition path)
- existing fake refresh stubs in test_api_storage.py updated for the
expanded ProjectRefreshResponse model
- full suite: 105 passing (was 97)
next-steps doc updated to reflect that the chroma snapshot + restore
validation gap from current-state.md is now closed in code; only the
operational retention policy remains.
|
|||
| 14ab7c8e9f |
fix: pass project_hint into retrieve and add path-signal ranking
Two changes that belong together:
1. builder.build_context() now passes project_hint into retrieve(),
so the project-aware boost actually fires for the retrieval pipeline
driven by /context/build. Before this, only direct /query callers
benefited from the registered-project boost.
2. retriever now applies two more ranking signals on every chunk:
- _query_match_boost: boosts chunks whose source/title/heading
echo high-signal query tokens (stop list filters out generic
words like "the", "project", "system")
- _path_signal_boost: down-weights archival noise (_archive,
_history, pre-cleanup, reviews) by 0.72 and up-weights current
high-signal docs (status, decision, requirements, charter,
system-map, error-budget, ...) by 1.18
Tests:
- test_context_builder_passes_project_hint_to_retrieval verifies
the wiring fix
- test_retrieve_downranks_archive_noise_and_prefers_high_signal_paths
verifies the new ranking helpers prefer current docs over archive
This addresses the cross-project competition and archive bleed
called out in current-state.md after the Wave 1 ingestion.
|
|||
| bdb42dba05 | Expand active project wave and serialize refreshes | |||
| 26bfa94c65 | Add project-aware boost to raw query | |||
| 06aa931273 | Add project registry update flow | |||
| c9757e313a | Harden runtime and add backup foundation | |||
| 9715fe3143 | Add project registration endpoint | |||
| 1f1e6b5749 | Add project registration proposal preview | |||
| 827dcf2cd1 | Add project registration policy and template | |||
| 8293099025 | Add project registry refresh foundation | |||
| 6bfa1fcc37 | Add Dalidou storage foundation and deployment prep | |||
| b0889b3925 | Stabilize core correctness and sync project plan state | |||
| b48f0c95ab |
feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation: - Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation - CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede - Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid) - Memory API endpoints: POST/GET/PUT/DELETE /memory Context builder integration (trust precedence per Master Plan): 1. Trusted Project State (highest trust, 20% budget) 2. Identity + Preference memories (10% budget) 3. Retrieved chunks (remaining budget) Also fixed database.py to use dynamic settings reference for test isolation. 45/45 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
|||
| 531c560db7 |
feat: Phase 1 ingestion hardening + Phase 5 Trusted Project State
Phase 1 - Ingestion hardening: - Encoding fallback (UTF-8/UTF-8-sig/Latin-1/CP1252) - Delete detection: purge DB/vector entries for removed files - Ingestion stats endpoint (GET /stats) Phase 5 - Trusted Project State: - project_state table with categories (status, decision, requirement, contact, milestone, fact, config) - CRUD API: POST/GET/DELETE /project/state - Upsert semantics, invalidation (supersede) support - Context builder integrates project state at highest trust precedence - Project state gets 20% budget allocation, appears first in context - Trust precedence: Project State > Retrieved Chunks (per Master Plan) 33/33 tests passing. Validated end-to-end with GigaBIT M1 project data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
|||
| b4afbbb53a |
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation: - FastAPI server with 5 endpoints (ingest, query, context/build, health, debug) - SQLite database with 5 tables (documents, chunks, memories, projects, interactions) - Heading-aware markdown chunker (800 char max, recursive splitting) - Multilingual embeddings via sentence-transformers (EN/FR) - ChromaDB vector store with cosine similarity retrieval - Context builder with project boosting, dedup, and budget enforcement - CLI scripts for batch ingestion and test prompt evaluation - 19 unit tests passing, 79% coverage - Validated on 482 real project files (8383 chunks, 0 errors) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |