Commit Graph

8 Commits

Author SHA1 Message Date
b492f5f7b0 fix: schema init ordering, deploy.sh default, client BASE_URL docs
Three issues Dalidou Claude surfaced during the first real deploy
of commit e877e5b to the live service (report from 2026-04-08).
Bug 1 was the critical one — a schema init ordering bug that would
have bitten every future upgrade from a pre-Phase-9 schema — and
the other two were usability traps around hostname resolution.

Bug 1 (CRITICAL): schema init ordering
--------------------------------------
src/atocore/models/database.py

SCHEMA_SQL contained CREATE INDEX statements that referenced
columns added later by _apply_migrations():

    CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project);
    CREATE INDEX IF NOT EXISTS idx_interactions_project_name ON interactions(project);
    CREATE INDEX IF NOT EXISTS idx_interactions_session ON interactions(session_id);

On a FRESH install, CREATE TABLE IF NOT EXISTS creates the tables
with the Phase 9 shape (columns present), so the CREATE INDEX runs
cleanly and _apply_migrations is effectively a no-op.

On an UPGRADE from a pre-Phase-9 schema, CREATE TABLE IF NOT EXISTS
is a no-op (the tables already exist in the old shape), the columns
are NOT added yet, and the CREATE INDEX fails with
"OperationalError: no such column: project" before
_apply_migrations gets a chance to add the columns.

Dalidou Claude hit this exactly when redeploying from 0.1.0 to
0.2.0 — had to manually ALTER TABLE to add the Phase 9 columns
before the container could start.

The fix is to remove the Phase 9-column indexes from SCHEMA_SQL.
They already exist in _apply_migrations() AFTER the corresponding
ALTER TABLE, so they still get created on both fresh and upgrade
paths — just after the columns exist, not before.

Indexes still in SCHEMA_SQL (all safe — reference columns that
have existed since the first release):
- idx_chunks_document on source_chunks(document_id)
- idx_memories_type on memories(memory_type)
- idx_memories_status on memories(status)
- idx_interactions_project on interactions(project_id)

Indexes moved to _apply_migrations (already there — just no longer
duplicated in SCHEMA_SQL):
- idx_memories_project on memories(project)
- idx_interactions_project_name on interactions(project)
- idx_interactions_session on interactions(session_id)
- idx_interactions_created_at on interactions(created_at)

Regression test: tests/test_database.py
---------------------------------------
New test_init_db_upgrades_pre_phase9_schema_without_failing:

- Seeds the DB with the exact pre-Phase-9 shape (no project /
  last_referenced_at / reference_count on memories; no project /
  client / session_id / response / memories_used / chunks_used on
  interactions)
- Calls init_db() — which used to raise OperationalError before
  the fix
- Verifies all Phase 9 columns are present after the call
- Verifies the migration indexes exist

Before the fix this test would have failed with
"OperationalError: no such column: project" on the init_db call.
After the fix it passes. This locks the invariant "init_db is
safe on any legacy schema shape" so the bug can't silently come
back.

Full suite: 216 passing (was 215), 1 warning. The +1 is the new
regression test.

Bug 3 (usability): deploy.sh DNS default
----------------------------------------
deploy/dalidou/deploy.sh

ATOCORE_GIT_REMOTE defaulted to http://dalidou:3000/Antoine/ATOCore.git
which requires the "dalidou" hostname to resolve. On the Dalidou
host itself it didn't (no /etc/hosts entry for localhost alias),
so deploy.sh had to be run with the IP as a manual workaround.

Fix: default ATOCORE_GIT_REMOTE to http://127.0.0.1:3000/Antoine/ATOCore.git.
Loopback always works on the host running the script. Callers
from a remote host (e.g. running deploy.sh from a laptop against
the Dalidou LAN) set ATOCORE_GIT_REMOTE explicitly. The script
header's Environment Variables section documents this with an
explicit reference to the 2026-04-08 Dalidou deploy report so the
rationale isn't lost.

docs/dalidou-deployment.md gets a new "Troubleshooting hostname
resolution" subsection and a new example invocation showing how
to deploy from a remote host with an explicit ATOCORE_GIT_REMOTE
override.

Bug 2 (usability): atocore_client.py ATOCORE_BASE_URL documentation
-------------------------------------------------------------------
scripts/atocore_client.py

Same class of issue as bug 3. BASE_URL defaults to
http://dalidou:8100 which resolves fine from a remote caller
(laptop, T420/OpenClaw over Tailscale) but NOT from the Dalidou
host itself or from inside the atocore container. Dalidou Claude
saw the CLI return
{"status": "unavailable", "fail_open": true}
while direct curl to http://127.0.0.1:8100 worked.

The fix here is NOT to change the default (remote callers are
the common case and would break) but to DOCUMENT the override
clearly so the next operator knows what's happening:

- The script module docstring grew a new "Environment variables"
  section covering ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS,
  ATOCORE_REFRESH_TIMEOUT_SECONDS, and ATOCORE_FAIL_OPEN, with
  the explicit override example for on-host/in-container use
- It calls out the exact symptom (fail-open envelope when the
  base URL doesn't resolve) so the diagnosis is obvious from
  the error alone
- docs/dalidou-deployment.md troubleshooting section mirrors
  this guidance so there's one place to look regardless of
  whether the operator starts with the client help or the
  deploy doc

What this commit does NOT do
----------------------------
- Does NOT change the default ATOCORE_BASE_URL. Doing that would
  break the T420 OpenClaw helper and every remote caller who
  currently relies on the hostname. Documentation is the right
  fix for this case.
- Does NOT fix /etc/hosts on Dalidou. That's a host-level
  configuration issue that the user can fix if they prefer
  having the hostname resolve; the deploy.sh fix makes it
  unnecessary regardless.
- Does NOT re-run the validation on Dalidou. The next step is
  for the live service to pull this commit via deploy.sh (which
  should now work without the IP workaround) and re-run the
  Phase 9 loop test to confirm nothing regressed.
2026-04-08 19:02:57 -04:00
2704997256 feat(phase9-B): reinforce active memories from captured interactions
Phase 9 Commit B from the agreed plan. With Commit A capturing what
AtoCore fed to the LLM and what came back, this commit closes the
weakest part of the loop: when a memory is actually referenced in a
response, its confidence should drift up, and stale memories that
nobody ever mentions should stay where they are.

This is reinforcement only — nothing is promoted into trusted state
and no candidates are created. Extraction is Commit C.

Schema (additive migration):
- memories.last_referenced_at DATETIME      (null by default)
- memories.reference_count    INTEGER DEFAULT 0
- idx_memories_last_referenced on last_referenced_at
- memories.status now accepts the new "candidate" value so Commit C
  has the status slot to land on. Existing active/superseded/invalid
  rows are untouched.

New module: src/atocore/memory/reinforcement.py
- reinforce_from_interaction(interaction): scans the interaction's
  response + response_summary for echoes of active memories and
  bumps confidence / reference_count for each match
- matching is intentionally simple and explainable:
  * normalize both sides (lowercase, collapse whitespace)
  * require >= 12 chars of memory content to match
  * compare the leading 80-char window of each memory
- the candidate pool is project-scoped memories for the interaction's
  project + global identity + preference memories, deduplicated
- candidates and invalidated memories are NEVER reinforced; only
  active memories move

Memory service changes:
- MEMORY_STATUSES = ["candidate", "active", "superseded", "invalid"]
- create_memory(status="candidate"|"active"|...) with per-status
  duplicate scoping so a candidate and an active with identical text
  can legitimately coexist during review
- get_memories(status=...) explicit override of the legacy active_only
  flag; callers can now list the review queue cleanly
- update_memory accepts any valid status including "candidate"
- reinforce_memory(id, delta): low-level primitive that bumps
  confidence (capped at 1.0), increments reference_count, and sets
  last_referenced_at. Only active memories; returns (applied, old, new)
- promote_memory / reject_candidate_memory helpers prepping Commit C

Interactions service:
- record_interaction(reinforce=True) runs reinforce_from_interaction
  automatically when the interaction has response content. reinforcement
  errors are logged but never raised back to the caller so capture
  itself is never blocked by a flaky downstream.
- circular import between interactions service and memory.reinforcement
  avoided by lazy import inside the function

API:
- POST /interactions now accepts a reinforce bool field (default true)
- POST /interactions/{id}/reinforce runs reinforcement on an existing
  captured interaction — useful for backfilling or for retrying after
  a transient error in the automatic pass
- response lists which memory ids were reinforced with
  old / new confidence for audit

Tests (17 new, all green):
- reinforce_memory bumps, caps at 1.0, accumulates reference_count
- reinforce_memory rejects candidates and missing ids
- reinforce_memory rejects negative delta
- reinforce_from_interaction matches active memory
- reinforce_from_interaction ignores candidates and inactive
- reinforce_from_interaction requires minimum content length
- reinforce_from_interaction handles empty response cleanly
- reinforce_from_interaction normalizes casing and whitespace
- reinforce_from_interaction deduplicates across memory buckets
- record_interaction auto-reinforces by default
- record_interaction reinforce=False skips the pass
- record_interaction handles empty response
- POST /interactions/{id}/reinforce runs against stored interaction
- POST /interactions/{id}/reinforce returns 404 for missing id
- POST /interactions accepts reinforce=false

Full suite: 135 passing (was 118).

Trust model unchanged:
- reinforcement only moves confidence within the existing active set
- the candidate lifecycle is declared but only Commit C will actually
  create candidate memories
- trusted project state is never touched by reinforcement

Next: Commit C adds the rule-based extractor that produces candidate
memories from captured interactions plus the promote/reject review
queue endpoints.
2026-04-06 21:18:38 -04:00
ea3fed3d44 feat(phase9-A): interaction capture loop foundation
Phase 9 Commit A from the agreed plan: turn AtoCore from a stateless
context enhancer into a system that records what it actually fed to an
LLM and what came back. This is the audit trail Reflection (Commit B)
and Extraction (Commit C) will be layered on top of.

The interactions table existed in the schema since the original PoC
but nothing wrote to it. This change makes it real:

Schema migration (additive only):
- response                full LLM response (caller decides how much)
- memories_used           JSON list of memory ids in the context pack
- chunks_used             JSON list of chunk ids in the context pack
- client                  identifier of the calling system
                          (openclaw, claude-code, manual, ...)
- session_id              groups multi-turn conversations
- project                 project name (mirrors the memory module pattern,
                          no FK so capture stays cheap)
- indexes on session_id, project, created_at

The created_at column is now written explicitly with a SQLite-compatible
'YYYY-MM-DD HH:MM:SS' format so the same string lives in the DB and the
returned dataclass. Without this the `since` filter on list_interactions
would silently fail because CURRENT_TIMESTAMP and isoformat use different
shapes that do not compare cleanly as strings.

New module src/atocore/interactions/:
- Interaction dataclass
- record_interaction()    persists one round-trip (prompt required;
                          everything else optional). Refuses empty prompts.
- list_interactions()     filters by project / session_id / client / since,
                          newest-first, hard-capped at 500
- get_interaction()       fetch by id, full response + context pack

API endpoints:
- POST   /interactions                capture one interaction
- GET    /interactions                list with summaries (no full response)
- GET    /interactions/{id}           full record incl. response + pack

Trust model:
- Capture is read-only with respect to memories, project state, and
  source chunks. Nothing here promotes anything into trusted state.
- The audit trail becomes the dataset Commit B (reinforcement) and
  Commit C (extraction + review queue) will operate on.

Tests (13 new, all green):
- service: persist + roundtrip every field
- service: minimum-fields path (prompt only)
- service: empty / whitespace prompt rejected
- service: get by id returns None for missing
- service: filter by project, session, client
- service: ordering newest-first with limit
- service: since filter inclusive on cutoff (the bug the timestamp
  fix above caught)
- service: limit=0 returns empty
- API: POST records and round-trips through GET /interactions/{id}
- API: empty prompt returns 400
- API: missing id returns 404
- API: list filter returns summaries (not full response bodies)

Full suite: 118 passing (was 105).

master-plan-status.md updated to move Phase 9 from "not started" to
"started" with the explicit note that Commit A is in and Commits B/C
remain.
2026-04-06 19:31:43 -04:00
c9757e313a Harden runtime and add backup foundation 2026-04-06 10:15:00 -04:00
6bfa1fcc37 Add Dalidou storage foundation and deployment prep 2026-04-05 18:33:52 -04:00
b0889b3925 Stabilize core correctness and sync project plan state 2026-04-05 17:53:23 -04:00
b48f0c95ab feat: Phase 2 Memory Core — structured memory with context integration
Memory Core implementation:
- Memory service with 6 types: identity, preference, project, episodic, knowledge, adaptation
- CRUD operations: create (with dedup), get (filtered), update, invalidate, supersede
- Confidence scoring (0.0-1.0) and lifecycle management (active/superseded/invalid)
- Memory API endpoints: POST/GET/PUT/DELETE /memory

Context builder integration (trust precedence per Master Plan):
  1. Trusted Project State (highest trust, 20% budget)
  2. Identity + Preference memories (10% budget)
  3. Retrieved chunks (remaining budget)

Also fixed database.py to use dynamic settings reference for test isolation.
45/45 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:54:52 -04:00
b4afbbb53a feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00