Compare commits
79 Commits
codex/atoc
...
330ecfb6a6
| Author | SHA1 | Date | |
|---|---|---|---|
| 330ecfb6a6 | |||
| d9dc55f841 | |||
| 81307cec47 | |||
| 59331e522d | |||
| b3253f35ee | |||
| 30ee857d62 | |||
| 38f6e525af | |||
| 37331d53ef | |||
| 5aeeb1cad1 | |||
| 4da81c9e4e | |||
| 7bf83bf46a | |||
| 1161645415 | |||
| 5913da53c5 | |||
| 8ea53f4003 | |||
| 9366ba7879 | |||
| c5bad996a7 | |||
| 0b1742770a | |||
| 2829d5ec1c | |||
| 58c744fd2f | |||
| a34a7a995f | |||
| 92fc250b54 | |||
| 2d911909f8 | |||
| 1a8fdf4225 | |||
| 336208004c | |||
| 03822389a1 | |||
| be4099486c | |||
| 2c0b214137 | |||
| b492f5f7b0 | |||
| e877e5b8ff | |||
| fad30d5461 | |||
| 261277fd51 | |||
| 7e60f5a0e6 | |||
| 1953e559f9 | |||
| f521aab97b | |||
| fb6298a9a1 | |||
| f2372eff9e | |||
| 78d4e979e5 | |||
| d6ce6128cf | |||
| 368adf2ebc | |||
| a637017900 | |||
| d0ff8b5738 | |||
| b9da5b6d84 | |||
| bd291ff874 | |||
| 480f13a6df | |||
| 53147d326c | |||
| 2704997256 | |||
| ac14f8d6a4 | |||
| ceb129c7d1 | |||
| 2e449a4c33 | |||
| ea3fed3d44 | |||
| c9b9eede25 | |||
| 14ab7c8e9f | |||
| bdb42dba05 | |||
| 46a5d5887a | |||
| 9943338846 | |||
| 26bfa94c65 | |||
| 4aa2b696a9 | |||
| af01dd3e70 | |||
| 8f74cab0e6 | |||
| 06aa931273 | |||
| c9757e313a | |||
| 9715fe3143 | |||
| 1f1e6b5749 | |||
| 827dcf2cd1 | |||
| d8028f406e | |||
| 3b8d717bdf | |||
| 8293099025 | |||
| 0f95415530 | |||
| 82c7535d15 | |||
| 8a94da4bf4 | |||
| 5069d5b1b6 | |||
| 440fc1d9ba | |||
| 6bfa1fcc37 | |||
| b0889b3925 | |||
| b48f0c95ab | |||
| 531c560db7 | |||
| 6081462058 | |||
| b4afbbb53a | |||
| 32ce409a7b |
159
.claude/commands/atocore-context.md
Normal file
159
.claude/commands/atocore-context.md
Normal file
@@ -0,0 +1,159 @@
|
||||
---
|
||||
description: Pull a context pack from the live AtoCore service for the current prompt
|
||||
argument-hint: <prompt text> [project-id]
|
||||
---
|
||||
|
||||
You are about to enrich a user prompt with context from the live
|
||||
AtoCore service. This is the daily-use entry point for AtoCore from
|
||||
inside Claude Code.
|
||||
|
||||
The work happens via the **shared AtoCore operator client** at
|
||||
`scripts/atocore_client.py`. That client is the canonical Python
|
||||
backbone for stable AtoCore operations and is meant to be reused by
|
||||
every LLM client (OpenClaw helper, future Codex skill, etc.) — see
|
||||
`docs/architecture/llm-client-integration.md` for the layering. This
|
||||
slash command is a thin Claude Code-specific frontend on top of it.
|
||||
|
||||
## Step 1 — parse the arguments
|
||||
|
||||
The user invoked `/atocore-context` with:
|
||||
|
||||
```
|
||||
$ARGUMENTS
|
||||
```
|
||||
|
||||
You need to figure out two things:
|
||||
|
||||
1. The **prompt text** — what AtoCore will retrieve context for
|
||||
2. An **optional project hint** — used to scope retrieval to a
|
||||
specific project's trusted state and corpus
|
||||
|
||||
The user may have passed a project id or alias as the **last
|
||||
whitespace-separated token**. Don't maintain a hardcoded list of
|
||||
known aliases — let the shared client decide. Use this rule:
|
||||
|
||||
- Take the last token of `$ARGUMENTS`. Call it `MAYBE_HINT`.
|
||||
- Run `python scripts/atocore_client.py detect-project "$MAYBE_HINT"`
|
||||
to ask the registry whether it's a known project id or alias.
|
||||
This call is cheap (it just hits `/projects` and does a regex
|
||||
match) and inherits the client's fail-open behavior.
|
||||
- If the response has a non-null `matched_project`, the last
|
||||
token was an explicit project hint. `PROMPT_TEXT` is everything
|
||||
except the last token; `PROJECT_HINT` is the matched canonical
|
||||
project id.
|
||||
- Otherwise the last token is just part of the prompt.
|
||||
`PROMPT_TEXT` is the full `$ARGUMENTS`; `PROJECT_HINT` is empty.
|
||||
|
||||
This delegates the alias-knowledge to the registry instead of
|
||||
embedding a stale list in this markdown file. When you add a new
|
||||
project to the registry, the slash command picks it up
|
||||
automatically with no edits here.
|
||||
|
||||
## Step 2 — call the shared client for the context pack
|
||||
|
||||
The server resolves project hints through the registry before
|
||||
looking up trusted state, so you can pass either the canonical id
|
||||
or any alias to `context-build` and the trusted state lookup will
|
||||
work either way. (Regression test:
|
||||
`tests/test_context_builder.py::test_alias_hint_resolves_through_registry`.)
|
||||
|
||||
**If `PROJECT_HINT` is non-empty**, call `context-build` directly
|
||||
with that hint:
|
||||
|
||||
```bash
|
||||
python scripts/atocore_client.py context-build \
|
||||
"$PROMPT_TEXT" \
|
||||
"$PROJECT_HINT"
|
||||
```
|
||||
|
||||
**If `PROJECT_HINT` is empty**, do the 2-step fallback dance so the
|
||||
user always gets a context pack regardless of whether the prompt
|
||||
implies a project:
|
||||
|
||||
```bash
|
||||
# Try project auto-detection first.
|
||||
RESULT=$(python scripts/atocore_client.py auto-context "$PROMPT_TEXT")
|
||||
|
||||
# If auto-context could not detect a project it returns a small
|
||||
# {"status": "no_project_match", ...} envelope. In that case fall
|
||||
# back to a corpus-wide context build with no project hint, which
|
||||
# is the right behaviour for cross-project or generic prompts like
|
||||
# "what changed in AtoCore backup policy this week?"
|
||||
if echo "$RESULT" | grep -q '"no_project_match"'; then
|
||||
RESULT=$(python scripts/atocore_client.py context-build "$PROMPT_TEXT")
|
||||
fi
|
||||
|
||||
echo "$RESULT"
|
||||
```
|
||||
|
||||
This is the fix for the P2 finding from codex's review: previously
|
||||
the slash command sent every no-hint prompt through `auto-context`
|
||||
and returned `no_project_match` to the user with no context, even
|
||||
though the underlying client's `context-build` subcommand has
|
||||
always supported corpus-wide context builds.
|
||||
|
||||
In both branches the response is the JSON payload from
|
||||
`/context/build` (or, in the rare case where even the corpus-wide
|
||||
build fails, a `{"status": "unavailable"}` envelope from the
|
||||
client's fail-open layer).
|
||||
|
||||
## Step 3 — present the context pack to the user
|
||||
|
||||
The successful response contains at least:
|
||||
|
||||
- `formatted_context` — the assembled context block AtoCore would
|
||||
feed an LLM
|
||||
- `chunks_used`, `total_chars`, `budget`, `budget_remaining`,
|
||||
`duration_ms`
|
||||
- `chunks` — array of source documents that contributed, each with
|
||||
`source_file`, `heading_path`, `score`
|
||||
|
||||
Render in this order:
|
||||
|
||||
1. A one-line stats banner: `chunks=N, chars=X/budget, duration=Yms`
|
||||
2. The `formatted_context` block verbatim inside a fenced text code
|
||||
block so the user can read what AtoCore would feed an LLM
|
||||
3. The `chunks` array as a small bullet list with `source_file`,
|
||||
`heading_path`, and `score` per chunk
|
||||
|
||||
Two special cases:
|
||||
|
||||
- **`{"status": "unavailable"}`** (fail-open from the client)
|
||||
→ Tell the user: "AtoCore is unreachable at `$ATOCORE_BASE_URL`.
|
||||
Check `python scripts/atocore_client.py health` for diagnostics."
|
||||
- **Empty `chunks_used: 0` with no project state and no memories**
|
||||
→ Tell the user: "AtoCore returned no context for this prompt —
|
||||
either the corpus does not have relevant information or the
|
||||
project hint is wrong. Try a different hint or a longer prompt."
|
||||
|
||||
## Step 4 — what about capturing the interaction
|
||||
|
||||
Capture (Phase 9 Commit A) and the rest of the reflection loop
|
||||
(reinforcement, extraction, review queue) are intentionally NOT
|
||||
exposed by the shared client yet. The contracts are stable but the
|
||||
workflow ergonomics are not, so the daily-use slash command stays
|
||||
focused on context retrieval until those review flows have been
|
||||
exercised in real use. See `docs/architecture/llm-client-integration.md`
|
||||
for the deferral rationale.
|
||||
|
||||
When capture is added to the shared client, this slash command will
|
||||
gain a follow-up `/atocore-record-response` companion command that
|
||||
posts the LLM's response back to the same interaction. That work is
|
||||
queued.
|
||||
|
||||
## Notes for the assistant
|
||||
|
||||
- DO NOT bypass the shared client by calling curl yourself. The
|
||||
client is the contract between AtoCore and every LLM frontend; if
|
||||
you find a missing capability, the right fix is to extend the
|
||||
client, not to work around it.
|
||||
- DO NOT maintain a hardcoded list of project aliases in this
|
||||
file. Use `detect-project` to ask the registry — that's the
|
||||
whole point of having a registry.
|
||||
- DO NOT silently change `ATOCORE_BASE_URL`. If the env var points
|
||||
at the wrong instance, surface the error so the user can fix it.
|
||||
- DO NOT hide the formatted context pack from the user. Showing
|
||||
what AtoCore would feed an LLM is the whole point.
|
||||
- The output goes into the user's working context as background;
|
||||
they may follow up with their actual question, and the AtoCore
|
||||
context pack acts as informal injected knowledge.
|
||||
9
.dockerignore
Normal file
9
.dockerignore
Normal file
@@ -0,0 +1,9 @@
|
||||
.git
|
||||
.pytest_cache
|
||||
.coverage
|
||||
.claude
|
||||
data
|
||||
__pycache__
|
||||
*.pyc
|
||||
tests
|
||||
docs
|
||||
24
.env.example
Normal file
24
.env.example
Normal file
@@ -0,0 +1,24 @@
|
||||
ATOCORE_ENV=development
|
||||
ATOCORE_DEBUG=false
|
||||
ATOCORE_LOG_LEVEL=INFO
|
||||
ATOCORE_DATA_DIR=./data
|
||||
ATOCORE_DB_DIR=
|
||||
ATOCORE_CHROMA_DIR=
|
||||
ATOCORE_CACHE_DIR=
|
||||
ATOCORE_TMP_DIR=
|
||||
ATOCORE_VAULT_SOURCE_DIR=./sources/vault
|
||||
ATOCORE_DRIVE_SOURCE_DIR=./sources/drive
|
||||
ATOCORE_SOURCE_VAULT_ENABLED=true
|
||||
ATOCORE_SOURCE_DRIVE_ENABLED=true
|
||||
ATOCORE_LOG_DIR=./logs
|
||||
ATOCORE_BACKUP_DIR=./backups
|
||||
ATOCORE_RUN_DIR=./run
|
||||
ATOCORE_PROJECT_REGISTRY_DIR=./config
|
||||
ATOCORE_PROJECT_REGISTRY_PATH=./config/project-registry.json
|
||||
ATOCORE_HOST=127.0.0.1
|
||||
ATOCORE_PORT=8100
|
||||
ATOCORE_DB_BUSY_TIMEOUT_MS=5000
|
||||
ATOCORE_EMBEDDING_MODEL=sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
||||
ATOCORE_CHUNK_MAX_SIZE=800
|
||||
ATOCORE_CHUNK_OVERLAP=100
|
||||
ATOCORE_CONTEXT_BUDGET=3000
|
||||
15
.gitignore
vendored
Normal file
15
.gitignore
vendored
Normal file
@@ -0,0 +1,15 @@
|
||||
data/
|
||||
__pycache__/
|
||||
*.pyc
|
||||
.env
|
||||
*.egg-info/
|
||||
dist/
|
||||
build/
|
||||
.pytest_cache/
|
||||
htmlcov/
|
||||
.coverage
|
||||
venv/
|
||||
.venv/
|
||||
.claude/*
|
||||
!.claude/commands/
|
||||
!.claude/commands/**
|
||||
45
AGENTS.md
Normal file
45
AGENTS.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# AGENTS.md
|
||||
|
||||
## Session protocol (read first, every session)
|
||||
|
||||
**Before doing anything else, read `DEV-LEDGER.md` at the repo root.** It is the one-file source of truth for "what is currently true" — live SHA, active plan, open review findings, recent decisions. The narrative docs under `docs/` may lag; the ledger does not.
|
||||
|
||||
**Before ending a session, append a Session Log line to `DEV-LEDGER.md`** with what you did and which commit range it covers, and bump the Orientation section if anything there changed.
|
||||
|
||||
This rule applies equally to Claude, Codex, and any future agent working in this repo.
|
||||
|
||||
## Project role
|
||||
This repository is AtoCore, the runtime and machine-memory layer of the Ato ecosystem.
|
||||
|
||||
## Ecosystem definitions
|
||||
- AtoCore = app/runtime/API/ingestion/retrieval/context builder/machine DB logic
|
||||
- AtoMind = future intelligence layer for promotion, reflection, conflict handling, trust decisions
|
||||
- AtoVault = human-readable memory source, intended for Obsidian
|
||||
- AtoDrive = trusted operational project source, higher trust than general vault notes
|
||||
|
||||
## Storage principles
|
||||
- Human-readable source layers and machine operational storage must remain separate
|
||||
- AtoVault is not the live vector database location
|
||||
- AtoDrive is not the live vector database location
|
||||
- Machine operational storage includes SQLite, vector store, indexes, embeddings, and runtime metadata
|
||||
- The machine DB is derived operational state, not the primary human source of truth
|
||||
|
||||
## Deployment principles
|
||||
- Dalidou is the canonical host for AtoCore service and machine database
|
||||
- OpenClaw on the T420 should consume AtoCore over API/network/Tailscale
|
||||
- Do not design around Syncthing for the live SQLite/vector DB
|
||||
- Prefer one canonical running service over multi-node live DB replication
|
||||
|
||||
## Coding guidance
|
||||
- Keep path handling explicit and configurable via environment variables
|
||||
- Do not hard-code machine-specific absolute paths
|
||||
- Keep implementation small, testable, and reversible
|
||||
- Preserve current working behavior unless a change is necessary
|
||||
- Add or update tests when changing config, storage, or path logic
|
||||
|
||||
## Change policy
|
||||
Before large refactors:
|
||||
1. explain the architectural reason
|
||||
2. propose the smallest safe batch
|
||||
3. implement incrementally
|
||||
4. summarize changed files and migration impact
|
||||
30
CLAUDE.md
Normal file
30
CLAUDE.md
Normal file
@@ -0,0 +1,30 @@
|
||||
# CLAUDE.md — project instructions for AtoCore
|
||||
|
||||
## Session protocol
|
||||
|
||||
Before doing anything else in this repo, read `DEV-LEDGER.md` at the repo root. It is the shared operating memory between Claude, Codex, and the human operator — live Dalidou SHA, active plan, open P1/P2 review findings, recent decisions, and session log. The narrative docs under `docs/` sometimes lag; the ledger does not.
|
||||
|
||||
Before ending a session, append a Session Log line to `DEV-LEDGER.md` covering:
|
||||
|
||||
- which commits you produced (sha range)
|
||||
- what changed at a high level
|
||||
- any harness / test count deltas
|
||||
- anything you overclaimed and later corrected
|
||||
|
||||
Bump the **Orientation** section if `live_sha`, `main_tip`, `test_count`, or `harness` changed.
|
||||
|
||||
`AGENTS.md` at the repo root carries the broader project principles (storage separation, deployment model, coding guidance). Read it when you need the "why" behind a constraint.
|
||||
|
||||
## Deploy workflow
|
||||
|
||||
```bash
|
||||
git push origin main && ssh papa@dalidou "bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh"
|
||||
```
|
||||
|
||||
The deploy script self-verifies via `/health` build_sha — if it exits non-zero, do not assume the change is live.
|
||||
|
||||
## Working model
|
||||
|
||||
- Claude builds; Codex audits. No parallel work on the same files.
|
||||
- P1 review findings block further `main` commits until acknowledged in the ledger's **Open Review Findings** table.
|
||||
- Codex branches must fork from `origin/main` (no orphan commits that require `--allow-unrelated-histories`).
|
||||
172
DEV-LEDGER.md
Normal file
172
DEV-LEDGER.md
Normal file
@@ -0,0 +1,172 @@
|
||||
# AtoCore Dev Ledger
|
||||
|
||||
> Shared operating memory between humans, Claude, and Codex.
|
||||
> **Every session MUST read this file at start and append a Session Log entry before ending.**
|
||||
> Section headers are stable - do not rename them. Trim Session Log and Recent Decisions to the last 20 entries at session end; older history lives in `git log` and `docs/`.
|
||||
|
||||
## Orientation
|
||||
|
||||
- **live_sha** (Dalidou `/health` build_sha): `38f6e52`
|
||||
- **last_updated**: 2026-04-11 by Claude (Day 1+2 eval on working branch, Day 4 gate escalated)
|
||||
- **main_tip**: `d9dc55f` (unchanged; Day 1+2 artifacts live on `claude/extractor-eval-loop @ 7d8d599`)
|
||||
- **test_count**: 264 passing
|
||||
- **harness**: `6/6 PASS` (`python scripts/retrieval_eval.py` against live Dalidou)
|
||||
- **off_host_backup**: `papa@192.168.86.39:/home/papa/atocore-backups/` via cron env `ATOCORE_BACKUP_RSYNC`, verified
|
||||
|
||||
## Active Plan
|
||||
|
||||
**Mini-phase**: Extractor improvement (eval-driven) + retrieval harness expansion.
|
||||
**Duration**: 8 days, hard gates at each day boundary.
|
||||
**Plan author**: Codex (2026-04-11). **Executor**: Claude. **Audit**: Codex.
|
||||
|
||||
### Preflight (before Day 1)
|
||||
|
||||
Stop if any of these fail:
|
||||
|
||||
- `git rev-parse HEAD` on `main` matches the expected branching tip
|
||||
- Live `/health` on Dalidou reports the SHA you think is deployed
|
||||
- `python scripts/retrieval_eval.py --json` still passes at the current baseline
|
||||
- `batch-extract` over the known 42-capture slice reproduces the current low-yield baseline
|
||||
- A frozen sample set exists for extractor labeling so the target does not move mid-phase
|
||||
|
||||
Success: baseline eval output saved, baseline extract output saved, working branch created from `origin/main`.
|
||||
|
||||
### Day 1 - Labeled extractor eval set
|
||||
|
||||
Pick 30 real captures: 10 that should produce 0 candidates, 10 that should plausibly produce 1, 10 ambiguous/hard. Store as a stable artifact (interaction id, expected count, expected type, notes). Add a runner that scores extractor output against labels.
|
||||
|
||||
Success: 30 labeled interactions in a stable artifact, one-command precision/recall output.
|
||||
Fail-early: if labeling 30 takes more than a day because the concept is unclear, tighten the extraction target before touching code.
|
||||
|
||||
### Day 2 - Measure current extractor
|
||||
|
||||
Run the rule-based extractor on all 30. Record yield, TP, FP, FN. Bucket misses by class (conversational preference, decision summary, status/constraint, meta chatter).
|
||||
|
||||
Success: short scorecard with counts by miss type, top 2 miss classes obvious.
|
||||
Fail-early: if the labeled set shows fewer than 5 plausible positives total, the corpus is too weak - relabel before tuning.
|
||||
|
||||
### Day 3 - Smallest rule expansion for top miss class
|
||||
|
||||
Add 1-2 narrow, explainable rules for the worst miss class. Add unit tests from real paraphrase examples in the labeled set. Then rerun eval.
|
||||
|
||||
Success: recall up on the labeled set, false positives do not materially rise, new tests cover the new cue class.
|
||||
Fail-early: if one rule expansion raises FP above ~20% of extracted candidates, revert or narrow before adding more.
|
||||
|
||||
### Day 4 - Decision gate: more rules or LLM-assisted prototype
|
||||
|
||||
If rule expansion reaches a **meaningfully reviewable queue**, keep going with rules. Otherwise prototype an LLM-assisted extraction mode behind a flag.
|
||||
|
||||
"Meaningfully reviewable queue":
|
||||
- >= 15-25% candidate yield on the 30 labeled captures
|
||||
- FP rate low enough that manual triage feels tolerable
|
||||
- >= 2 real non-synthetic candidates worth review
|
||||
|
||||
Hard stop: if candidate yield is still under 10% after this point, stop rule tinkering and switch to architecture review (LLM-assisted OR narrower extraction scope).
|
||||
|
||||
### Day 5 - Stabilize and document
|
||||
|
||||
Add remaining focused rules or the flagged LLM-assisted path. Write down in-scope and out-of-scope utterance kinds.
|
||||
|
||||
Success: labeled eval green against target threshold, extractor scope explainable in <= 5 bullets.
|
||||
|
||||
### Day 6 - Retrieval harness expansion (6 -> 15-20 fixtures)
|
||||
|
||||
Grow across p04/p05/p06. Include short ambiguous prompts, cross-project collision cases, expected project-state wins, expected project-memory wins, and 1-2 "should fail open / low confidence" cases.
|
||||
|
||||
Success: >= 15 fixtures, each active project has easy + medium + hard cases.
|
||||
Fail-early: if fixtures are mostly obvious wins, add harder adversarial cases before claiming coverage.
|
||||
|
||||
### Day 7 - Regression pass and calibration
|
||||
|
||||
Run harness on current code vs live Dalidou. Inspect failures (ranking, ingestion gap, project bleed, budget). Make at most ONE ranking/budget tweak if the harness clearly justifies it. Do not mix harness expansion and ranking changes in a single commit unless tightly coupled.
|
||||
|
||||
Success: harness still passes or improves after extractor work; any ranking tweak is justified by a concrete fixture delta.
|
||||
Fail-early: if > 20-25% of harness fixtures regress after extractor changes, separate concerns before merging.
|
||||
|
||||
### Day 8 - Merge and close
|
||||
|
||||
Clean commit sequence. Save before/after metrics (extractor scorecard, harness results). Update docs only with claims the metrics support.
|
||||
|
||||
Merge order: labeled corpus + runner -> extractor improvements + tests -> harness expansion -> any justified ranking tweak -> docs sync last.
|
||||
|
||||
Success: point to a before/after delta for both extraction and retrieval; docs do not overclaim.
|
||||
|
||||
### Hard Gates (stop/rethink points)
|
||||
|
||||
- Extractor yield < 10% after 30 labeled interactions -> stop, reconsider rule-only extraction
|
||||
- FP rate > 20% on labeled set -> narrow rules before adding more
|
||||
- Harness expansion finds < 3 genuinely hard cases -> harness still too soft
|
||||
- Ranking change improves one project but regresses another -> do not merge without explicit tradeoff note
|
||||
|
||||
### Branching
|
||||
|
||||
One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-harness-expansion` for Day 6-7. Keeps extraction and retrieval judgments auditable.
|
||||
|
||||
## Review Protocol
|
||||
|
||||
- Codex records review findings in **Open Review Findings**.
|
||||
- Claude must read **Open Review Findings** at session start before coding.
|
||||
- Codex owns finding text. Claude may update operational fields only:
|
||||
- `status`
|
||||
- `owner`
|
||||
- `resolved_by`
|
||||
- If Claude disagrees with a finding, do not rewrite it. Mark it `declined` and explain why in the **Session Log**.
|
||||
- Any commit or session that addresses a finding should reference the finding id in the commit message or **Session Log**.
|
||||
- `P1` findings block further commits in the affected area until they are at least acknowledged and explicitly tracked.
|
||||
- Findings may be code-level, claim-level, or ops-level. If the implementation boundary changes, retarget the finding instead of silently closing it.
|
||||
|
||||
## Open Review Findings
|
||||
|
||||
| id | finder | severity | file:line | summary | status | owner | opened_at | resolved_by |
|
||||
|-----|--------|----------|------------------------------------|-------------------------------------------------------------------------|--------------|--------|------------|-------------|
|
||||
| R1 | Codex | P1 | deploy/hooks/capture_stop.py:76-85 | Live Claude capture still omits `extract`, so "loop closed both sides" remains overstated in practice even though the API supports it | acknowledged | Claude | 2026-04-11 | |
|
||||
| R2 | Codex | P1 | src/atocore/context/builder.py | Project memories excluded from pack | fixed | Claude | 2026-04-11 | 8ea53f4 |
|
||||
| R3 | Claude | P2 | src/atocore/memory/extractor.py | Rule cues (`## Decision:`) never fire on conversational LLM text | open | Claude | 2026-04-11 | |
|
||||
| R4 | Codex | P2 | DEV-LEDGER.md:11 | Orientation `main_tip` was stale versus `HEAD` / `origin/main` | fixed | Codex | 2026-04-11 | 81307ce |
|
||||
|
||||
## Recent Decisions
|
||||
|
||||
- **2026-04-11** Adopt this ledger as shared operating memory between Claude and Codex. *Proposed by:* Antoine. *Ratified by:* Antoine.
|
||||
- **2026-04-11** Accept Codex's 8-day mini-phase plan verbatim as Active Plan. *Proposed by:* Codex. *Ratified by:* Antoine.
|
||||
- **2026-04-11** Review findings live in `DEV-LEDGER.md` with Codex owning finding text and Claude updating status fields only. *Proposed by:* Codex. *Ratified by:* Antoine.
|
||||
- **2026-04-11** Project memories land in the pack under `--- Project Memories ---` at 25% budget ratio, gated on canonical project hint. *Proposed by:* Claude.
|
||||
- **2026-04-11** Extraction stays off the capture hot path. Batch / manual only. *Proposed by:* Antoine.
|
||||
- **2026-04-11** 4-step roadmap: extractor -> harness expansion -> Wave 2 ingestion -> OpenClaw finish. Steps 1+2 as one mini-phase. *Ratified by:* Antoine.
|
||||
- **2026-04-11** Codex branches must fork from `main`, not be orphan commits. *Proposed by:* Claude. *Agreed by:* Codex.
|
||||
|
||||
## Session Log
|
||||
|
||||
- **2026-04-11 Claude** `claude/extractor-eval-loop @ 7d8d599` — Day 1+2 of the mini-phase. Froze a 64-interaction snapshot (`scripts/eval_data/interactions_snapshot_2026-04-11.json`) and labeled 20 by length-stratified random sample (5 positive, 15 zero; 7 total expected candidates). Built `scripts/extractor_eval.py` as a file-based eval runner. **Day 2 baseline: rule extractor hit 0% yield / 0% recall / 0% precision on the labeled set; 5 false negatives across 5 distinct miss classes (recommendation_prose, architectural_change_summary, spec_update_announcement, layered_recommendation, alignment_assertion).** This is the Day 4 hard-stop signal arriving two days early — a single rule expansion cannot close a 5-way miss, and widening rules blindly will collapse precision. The Day 4 decision gate is escalated to Antoine for ratification before Day 3 touches any extractor code. No extractor code on main has changed.
|
||||
- **2026-04-11 Codex (ledger audit)** fixed stale `main_tip`, retargeted R1 from the API surface to the live Claude Stop hook, and formalized the review write protocol so Claude can consume findings without rewriting them.
|
||||
- **2026-04-11 Claude** `b3253f3..59331e5` (1 commit). Wired the DEV-LEDGER, added session protocol to AGENTS.md, created project-local CLAUDE.md, deleted stale `codex/port-atocore-ops-client` remote branch. No code changes, no redeploy needed.
|
||||
- **2026-04-11 Claude** `c5bad99..b3253f3` (11 commits + 1 merge). Length-aware reinforcement, project memories in pack, query-relevance memory ranking, hyphenated-identifier tokenizer, retrieval eval harness seeded, off-host backup wired end-to-end, docs synced, codex integration-pass branch merged. Harness went 0->6/6 on live Dalidou.
|
||||
- **2026-04-11 Codex (async review)** identified 2 P1s against a stale checkout. R1 was fair (extraction not automated), R2 was outdated (project memories already landed on main). Delivered the 8-day execution plan now in Active Plan.
|
||||
- **2026-04-06 Antoine** created `codex/atocore-integration-pass` with the `t420-openclaw/` workspace (merged 2026-04-11).
|
||||
|
||||
## Working Rules
|
||||
|
||||
- Claude builds; Codex audits. No parallel work on the same files.
|
||||
- Codex branches fork from `main`: `git fetch origin && git checkout -b codex/<topic> origin/main`.
|
||||
- P1 findings block further main commits until acknowledged in Open Review Findings.
|
||||
- Every session appends at least one Session Log line and bumps Orientation.
|
||||
- Trim Session Log and Recent Decisions to the last 20 at session end.
|
||||
- Docs in `docs/` may overclaim stale status; the ledger is the one-file source of truth for "what is true right now."
|
||||
|
||||
## Quick Commands
|
||||
|
||||
```bash
|
||||
# Check live state
|
||||
ssh papa@dalidou "curl -s http://localhost:8100/health"
|
||||
|
||||
# Run the retrieval harness
|
||||
python scripts/retrieval_eval.py # human-readable
|
||||
python scripts/retrieval_eval.py --json # machine-readable
|
||||
|
||||
# Deploy a new main tip
|
||||
git push origin main && ssh papa@dalidou "bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh"
|
||||
|
||||
# Reflection-loop ops
|
||||
python scripts/atocore_client.py batch-extract '' '' 200 false # preview
|
||||
python scripts/atocore_client.py batch-extract '' '' 200 true # persist
|
||||
python scripts/atocore_client.py triage
|
||||
```
|
||||
21
Dockerfile
Normal file
21
Dockerfile
Normal file
@@ -0,0 +1,21 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
ENV PYTHONDONTWRITEBYTECODE=1
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends build-essential curl git \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
COPY pyproject.toml README.md requirements.txt requirements-dev.txt ./
|
||||
COPY config ./config
|
||||
COPY src ./src
|
||||
|
||||
RUN pip install --no-cache-dir --upgrade pip \
|
||||
&& pip install --no-cache-dir .
|
||||
|
||||
EXPOSE 8100
|
||||
|
||||
CMD ["python", "-m", "uvicorn", "atocore.main:app", "--host", "0.0.0.0", "--port", "8100"]
|
||||
88
README.md
Normal file
88
README.md
Normal file
@@ -0,0 +1,88 @@
|
||||
# AtoCore
|
||||
|
||||
Personal context engine that enriches LLM interactions with durable memory, structured context, and project knowledge.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
uvicorn src.atocore.main:app --port 8100
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
# Ingest markdown files
|
||||
curl -X POST http://localhost:8100/ingest \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"path": "/path/to/notes"}'
|
||||
|
||||
# Build enriched context for a prompt
|
||||
curl -X POST http://localhost:8100/context/build \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"prompt": "What is the project status?", "project": "myproject"}'
|
||||
|
||||
# CLI ingestion
|
||||
python scripts/ingest_folder.py --path /path/to/notes
|
||||
|
||||
# Live operator client
|
||||
python scripts/atocore_client.py health
|
||||
python scripts/atocore_client.py audit-query "gigabit" 5
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| POST | /ingest | Ingest markdown file or folder |
|
||||
| POST | /query | Retrieve relevant chunks |
|
||||
| POST | /context/build | Build full context pack |
|
||||
| GET | /health | Health check |
|
||||
| GET | /debug/context | Inspect last context pack |
|
||||
|
||||
## Architecture
|
||||
|
||||
```text
|
||||
FastAPI (port 8100)
|
||||
|- Ingestion: markdown -> parse -> chunk -> embed -> store
|
||||
|- Retrieval: query -> embed -> vector search -> rank
|
||||
|- Context Builder: retrieve -> boost -> budget -> format
|
||||
|- SQLite (documents, chunks, memories, projects, interactions)
|
||||
'- ChromaDB (vector embeddings)
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Set via environment variables (prefix `ATOCORE_`):
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| ATOCORE_DEBUG | false | Enable debug logging |
|
||||
| ATOCORE_PORT | 8100 | Server port |
|
||||
| ATOCORE_CHUNK_MAX_SIZE | 800 | Max chunk size (chars) |
|
||||
| ATOCORE_CONTEXT_BUDGET | 3000 | Context pack budget (chars) |
|
||||
| ATOCORE_EMBEDDING_MODEL | paraphrase-multilingual-MiniLM-L12-v2 | Embedding model |
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
pip install -e ".[dev]"
|
||||
pytest
|
||||
```
|
||||
|
||||
## Operations
|
||||
|
||||
- `scripts/atocore_client.py` provides a live API client for project refresh, project-state inspection, and retrieval-quality audits.
|
||||
- `docs/operations.md` captures the current operational priority order: retrieval quality, Wave 2 trusted-operational ingestion, AtoDrive scoping, and restore validation.
|
||||
|
||||
## Architecture Notes
|
||||
|
||||
Implementation-facing architecture notes live under `docs/architecture/`.
|
||||
|
||||
Current additions:
|
||||
- `docs/architecture/engineering-knowledge-hybrid-architecture.md` — 5-layer hybrid model
|
||||
- `docs/architecture/engineering-ontology-v1.md` — V1 object and relationship inventory
|
||||
- `docs/architecture/engineering-query-catalog.md` — 20 v1-required queries
|
||||
- `docs/architecture/memory-vs-entities.md` — canonical home split
|
||||
- `docs/architecture/promotion-rules.md` — Layer 0 to Layer 2 pipeline
|
||||
- `docs/architecture/conflict-model.md` — contradictory facts detection and resolution
|
||||
21
config/project-registry.example.json
Normal file
21
config/project-registry.example.json
Normal file
@@ -0,0 +1,21 @@
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p07-example",
|
||||
"aliases": ["p07", "example-project"],
|
||||
"description": "Short description of the project and the staged source set.",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p07-example",
|
||||
"label": "Primary staged project docs"
|
||||
},
|
||||
{
|
||||
"source": "drive",
|
||||
"subpath": "projects/p07-example",
|
||||
"label": "Trusted operational docs"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
52
config/project-registry.json
Normal file
52
config/project-registry.json
Normal file
@@ -0,0 +1,52 @@
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "atocore",
|
||||
"aliases": ["ato core"],
|
||||
"description": "AtoCore platform docs and trusted project materials.",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "drive",
|
||||
"subpath": "atocore",
|
||||
"label": "AtoCore drive docs"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["p04", "gigabit", "gigaBIT"],
|
||||
"description": "Active P04 GigaBIT mirror project corpus from PKM plus staged operational docs.",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p04-gigabit",
|
||||
"label": "P04 staged project docs"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"description": "Active P05 interferometer corpus from PKM plus selected repo context and vendor documentation.",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p05-interferometer",
|
||||
"label": "P05 staged project docs"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "p06-polisher",
|
||||
"aliases": ["p06", "polisher"],
|
||||
"description": "Active P06 polisher corpus from PKM, software-suite notes, and selected repo context.",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p06-polisher",
|
||||
"label": "P06 staged project docs"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
19
deploy/dalidou/.env.example
Normal file
19
deploy/dalidou/.env.example
Normal file
@@ -0,0 +1,19 @@
|
||||
ATOCORE_ENV=production
|
||||
ATOCORE_DEBUG=false
|
||||
ATOCORE_LOG_LEVEL=INFO
|
||||
ATOCORE_HOST=0.0.0.0
|
||||
ATOCORE_PORT=8100
|
||||
|
||||
ATOCORE_DATA_DIR=/srv/storage/atocore/data
|
||||
ATOCORE_DB_DIR=/srv/storage/atocore/data/db
|
||||
ATOCORE_CHROMA_DIR=/srv/storage/atocore/data/chroma
|
||||
ATOCORE_CACHE_DIR=/srv/storage/atocore/data/cache
|
||||
ATOCORE_TMP_DIR=/srv/storage/atocore/data/tmp
|
||||
ATOCORE_LOG_DIR=/srv/storage/atocore/logs
|
||||
ATOCORE_BACKUP_DIR=/srv/storage/atocore/backups
|
||||
ATOCORE_RUN_DIR=/srv/storage/atocore/run
|
||||
|
||||
ATOCORE_VAULT_SOURCE_DIR=/srv/storage/atocore/sources/vault
|
||||
ATOCORE_DRIVE_SOURCE_DIR=/srv/storage/atocore/sources/drive
|
||||
ATOCORE_SOURCE_VAULT_ENABLED=true
|
||||
ATOCORE_SOURCE_DRIVE_ENABLED=true
|
||||
85
deploy/dalidou/cron-backup.sh
Executable file
85
deploy/dalidou/cron-backup.sh
Executable file
@@ -0,0 +1,85 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy/dalidou/cron-backup.sh
|
||||
# ------------------------------
|
||||
# Daily backup + retention cleanup via the AtoCore API.
|
||||
#
|
||||
# Intended to run from cron on Dalidou:
|
||||
#
|
||||
# # Daily at 03:00 UTC
|
||||
# 0 3 * * * /srv/storage/atocore/app/deploy/dalidou/cron-backup.sh >> /var/log/atocore-backup.log 2>&1
|
||||
#
|
||||
# What it does:
|
||||
# 1. Creates a runtime backup (db + registry, no chroma by default)
|
||||
# 2. Runs retention cleanup with --confirm to delete old snapshots
|
||||
# 3. Logs results to stdout (captured by cron into the log file)
|
||||
#
|
||||
# Fail-open: exits 0 even on API errors so cron doesn't send noise
|
||||
# emails. Check /var/log/atocore-backup.log for diagnostics.
|
||||
#
|
||||
# Environment variables:
|
||||
# ATOCORE_URL default http://127.0.0.1:8100
|
||||
# ATOCORE_BACKUP_CHROMA default false (set to "true" for cold chroma copy)
|
||||
# ATOCORE_BACKUP_DIR default /srv/storage/atocore/backups
|
||||
# ATOCORE_BACKUP_RSYNC optional rsync destination for off-host copies
|
||||
# (e.g. papa@laptop:/home/papa/atocore-backups/)
|
||||
# When set, the local snapshots tree is rsynced to
|
||||
# the destination after cleanup. Unset = skip.
|
||||
# SSH key auth must already be configured from this
|
||||
# host to the destination.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
ATOCORE_URL="${ATOCORE_URL:-http://127.0.0.1:8100}"
|
||||
INCLUDE_CHROMA="${ATOCORE_BACKUP_CHROMA:-false}"
|
||||
BACKUP_DIR="${ATOCORE_BACKUP_DIR:-/srv/storage/atocore/backups}"
|
||||
RSYNC_TARGET="${ATOCORE_BACKUP_RSYNC:-}"
|
||||
TIMESTAMP="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
|
||||
log() { printf '[%s] %s\n' "$TIMESTAMP" "$*"; }
|
||||
|
||||
log "=== AtoCore daily backup starting ==="
|
||||
|
||||
# Step 1: Create backup
|
||||
log "Step 1: creating backup (chroma=$INCLUDE_CHROMA)"
|
||||
BACKUP_RESULT=$(curl -sf -X POST \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"include_chroma\": $INCLUDE_CHROMA}" \
|
||||
"$ATOCORE_URL/admin/backup" 2>&1) || {
|
||||
log "ERROR: backup creation failed: $BACKUP_RESULT"
|
||||
exit 0
|
||||
}
|
||||
log "Backup created: $BACKUP_RESULT"
|
||||
|
||||
# Step 2: Retention cleanup (confirm=true to actually delete)
|
||||
log "Step 2: running retention cleanup"
|
||||
CLEANUP_RESULT=$(curl -sf -X POST \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"confirm": true}' \
|
||||
"$ATOCORE_URL/admin/backup/cleanup" 2>&1) || {
|
||||
log "ERROR: cleanup failed: $CLEANUP_RESULT"
|
||||
exit 0
|
||||
}
|
||||
log "Cleanup result: $CLEANUP_RESULT"
|
||||
|
||||
# Step 3: Off-host rsync (optional). Fail-open: log but don't abort
|
||||
# the cron so a laptop being offline at 03:00 UTC never turns the
|
||||
# local backup path red.
|
||||
if [[ -n "$RSYNC_TARGET" ]]; then
|
||||
log "Step 3: rsyncing snapshots to $RSYNC_TARGET"
|
||||
if [[ ! -d "$BACKUP_DIR/snapshots" ]]; then
|
||||
log "WARN: $BACKUP_DIR/snapshots does not exist, skipping rsync"
|
||||
else
|
||||
RSYNC_OUTPUT=$(rsync -a --delete \
|
||||
-e "ssh -o ConnectTimeout=10 -o BatchMode=yes -o StrictHostKeyChecking=accept-new" \
|
||||
"$BACKUP_DIR/snapshots/" "$RSYNC_TARGET" 2>&1) && {
|
||||
log "Rsync complete"
|
||||
} || {
|
||||
log "WARN: rsync to $RSYNC_TARGET failed (offline or auth?): $RSYNC_OUTPUT"
|
||||
}
|
||||
fi
|
||||
else
|
||||
log "Step 3: ATOCORE_BACKUP_RSYNC not set, skipping off-host copy"
|
||||
fi
|
||||
|
||||
log "=== AtoCore daily backup complete ==="
|
||||
349
deploy/dalidou/deploy.sh
Normal file
349
deploy/dalidou/deploy.sh
Normal file
@@ -0,0 +1,349 @@
|
||||
#!/usr/bin/env bash
|
||||
#
|
||||
# deploy/dalidou/deploy.sh
|
||||
# -------------------------
|
||||
# One-shot deploy script for updating the running AtoCore container
|
||||
# on Dalidou from the current Gitea main branch.
|
||||
#
|
||||
# The script is idempotent and safe to re-run. It handles both the
|
||||
# first-time deploy (where /srv/storage/atocore/app may not yet be
|
||||
# a git checkout) and the ongoing update case (where it is).
|
||||
#
|
||||
# Usage
|
||||
# -----
|
||||
#
|
||||
# # Normal update from main (most common)
|
||||
# bash deploy/dalidou/deploy.sh
|
||||
#
|
||||
# # Deploy a specific branch or tag
|
||||
# ATOCORE_BRANCH=codex/some-feature bash deploy/dalidou/deploy.sh
|
||||
#
|
||||
# # Dry-run: show what would happen without touching anything
|
||||
# ATOCORE_DEPLOY_DRY_RUN=1 bash deploy/dalidou/deploy.sh
|
||||
#
|
||||
# Environment variables
|
||||
# ---------------------
|
||||
#
|
||||
# ATOCORE_APP_DIR default /srv/storage/atocore/app
|
||||
# ATOCORE_GIT_REMOTE default http://127.0.0.1:3000/Antoine/ATOCore.git
|
||||
# This is the local Dalidou gitea, reached
|
||||
# via loopback. Override only when running
|
||||
# the deploy from a remote host. The default
|
||||
# is loopback (not the hostname "dalidou")
|
||||
# because the hostname doesn't reliably
|
||||
# resolve on the host itself — Dalidou
|
||||
# Claude's first deploy had to work around
|
||||
# exactly this.
|
||||
# ATOCORE_BRANCH default main
|
||||
# ATOCORE_DEPLOY_DRY_RUN if set to 1, report only, no mutations
|
||||
# ATOCORE_HEALTH_URL default http://127.0.0.1:8100/health
|
||||
#
|
||||
# Safety rails
|
||||
# ------------
|
||||
#
|
||||
# - If the app dir exists but is NOT a git repo, the script renames
|
||||
# it to <dir>.pre-git-<timestamp> before re-cloning, so you never
|
||||
# lose the pre-existing snapshot to a git clobber.
|
||||
# - If the health check fails after restart, the script exits
|
||||
# non-zero and prints the container logs tail for diagnosis.
|
||||
# - Dry-run mode is the default recommendation for the first deploy
|
||||
# on a new environment: it shows the planned git operations and
|
||||
# the compose command without actually running them.
|
||||
#
|
||||
# What this script does NOT do
|
||||
# ----------------------------
|
||||
#
|
||||
# - Does not manage secrets / .env files. The caller is responsible
|
||||
# for placing deploy/dalidou/.env before running.
|
||||
# - Does not run a backup before deploying. Run the backup endpoint
|
||||
# first if you want a pre-deploy snapshot.
|
||||
# - Does not roll back on health-check failure. If deploy fails,
|
||||
# the previous container is already stopped; you need to redeploy
|
||||
# a known-good commit to recover.
|
||||
# - Does not touch the database. The Phase 9 schema migrations in
|
||||
# src/atocore/models/database.py::_apply_migrations are idempotent
|
||||
# ALTER TABLE ADD COLUMN calls that run at service startup via the
|
||||
# lifespan handler. Stale pre-Phase-9 schema is upgraded in place.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
APP_DIR="${ATOCORE_APP_DIR:-/srv/storage/atocore/app}"
|
||||
GIT_REMOTE="${ATOCORE_GIT_REMOTE:-http://127.0.0.1:3000/Antoine/ATOCore.git}"
|
||||
BRANCH="${ATOCORE_BRANCH:-main}"
|
||||
HEALTH_URL="${ATOCORE_HEALTH_URL:-http://127.0.0.1:8100/health}"
|
||||
DRY_RUN="${ATOCORE_DEPLOY_DRY_RUN:-0}"
|
||||
COMPOSE_DIR="$APP_DIR/deploy/dalidou"
|
||||
|
||||
log() { printf '==> %s\n' "$*"; }
|
||||
run() {
|
||||
if [ "$DRY_RUN" = "1" ]; then
|
||||
printf ' [dry-run] %s\n' "$*"
|
||||
else
|
||||
eval "$@"
|
||||
fi
|
||||
}
|
||||
|
||||
log "AtoCore deploy starting"
|
||||
log " app dir: $APP_DIR"
|
||||
log " git remote: $GIT_REMOTE"
|
||||
log " branch: $BRANCH"
|
||||
log " health url: $HEALTH_URL"
|
||||
log " dry run: $DRY_RUN"
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 0: pre-flight permission check
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# If $APP_DIR exists but the current user cannot write to it (because
|
||||
# a previous manual deploy left it root-owned, for example), the git
|
||||
# fetch / reset in step 1 will fail with cryptic errors. Detect this
|
||||
# up front and give the operator a clean remediation command instead
|
||||
# of letting git produce half-state on partial failure. This was the
|
||||
# exact workaround the 2026-04-08 Dalidou redeploy needed — pre-
|
||||
# existing root ownership from the pre-phase9 manual schema fix.
|
||||
|
||||
if [ -d "$APP_DIR" ] && [ "$DRY_RUN" != "1" ]; then
|
||||
if [ ! -w "$APP_DIR" ] || [ ! -r "$APP_DIR/.git" ] 2>/dev/null; then
|
||||
log "WARNING: app dir exists but may not be writable by current user"
|
||||
fi
|
||||
current_owner="$(stat -c '%U:%G' "$APP_DIR" 2>/dev/null || echo unknown)"
|
||||
current_user="$(id -un 2>/dev/null || echo unknown)"
|
||||
current_uid_gid="$(id -u 2>/dev/null):$(id -g 2>/dev/null)"
|
||||
log "Step 0: permission check"
|
||||
log " app dir owner: $current_owner"
|
||||
log " current user: $current_user ($current_uid_gid)"
|
||||
# Try to write a tiny marker file. If it fails, surface a clean
|
||||
# remediation message and exit before git produces confusing
|
||||
# half-state.
|
||||
marker="$APP_DIR/.deploy-permission-check"
|
||||
if ! ( : > "$marker" ) 2>/dev/null; then
|
||||
log "FATAL: cannot write to $APP_DIR as $current_user"
|
||||
log ""
|
||||
log "The app dir is owned by $current_owner and the current user"
|
||||
log "doesn't have write permission. This usually happens after a"
|
||||
log "manual workaround deploy that ran as root."
|
||||
log ""
|
||||
log "Remediation (pick the one that matches your setup):"
|
||||
log ""
|
||||
log " # If you have passwordless sudo and gitea runs as UID 1000:"
|
||||
log " sudo chown -R 1000:1000 $APP_DIR"
|
||||
log ""
|
||||
log " # If you're running deploy.sh itself as root:"
|
||||
log " sudo bash $0"
|
||||
log ""
|
||||
log " # If neither works, do it via a throwaway container:"
|
||||
log " docker run --rm -v $APP_DIR:/app alpine \\"
|
||||
log " chown -R 1000:1000 /app"
|
||||
log ""
|
||||
log "Then re-run deploy.sh."
|
||||
exit 5
|
||||
fi
|
||||
rm -f "$marker" 2>/dev/null || true
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 1: make sure $APP_DIR is a proper git checkout of the branch
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
if [ -d "$APP_DIR/.git" ]; then
|
||||
log "Step 1: app dir is already a git checkout; fetching latest"
|
||||
run "cd '$APP_DIR' && git fetch origin '$BRANCH'"
|
||||
run "cd '$APP_DIR' && git reset --hard 'origin/$BRANCH'"
|
||||
else
|
||||
log "Step 1: app dir is NOT a git checkout; converting"
|
||||
if [ -d "$APP_DIR" ]; then
|
||||
BACKUP="${APP_DIR}.pre-git-$(date -u +%Y%m%dT%H%M%SZ)"
|
||||
log " backing up existing snapshot to $BACKUP"
|
||||
run "mv '$APP_DIR' '$BACKUP'"
|
||||
fi
|
||||
log " cloning $GIT_REMOTE -> $APP_DIR (branch: $BRANCH)"
|
||||
run "git clone --branch '$BRANCH' '$GIT_REMOTE' '$APP_DIR'"
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 1.5: self-update re-exec guard
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# When deploy.sh itself changes in the commit we just pulled, the bash
|
||||
# process running this script is still executing the OLD deploy.sh
|
||||
# from memory — git reset --hard updated the file on disk but our
|
||||
# in-memory instructions are stale. That's exactly how the first
|
||||
# 2026-04-09 Dalidou deploy silently wrote "unknown" build_sha: old
|
||||
# Step 2 logic ran against fresh source. Detect the mismatch and
|
||||
# re-exec into the fresh copy so every post-update run exercises the
|
||||
# new script.
|
||||
#
|
||||
# Guard rails:
|
||||
# - Only runs when $APP_DIR exists, holds a git checkout, and a
|
||||
# deploy.sh exists there (i.e. after Step 1 succeeded).
|
||||
# - Uses a sentinel env var ATOCORE_DEPLOY_REEXECED=1 to make sure
|
||||
# we only re-exec once, never recurse.
|
||||
# - Skipped in dry-run mode (no mutation).
|
||||
# - Skipped if $0 isn't a readable file (bash -c pipe inputs, etc.).
|
||||
|
||||
if [ "$DRY_RUN" != "1" ] \
|
||||
&& [ -z "${ATOCORE_DEPLOY_REEXECED:-}" ] \
|
||||
&& [ -r "$0" ] \
|
||||
&& [ -f "$APP_DIR/deploy/dalidou/deploy.sh" ]; then
|
||||
ON_DISK_HASH="$(sha1sum "$APP_DIR/deploy/dalidou/deploy.sh" 2>/dev/null | awk '{print $1}')"
|
||||
RUNNING_HASH="$(sha1sum "$0" 2>/dev/null | awk '{print $1}')"
|
||||
if [ -n "$ON_DISK_HASH" ] \
|
||||
&& [ -n "$RUNNING_HASH" ] \
|
||||
&& [ "$ON_DISK_HASH" != "$RUNNING_HASH" ]; then
|
||||
log "Step 1.5: deploy.sh changed in the pulled commit; re-exec'ing"
|
||||
log " running script hash: $RUNNING_HASH"
|
||||
log " on-disk script hash: $ON_DISK_HASH"
|
||||
log " re-exec -> $APP_DIR/deploy/dalidou/deploy.sh"
|
||||
export ATOCORE_DEPLOY_REEXECED=1
|
||||
exec bash "$APP_DIR/deploy/dalidou/deploy.sh" "$@"
|
||||
fi
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 2: capture build provenance to pass to the container
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# We compute the full SHA, the short SHA, the UTC build timestamp,
|
||||
# and the source branch. These get exported as env vars before
|
||||
# `docker compose up -d --build` so the running container can read
|
||||
# them at startup and report them via /health. The post-deploy
|
||||
# verification step (Step 6) reads /health and compares the
|
||||
# reported SHA against this value to detect any silent drift.
|
||||
|
||||
log "Step 2: capturing build provenance"
|
||||
if [ "$DRY_RUN" != "1" ] && [ -d "$APP_DIR/.git" ]; then
|
||||
DEPLOYING_SHA_FULL="$(cd "$APP_DIR" && git rev-parse HEAD)"
|
||||
DEPLOYING_SHA="$(echo "$DEPLOYING_SHA_FULL" | cut -c1-7)"
|
||||
DEPLOYING_TIME="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||
DEPLOYING_BRANCH="$BRANCH"
|
||||
log " commit: $DEPLOYING_SHA ($DEPLOYING_SHA_FULL)"
|
||||
log " built at: $DEPLOYING_TIME"
|
||||
log " branch: $DEPLOYING_BRANCH"
|
||||
( cd "$APP_DIR" && git log --oneline -1 ) | sed 's/^/ /'
|
||||
export ATOCORE_BUILD_SHA="$DEPLOYING_SHA_FULL"
|
||||
export ATOCORE_BUILD_TIME="$DEPLOYING_TIME"
|
||||
export ATOCORE_BUILD_BRANCH="$DEPLOYING_BRANCH"
|
||||
else
|
||||
log " [dry-run] would read git log from $APP_DIR"
|
||||
DEPLOYING_SHA="dry-run"
|
||||
DEPLOYING_SHA_FULL="dry-run"
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 3: preserve the .env file (it's not in git)
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
ENV_FILE="$COMPOSE_DIR/.env"
|
||||
if [ "$DRY_RUN" != "1" ] && [ ! -f "$ENV_FILE" ]; then
|
||||
log "Step 3: WARNING — $ENV_FILE does not exist"
|
||||
log " the compose workflow needs this file to map mount points"
|
||||
log " copy deploy/dalidou/.env.example to $ENV_FILE and edit it"
|
||||
log " before re-running this script"
|
||||
exit 2
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 4: rebuild and restart the container
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
log "Step 4: rebuilding and restarting the atocore container"
|
||||
run "cd '$COMPOSE_DIR' && docker compose up -d --build"
|
||||
|
||||
if [ "$DRY_RUN" = "1" ]; then
|
||||
log "dry-run complete — no mutations performed"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 5: wait for the service to come up and pass the health check
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
log "Step 5: waiting for /health to respond"
|
||||
for i in 1 2 3 4 5 6 7 8 9 10; do
|
||||
if curl -fsS "$HEALTH_URL" > /tmp/atocore-health.json 2>/dev/null; then
|
||||
log " service is responding"
|
||||
break
|
||||
fi
|
||||
log " not ready yet ($i/10); waiting 3s"
|
||||
sleep 3
|
||||
done
|
||||
|
||||
if ! curl -fsS "$HEALTH_URL" > /tmp/atocore-health.json 2>/dev/null; then
|
||||
log "FATAL: service did not come up within 30 seconds"
|
||||
log " container logs (last 50 lines):"
|
||||
cd "$COMPOSE_DIR" && docker compose logs --tail=50 atocore || true
|
||||
exit 3
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Step 6: verify the deployed build matches what we just shipped
|
||||
# ---------------------------------------------------------------------
|
||||
#
|
||||
# Two layers of comparison:
|
||||
#
|
||||
# - code_version: matches src/atocore/__init__.py::__version__.
|
||||
# Coarse: any commit between version bumps reports the same value.
|
||||
# - build_sha: full git SHA the container was built from. Set as
|
||||
# an env var by Step 2 above and read by /health from
|
||||
# ATOCORE_BUILD_SHA. This is the precise drift signal — if the
|
||||
# live build_sha doesn't match $DEPLOYING_SHA_FULL, the build
|
||||
# didn't pick up the new source.
|
||||
|
||||
log "Step 6: verifying deployed build"
|
||||
log " /health response:"
|
||||
if command -v jq >/dev/null 2>&1; then
|
||||
jq . < /tmp/atocore-health.json | sed 's/^/ /'
|
||||
REPORTED_VERSION="$(jq -r '.code_version // .version' < /tmp/atocore-health.json)"
|
||||
REPORTED_SHA="$(jq -r '.build_sha // "unknown"' < /tmp/atocore-health.json)"
|
||||
REPORTED_BUILD_TIME="$(jq -r '.build_time // "unknown"' < /tmp/atocore-health.json)"
|
||||
else
|
||||
cat /tmp/atocore-health.json | sed 's/^/ /'
|
||||
echo
|
||||
REPORTED_VERSION="$(grep -o '"code_version":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
if [ -z "$REPORTED_VERSION" ]; then
|
||||
REPORTED_VERSION="$(grep -o '"version":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
fi
|
||||
REPORTED_SHA="$(grep -o '"build_sha":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
REPORTED_SHA="${REPORTED_SHA:-unknown}"
|
||||
REPORTED_BUILD_TIME="$(grep -o '"build_time":"[^"]*"' /tmp/atocore-health.json | head -1 | cut -d'"' -f4)"
|
||||
REPORTED_BUILD_TIME="${REPORTED_BUILD_TIME:-unknown}"
|
||||
fi
|
||||
|
||||
EXPECTED_VERSION="$(grep -oE "__version__ = \"[^\"]+\"" "$APP_DIR/src/atocore/__init__.py" | head -1 | cut -d'"' -f2)"
|
||||
|
||||
log " Layer 1 — coarse version:"
|
||||
log " expected code_version: $EXPECTED_VERSION (from src/atocore/__init__.py)"
|
||||
log " reported code_version: $REPORTED_VERSION (from live /health)"
|
||||
|
||||
if [ "$REPORTED_VERSION" != "$EXPECTED_VERSION" ]; then
|
||||
log "FATAL: code_version mismatch"
|
||||
log " the container may not have picked up the new image"
|
||||
log " try: docker compose down && docker compose up -d --build"
|
||||
exit 4
|
||||
fi
|
||||
|
||||
log " Layer 2 — precise build SHA:"
|
||||
log " expected build_sha: $DEPLOYING_SHA_FULL (from this deploy.sh run)"
|
||||
log " reported build_sha: $REPORTED_SHA (from live /health)"
|
||||
log " reported build_time: $REPORTED_BUILD_TIME"
|
||||
|
||||
if [ "$REPORTED_SHA" != "$DEPLOYING_SHA_FULL" ]; then
|
||||
log "FATAL: build_sha mismatch"
|
||||
log " the live container is reporting a different commit than"
|
||||
log " the one this deploy.sh run just shipped. Possible causes:"
|
||||
log " - the container is using a cached image instead of the"
|
||||
log " freshly-built one (try: docker compose build --no-cache)"
|
||||
log " - the env vars didn't propagate (check that"
|
||||
log " deploy/dalidou/docker-compose.yml has the environment"
|
||||
log " section with ATOCORE_BUILD_SHA)"
|
||||
log " - another process restarted the container between the"
|
||||
log " build and the health check"
|
||||
exit 6
|
||||
fi
|
||||
|
||||
log "Deploy complete."
|
||||
log " commit: $DEPLOYING_SHA ($DEPLOYING_SHA_FULL)"
|
||||
log " code_version: $REPORTED_VERSION"
|
||||
log " build_sha: $REPORTED_SHA"
|
||||
log " build_time: $REPORTED_BUILD_TIME"
|
||||
log " health: ok"
|
||||
37
deploy/dalidou/docker-compose.yml
Normal file
37
deploy/dalidou/docker-compose.yml
Normal file
@@ -0,0 +1,37 @@
|
||||
services:
|
||||
atocore:
|
||||
build:
|
||||
context: ../../
|
||||
dockerfile: Dockerfile
|
||||
container_name: atocore
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "${ATOCORE_PORT:-8100}:8100"
|
||||
env_file:
|
||||
- .env
|
||||
environment:
|
||||
# Build provenance — set by deploy/dalidou/deploy.sh on each
|
||||
# rebuild so /health can report exactly which commit is live.
|
||||
# Defaults to 'unknown' for direct `docker compose up` runs that
|
||||
# bypass deploy.sh; in that case the operator should run
|
||||
# deploy.sh instead so the deployed SHA is recorded.
|
||||
ATOCORE_BUILD_SHA: "${ATOCORE_BUILD_SHA:-unknown}"
|
||||
ATOCORE_BUILD_TIME: "${ATOCORE_BUILD_TIME:-unknown}"
|
||||
ATOCORE_BUILD_BRANCH: "${ATOCORE_BUILD_BRANCH:-unknown}"
|
||||
volumes:
|
||||
- ${ATOCORE_DB_DIR}:${ATOCORE_DB_DIR}
|
||||
- ${ATOCORE_CHROMA_DIR}:${ATOCORE_CHROMA_DIR}
|
||||
- ${ATOCORE_CACHE_DIR}:${ATOCORE_CACHE_DIR}
|
||||
- ${ATOCORE_TMP_DIR}:${ATOCORE_TMP_DIR}
|
||||
- ${ATOCORE_LOG_DIR}:${ATOCORE_LOG_DIR}
|
||||
- ${ATOCORE_BACKUP_DIR}:${ATOCORE_BACKUP_DIR}
|
||||
- ${ATOCORE_RUN_DIR}:${ATOCORE_RUN_DIR}
|
||||
- ${ATOCORE_PROJECT_REGISTRY_DIR}:${ATOCORE_PROJECT_REGISTRY_DIR}
|
||||
- ${ATOCORE_VAULT_SOURCE_DIR}:${ATOCORE_VAULT_SOURCE_DIR}:ro
|
||||
- ${ATOCORE_DRIVE_SOURCE_DIR}:${ATOCORE_DRIVE_SOURCE_DIR}:ro
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-fsS", "http://127.0.0.1:8100/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 5
|
||||
start_period: 20s
|
||||
188
deploy/hooks/capture_stop.py
Normal file
188
deploy/hooks/capture_stop.py
Normal file
@@ -0,0 +1,188 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Claude Code Stop hook: capture interaction to AtoCore.
|
||||
|
||||
Reads the Stop hook JSON from stdin, extracts the last user prompt
|
||||
from the transcript JSONL, and POSTs to the AtoCore /interactions
|
||||
endpoint with reinforcement enabled (no extraction).
|
||||
|
||||
Fail-open: always exits 0, logs errors to stderr only.
|
||||
|
||||
Environment variables:
|
||||
ATOCORE_URL Base URL of the AtoCore instance (default: http://dalidou:8100)
|
||||
ATOCORE_CAPTURE_DISABLED Set to "1" to disable capture (kill switch)
|
||||
|
||||
Usage in ~/.claude/settings.json:
|
||||
"Stop": [{
|
||||
"matcher": "",
|
||||
"hooks": [{
|
||||
"type": "command",
|
||||
"command": "python /path/to/capture_stop.py",
|
||||
"timeout": 15
|
||||
}]
|
||||
}]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
|
||||
ATOCORE_URL = os.environ.get("ATOCORE_URL", "http://dalidou:8100")
|
||||
TIMEOUT_SECONDS = 10
|
||||
|
||||
# Minimum prompt length to bother capturing. Single-word acks,
|
||||
# slash commands, and empty lines aren't useful interactions.
|
||||
MIN_PROMPT_LENGTH = 15
|
||||
|
||||
# Maximum response length to capture. Truncate very long assistant
|
||||
# responses to keep the interactions table manageable.
|
||||
MAX_RESPONSE_LENGTH = 50_000
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Entry point. Always exits 0."""
|
||||
try:
|
||||
_capture()
|
||||
except Exception as exc:
|
||||
print(f"capture_stop: {exc}", file=sys.stderr)
|
||||
|
||||
|
||||
def _capture() -> None:
|
||||
if os.environ.get("ATOCORE_CAPTURE_DISABLED") == "1":
|
||||
return
|
||||
|
||||
raw = sys.stdin.read()
|
||||
if not raw.strip():
|
||||
return
|
||||
|
||||
hook_data = json.loads(raw)
|
||||
|
||||
session_id = hook_data.get("session_id", "")
|
||||
assistant_message = hook_data.get("last_assistant_message", "")
|
||||
transcript_path = hook_data.get("transcript_path", "")
|
||||
cwd = hook_data.get("cwd", "")
|
||||
|
||||
prompt = _extract_last_user_prompt(transcript_path)
|
||||
if not prompt or len(prompt.strip()) < MIN_PROMPT_LENGTH:
|
||||
return
|
||||
|
||||
response = assistant_message or ""
|
||||
if len(response) > MAX_RESPONSE_LENGTH:
|
||||
response = response[:MAX_RESPONSE_LENGTH] + "\n\n[truncated]"
|
||||
|
||||
project = _infer_project(cwd)
|
||||
|
||||
payload = {
|
||||
"prompt": prompt,
|
||||
"response": response,
|
||||
"client": "claude-code",
|
||||
"session_id": session_id,
|
||||
"project": project,
|
||||
"reinforce": True,
|
||||
}
|
||||
|
||||
body = json.dumps(payload, ensure_ascii=True).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{ATOCORE_URL}/interactions",
|
||||
data=body,
|
||||
headers={"Content-Type": "application/json"},
|
||||
method="POST",
|
||||
)
|
||||
resp = urllib.request.urlopen(req, timeout=TIMEOUT_SECONDS)
|
||||
result = json.loads(resp.read().decode("utf-8"))
|
||||
print(
|
||||
f"capture_stop: recorded interaction {result.get('id', '?')} "
|
||||
f"(project={project or 'none'}, prompt_chars={len(prompt)}, "
|
||||
f"response_chars={len(response)})",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
|
||||
def _extract_last_user_prompt(transcript_path: str) -> str:
|
||||
"""Read the JSONL transcript and return the last real user prompt.
|
||||
|
||||
Skips meta messages (isMeta=True) and system/command messages
|
||||
(content starting with '<').
|
||||
"""
|
||||
if not transcript_path:
|
||||
return ""
|
||||
|
||||
# Normalize path for the current OS
|
||||
path = os.path.normpath(transcript_path)
|
||||
if not os.path.isfile(path):
|
||||
return ""
|
||||
|
||||
last_prompt = ""
|
||||
try:
|
||||
with open(path, encoding="utf-8", errors="replace") as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
entry = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
if entry.get("type") != "user":
|
||||
continue
|
||||
if entry.get("isMeta", False):
|
||||
continue
|
||||
|
||||
msg = entry.get("message", {})
|
||||
if not isinstance(msg, dict):
|
||||
continue
|
||||
|
||||
content = msg.get("content", "")
|
||||
|
||||
if isinstance(content, str):
|
||||
text = content.strip()
|
||||
elif isinstance(content, list):
|
||||
# Content blocks: extract text blocks
|
||||
parts = []
|
||||
for block in content:
|
||||
if isinstance(block, str):
|
||||
parts.append(block)
|
||||
elif isinstance(block, dict) and block.get("type") == "text":
|
||||
parts.append(block.get("text", ""))
|
||||
text = "\n".join(parts).strip()
|
||||
else:
|
||||
continue
|
||||
|
||||
# Skip system/command XML and very short messages
|
||||
if text.startswith("<") or len(text) < MIN_PROMPT_LENGTH:
|
||||
continue
|
||||
|
||||
last_prompt = text
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
return last_prompt
|
||||
|
||||
|
||||
# Project inference from working directory.
|
||||
# Maps known repo paths to AtoCore project IDs. The user can extend
|
||||
# this table or replace it with a registry lookup later.
|
||||
_PROJECT_PATH_MAP: dict[str, str] = {
|
||||
# Add mappings as needed, e.g.:
|
||||
# "C:\\Users\\antoi\\gigabit": "p04-gigabit",
|
||||
# "C:\\Users\\antoi\\interferometer": "p05-interferometer",
|
||||
}
|
||||
|
||||
|
||||
def _infer_project(cwd: str) -> str:
|
||||
"""Try to map the working directory to an AtoCore project."""
|
||||
if not cwd:
|
||||
return ""
|
||||
norm = os.path.normpath(cwd).lower()
|
||||
for path_prefix, project_id in _PROJECT_PATH_MAP.items():
|
||||
if norm.startswith(os.path.normpath(path_prefix).lower()):
|
||||
return project_id
|
||||
return ""
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
332
docs/architecture/conflict-model.md
Normal file
332
docs/architecture/conflict-model.md
Normal file
@@ -0,0 +1,332 @@
|
||||
# Conflict Model (how AtoCore handles contradictory facts)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
Any system that accumulates facts from multiple sources — interactions,
|
||||
ingested documents, repo history, PKM notes — will eventually see
|
||||
contradictory facts about the same thing. AtoCore's operating model
|
||||
already has the hard rule:
|
||||
|
||||
> **Bad memory is worse than no memory.**
|
||||
|
||||
The practical consequence of that rule is: AtoCore must never
|
||||
silently merge contradictory facts, never silently pick a winner,
|
||||
and never silently discard evidence. Every conflict must be
|
||||
surfaced to a human reviewer with full audit context.
|
||||
|
||||
This document defines what "conflict" means in AtoCore, how
|
||||
conflicts are detected, how they are represented, how they are
|
||||
surfaced, and how they are resolved.
|
||||
|
||||
## What counts as a conflict
|
||||
|
||||
A conflict exists when two or more facts in the system claim
|
||||
incompatible values for the same conceptual slot. More precisely:
|
||||
|
||||
A conflict is a set of two or more **active** rows (across memories,
|
||||
entities, project_state) such that:
|
||||
|
||||
1. They share the same **target identity** — same entity type and
|
||||
same semantic key
|
||||
2. Their **claimed values** are incompatible
|
||||
3. They are all in an **active** status (not superseded, not
|
||||
invalid, not candidate)
|
||||
|
||||
Examples that are conflicts:
|
||||
|
||||
- Two active `Decision` entities affecting the same `Subsystem`
|
||||
with contradictory values for the same decided field (e.g.
|
||||
lateral support material = GF-PTFE vs lateral support material = PEEK)
|
||||
- An active `preference` memory "prefers rebase workflow" and an
|
||||
active `preference` memory "prefers merge-commit workflow"
|
||||
- A `project_state` entry `p05 / decision / lateral_support_material = GF-PTFE`
|
||||
and an active `Decision` entity also claiming the lateral support
|
||||
material is PEEK (cross-layer conflict)
|
||||
|
||||
Examples that are NOT conflicts:
|
||||
|
||||
- Two active memories both saying "prefers small diffs" — same
|
||||
meaning, not contradictory
|
||||
- An active memory saying X and a candidate memory saying Y —
|
||||
candidates are not active, so this is part of the review queue
|
||||
flow, not the conflict flow
|
||||
- A superseded `Decision` saying X and an active `Decision` saying Y
|
||||
— supersession is a resolved history, not a conflict
|
||||
- Two active `Requirement` entities each constraining the same
|
||||
component in different but compatible ways (e.g. one caps mass,
|
||||
one caps heat flux) — different fields, no contradiction
|
||||
|
||||
## Detection triggers
|
||||
|
||||
Conflict detection must fire at every write that could create a new
|
||||
active fact. That means the following hook points:
|
||||
|
||||
1. **`POST /memory` creating an active memory** (legacy path)
|
||||
2. **`POST /memory/{id}/promote`** (candidate → active)
|
||||
3. **`POST /entities` creating an active entity** (future)
|
||||
4. **`POST /entities/{id}/promote`** (candidate → active, future)
|
||||
5. **`POST /project/state`** (curating trusted state directly)
|
||||
6. **`POST /memory/{id}/graduate`** (memory → entity graduation,
|
||||
future — the resulting entity could conflict with something)
|
||||
|
||||
Extraction passes do NOT trigger conflict detection at candidate
|
||||
write time. Candidates are allowed to sit in the queue in an
|
||||
apparently-conflicting state; the reviewer will see them during
|
||||
promotion and decision-detection fires at that moment.
|
||||
|
||||
## Detection strategy per layer
|
||||
|
||||
### Memory layer
|
||||
|
||||
For identity / preference / episodic memories (the ones that stay
|
||||
in the memory layer):
|
||||
|
||||
- Matching key: `(memory_type, project, normalized_content_family)`
|
||||
- `normalized_content_family` is not a hash of the content — that
|
||||
would require exact equality — but a slot identifier extracted
|
||||
by a small per-type rule set:
|
||||
- identity: slot is "role" / "background" / "credentials"
|
||||
- preference: slot is the first content word after "prefers" / "uses" / "likes"
|
||||
normalized to a lowercase noun stem, OR the rule id that extracted it
|
||||
- episodic: no slot — episodic entries are intrinsically tied to
|
||||
a moment in time and rarely conflict
|
||||
|
||||
A conflict is flagged when two active memories share a
|
||||
`(memory_type, project, slot)` but have different content bodies.
|
||||
|
||||
### Entity layer (V1)
|
||||
|
||||
For each V1 entity type, the conflict key is a short tuple that
|
||||
uniquely identifies the "slot" that entity is claiming:
|
||||
|
||||
| Entity type | Conflict slot |
|
||||
|-------------------|-------------------------------------------------------|
|
||||
| Project | `(project_id)` |
|
||||
| Subsystem | `(project_id, subsystem_name)` |
|
||||
| Component | `(project_id, subsystem_name, component_name)` |
|
||||
| Requirement | `(project_id, requirement_key)` |
|
||||
| Constraint | `(project_id, constraint_target, constraint_kind)` |
|
||||
| Decision | `(project_id, decision_target, decision_field)` |
|
||||
| Material | `(project_id, component_id)` |
|
||||
| Parameter | `(project_id, parameter_scope, parameter_name)` |
|
||||
| AnalysisModel | `(project_id, subsystem_id, model_name)` |
|
||||
| Result | `(project_id, analysis_model_id, result_key)` |
|
||||
| ValidationClaim | `(project_id, claim_key)` |
|
||||
| Artifact | no conflict detection — artifacts are additive |
|
||||
|
||||
A conflict is two active entities with the same slot but
|
||||
different structural values. The exact "which fields count as
|
||||
structural" list is per-type and lives in the entity schema doc
|
||||
(not yet written — tracked as future `engineering-ontology-v1.md`
|
||||
updates).
|
||||
|
||||
### Cross-layer (memory vs entity vs trusted project state)
|
||||
|
||||
Trusted project state trumps active entities trumps active
|
||||
memories. This is the trust hierarchy from the operating model.
|
||||
|
||||
Cross-layer conflict detection works by a nightly job that walks
|
||||
the three layers and flags any slot that has entries in more than
|
||||
one layer with incompatible values:
|
||||
|
||||
- If trusted project state and an entity disagree: the entity is
|
||||
flagged; trusted state is assumed correct
|
||||
- If an entity and a memory disagree: the memory is flagged; the
|
||||
entity is assumed correct
|
||||
- If trusted state and a memory disagree: the memory is flagged;
|
||||
trusted state is assumed correct
|
||||
|
||||
In all three cases the lower-trust row gets a `conflicts_with`
|
||||
reference pointing at the higher-trust row but does NOT auto-move
|
||||
to superseded. The flag is an alert, not an action.
|
||||
|
||||
## Representation
|
||||
|
||||
Conflicts are represented as rows in a new `conflicts` table
|
||||
(V1 schema, not yet shipped):
|
||||
|
||||
```sql
|
||||
CREATE TABLE conflicts (
|
||||
id TEXT PRIMARY KEY,
|
||||
detected_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||
slot_kind TEXT NOT NULL, -- "memory_slot" or "entity_slot" or "cross_layer"
|
||||
slot_key TEXT NOT NULL, -- JSON-encoded tuple identifying the slot
|
||||
project TEXT DEFAULT '',
|
||||
status TEXT NOT NULL DEFAULT 'open', -- open | resolved | dismissed
|
||||
resolved_at DATETIME,
|
||||
resolution TEXT DEFAULT '', -- free text from the reviewer
|
||||
-- links to conflicting rows live in conflict_members
|
||||
UNIQUE(slot_kind, slot_key, status) -- ensures only one open conflict per slot
|
||||
);
|
||||
|
||||
CREATE TABLE conflict_members (
|
||||
conflict_id TEXT NOT NULL REFERENCES conflicts(id) ON DELETE CASCADE,
|
||||
member_kind TEXT NOT NULL, -- "memory" | "entity" | "project_state"
|
||||
member_id TEXT NOT NULL,
|
||||
member_layer_trust INTEGER NOT NULL,-- 1=memory, 2=entity, 3=project_state
|
||||
PRIMARY KEY (conflict_id, member_kind, member_id)
|
||||
);
|
||||
```
|
||||
|
||||
Constraint rationale:
|
||||
|
||||
- `UNIQUE(slot_kind, slot_key, status)` where status='open' prevents
|
||||
duplicate "conflict already open for this slot" rows. At most one
|
||||
open conflict exists per slot at a time; new conflicting rows are
|
||||
added as members to the existing conflict, not as a new conflict.
|
||||
- `conflict_members.member_layer_trust` is denormalized so the
|
||||
conflict resolution UI can sort conflicting rows by trust tier
|
||||
without re-querying.
|
||||
- `status='dismissed'` exists separately from `resolved` because
|
||||
"the reviewer looked at this and declared it not a real conflict"
|
||||
is a valid distinct outcome (the two rows really do describe
|
||||
different things and the detector was overfitting).
|
||||
|
||||
## API shape
|
||||
|
||||
```
|
||||
GET /conflicts list open conflicts
|
||||
GET /conflicts?status=resolved list resolved conflicts
|
||||
GET /conflicts?project=p05-interferometer scope by project
|
||||
GET /conflicts/{id} full detail including all members
|
||||
|
||||
POST /conflicts/{id}/resolve mark resolved with notes
|
||||
body: {
|
||||
"resolution_notes": "...",
|
||||
"winner_member_id": "...", # optional: if specified,
|
||||
# other members are auto-superseded
|
||||
"action": "supersede_others" # or "no_action" if reviewer
|
||||
# wants to resolve without touching rows
|
||||
}
|
||||
|
||||
POST /conflicts/{id}/dismiss mark dismissed ("not a real conflict")
|
||||
body: {
|
||||
"reason": "..."
|
||||
}
|
||||
```
|
||||
|
||||
Conflict detection must also surface in existing endpoints:
|
||||
|
||||
- `GET /memory/{id}` — response includes a `conflicts` array if
|
||||
the memory is a member of any open conflict
|
||||
- `GET /entities/{type}/{id}` (future) — same
|
||||
- `GET /health` — includes `open_conflicts_count` so the operator
|
||||
sees at a glance that review is pending
|
||||
|
||||
## Supersession as a conflict resolution tool
|
||||
|
||||
When the reviewer resolves a conflict with `action: "supersede_others"`,
|
||||
the winner stays active and every other member is flipped to
|
||||
status="superseded" with a `superseded_by` pointer to the winner.
|
||||
This is the normal path: "we used to think X, now we know Y, flag
|
||||
X as superseded so the audit trail keeps X visible but X no longer
|
||||
influences context".
|
||||
|
||||
The conflict resolution audit record links back to all superseded
|
||||
members, so the conflict history itself is queryable:
|
||||
|
||||
- "Show me every conflict that touched Subsystem X"
|
||||
- "Show me every Decision that superseded another Decision because
|
||||
of a conflict"
|
||||
|
||||
These are entries in the V1 query catalog (see Q-014 decision history).
|
||||
|
||||
## Detection latency
|
||||
|
||||
Conflict detection runs at two latencies:
|
||||
|
||||
1. **Synchronous (at write time)** — every create/promote/update of
|
||||
an active row in a conflict-enabled type runs a synchronous
|
||||
same-layer detector. If a conflict is detected the write still
|
||||
succeeds but a row is inserted into `conflicts` and the API
|
||||
response includes a `conflict_id` field so the caller knows
|
||||
immediately.
|
||||
|
||||
2. **Asynchronous (nightly sweep)** — a scheduled job walks all
|
||||
three layers looking for cross-layer conflicts that slipped
|
||||
past write-time detection (e.g. a memory that was already
|
||||
active before an entity with the same slot was promoted). The
|
||||
sweep also looks for slot overlaps that the synchronous
|
||||
detector can't see because the slot key extraction rules have
|
||||
improved since the row was written.
|
||||
|
||||
Both paths write to the same `conflicts` table and both are
|
||||
surfaced in the same review queue.
|
||||
|
||||
## The "flag, never block" rule
|
||||
|
||||
Detection **never** blocks writes. The operating rule is:
|
||||
|
||||
- If the write is otherwise valid (schema, permissions, trust
|
||||
hierarchy), accept it
|
||||
- Log the conflict
|
||||
- Surface it to the reviewer
|
||||
- Let the system keep functioning with the conflict in place
|
||||
|
||||
The alternative — blocking writes on conflict — would mean that
|
||||
one stale fact could prevent all future writes until manually
|
||||
resolved, which in practice makes the system unusable for normal
|
||||
work. The "flag, never block" rule keeps AtoCore responsive while
|
||||
still making conflicts impossible to ignore (the `/health`
|
||||
endpoint's `open_conflicts_count` makes them loud).
|
||||
|
||||
The one exception: writing to `project_state` (layer 3) when an
|
||||
open conflict already exists on that slot will return a warning
|
||||
in the response body. The write still happens, but the reviewer
|
||||
is explicitly told "you just wrote to a slot that has an open
|
||||
conflict". This is the highest-trust layer so we want extra
|
||||
friction there without actually blocking.
|
||||
|
||||
## Showing conflicts in the Human Mirror
|
||||
|
||||
When the Human Mirror template renders a project overview, any
|
||||
open conflict in that project shows as a **"⚠ disputed"** marker
|
||||
next to the affected field, with a link to the conflict detail.
|
||||
This makes conflicts visible to anyone reading the derived
|
||||
human-facing pages, not just to reviewers who think to check the
|
||||
`/conflicts` endpoint.
|
||||
|
||||
The Human Mirror render rules (not yet written — tracked as future
|
||||
`human-mirror-rules.md`) will specify exactly where and how the
|
||||
disputed marker appears.
|
||||
|
||||
## What this document does NOT solve
|
||||
|
||||
1. **Automatic conflict resolution.** No policy will ever
|
||||
automatically promote one conflict member over another. The
|
||||
trust hierarchy is an *alert ordering* for reviewers, not an
|
||||
auto-resolve rule. The human signs off on every resolution.
|
||||
|
||||
2. **Cross-project conflicts.** If p04 and p06 both have
|
||||
entities claiming conflicting things about a shared component,
|
||||
that is currently out of scope because the V1 slot keys all
|
||||
include `project_id`. Cross-project conflict detection is a
|
||||
future concern that needs its own slot key strategy.
|
||||
|
||||
3. **Temporal conflicts with partial overlap.** If a fact was
|
||||
true during a time window and another fact is true in a
|
||||
different time window, that is not a conflict — it's history.
|
||||
Representing time-bounded facts is deferred to a future
|
||||
temporal-entities doc.
|
||||
|
||||
4. **Probabilistic "soft" conflicts.** If two entities claim the
|
||||
same slot with slightly different values (e.g. "4.8 kg" vs
|
||||
"4.82 kg"), is that a conflict? For V1, yes — the string
|
||||
values are unequal so they're flagged. Tolerance-aware
|
||||
numeric comparisons are a V2 concern.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- Conflicts = two or more active rows claiming the same slot with
|
||||
incompatible values
|
||||
- Detection fires on every active write AND in a nightly sweep
|
||||
- Conflicts are stored in a dedicated `conflicts` table with a
|
||||
`conflict_members` join
|
||||
- Resolution is always human (promote-winner / supersede-others
|
||||
/ dismiss-as-not-a-conflict)
|
||||
- "Flag, never block" — writes always succeed, conflicts are
|
||||
surfaced via `/conflicts`, `/health`, per-entity responses, and
|
||||
the Human Mirror
|
||||
- Trusted project state is the top of the trust hierarchy and is
|
||||
assumed correct in any cross-layer conflict until the reviewer
|
||||
says otherwise
|
||||
205
docs/architecture/engineering-knowledge-hybrid-architecture.md
Normal file
205
docs/architecture/engineering-knowledge-hybrid-architecture.md
Normal file
@@ -0,0 +1,205 @@
|
||||
# Engineering Knowledge Hybrid Architecture
|
||||
|
||||
## Purpose
|
||||
|
||||
This note defines how **AtoCore** can evolve into the machine foundation for a **living engineering project knowledge system** while remaining aligned with core AtoCore philosophy.
|
||||
|
||||
AtoCore remains:
|
||||
- the trust engine
|
||||
- the memory/context engine
|
||||
- the retrieval/context assembly layer
|
||||
- the runtime-facing augmentation layer
|
||||
|
||||
It does **not** become a generic wiki app or a PLM clone.
|
||||
|
||||
## Core Architectural Thesis
|
||||
|
||||
AtoCore should act as the **machine truth / context / memory substrate** for project knowledge systems.
|
||||
|
||||
That substrate can then support:
|
||||
- engineering knowledge accumulation
|
||||
- human-readable mirrors
|
||||
- OpenClaw augmentation
|
||||
- future engineering copilots
|
||||
- project traceability across design, analysis, manufacturing, and operations
|
||||
|
||||
## Layer Model
|
||||
|
||||
### Layer 0 — Raw Artifact Layer
|
||||
Examples:
|
||||
- CAD exports
|
||||
- FEM exports
|
||||
- videos / transcripts
|
||||
- screenshots
|
||||
- PDFs
|
||||
- source code
|
||||
- spreadsheets
|
||||
- reports
|
||||
- test data
|
||||
|
||||
### Layer 1 — AtoCore Core Machine Layer
|
||||
Canonical machine substrate.
|
||||
|
||||
Contains:
|
||||
- source registry
|
||||
- source chunks
|
||||
- embeddings / vector retrieval
|
||||
- structured memory
|
||||
- trusted project state
|
||||
- entity and relationship stores
|
||||
- provenance and confidence metadata
|
||||
- interactions / retrieval logs / context packs
|
||||
|
||||
### Layer 2 — Engineering Knowledge Layer
|
||||
Domain-specific project model built on top of AtoCore.
|
||||
|
||||
Represents typed engineering objects such as:
|
||||
- Project
|
||||
- System
|
||||
- Subsystem
|
||||
- Component
|
||||
- Interface
|
||||
- Requirement
|
||||
- Constraint
|
||||
- Assumption
|
||||
- Decision
|
||||
- Material
|
||||
- Parameter
|
||||
- Equation
|
||||
- Analysis Model
|
||||
- Result
|
||||
- Validation Claim
|
||||
- Manufacturing Process
|
||||
- Test
|
||||
- Software Module
|
||||
- Vendor
|
||||
- Artifact
|
||||
|
||||
### Layer 3 — Human Mirror
|
||||
Derived human-readable support surface.
|
||||
|
||||
Examples:
|
||||
- project overview
|
||||
- current state
|
||||
- subsystem pages
|
||||
- component pages
|
||||
- decision log
|
||||
- validation summary
|
||||
- timeline
|
||||
- open questions / risks
|
||||
|
||||
This layer is **derived** from structured state and approved synthesis. It is not canonical machine truth.
|
||||
|
||||
### Layer 4 — Runtime / Clients
|
||||
Consumers such as:
|
||||
- OpenClaw
|
||||
- CLI tools
|
||||
- dashboards
|
||||
- future IDE integrations
|
||||
- engineering copilots
|
||||
- reporting systems
|
||||
- Atomizer / optimization tooling
|
||||
|
||||
## Non-Negotiable Rule
|
||||
|
||||
**Human-readable pages are support artifacts. They are not the primary machine truth layer.**
|
||||
|
||||
Runtime trust order should remain:
|
||||
1. trusted current project state
|
||||
2. validated structured records
|
||||
3. selected reviewed synthesis
|
||||
4. retrieved source evidence
|
||||
5. historical / low-confidence material
|
||||
|
||||
## Responsibilities
|
||||
|
||||
### AtoCore core owns
|
||||
- memory CRUD
|
||||
- trusted project state CRUD
|
||||
- retrieval orchestration
|
||||
- context assembly
|
||||
- provenance
|
||||
- confidence / status
|
||||
- conflict flags
|
||||
- runtime APIs
|
||||
|
||||
### Engineering Knowledge Layer owns
|
||||
- engineering object taxonomy
|
||||
- engineering relationships
|
||||
- domain adapters
|
||||
- project-specific interpretation logic
|
||||
- design / analysis / manufacturing / operations linkage
|
||||
|
||||
### Human Mirror owns
|
||||
- readability
|
||||
- navigation
|
||||
- overview pages
|
||||
- subsystem summaries
|
||||
- decision digests
|
||||
- human inspection / audit comfort
|
||||
|
||||
## Update Model
|
||||
|
||||
New artifacts should not directly overwrite trusted state.
|
||||
|
||||
Recommended update flow:
|
||||
1. ingest source
|
||||
2. parse / chunk / register artifact
|
||||
3. extract candidate objects / claims / relationships
|
||||
4. compare against current trusted state
|
||||
5. flag conflicts or supersessions
|
||||
6. promote updates only under explicit rules
|
||||
7. regenerate affected human-readable pages
|
||||
8. log history and provenance
|
||||
|
||||
## Integration with Existing Knowledge Base System
|
||||
|
||||
The existing engineering Knowledge Base project can be treated as the first major domain adapter.
|
||||
|
||||
Bridge targets include:
|
||||
- KB-CAD component and architecture pages
|
||||
- KB-FEM models / results / validation pages
|
||||
- generation history
|
||||
- images / transcripts / session captures
|
||||
|
||||
AtoCore should absorb the structured value of that system, not replace it with plain retrieval.
|
||||
|
||||
## Suggested First Implementation Scope
|
||||
|
||||
1. stabilize current AtoCore core behavior
|
||||
2. define engineering ontology v1
|
||||
3. add minimal entity / relationship support
|
||||
4. create a Knowledge Base bridge for existing project structures
|
||||
5. generate Human Mirror v1 pages:
|
||||
- overview
|
||||
- current state
|
||||
- decision log
|
||||
- subsystem summary
|
||||
6. add engineering-aware context assembly for OpenClaw
|
||||
|
||||
## Why This Is Aligned With AtoCore Philosophy
|
||||
|
||||
This architecture preserves the original core ideas:
|
||||
- owned memory layer
|
||||
- owned context assembly
|
||||
- machine-human separation
|
||||
- provenance and trust clarity
|
||||
- portability across runtimes
|
||||
- robustness before sophistication
|
||||
|
||||
## Long-Range Outcome
|
||||
|
||||
AtoCore can become the substrate for a **knowledge twin** of an engineering project:
|
||||
- structure
|
||||
- intent
|
||||
- rationale
|
||||
- validation
|
||||
- manufacturing impact
|
||||
- operational behavior
|
||||
- change history
|
||||
- evidence traceability
|
||||
|
||||
That is significantly more powerful than either:
|
||||
- a generic wiki
|
||||
- plain document RAG
|
||||
- an assistant with only chat memory
|
||||
250
docs/architecture/engineering-ontology-v1.md
Normal file
250
docs/architecture/engineering-ontology-v1.md
Normal file
@@ -0,0 +1,250 @@
|
||||
# Engineering Ontology V1
|
||||
|
||||
## Purpose
|
||||
|
||||
Define the first practical engineering ontology that can sit on top of AtoCore and represent a real engineering project as structured knowledge.
|
||||
|
||||
This ontology is intended to be:
|
||||
- useful to machines
|
||||
- inspectable by humans through derived views
|
||||
- aligned with AtoCore trust / provenance rules
|
||||
- expandable across mechanical, FEM, electrical, software, manufacturing, and operations
|
||||
|
||||
## Goal
|
||||
|
||||
Represent a project as a **system of objects and relationships**, not as a pile of notes.
|
||||
|
||||
The ontology should support queries such as:
|
||||
- what is this subsystem?
|
||||
- what requirements does this component satisfy?
|
||||
- what result validates this claim?
|
||||
- what changed recently?
|
||||
- what interfaces are affected by a design change?
|
||||
- what is active vs superseded?
|
||||
|
||||
## Object Families
|
||||
|
||||
### Project structure
|
||||
- Project
|
||||
- System
|
||||
- Subsystem
|
||||
- Assembly
|
||||
- Component
|
||||
- Interface
|
||||
|
||||
### Intent / design logic
|
||||
- Requirement
|
||||
- Constraint
|
||||
- Assumption
|
||||
- Decision
|
||||
- Rationale
|
||||
- Risk
|
||||
- Issue
|
||||
- Open Question
|
||||
- Change Request
|
||||
|
||||
### Physical / technical definition
|
||||
- Material
|
||||
- Parameter
|
||||
- Equation
|
||||
- Configuration
|
||||
- Geometry Artifact
|
||||
- CAD Artifact
|
||||
- Tolerance
|
||||
- Operating Mode
|
||||
|
||||
### Analysis / validation
|
||||
- Analysis Model
|
||||
- Load Case
|
||||
- Boundary Condition
|
||||
- Solver Setup
|
||||
- Result
|
||||
- Validation Claim
|
||||
- Test
|
||||
- Correlation Record
|
||||
|
||||
### Manufacturing / delivery
|
||||
- Manufacturing Process
|
||||
- Vendor
|
||||
- BOM Item
|
||||
- Part Number
|
||||
- Assembly Procedure
|
||||
- Inspection Step
|
||||
- Cost Driver
|
||||
|
||||
### Software / controls / electrical
|
||||
- Software Module
|
||||
- Control Function
|
||||
- State Machine
|
||||
- Signal
|
||||
- Sensor
|
||||
- Actuator
|
||||
- Electrical Interface
|
||||
- Firmware Artifact
|
||||
|
||||
### Evidence / provenance
|
||||
- Source Document
|
||||
- Transcript Segment
|
||||
- Image / Screenshot
|
||||
- Session
|
||||
- Report
|
||||
- External Reference
|
||||
- Generated Summary
|
||||
|
||||
## Minimum Viable V1 Scope
|
||||
|
||||
Initial implementation should start with:
|
||||
- Project
|
||||
- Subsystem
|
||||
- Component
|
||||
- Requirement
|
||||
- Constraint
|
||||
- Decision
|
||||
- Material
|
||||
- Parameter
|
||||
- Analysis Model
|
||||
- Result
|
||||
- Validation Claim
|
||||
- Artifact
|
||||
|
||||
This is enough to represent meaningful project state without trying to model everything immediately.
|
||||
|
||||
## Core Relationship Types
|
||||
|
||||
### Structural
|
||||
- `CONTAINS`
|
||||
- `PART_OF`
|
||||
- `INTERFACES_WITH`
|
||||
|
||||
### Intent / logic
|
||||
- `SATISFIES`
|
||||
- `CONSTRAINED_BY`
|
||||
- `BASED_ON_ASSUMPTION`
|
||||
- `AFFECTED_BY_DECISION`
|
||||
- `SUPERSEDES`
|
||||
|
||||
### Validation
|
||||
- `ANALYZED_BY`
|
||||
- `VALIDATED_BY`
|
||||
- `SUPPORTS`
|
||||
- `CONFLICTS_WITH`
|
||||
- `DEPENDS_ON`
|
||||
|
||||
### Artifact / provenance
|
||||
- `DESCRIBED_BY`
|
||||
- `UPDATED_BY_SESSION`
|
||||
- `EVIDENCED_BY`
|
||||
- `SUMMARIZED_IN`
|
||||
|
||||
## Example Statements
|
||||
|
||||
- `Subsystem:Lateral Support CONTAINS Component:Pivot Pin`
|
||||
- `Component:Pivot Pin CONSTRAINED_BY Requirement:low lateral friction`
|
||||
- `Decision:Use GF-PTFE pad AFFECTS Subsystem:Lateral Support`
|
||||
- `AnalysisModel:M1 static model ANALYZES Subsystem:Reference Frame`
|
||||
- `Result:deflection case 03 SUPPORTS ValidationClaim:vertical stiffness acceptable`
|
||||
- `Artifact:NX assembly DESCRIBES Component:Reference Frame`
|
||||
- `Session:gen-004 UPDATED_BY_SESSION Component:Vertical Support`
|
||||
|
||||
## Shared Required Fields
|
||||
|
||||
Every major object should support fields equivalent to:
|
||||
- `id`
|
||||
- `type`
|
||||
- `name`
|
||||
- `project_id`
|
||||
- `status`
|
||||
- `confidence`
|
||||
- `source_refs`
|
||||
- `created_at`
|
||||
- `updated_at`
|
||||
- `notes` (optional)
|
||||
|
||||
## Suggested Status Lifecycle
|
||||
|
||||
For objects and claims:
|
||||
- `candidate`
|
||||
- `active`
|
||||
- `superseded`
|
||||
- `invalid`
|
||||
- `needs_review`
|
||||
|
||||
## Trust Rules
|
||||
|
||||
1. An object may exist before it becomes trusted.
|
||||
2. A generated markdown summary is not canonical truth by default.
|
||||
3. If evidence conflicts, prefer:
|
||||
1. trusted current project state
|
||||
2. validated structured records
|
||||
3. reviewed derived synthesis
|
||||
4. raw evidence
|
||||
5. historical notes
|
||||
4. Conflicts should be surfaced, not silently blended.
|
||||
|
||||
## Mapping to the Existing Knowledge Base System
|
||||
|
||||
### KB-CAD can map to
|
||||
- System
|
||||
- Subsystem
|
||||
- Component
|
||||
- Material
|
||||
- Decision
|
||||
- Constraint
|
||||
- Artifact
|
||||
|
||||
### KB-FEM can map to
|
||||
- Analysis Model
|
||||
- Load Case
|
||||
- Boundary Condition
|
||||
- Result
|
||||
- Validation Claim
|
||||
- Correlation Record
|
||||
|
||||
### Session generations can map to
|
||||
- Session
|
||||
- Generated Summary
|
||||
- object update history
|
||||
- provenance events
|
||||
|
||||
## Human Mirror Possibilities
|
||||
|
||||
Once the ontology exists, AtoCore can generate pages such as:
|
||||
- project overview
|
||||
- subsystem page
|
||||
- component page
|
||||
- decision log
|
||||
- validation summary
|
||||
- requirement trace page
|
||||
|
||||
These should remain **derived representations** of structured state.
|
||||
|
||||
## Recommended V1 Deliverables
|
||||
|
||||
1. minimal typed object registry
|
||||
2. minimal typed relationship registry
|
||||
3. evidence-linking support
|
||||
4. practical query support for:
|
||||
- component summary
|
||||
- subsystem current state
|
||||
- requirement coverage
|
||||
- result-to-claim mapping
|
||||
- decision history
|
||||
|
||||
## What Not To Do In V1
|
||||
|
||||
- do not model every engineering concept immediately
|
||||
- do not build a giant graph with no practical queries
|
||||
- do not collapse structured objects back into only markdown
|
||||
- do not let generated prose outrank structured truth
|
||||
- do not auto-promote trusted state too aggressively
|
||||
|
||||
## Summary
|
||||
|
||||
Ontology V1 should be:
|
||||
- small enough to implement
|
||||
- rich enough to be useful
|
||||
- aligned with AtoCore trust philosophy
|
||||
- capable of absorbing the existing engineering Knowledge Base work
|
||||
|
||||
The first goal is not to model everything.
|
||||
The first goal is to represent enough of a real project that AtoCore can reason over structure, not just notes.
|
||||
380
docs/architecture/engineering-query-catalog.md
Normal file
380
docs/architecture/engineering-query-catalog.md
Normal file
@@ -0,0 +1,380 @@
|
||||
# Engineering Query Catalog (V1 driving target)
|
||||
|
||||
## Purpose
|
||||
|
||||
This document is the **single most important driver** of the engineering
|
||||
layer V1 design. The ontology, the schema, the relationship types, and
|
||||
the human mirror templates should all be designed *to answer the queries
|
||||
in this catalog*. Anything in the ontology that does not serve at least
|
||||
one of these queries is overdesign for V1.
|
||||
|
||||
The rule is:
|
||||
|
||||
> If we cannot describe what question a typed object or relationship
|
||||
> lets us answer, that object or relationship is not in V1.
|
||||
|
||||
The catalog is also the **acceptance test** for the engineering layer.
|
||||
"V1 is done" means: AtoCore can answer at least the V1-required queries
|
||||
in this list against the active project set (`p04-gigabit`,
|
||||
`p05-interferometer`, `p06-polisher`).
|
||||
|
||||
## Structure of each entry
|
||||
|
||||
Each query is documented as:
|
||||
|
||||
- **id**: stable identifier (`Q-001`, `Q-002`, ...)
|
||||
- **question**: the natural-language question a human or LLM would ask
|
||||
- **example invocation**: how a client would call AtoCore to ask it
|
||||
- **expected result shape**: the structure of the answer (not real data)
|
||||
- **objects required**: which engineering objects must exist
|
||||
- **relationships required**: which relationships must exist
|
||||
- **provenance requirement**: what evidence must be linkable
|
||||
- **tier**: `v1-required` | `v1-stretch` | `v2`
|
||||
|
||||
## Tiering
|
||||
|
||||
- **v1-required** queries are the floor. The engineering layer cannot
|
||||
ship without all of them working.
|
||||
- **v1-stretch** queries should be doable with V1 objects but may need
|
||||
additional adapters.
|
||||
- **v2** queries are aspirational; they belong to a later wave of
|
||||
ontology work and are listed here only to make sure V1 does not
|
||||
paint us into a corner.
|
||||
|
||||
## V1 minimum object set (recap)
|
||||
|
||||
For reference, the V1 ontology includes:
|
||||
|
||||
- Project, Subsystem, Component
|
||||
- Requirement, Constraint, Decision
|
||||
- Material, Parameter
|
||||
- AnalysisModel, Result, ValidationClaim
|
||||
- Artifact
|
||||
|
||||
And the four relationship families:
|
||||
|
||||
- Structural: `CONTAINS`, `PART_OF`, `INTERFACES_WITH`
|
||||
- Intent: `SATISFIES`, `CONSTRAINED_BY`, `BASED_ON_ASSUMPTION`,
|
||||
`AFFECTED_BY_DECISION`, `SUPERSEDES`
|
||||
- Validation: `ANALYZED_BY`, `VALIDATED_BY`, `SUPPORTS`,
|
||||
`CONFLICTS_WITH`, `DEPENDS_ON`
|
||||
- Provenance: `DESCRIBED_BY`, `UPDATED_BY_SESSION`, `EVIDENCED_BY`,
|
||||
`SUMMARIZED_IN`
|
||||
|
||||
Every query below is annotated with which of these it depends on, so
|
||||
that the V1 implementation order is unambiguous.
|
||||
|
||||
---
|
||||
|
||||
## Tier 1: Structure queries
|
||||
|
||||
### Q-001 — What does this subsystem contain?
|
||||
- **question**: "What components and child subsystems make up
|
||||
Subsystem `<name>`?"
|
||||
- **invocation**: `GET /entities/Subsystem/<id>?expand=contains`
|
||||
- **expected**: `{ subsystem, contains: [{ id, type, name, status }] }`
|
||||
- **objects**: Subsystem, Component
|
||||
- **relationships**: `CONTAINS`
|
||||
- **provenance**: each child must link back to at least one Artifact or
|
||||
source chunk via `DESCRIBED_BY` / `EVIDENCED_BY`
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-002 — What is this component a part of?
|
||||
- **question**: "Which subsystem(s) does Component `<name>` belong to?"
|
||||
- **invocation**: `GET /entities/Component/<id>?expand=parents`
|
||||
- **expected**: `{ component, part_of: [{ id, type, name, status }] }`
|
||||
- **objects**: Component, Subsystem
|
||||
- **relationships**: `PART_OF` (inverse of `CONTAINS`)
|
||||
- **provenance**: same as Q-001
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-003 — What interfaces does this subsystem have, and to what?
|
||||
- **question**: "What does Subsystem `<name>` interface with, and on
|
||||
which interfaces?"
|
||||
- **invocation**: `GET /entities/Subsystem/<id>/interfaces`
|
||||
- **expected**: `[{ interface_id, peer: { id, type, name }, role }]`
|
||||
- **objects**: Subsystem (Interface object deferred to v2)
|
||||
- **relationships**: `INTERFACES_WITH`
|
||||
- **tier**: v1-required (with simplified Interface = string label;
|
||||
full Interface object becomes v2)
|
||||
|
||||
### Q-004 — What is the system map for this project right now?
|
||||
- **question**: "Give me the current structural tree of Project `<id>`."
|
||||
- **invocation**: `GET /projects/<id>/system-map`
|
||||
- **expected**: nested tree of `{ id, type, name, status, children: [] }`
|
||||
- **objects**: Project, Subsystem, Component
|
||||
- **relationships**: `CONTAINS`, `PART_OF`
|
||||
- **tier**: v1-required
|
||||
|
||||
---
|
||||
|
||||
## Tier 2: Intent queries
|
||||
|
||||
### Q-005 — Which requirements does this component satisfy?
|
||||
- **question**: "Which Requirements does Component `<name>` satisfy
|
||||
today?"
|
||||
- **invocation**: `GET /entities/Component/<id>?expand=satisfies`
|
||||
- **expected**: `[{ requirement_id, name, status, confidence }]`
|
||||
- **objects**: Component, Requirement
|
||||
- **relationships**: `SATISFIES`
|
||||
- **provenance**: each `SATISFIES` edge must link to a Result or
|
||||
ValidationClaim that supports the satisfaction (or be flagged as
|
||||
`unverified`)
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-006 — Which requirements are not satisfied by anything?
|
||||
- **question**: "Show me orphan Requirements in Project `<id>` —
|
||||
requirements with no `SATISFIES` edge from any Component."
|
||||
- **invocation**: `GET /projects/<id>/requirements?coverage=orphan`
|
||||
- **expected**: `[{ requirement_id, name, status, last_updated }]`
|
||||
- **objects**: Project, Requirement, Component
|
||||
- **relationships**: absence of `SATISFIES`
|
||||
- **tier**: v1-required (this is the killer correctness query — it's
|
||||
the engineering equivalent of "untested code")
|
||||
|
||||
### Q-007 — What constrains this component?
|
||||
- **question**: "What Constraints apply to Component `<name>`?"
|
||||
- **invocation**: `GET /entities/Component/<id>?expand=constraints`
|
||||
- **expected**: `[{ constraint_id, name, value, source_decision_id? }]`
|
||||
- **objects**: Component, Constraint
|
||||
- **relationships**: `CONSTRAINED_BY`
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-008 — Which decisions affect this subsystem or component?
|
||||
- **question**: "Show me every Decision that affects `<entity>`."
|
||||
- **invocation**: `GET /entities/<type>/<id>?expand=decisions`
|
||||
- **expected**: `[{ decision_id, name, status, made_at, supersedes? }]`
|
||||
- **objects**: Decision, plus the affected entity
|
||||
- **relationships**: `AFFECTED_BY_DECISION`, `SUPERSEDES`
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-009 — Which decisions are based on assumptions that are now flagged?
|
||||
- **question**: "Are any active Decisions in Project `<id>` based on an
|
||||
Assumption that has been marked invalid or needs_review?"
|
||||
- **invocation**: `GET /projects/<id>/decisions?assumption_status=needs_review,invalid`
|
||||
- **expected**: `[{ decision_id, assumption_id, assumption_status }]`
|
||||
- **objects**: Decision, Assumption
|
||||
- **relationships**: `BASED_ON_ASSUMPTION`
|
||||
- **tier**: v1-required (this is the second killer correctness query —
|
||||
catches fragile design)
|
||||
|
||||
---
|
||||
|
||||
## Tier 3: Validation queries
|
||||
|
||||
### Q-010 — What result validates this claim?
|
||||
- **question**: "Show me the Result(s) supporting ValidationClaim
|
||||
`<name>`."
|
||||
- **invocation**: `GET /entities/ValidationClaim/<id>?expand=supports`
|
||||
- **expected**: `[{ result_id, analysis_model_id, summary, confidence }]`
|
||||
- **objects**: ValidationClaim, Result, AnalysisModel
|
||||
- **relationships**: `SUPPORTS`, `ANALYZED_BY`
|
||||
- **provenance**: every Result must link to its AnalysisModel and an
|
||||
Artifact via `DESCRIBED_BY`
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-011 — Are there any active validation claims with no supporting result?
|
||||
- **question**: "Which active ValidationClaims in Project `<id>` have
|
||||
no `SUPPORTS` edge from any Result?"
|
||||
- **invocation**: `GET /projects/<id>/validation?coverage=unsupported`
|
||||
- **expected**: `[{ claim_id, name, status, last_updated }]`
|
||||
- **objects**: ValidationClaim, Result
|
||||
- **relationships**: absence of `SUPPORTS`
|
||||
- **tier**: v1-required (third killer correctness query — catches
|
||||
claims that are not yet evidenced)
|
||||
|
||||
### Q-012 — Are there conflicting results for the same claim?
|
||||
- **question**: "Show me ValidationClaims where multiple Results
|
||||
disagree (one `SUPPORTS`, another `CONFLICTS_WITH`)."
|
||||
- **invocation**: `GET /projects/<id>/validation?coverage=conflict`
|
||||
- **expected**: `[{ claim_id, supporting_results, conflicting_results }]`
|
||||
- **objects**: ValidationClaim, Result
|
||||
- **relationships**: `SUPPORTS`, `CONFLICTS_WITH`
|
||||
- **tier**: v1-required
|
||||
|
||||
---
|
||||
|
||||
## Tier 4: Change / time queries
|
||||
|
||||
### Q-013 — What changed in this project recently?
|
||||
- **question**: "List entities in Project `<id>` whose `updated_at`
|
||||
is within the last `<window>`."
|
||||
- **invocation**: `GET /projects/<id>/changes?since=<iso>`
|
||||
- **expected**: `[{ id, type, name, status, updated_at, change_kind }]`
|
||||
- **objects**: any
|
||||
- **relationships**: any
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-014 — What is the decision history for this subsystem?
|
||||
- **question**: "Show me all Decisions affecting Subsystem `<id>` in
|
||||
chronological order, including superseded ones."
|
||||
- **invocation**: `GET /entities/Subsystem/<id>/decision-log`
|
||||
- **expected**: ordered list with supersession chain
|
||||
- **objects**: Decision, Subsystem
|
||||
- **relationships**: `AFFECTED_BY_DECISION`, `SUPERSEDES`
|
||||
- **tier**: v1-required (this is what a human-readable decision log
|
||||
is generated from)
|
||||
|
||||
### Q-015 — What was the trusted state of this entity at time T?
|
||||
- **question**: "Reconstruct the active fields of `<entity>` as of
|
||||
timestamp `<T>`."
|
||||
- **invocation**: `GET /entities/<type>/<id>?as_of=<iso>`
|
||||
- **expected**: the entity record as it would have been seen at T
|
||||
- **objects**: any
|
||||
- **relationships**: status lifecycle
|
||||
- **tier**: v1-stretch (requires status history table — defer if
|
||||
baseline implementation runs long)
|
||||
|
||||
---
|
||||
|
||||
## Tier 5: Cross-cutting queries
|
||||
|
||||
### Q-016 — Which interfaces are affected by changing this component?
|
||||
- **question**: "If Component `<name>` changes, which Interfaces and
|
||||
which peer subsystems are impacted?"
|
||||
- **invocation**: `GET /entities/Component/<id>/impact`
|
||||
- **expected**: `[{ interface_id, peer_id, peer_type, peer_name }]`
|
||||
- **objects**: Component, Subsystem
|
||||
- **relationships**: `PART_OF`, `INTERFACES_WITH`
|
||||
- **tier**: v1-required (this is the change-impact-analysis query the
|
||||
whole engineering layer exists for)
|
||||
|
||||
### Q-017 — What evidence supports this fact?
|
||||
- **question**: "Give me the source documents and chunks that support
|
||||
the current value of `<entity>.<field>`."
|
||||
- **invocation**: `GET /entities/<type>/<id>/evidence?field=<field>`
|
||||
- **expected**: `[{ source_file, chunk_id, heading_path, score }]`
|
||||
- **objects**: any
|
||||
- **relationships**: `EVIDENCED_BY`, `DESCRIBED_BY`
|
||||
- **tier**: v1-required (without this the engineering layer cannot
|
||||
pass the AtoCore "trust + provenance" rule)
|
||||
|
||||
### Q-018 — What is active vs superseded for this concept?
|
||||
- **question**: "Show me the current active record for `<key>` plus
|
||||
the chain of superseded versions."
|
||||
- **invocation**: `GET /entities/<type>/<id>?include=superseded`
|
||||
- **expected**: `{ active, superseded_chain: [...] }`
|
||||
- **objects**: any
|
||||
- **relationships**: `SUPERSEDES`
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-019 — Which components depend on this material?
|
||||
- **question**: "List every Component whose Material is `<material>`."
|
||||
- **invocation**: `GET /entities/Material/<id>/components`
|
||||
- **expected**: `[{ component_id, name, subsystem_id }]`
|
||||
- **objects**: Component, Material
|
||||
- **relationships**: derived from Component.material field, no edge
|
||||
needed
|
||||
- **tier**: v1-required
|
||||
|
||||
### Q-020 — What does this project look like as a project overview?
|
||||
- **question**: "Generate the human-readable Project Overview for
|
||||
Project `<id>` from current trusted state."
|
||||
- **invocation**: `GET /projects/<id>/mirror/overview`
|
||||
- **expected**: formatted markdown derived from active entities
|
||||
- **objects**: Project, Subsystem, Component, Decision, Requirement,
|
||||
ValidationClaim
|
||||
- **relationships**: structural + intent
|
||||
- **tier**: v1-required (this is the Layer 3 Human Mirror entry
|
||||
point — the moment the engineering layer becomes useful to humans
|
||||
who do not want to call APIs)
|
||||
|
||||
---
|
||||
|
||||
## v1-stretch (nice to have)
|
||||
|
||||
### Q-021 — Which parameters drive this analysis result?
|
||||
- **objects**: AnalysisModel, Parameter, Result
|
||||
- **relationships**: `ANALYZED_BY`, plus a new `DRIVEN_BY` edge
|
||||
|
||||
### Q-022 — Which decisions cite which prior decisions?
|
||||
- **objects**: Decision
|
||||
- **relationships**: `BASED_ON_DECISION` (new)
|
||||
|
||||
### Q-023 — Cross-project comparison
|
||||
- **question**: "Are any Materials shared between p04, p05, and p06,
|
||||
and are their Constraints consistent?"
|
||||
- **objects**: Project, Material, Constraint
|
||||
|
||||
---
|
||||
|
||||
## v2 (deferred)
|
||||
|
||||
### Q-024 — Cost rollup
|
||||
- requires BOM Item, Cost Driver, Vendor — out of V1 scope
|
||||
|
||||
### Q-025 — Manufacturing readiness
|
||||
- requires Manufacturing Process, Inspection Step, Assembly Procedure
|
||||
— out of V1 scope
|
||||
|
||||
### Q-026 — Software / control state
|
||||
- requires Software Module, State Machine, Sensor, Actuator — out
|
||||
of V1 scope
|
||||
|
||||
### Q-027 — Test correlation across analyses
|
||||
- requires Test, Correlation Record — out of V1 scope
|
||||
|
||||
---
|
||||
|
||||
## What this catalog implies for V1 implementation order
|
||||
|
||||
The 20 v1-required queries above tell us what to build first, in
|
||||
roughly this order:
|
||||
|
||||
1. **Structural** (Q-001 to Q-004): need Project, Subsystem, Component
|
||||
and `CONTAINS` / `PART_OF` / `INTERFACES_WITH` (with Interface as a
|
||||
simple string label, not its own entity).
|
||||
2. **Intent core** (Q-005 to Q-008): need Requirement, Constraint,
|
||||
Decision and `SATISFIES` / `CONSTRAINED_BY` / `AFFECTED_BY_DECISION`.
|
||||
3. **Killer correctness queries** (Q-006, Q-009, Q-011): need the
|
||||
absence-of-edge query patterns and the Assumption object.
|
||||
4. **Validation** (Q-010 to Q-012): need AnalysisModel, Result,
|
||||
ValidationClaim and `SUPPORTS` / `ANALYZED_BY` / `CONFLICTS_WITH`.
|
||||
5. **Change/time** (Q-013, Q-014): need a write log per entity (the
|
||||
existing `updated_at` plus a status history if Q-015 is in scope).
|
||||
6. **Cross-cutting** (Q-016 to Q-019): impact analysis is mostly a
|
||||
graph traversal once the structural and intent edges exist.
|
||||
7. **Provenance** (Q-017): the entity store must always link to
|
||||
chunks/artifacts via `EVIDENCED_BY` / `DESCRIBED_BY`. This is
|
||||
non-negotiable and should be enforced at insert time, not later.
|
||||
8. **Human Mirror** (Q-020): the markdown generator is the *last*
|
||||
thing built, not the first. It is derived from everything above.
|
||||
|
||||
## What is intentionally left out of V1
|
||||
|
||||
- BOM, manufacturing, vendor, cost objects (entire family deferred)
|
||||
- Software, control, electrical objects (entire family deferred)
|
||||
- Test correlation objects (entire family deferred)
|
||||
- Full Interface as its own entity (string label is enough for V1)
|
||||
- Time-travel queries beyond `since=<iso>` (Q-015 is stretch)
|
||||
- Multi-project rollups (Q-023 is stretch)
|
||||
|
||||
## Open questions this catalog raises
|
||||
|
||||
These are the design questions that need to be answered in the next
|
||||
planning docs (memory-vs-entities, conflict-model, promotion-rules):
|
||||
|
||||
- **Q-006, Q-011 (orphan / unsupported queries)**: do orphans get
|
||||
flagged at insert time, computed at query time, or both?
|
||||
- **Q-009 (assumption-driven decisions)**: when an Assumption flips
|
||||
to `needs_review`, are all dependent Decisions auto-flagged or do
|
||||
they only show up when this query is run?
|
||||
- **Q-012 (conflicting results)**: does AtoCore *block* a conflict
|
||||
from being saved, or always save and flag? (The trust rule says
|
||||
flag, never block — but the implementation needs the explicit nod.)
|
||||
- **Q-017 (evidence)**: is `EVIDENCED_BY` mandatory at insert? If yes,
|
||||
how do we backfill entities extracted from older interactions where
|
||||
the source link is fuzzy?
|
||||
- **Q-020 (Project Overview mirror)**: when does it regenerate?
|
||||
On every entity write? On a schedule? On demand?
|
||||
|
||||
These are the questions the next architecture docs in the planning
|
||||
sprint should resolve before any code is written.
|
||||
|
||||
## Working rule
|
||||
|
||||
> If a v1-required query in this catalog cannot be answered against
|
||||
> at least one of `p04-gigabit`, `p05-interferometer`, or
|
||||
> `p06-polisher`, the engineering layer is not done.
|
||||
|
||||
This catalog is the contract.
|
||||
434
docs/architecture/engineering-v1-acceptance.md
Normal file
434
docs/architecture/engineering-v1-acceptance.md
Normal file
@@ -0,0 +1,434 @@
|
||||
# Engineering Layer V1 Acceptance Criteria
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The engineering layer planning sprint produced 7 architecture
|
||||
docs. None of them on their own says "you're done with V1, ship
|
||||
it". This document does. It translates the planning into
|
||||
measurable, falsifiable acceptance criteria so the implementation
|
||||
sprint can know unambiguously when V1 is complete.
|
||||
|
||||
The acceptance criteria are organized into four categories:
|
||||
|
||||
1. **Functional** — what the system must be able to do
|
||||
2. **Quality** — how well it must do it
|
||||
3. **Operational** — what running it must look like
|
||||
4. **Documentation** — what must be written down
|
||||
|
||||
V1 is "done" only when **every criterion in this document is met
|
||||
against at least one of the three active projects** (`p04-gigabit`,
|
||||
`p05-interferometer`, `p06-polisher`). The choice of which
|
||||
project is the test bed is up to the implementer, but the same
|
||||
project must satisfy all functional criteria.
|
||||
|
||||
## The single-sentence definition
|
||||
|
||||
> AtoCore Engineering Layer V1 is done when, against one chosen
|
||||
> active project, every v1-required query in
|
||||
> `engineering-query-catalog.md` returns a correct result, the
|
||||
> Human Mirror renders a coherent project overview, and a real
|
||||
> KB-CAD or KB-FEM export round-trips through the ingest →
|
||||
> review queue → active entity flow without violating any
|
||||
> conflict or trust invariant.
|
||||
|
||||
Everything below is the operational form of that sentence.
|
||||
|
||||
## Category 1 — Functional acceptance
|
||||
|
||||
### F-1: Entity store implemented per the V1 ontology
|
||||
|
||||
- The 12 V1 entity types from `engineering-ontology-v1.md` exist
|
||||
in the database with the schema described there
|
||||
- The 4 relationship families (Structural, Intent, Validation,
|
||||
Provenance) are implemented as edges with the relationship
|
||||
types listed in the catalog
|
||||
- Every entity has the shared header fields:
|
||||
`id, type, name, project_id, status, confidence, source_refs,
|
||||
created_at, updated_at, extractor_version, canonical_home`
|
||||
- The status lifecycle matches the memory layer:
|
||||
`candidate → active → superseded | invalid`
|
||||
|
||||
### F-2: All v1-required queries return correct results
|
||||
|
||||
For the chosen test project, every query Q-001 through Q-020 in
|
||||
`engineering-query-catalog.md` must:
|
||||
|
||||
- be implemented as an API endpoint with the shape specified in
|
||||
the catalog
|
||||
- return the expected result shape against real data
|
||||
- include the provenance chain when the catalog requires it
|
||||
- handle the empty case (no matches) gracefully — empty array,
|
||||
not 500
|
||||
|
||||
The "killer correctness queries" — Q-006 (orphan requirements),
|
||||
Q-009 (decisions on flagged assumptions), Q-011 (unsupported
|
||||
validation claims) — are non-negotiable. If any of those three
|
||||
returns wrong results, V1 is not done.
|
||||
|
||||
### F-3: Tool ingest endpoints are live
|
||||
|
||||
Both endpoints from `tool-handoff-boundaries.md` are implemented:
|
||||
|
||||
- `POST /ingest/kb-cad/export` accepts the documented JSON
|
||||
shape, validates it, and produces entity candidates
|
||||
- `POST /ingest/kb-fem/export` ditto
|
||||
- Both refuse exports with invalid schemas (4xx with a clear
|
||||
error)
|
||||
- Both return a summary of created/dropped/failed counts
|
||||
- Both never auto-promote anything; everything lands as
|
||||
`status="candidate"`
|
||||
- Both carry source identifiers (exporter name, exporter version,
|
||||
source artifact id) into the candidate's provenance fields
|
||||
|
||||
A real KB-CAD export — even a hand-crafted one if the actual
|
||||
exporter doesn't exist yet — must round-trip through the endpoint
|
||||
and produce reviewable candidates for the test project.
|
||||
|
||||
### F-4: Candidate review queue works end to end
|
||||
|
||||
Per `promotion-rules.md`:
|
||||
|
||||
- `GET /entities?status=candidate` lists the queue
|
||||
- `POST /entities/{id}/promote` moves candidate → active
|
||||
- `POST /entities/{id}/reject` moves candidate → invalid
|
||||
- The same shapes work for memories (already shipped in Phase 9 C)
|
||||
- The reviewer can edit a candidate's content via
|
||||
`PUT /entities/{id}` before promoting
|
||||
- Every promote/reject is logged with timestamp and reason
|
||||
|
||||
### F-5: Conflict detection fires
|
||||
|
||||
Per `conflict-model.md`:
|
||||
|
||||
- The synchronous detector runs at every active write
|
||||
(create, promote, project_state set, KB import)
|
||||
- A test must demonstrate that pushing a contradictory KB-CAD
|
||||
export creates a `conflicts` row with both members linked
|
||||
- The reviewer can resolve the conflict via
|
||||
`POST /conflicts/{id}/resolve` with one of the supported
|
||||
actions (supersede_others, no_action, dismiss)
|
||||
- Resolution updates the underlying entities according to the
|
||||
chosen action
|
||||
|
||||
### F-6: Human Mirror renders for the test project
|
||||
|
||||
Per `human-mirror-rules.md`:
|
||||
|
||||
- `GET /mirror/{project}/overview` returns rendered markdown
|
||||
- `GET /mirror/{project}/decisions` returns rendered markdown
|
||||
- `GET /mirror/{project}/subsystems/{subsystem}` returns
|
||||
rendered markdown for at least one subsystem
|
||||
- `POST /mirror/{project}/regenerate` triggers regeneration on
|
||||
demand
|
||||
- Generated files appear under `/srv/storage/atocore/data/mirror/`
|
||||
with the "do not edit" header banner
|
||||
- Disputed markers appear inline when conflicts exist
|
||||
- Project-state overrides display with the `(curated)` annotation
|
||||
- Output is deterministic (the same inputs produce the same
|
||||
bytes, suitable for diffing)
|
||||
|
||||
### F-7: Memory-to-entity graduation works for at least one type
|
||||
|
||||
Per `memory-vs-entities.md`:
|
||||
|
||||
- `POST /memory/{id}/graduate` exists
|
||||
- Graduating a memory of type `adaptation` produces a Decision
|
||||
entity candidate with the memory's content as a starting point
|
||||
- The original memory row stays at `status="graduated"` (a new
|
||||
status added by the engineering layer migration)
|
||||
- The graduated memory has a forward pointer to the entity
|
||||
candidate's id
|
||||
- Promoting the entity candidate does NOT delete the original
|
||||
memory
|
||||
- The same graduation flow works for `project` → Requirement
|
||||
and `knowledge` → Fact entity types (test the path; doesn't
|
||||
have to be exhaustive)
|
||||
|
||||
### F-8: Provenance chain is complete
|
||||
|
||||
For every active entity in the test project, the following must
|
||||
be true:
|
||||
|
||||
- It links back to at least one source via `source_refs` (which
|
||||
is one or more of: source_chunk_id, source_interaction_id,
|
||||
source_artifact_id from KB import)
|
||||
- The provenance chain can be walked from the entity to the
|
||||
underlying raw text (source_chunks) or external artifact
|
||||
- Q-017 (the evidence query) returns at least one row for every
|
||||
active entity
|
||||
|
||||
If any active entity has no provenance, it's a bug — provenance
|
||||
is mandatory at write time per the promotion rules.
|
||||
|
||||
## Category 2 — Quality acceptance
|
||||
|
||||
### Q-1: All existing tests still pass
|
||||
|
||||
The full pre-V1 test suite (currently 160 tests) must still
|
||||
pass. The V1 implementation may add new tests but cannot regress
|
||||
any existing test.
|
||||
|
||||
### Q-2: V1 has its own test coverage
|
||||
|
||||
For each of F-1 through F-8 above, at least one automated test
|
||||
exists that:
|
||||
|
||||
- exercises the happy path
|
||||
- covers at least one error path
|
||||
- runs in CI in under 10 seconds (no real network, no real LLM)
|
||||
|
||||
The full V1 test suite should be under 30 seconds total runtime
|
||||
to keep the development loop fast.
|
||||
|
||||
### Q-3: Conflict invariants are enforced by tests
|
||||
|
||||
Specific tests must demonstrate:
|
||||
|
||||
- Two contradictory KB exports produce a conflict (not silent
|
||||
overwrite)
|
||||
- A reviewer can't accidentally promote both members of an open
|
||||
conflict to active without resolving the conflict first
|
||||
- The "flag, never block" rule holds — writes still succeed
|
||||
even when they create a conflict
|
||||
|
||||
### Q-4: Trust hierarchy is enforced by tests
|
||||
|
||||
Specific tests must demonstrate:
|
||||
|
||||
- Entity candidates can never appear in context packs
|
||||
- Reinforcement only touches active memories (already covered
|
||||
by Phase 9 Commit B tests, but the same property must hold
|
||||
for entities once they exist)
|
||||
- Nothing automatically writes to project_state ever
|
||||
- Candidates can never satisfy Q-005 (only active entities count)
|
||||
|
||||
### Q-5: The Human Mirror is reproducible
|
||||
|
||||
A golden-file test exists for at least one Mirror page. Updating
|
||||
the golden file is a normal part of template work (single
|
||||
command, well-documented). The test fails if the renderer
|
||||
produces different bytes for the same input, catching
|
||||
non-determinism.
|
||||
|
||||
### Q-6: Killer correctness queries pass against real-ish data
|
||||
|
||||
The test bed for Q-006, Q-009, Q-011 is not synthetic. The
|
||||
implementation must seed the test project with at least:
|
||||
|
||||
- One Requirement that has a satisfying Component (Q-006 should
|
||||
not flag it)
|
||||
- One Requirement with no satisfying Component (Q-006 must flag it)
|
||||
- One Decision based on an Assumption flagged as `needs_review`
|
||||
(Q-009 must flag the Decision)
|
||||
- One ValidationClaim with at least one supporting Result
|
||||
(Q-011 should not flag it)
|
||||
- One ValidationClaim with no supporting Result (Q-011 must flag it)
|
||||
|
||||
These five seed cases run as a single integration test that
|
||||
exercises the killer correctness queries against actual
|
||||
representative data.
|
||||
|
||||
## Category 3 — Operational acceptance
|
||||
|
||||
### O-1: Migration is safe and reversible
|
||||
|
||||
The V1 schema migration (adding the `entities`, `relationships`,
|
||||
`conflicts`, `conflict_members` tables, plus `mirror_regeneration_failures`)
|
||||
must:
|
||||
|
||||
- run cleanly against a production-shape database
|
||||
- be implemented via the same `_apply_migrations` pattern as
|
||||
Phase 9 (additive only, idempotent, safe to run twice)
|
||||
- be tested by spinning up a fresh DB AND running against a
|
||||
copy of the live Dalidou DB taken from a backup
|
||||
|
||||
### O-2: Backup and restore still work
|
||||
|
||||
The backup endpoint must include the new tables. A restore drill
|
||||
on the test project must:
|
||||
|
||||
- successfully back up the V1 entity state via
|
||||
`POST /admin/backup`
|
||||
- successfully validate the snapshot
|
||||
- successfully restore from the snapshot per
|
||||
`docs/backup-restore-procedure.md`
|
||||
- pass post-restore verification including a Q-001 query against
|
||||
the test project
|
||||
|
||||
The drill must be performed once before V1 is declared done.
|
||||
|
||||
### O-3: Performance bounds
|
||||
|
||||
These are starting bounds; tune later if real usage shows
|
||||
problems:
|
||||
|
||||
- Single-entity write (`POST /entities/...`): under 100ms p99
|
||||
on the production Dalidou hardware
|
||||
- Single Q-001 / Q-005 / Q-008 query: under 500ms p99 against
|
||||
a project with up to 1000 entities
|
||||
- Mirror regeneration of one project overview: under 5 seconds
|
||||
for a project with up to 1000 entities
|
||||
- Conflict detector at write time: adds no more than 50ms p99
|
||||
to a write that doesn't actually produce a conflict
|
||||
|
||||
These bounds are not tested by automated benchmarks in V1 (that
|
||||
would be over-engineering). They are sanity-checked by the
|
||||
developer running the operations against the test project.
|
||||
|
||||
### O-4: No new manual ops burden
|
||||
|
||||
V1 should not introduce any new "you have to remember to run X
|
||||
every day" requirement. Specifically:
|
||||
|
||||
- Mirror regeneration is automatic (debounced async + daily
|
||||
refresh), no manual cron entry needed
|
||||
- Conflict detection is automatic at write time, no manual sweep
|
||||
needed in V1 (the nightly sweep is V2)
|
||||
- Backup retention cleanup is **still** an open follow-up from
|
||||
the operational baseline; V1 does not block on it
|
||||
|
||||
### O-5: No regressions in Phase 9 reflection loop
|
||||
|
||||
The capture, reinforcement, and extraction loop from Phase 9
|
||||
A/B/C must continue to work end to end with the engineering
|
||||
layer in place. Specifically:
|
||||
|
||||
- Memories whose types are NOT in the engineering layer
|
||||
(identity, preference, episodic) keep working exactly as
|
||||
before
|
||||
- Memories whose types ARE in the engineering layer (project,
|
||||
knowledge, adaptation) can still be created hand or by
|
||||
extraction; the deprecation rule from `memory-vs-entities.md`
|
||||
("no new writes after V1 ships") is implemented as a
|
||||
configurable warning, not a hard block, so existing
|
||||
workflows aren't disrupted
|
||||
|
||||
## Category 4 — Documentation acceptance
|
||||
|
||||
### D-1: Per-entity-type spec docs
|
||||
|
||||
Each of the 12 V1 entity types has a short spec doc under
|
||||
`docs/architecture/entities/` covering:
|
||||
|
||||
- the entity's purpose
|
||||
- its required and optional fields
|
||||
- its lifecycle quirks (if any beyond the standard
|
||||
candidate/active/superseded/invalid)
|
||||
- which queries it appears in (cross-reference to the catalog)
|
||||
- which relationship types reference it
|
||||
|
||||
These docs can be terse — a page each, mostly bullet lists.
|
||||
Their purpose is to make the entity model legible to a future
|
||||
maintainer, not to be reference manuals.
|
||||
|
||||
### D-2: KB-CAD and KB-FEM export schema docs
|
||||
|
||||
`docs/architecture/kb-cad-export-schema.md` and
|
||||
`docs/architecture/kb-fem-export-schema.md` are written and
|
||||
match the implemented validators.
|
||||
|
||||
### D-3: V1 release notes
|
||||
|
||||
A `docs/v1-release-notes.md` summarizes:
|
||||
|
||||
- What V1 added (entities, relationships, conflicts, mirror,
|
||||
ingest endpoints)
|
||||
- What V1 deferred (auto-promotion, BOM/cost/manufacturing
|
||||
entities, NX direct integration, cross-project rollups)
|
||||
- The migration story for existing memories (graduation flow)
|
||||
- Known limitations and the V2 roadmap pointers
|
||||
|
||||
### D-4: master-plan-status.md and current-state.md updated
|
||||
|
||||
Both top-level status docs reflect V1's completion:
|
||||
|
||||
- Phase 6 (AtoDrive) and the engineering layer are explicitly
|
||||
marked as separate tracks
|
||||
- The engineering planning sprint section is marked complete
|
||||
- Phase 9 stays at "baseline complete" (V1 doesn't change Phase 9)
|
||||
- The engineering layer V1 is added as its own line item
|
||||
|
||||
## What V1 explicitly does NOT need to do
|
||||
|
||||
To prevent scope creep, here is the negative list. None of the
|
||||
following are V1 acceptance criteria:
|
||||
|
||||
- **No LLM extractor.** The Phase 9 C rule-based extractor is
|
||||
the entity extractor for V1 too, just with new rules added for
|
||||
entity types.
|
||||
- **No auto-promotion of candidates.** Per `promotion-rules.md`.
|
||||
- **No write-back to KB-CAD or KB-FEM.** Per
|
||||
`tool-handoff-boundaries.md`.
|
||||
- **No multi-user / per-reviewer auth.** Single-user assumed.
|
||||
- **No real-time UI.** API + Mirror markdown is the V1 surface.
|
||||
A web UI is V2+.
|
||||
- **No cross-project rollups.** Per `human-mirror-rules.md`.
|
||||
- **No time-travel queries** (Q-015 stays v1-stretch).
|
||||
- **No nightly conflict sweep.** Synchronous detection only in V1.
|
||||
- **No incremental Chroma snapshots.** The current full-copy
|
||||
approach in `backup-restore-procedure.md` is fine for V1.
|
||||
- **No retention cleanup script.** Still an open follow-up.
|
||||
- **No backup encryption.** Still an open follow-up.
|
||||
- **No off-Dalidou backup target.** Still an open follow-up.
|
||||
|
||||
## How to use this document during implementation
|
||||
|
||||
When the implementation sprint begins:
|
||||
|
||||
1. Read this doc once, top to bottom
|
||||
2. Pick the test project (probably p05-interferometer because
|
||||
the optical/structural domain has the cleanest entity model)
|
||||
3. For each section, write the test or the implementation, in
|
||||
roughly the order: F-1 → F-2 → F-3 → F-4 → F-5 → F-6 → F-7 → F-8
|
||||
4. Each acceptance criterion's test should be written **before
|
||||
or alongside** the implementation, not after
|
||||
5. Run the full test suite at every commit
|
||||
6. When every box is checked, write D-3 (release notes), update
|
||||
D-4 (status docs), and call V1 done
|
||||
|
||||
The implementation sprint should not touch anything outside the
|
||||
scope listed here. If a desire arises to add something not in
|
||||
this doc, that's a V2 conversation, not a V1 expansion.
|
||||
|
||||
## Anticipated friction points
|
||||
|
||||
These are the things I expect will be hard during implementation:
|
||||
|
||||
1. **The graduation flow (F-7)** is the most cross-cutting
|
||||
change because it touches the existing memory module.
|
||||
Worth doing it last so the memory module is stable for
|
||||
all the V1 entity work first.
|
||||
2. **The Mirror's deterministic-output requirement (Q-5)** will
|
||||
bite if the implementer iterates over Python dicts without
|
||||
sorting. Plan to use `sorted()` literally everywhere.
|
||||
3. **Conflict detection (F-5)** has subtle correctness traps:
|
||||
the slot key extraction must be stable, the dedup-of-existing-conflicts
|
||||
logic must be right, and the synchronous detector must not
|
||||
slow writes meaningfully (Q-3 / O-3 cover this, but watch).
|
||||
4. **Provenance backfill** for entities that come from the
|
||||
existing memory layer via graduation (F-7) is the trickiest
|
||||
part: the original memory may not have had a strict
|
||||
`source_chunk_id`, in which case the graduated entity also
|
||||
doesn't have one. The implementation needs an "orphan
|
||||
provenance" allowance for graduated entities, with a
|
||||
warning surfaced in the Mirror.
|
||||
|
||||
These aren't blockers, just the parts of the V1 spec I'd
|
||||
attack with extra care.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- Engineering V1 is done when every box in this doc is checked
|
||||
against one chosen active project
|
||||
- Functional: 8 criteria covering entities, queries, ingest,
|
||||
review queue, conflicts, mirror, graduation, provenance
|
||||
- Quality: 6 criteria covering tests, golden files, killer
|
||||
correctness, trust enforcement
|
||||
- Operational: 5 criteria covering migration safety, backup
|
||||
drill, performance bounds, no new manual ops, Phase 9 not
|
||||
regressed
|
||||
- Documentation: 4 criteria covering entity specs, KB schema
|
||||
docs, release notes, top-level status updates
|
||||
- Negative list: a clear set of things V1 deliberately does
|
||||
NOT need to do, to prevent scope creep
|
||||
- The implementation sprint follows this doc as a checklist
|
||||
384
docs/architecture/human-mirror-rules.md
Normal file
384
docs/architecture/human-mirror-rules.md
Normal file
@@ -0,0 +1,384 @@
|
||||
# Human Mirror Rules (Layer 3 → derived markdown views)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The engineering layer V1 stores facts as typed entities and
|
||||
relationships in a SQL database. That representation is excellent
|
||||
for queries, conflict detection, and automated reasoning, but
|
||||
it's terrible for the human reading experience. People want to
|
||||
read prose, not crawl JSON.
|
||||
|
||||
The Human Mirror is the layer that turns the typed entity store
|
||||
into human-readable markdown pages. It's strictly a derived view —
|
||||
nothing in the Human Mirror is canonical, every page is regenerated
|
||||
from current entity state on demand.
|
||||
|
||||
This document defines:
|
||||
|
||||
- what the Human Mirror generates
|
||||
- when it regenerates
|
||||
- how the human edits things they see in the Mirror
|
||||
- how the canonical-vs-derived rule is enforced (so editing the
|
||||
derived markdown can't silently corrupt the entity store)
|
||||
|
||||
## The non-negotiable rule
|
||||
|
||||
> **The Human Mirror is read-only from the human's perspective.**
|
||||
>
|
||||
> If the human wants to change a fact they see in the Mirror, they
|
||||
> change it in the canonical home (per `representation-authority.md`),
|
||||
> NOT in the Mirror page. The next regeneration picks up the change.
|
||||
|
||||
This rule is what makes the whole derived-view approach safe. If
|
||||
the human is allowed to edit Mirror pages directly, the
|
||||
canonical-vs-derived split breaks and the Mirror becomes a second
|
||||
source of truth that disagrees with the entity store.
|
||||
|
||||
The technical enforcement is that every Mirror page carries a
|
||||
header banner that says "this file is generated from AtoCore
|
||||
entity state, do not edit", and the file is regenerated from the
|
||||
entity store on every change to its underlying entities. Manual
|
||||
edits will be silently overwritten on the next regeneration.
|
||||
|
||||
## What the Mirror generates in V1
|
||||
|
||||
Three template families, each producing one or more pages per
|
||||
project:
|
||||
|
||||
### 1. Project Overview
|
||||
|
||||
One page per registered project. Renders:
|
||||
|
||||
- Project header (id, aliases, description)
|
||||
- Subsystem tree (from Q-001 / Q-004 in the query catalog)
|
||||
- Active Decisions affecting this project (Q-008, ordered by date)
|
||||
- Open Requirements with coverage status (Q-005, Q-006)
|
||||
- Open ValidationClaims with support status (Q-010, Q-011)
|
||||
- Currently flagged conflicts (from the conflict model)
|
||||
- Recent changes (Q-013) — last 14 days
|
||||
|
||||
This is the most important Mirror page. It's the page someone
|
||||
opens when they want to know "what's the state of this project
|
||||
right now". It deliberately mirrors what `current-state.md` does
|
||||
for AtoCore itself but generated entirely from typed state.
|
||||
|
||||
### 2. Decision Log
|
||||
|
||||
One page per project. Renders:
|
||||
|
||||
- All active Decisions in chronological order (newest first)
|
||||
- Each Decision shows: id, what was decided, when, the affected
|
||||
Subsystem/Component, the supporting evidence (Q-014, Q-017)
|
||||
- Superseded Decisions appear as collapsed "history" entries
|
||||
with a forward link to whatever superseded them
|
||||
- Conflicting Decisions get a "⚠ disputed" marker
|
||||
|
||||
This is the human-readable form of the engineering query catalog's
|
||||
Q-014 query.
|
||||
|
||||
### 3. Subsystem Detail
|
||||
|
||||
One page per Subsystem (so a few per project). Renders:
|
||||
|
||||
- Subsystem header
|
||||
- Components contained in this subsystem (Q-001)
|
||||
- Interfaces this subsystem has (Q-003)
|
||||
- Constraints applying to it (Q-007)
|
||||
- Decisions affecting it (Q-008)
|
||||
- Validation status: which Requirements are satisfied,
|
||||
which are open (Q-005, Q-006)
|
||||
- Change history within this subsystem (Q-013 scoped)
|
||||
|
||||
Subsystem detail pages are what someone reads when they're
|
||||
working on a specific part of the system and want everything
|
||||
relevant in one place.
|
||||
|
||||
## What the Mirror does NOT generate in V1
|
||||
|
||||
Intentionally excluded so the V1 implementation stays scoped:
|
||||
|
||||
- **Per-component detail pages.** Components are listed in
|
||||
Subsystem pages but don't get their own pages. Reduces page
|
||||
count from hundreds to dozens.
|
||||
- **Per-Decision detail pages.** Decisions appear inline in
|
||||
Project Overview and Decision Log; their full text plus
|
||||
evidence chain is shown there, not on a separate page.
|
||||
- **Cross-project rollup pages.** No "all projects at a glance"
|
||||
page in V1. Each project is its own report.
|
||||
- **Time-series / historical pages.** The Mirror is always
|
||||
"current state". History is accessible via Decision Log and
|
||||
superseded chains, but no "what was true on date X" page exists
|
||||
in V1 (Q-015 is v1-stretch in the query catalog for the same
|
||||
reason).
|
||||
- **Diff pages between two timestamps.** Same reasoning.
|
||||
- **Render of the conflict queue itself.** Conflicts appear
|
||||
inline in the relevant Mirror pages with the "⚠ disputed"
|
||||
marker and a link to `/conflicts/{id}`, but there's no
|
||||
Mirror page that lists all conflicts. Use `GET /conflicts`.
|
||||
- **Per-memory pages.** Memories are not engineering entities;
|
||||
they appear in context packs and the review queue, not in the
|
||||
Human Mirror.
|
||||
|
||||
## Where Mirror pages live
|
||||
|
||||
Two options were considered. The chosen V1 path is option B:
|
||||
|
||||
**Option A — write Mirror pages back into the source vault.**
|
||||
Generate `/srv/storage/atocore/sources/vault/mirror/p05/overview.md`
|
||||
so the human reads them in their normal Obsidian / markdown
|
||||
viewer. **Rejected** because writing into the source vault
|
||||
violates the "sources are read-only" rule from
|
||||
`tool-handoff-boundaries.md` and the operating model.
|
||||
|
||||
**Option B (chosen) — write Mirror pages into a dedicated AtoCore
|
||||
output dir, served via the API.** Generate under
|
||||
`/srv/storage/atocore/data/mirror/p05/overview.md`. The human
|
||||
reads them via:
|
||||
|
||||
- the API endpoints `GET /mirror/{project}/overview`,
|
||||
`GET /mirror/{project}/decisions`,
|
||||
`GET /mirror/{project}/subsystems/{subsystem}` (all return
|
||||
rendered markdown as text/markdown)
|
||||
- a future "Mirror viewer" in the Claude Code slash command
|
||||
`/atocore-mirror <project>` that fetches the rendered markdown
|
||||
and displays it inline
|
||||
- direct file access on Dalidou for power users:
|
||||
`cat /srv/storage/atocore/data/mirror/p05/overview.md`
|
||||
|
||||
The dedicated dir keeps the Mirror clearly separated from the
|
||||
canonical sources and makes regeneration safe (it's just a
|
||||
directory wipe + write).
|
||||
|
||||
## When the Mirror regenerates
|
||||
|
||||
Three triggers, in order from cheapest to most expensive:
|
||||
|
||||
### 1. On explicit human request
|
||||
|
||||
```
|
||||
POST /mirror/{project}/regenerate
|
||||
```
|
||||
|
||||
Returns the timestamp of the regeneration and the list of files
|
||||
written. This is the path the human takes when they've just
|
||||
curated something into project_state and want to see the Mirror
|
||||
reflect it immediately.
|
||||
|
||||
### 2. On entity write (debounced, async, per project)
|
||||
|
||||
When any entity in a project changes status (candidate → active,
|
||||
active → superseded), a regeneration of that project's Mirror is
|
||||
queued. The queue is debounced — multiple writes within a 30-second
|
||||
window only trigger one regeneration. This keeps the Mirror
|
||||
"close to current" without generating a Mirror update on every
|
||||
single API call.
|
||||
|
||||
The implementation is a simple dict of "next regeneration time"
|
||||
per project, checked by a background task. No cron, no message
|
||||
queue, no Celery. Just a `dict[str, datetime]` and a thread.
|
||||
|
||||
### 3. On scheduled refresh (daily)
|
||||
|
||||
Once per day at a quiet hour, every project's Mirror regenerates
|
||||
unconditionally. This catches any state drift from manual
|
||||
project_state edits that bypassed the entity write hooks, and
|
||||
provides a baseline guarantee that the Mirror is at most 24
|
||||
hours stale.
|
||||
|
||||
The schedule runs from the same machinery as the future backup
|
||||
retention job, so we get one cron-equivalent system to maintain
|
||||
instead of two.
|
||||
|
||||
## What if regeneration fails
|
||||
|
||||
The Mirror has to be resilient. If regeneration fails for a
|
||||
project (e.g. a query catalog query crashes, a template rendering
|
||||
error), the existing Mirror files are **not** deleted. The
|
||||
existing files stay in place (showing the last successful state)
|
||||
and a regeneration error is recorded in:
|
||||
|
||||
- the API response if the trigger was explicit
|
||||
- a log entry at warning level for the async path
|
||||
- a `mirror_regeneration_failures` table for the daily refresh
|
||||
|
||||
This means the human can always read the Mirror, even if the
|
||||
last 5 minutes of changes haven't made it in yet. Stale is
|
||||
better than blank.
|
||||
|
||||
## How the human curates "around" the Mirror
|
||||
|
||||
The Mirror reflects the current entity state. If the human
|
||||
doesn't like what they see, the right edits go into one of:
|
||||
|
||||
| What you want to change | Where you change it |
|
||||
|---|---|
|
||||
| A Decision's text | `PUT /entities/Decision/{id}` (or `PUT /memory/{id}` if it's still memory-layer) |
|
||||
| A Decision's status (active → superseded) | `POST /entities/Decision/{id}/supersede` (V1 entity API) |
|
||||
| Whether a Component "satisfies" a Requirement | edit the relationship directly via the entity API (V1) |
|
||||
| The current trusted next focus shown on the Project Overview | `POST /project/state` with `category=status, key=next_focus` |
|
||||
| A typo in a generated heading or label | edit the **template**, not the rendered file. Templates live in `templates/mirror/` (V1 implementation) |
|
||||
| Source of a fact ("this came from KB-CAD on day X") | not editable by hand — it's automatically populated from provenance |
|
||||
|
||||
The rule is consistent: edit the canonical home, regenerate (or
|
||||
let the auto-trigger fire), see the change reflected in the
|
||||
Mirror.
|
||||
|
||||
## Templates
|
||||
|
||||
The Mirror uses Jinja2-style templates checked into the repo
|
||||
under `templates/mirror/`. Each template is a markdown file with
|
||||
placeholders that the renderer fills from query catalog results.
|
||||
|
||||
Template list for V1:
|
||||
|
||||
- `templates/mirror/project-overview.md.j2`
|
||||
- `templates/mirror/decision-log.md.j2`
|
||||
- `templates/mirror/subsystem-detail.md.j2`
|
||||
|
||||
Editing a template is a code change, reviewed via normal git PRs.
|
||||
The templates are deliberately small and readable so the human
|
||||
can tweak the output format without touching renderer code.
|
||||
|
||||
The renderer is a thin module:
|
||||
|
||||
```python
|
||||
# src/atocore/mirror/renderer.py (V1, not yet implemented)
|
||||
|
||||
def render_project_overview(project: str) -> str:
|
||||
"""Generate the project overview markdown for one project."""
|
||||
facts = collect_project_overview_facts(project)
|
||||
template = load_template("project-overview.md.j2")
|
||||
return template.render(**facts)
|
||||
```
|
||||
|
||||
## The "do not edit" header
|
||||
|
||||
Every generated Mirror file starts with a fixed banner:
|
||||
|
||||
```markdown
|
||||
<!--
|
||||
This file is generated by AtoCore from current entity state.
|
||||
DO NOT EDIT — manual changes will be silently overwritten on
|
||||
the next regeneration.
|
||||
Edit the canonical home instead. See:
|
||||
https://docs.atocore.../representation-authority.md
|
||||
Regenerated: 2026-04-07T12:34:56Z
|
||||
Source entities: <commit-like checksum of input data>
|
||||
-->
|
||||
```
|
||||
|
||||
The checksum at the end lets the renderer skip work when nothing
|
||||
relevant has changed since the last regeneration. If the inputs
|
||||
match the previous run's checksum, the existing file is left
|
||||
untouched.
|
||||
|
||||
## Conflicts in the Mirror
|
||||
|
||||
Per the conflict model, any open conflict on a fact that appears
|
||||
in the Mirror gets a visible disputed marker:
|
||||
|
||||
```markdown
|
||||
- Lateral support material: **GF-PTFE** ⚠ disputed
|
||||
- The KB-CAD import on 2026-04-07 reported PEEK; conflict #c-039.
|
||||
```
|
||||
|
||||
The disputed marker is a hyperlink (in renderer terms; the markdown
|
||||
output is a relative link) to the conflict detail page in the API
|
||||
or to the conflict id for direct lookup. The reviewer follows the
|
||||
link, resolves the conflict via `POST /conflicts/{id}/resolve`,
|
||||
and on the next regeneration the marker disappears.
|
||||
|
||||
## Project-state overrides in the Mirror
|
||||
|
||||
When a Mirror page would show a value derived from entities, but
|
||||
project_state has an override on the same key, **the Mirror shows
|
||||
the project_state value** with a small annotation noting the
|
||||
override:
|
||||
|
||||
```markdown
|
||||
- Next focus: **Wave 2 trusted-operational ingestion** (curated)
|
||||
```
|
||||
|
||||
The `(curated)` annotation tells the reader "this is from the
|
||||
trusted-state Layer 3, not from extracted entities". This makes
|
||||
the trust hierarchy visible in the human reading experience.
|
||||
|
||||
## The "Mirror diff" workflow (post-V1, but designed for)
|
||||
|
||||
A common workflow after V1 ships will be:
|
||||
|
||||
1. Reviewer has curated some new entities
|
||||
2. They want to see "what changed in the Mirror as a result"
|
||||
3. They want to share that diff with someone else as evidence
|
||||
|
||||
To support this, the Mirror generator writes its output
|
||||
deterministically (sorted iteration, stable timestamp formatting)
|
||||
so a `git diff` between two regenerated states is meaningful.
|
||||
|
||||
V1 doesn't add an explicit "diff between two Mirror snapshots"
|
||||
endpoint — that's deferred. But the deterministic-output
|
||||
property is a V1 requirement so future diffing works without
|
||||
re-renderer-design work.
|
||||
|
||||
## What the Mirror enables
|
||||
|
||||
With the Mirror in place:
|
||||
|
||||
- **OpenClaw can read project state in human form.** The
|
||||
read-only AtoCore helper skill on the T420 already calls
|
||||
`/context/build`; in V1 it gains the option to call
|
||||
`/mirror/{project}/overview` to get a fully-rendered markdown
|
||||
page instead of just retrieved chunks. This is much faster
|
||||
than crawling individual entities for general questions.
|
||||
- **The human gets a daily-readable artifact.** Every morning,
|
||||
Antoine can `cat /srv/storage/atocore/data/mirror/p05/overview.md`
|
||||
and see the current state of p05 in his preferred reading
|
||||
format. No API calls, no JSON parsing.
|
||||
- **Cross-collaborator sharing.** If you ever want to send
|
||||
someone a project overview without giving them AtoCore access,
|
||||
the Mirror file is a self-contained markdown document they can
|
||||
read in any markdown viewer.
|
||||
- **Claude Code integration.** A future
|
||||
`/atocore-mirror <project>` slash command renders the Mirror
|
||||
inline, complementing the existing `/atocore-context` command
|
||||
with a human-readable view of "what does AtoCore think about
|
||||
this project right now".
|
||||
|
||||
## Open questions for V1 implementation
|
||||
|
||||
1. **What's the regeneration debounce window?** 30 seconds is the
|
||||
starting value but should be tuned with real usage.
|
||||
2. **Does the daily refresh need a separate trigger mechanism, or
|
||||
is it just a long-period entry in the same in-process scheduler
|
||||
that handles the debounced async refreshes?** Probably the
|
||||
latter — keep it simple.
|
||||
3. **How are templates tested?** Likely a small set of fixture
|
||||
project states + golden output files, with a single test that
|
||||
asserts `render(fixture) == golden`. Updating golden files is
|
||||
a normal part of template work.
|
||||
4. **Are Mirror pages discoverable via a directory listing
|
||||
endpoint?** `GET /mirror/{project}` returns the list of
|
||||
available pages for that project. Probably yes; cheap to add.
|
||||
5. **How does the Mirror handle a project that has zero entities
|
||||
yet?** Render an empty-state page that says "no curated facts
|
||||
yet — add some via /memory or /entities/Decision". Better than
|
||||
a blank file.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- The Human Mirror generates 3 template families per project
|
||||
(Overview, Decision Log, Subsystem Detail) from current entity
|
||||
state
|
||||
- It's strictly read-only from the human's perspective; edits go
|
||||
to the canonical home and the Mirror picks them up on
|
||||
regeneration
|
||||
- Three regeneration triggers: explicit POST, debounced
|
||||
async-on-write, daily scheduled refresh
|
||||
- Mirror files live in `/srv/storage/atocore/data/mirror/`
|
||||
(NOT in the source vault — sources stay read-only)
|
||||
- Conflicts and project_state overrides are visible inline in
|
||||
the rendered markdown so the trust hierarchy shows through
|
||||
- Templates are checked into the repo and edited via PR; the
|
||||
rendered files are derived and never canonical
|
||||
- Deterministic output is a V1 requirement so future diffing
|
||||
works without rework
|
||||
333
docs/architecture/llm-client-integration.md
Normal file
333
docs/architecture/llm-client-integration.md
Normal file
@@ -0,0 +1,333 @@
|
||||
# LLM Client Integration (the layering)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
AtoCore must be reachable from many different LLM client contexts:
|
||||
|
||||
- **OpenClaw** on the T420 (already integrated via the read-only
|
||||
helper skill at `/home/papa/clawd/skills/atocore-context/`)
|
||||
- **Claude Code** on the laptop (via the slash command shipped in
|
||||
this repo at `.claude/commands/atocore-context.md`)
|
||||
- **Codex** sessions (future)
|
||||
- **Direct API consumers** — scripts, Python code, ad-hoc curl
|
||||
- **The eventual MCP server** when it's worth building
|
||||
|
||||
Without an explicit layering rule, every new client tends to
|
||||
reimplement the same routing logic (project detection, context
|
||||
build, retrieval audit, project-state inspection) in slightly
|
||||
different ways. That is exactly what almost happened in the first
|
||||
draft of the Claude Code slash command, which started as a curl +
|
||||
jq script that duplicated capabilities the existing operator client
|
||||
already had.
|
||||
|
||||
This document defines the layering so future clients don't repeat
|
||||
that mistake.
|
||||
|
||||
## The layering
|
||||
|
||||
Three layers, top to bottom:
|
||||
|
||||
```
|
||||
+----------------------------------------------------+
|
||||
| Per-agent thin frontends |
|
||||
| |
|
||||
| - Claude Code slash command |
|
||||
| (.claude/commands/atocore-context.md) |
|
||||
| - OpenClaw helper skill |
|
||||
| (/home/papa/clawd/skills/atocore-context/) |
|
||||
| - Codex skill (future) |
|
||||
| - MCP server (future) |
|
||||
+----------------------------------------------------+
|
||||
|
|
||||
| shells out to / imports
|
||||
v
|
||||
+----------------------------------------------------+
|
||||
| Shared operator client |
|
||||
| scripts/atocore_client.py |
|
||||
| |
|
||||
| - subcommands for stable AtoCore operations |
|
||||
| - fail-open on network errors |
|
||||
| - consistent JSON output across all subcommands |
|
||||
| - environment-driven configuration |
|
||||
| (ATOCORE_BASE_URL, ATOCORE_TIMEOUT_SECONDS, |
|
||||
| ATOCORE_REFRESH_TIMEOUT_SECONDS, |
|
||||
| ATOCORE_FAIL_OPEN) |
|
||||
+----------------------------------------------------+
|
||||
|
|
||||
| HTTP
|
||||
v
|
||||
+----------------------------------------------------+
|
||||
| AtoCore HTTP API |
|
||||
| src/atocore/api/routes.py |
|
||||
| |
|
||||
| - the universal interface to AtoCore |
|
||||
| - everything else above is glue |
|
||||
+----------------------------------------------------+
|
||||
```
|
||||
|
||||
## The non-negotiable rules
|
||||
|
||||
These rules are what make the layering work.
|
||||
|
||||
### Rule 1 — every per-agent frontend is a thin wrapper
|
||||
|
||||
A per-agent frontend exists to do exactly two things:
|
||||
|
||||
1. **Translate the agent platform's command/skill format** into an
|
||||
invocation of the shared client (or a small sequence of them)
|
||||
2. **Render the JSON response** into whatever shape the agent
|
||||
platform wants (markdown for Claude Code, plaintext for
|
||||
OpenClaw, MCP tool result for an MCP server, etc.)
|
||||
|
||||
Everything else — talking to AtoCore, project detection, retrieval
|
||||
audit, fail-open behavior, configuration — is the **shared
|
||||
client's** job.
|
||||
|
||||
If a per-agent frontend grows logic beyond the two responsibilities
|
||||
above, that logic is in the wrong place. It belongs in the shared
|
||||
client where every other frontend gets to use it.
|
||||
|
||||
### Rule 2 — the shared client never duplicates the API
|
||||
|
||||
The shared client is allowed to **compose** API calls (e.g.
|
||||
`auto-context` calls `detect-project` then `context-build`), but
|
||||
it never reimplements API logic. If a useful operation can't be
|
||||
expressed via the existing API endpoints, the right fix is to
|
||||
extend the API, not to embed the logic in the client.
|
||||
|
||||
This rule keeps the API as the single source of truth for what
|
||||
AtoCore can do.
|
||||
|
||||
### Rule 3 — the shared client only exposes stable operations
|
||||
|
||||
A subcommand only makes it into the shared client when:
|
||||
|
||||
- the API endpoint behind it has been exercised by at least one
|
||||
real workflow
|
||||
- the request and response shapes are unlikely to change
|
||||
- the operation is one that more than one frontend will plausibly
|
||||
want
|
||||
|
||||
This rule keeps the client surface stable so frontends don't have
|
||||
to chase changes. New endpoints land in the API first, get
|
||||
exercised in real use, and only then get a client subcommand.
|
||||
|
||||
## What's in scope for the shared client today
|
||||
|
||||
The currently shipped scope (per `scripts/atocore_client.py`):
|
||||
|
||||
### Stable operations (shipped since the client was introduced)
|
||||
|
||||
| Subcommand | Purpose | API endpoint(s) |
|
||||
|---|---|---|
|
||||
| `health` | service status, mount + source readiness | `GET /health` |
|
||||
| `sources` | enabled source roots and their existence | `GET /sources` |
|
||||
| `stats` | document/chunk/vector counts | `GET /stats` |
|
||||
| `projects` | registered projects | `GET /projects` |
|
||||
| `project-template` | starter shape for a new project | `GET /projects/template` |
|
||||
| `propose-project` | preview a registration | `POST /projects/proposal` |
|
||||
| `register-project` | persist a registration | `POST /projects/register` |
|
||||
| `update-project` | update an existing registration | `PUT /projects/{name}` |
|
||||
| `refresh-project` | re-ingest a project's roots | `POST /projects/{name}/refresh` |
|
||||
| `project-state` | list trusted state for a project | `GET /project/state/{name}` |
|
||||
| `project-state-set` | curate trusted state | `POST /project/state` |
|
||||
| `project-state-invalidate` | supersede trusted state | `DELETE /project/state` |
|
||||
| `query` | raw retrieval | `POST /query` |
|
||||
| `context-build` | full context pack | `POST /context/build` |
|
||||
| `auto-context` | detect-project then context-build | composes `/projects` + `/context/build` |
|
||||
| `detect-project` | match a prompt to a registered project | composes `/projects` + local regex |
|
||||
| `audit-query` | retrieval-quality audit with classification | composes `/query` + local labelling |
|
||||
| `debug-context` | last context pack inspection | `GET /debug/context` |
|
||||
| `ingest-sources` | ingest configured source dirs | `POST /ingest/sources` |
|
||||
|
||||
### Phase 9 reflection loop (shipped after migration safety work)
|
||||
|
||||
These were explicitly deferred in earlier versions of this doc
|
||||
pending "exercised workflow". The constraint was real — premature
|
||||
API freeze would have made it harder to iterate on the ergonomics —
|
||||
but the deferral ran into a bootstrap problem: you can't exercise
|
||||
the workflow in real Claude Code sessions without a usable client
|
||||
surface to drive it from. The fix is to ship a minimal Phase 9
|
||||
surface now and treat it as stable-but-refinable: adding new
|
||||
optional parameters is fine, renaming subcommands is not.
|
||||
|
||||
| Subcommand | Purpose | API endpoint(s) |
|
||||
|---|---|---|
|
||||
| `capture` | record one interaction round-trip | `POST /interactions` |
|
||||
| `extract` | run the rule-based extractor (preview or persist) | `POST /interactions/{id}/extract` |
|
||||
| `reinforce-interaction` | backfill reinforcement on an existing interaction | `POST /interactions/{id}/reinforce` |
|
||||
| `list-interactions` | paginated list with filters | `GET /interactions` |
|
||||
| `get-interaction` | fetch one interaction by id | `GET /interactions/{id}` |
|
||||
| `queue` | list the candidate review queue | `GET /memory?status=candidate` |
|
||||
| `promote` | move a candidate memory to active | `POST /memory/{id}/promote` |
|
||||
| `reject` | mark a candidate memory invalid | `POST /memory/{id}/reject` |
|
||||
|
||||
All 8 Phase 9 subcommands have test coverage in
|
||||
`tests/test_atocore_client.py` via mocked `request()`, including
|
||||
an end-to-end test that drives the full capture → extract → queue
|
||||
→ promote/reject cycle through the client.
|
||||
|
||||
### Coverage summary
|
||||
|
||||
That covers everything in the "stable operations" set AND the
|
||||
full Phase 9 reflection loop: project lifecycle, ingestion,
|
||||
project-state curation, retrieval, context build,
|
||||
retrieval-quality audit, health and stats inspection, interaction
|
||||
capture, candidate extraction, candidate review queue.
|
||||
|
||||
## What's intentionally NOT in scope today
|
||||
|
||||
Two families of operations remain deferred:
|
||||
|
||||
### 1. Backup and restore admin operations
|
||||
|
||||
Phase 9 Commit B shipped these endpoints:
|
||||
|
||||
- `POST /admin/backup` (with `include_chroma`)
|
||||
- `GET /admin/backup` (list)
|
||||
- `GET /admin/backup/{stamp}/validate`
|
||||
|
||||
The backup endpoints are stable, but the documented operational
|
||||
procedure (`docs/backup-restore-procedure.md`) intentionally uses
|
||||
direct curl rather than the shared client. The reason is that
|
||||
backup operations are *administrative* and benefit from being
|
||||
explicit about which instance they're targeting, with no
|
||||
fail-open behavior. The shared client's fail-open default would
|
||||
hide a real backup failure.
|
||||
|
||||
If we later decide to add backup commands to the shared client,
|
||||
they would set `ATOCORE_FAIL_OPEN=false` for the duration of the
|
||||
call so the operator gets a real error on failure rather than a
|
||||
silent fail-open envelope.
|
||||
|
||||
### 2. Engineering layer entity operations
|
||||
|
||||
The engineering layer is in planning, not implementation. When
|
||||
V1 ships per `engineering-v1-acceptance.md`, the shared client
|
||||
will gain entity, relationship, conflict, and Mirror commands.
|
||||
None of those exist as stable contracts yet, so they are not in
|
||||
the shared client today.
|
||||
|
||||
## How a new agent platform integrates
|
||||
|
||||
When a new LLM client needs AtoCore (e.g. Codex, ChatGPT custom
|
||||
GPT, a Cursor extension), the integration recipe is:
|
||||
|
||||
1. **Don't reimplement.** Don't write a new HTTP client. Use the
|
||||
shared client.
|
||||
2. **Write a thin frontend** that translates the platform's
|
||||
command/skill format into a shell call to
|
||||
`python scripts/atocore_client.py <subcommand> <args...>`.
|
||||
3. **Render the JSON response** in the platform's preferred shape.
|
||||
4. **Inherit fail-open and env-var behavior** from the shared
|
||||
client. Don't override unless the platform explicitly needs
|
||||
to (e.g. an admin tool that wants to see real errors).
|
||||
5. **If a needed capability is missing**, propose adding it to
|
||||
the shared client. If the underlying API endpoint also
|
||||
doesn't exist, propose adding it to the API first. Don't
|
||||
add the logic to your frontend.
|
||||
|
||||
The Claude Code slash command in this repo is a worked example:
|
||||
~50 lines of markdown that does argument parsing, calls the
|
||||
shared client, and renders the result. It contains zero AtoCore
|
||||
business logic of its own.
|
||||
|
||||
## How OpenClaw fits
|
||||
|
||||
OpenClaw's helper skill at `/home/papa/clawd/skills/atocore-context/`
|
||||
on the T420 currently has its own implementation of `auto-context`,
|
||||
`detect-project`, and the project lifecycle commands. It predates
|
||||
this layering doc.
|
||||
|
||||
The right long-term shape is to **refactor the OpenClaw helper to
|
||||
shell out to the shared client** instead of duplicating the
|
||||
routing logic. This isn't urgent because:
|
||||
|
||||
- OpenClaw's helper works today and is in active use
|
||||
- The duplication is on the OpenClaw side; AtoCore itself is not
|
||||
affected
|
||||
- The shared client and the OpenClaw helper are in different
|
||||
repos (AtoCore vs OpenClaw clawd), so the refactor is a
|
||||
cross-repo coordination
|
||||
|
||||
The refactor is queued as a follow-up. Until then, **the OpenClaw
|
||||
helper and the Claude Code slash command are parallel
|
||||
implementations** of the same idea. The shared client is the
|
||||
canonical backbone going forward; new clients should follow the
|
||||
new pattern even though the existing OpenClaw helper still has
|
||||
its own.
|
||||
|
||||
## How this connects to the master plan
|
||||
|
||||
| Layer | Phase home | Status |
|
||||
|---|---|---|
|
||||
| AtoCore HTTP API | Phases 0/0.5/1/2/3/5/7/9 | shipped |
|
||||
| Shared operator client (`scripts/atocore_client.py`) | implicitly Phase 8 (OpenClaw integration) infrastructure | shipped via codex/port-atocore-ops-client merge |
|
||||
| OpenClaw helper skill (T420) | Phase 8 — partial | shipped (own implementation, refactor queued) |
|
||||
| Claude Code slash command (this repo) | precursor to Phase 11 (multi-model) | shipped (refactored to use the shared client) |
|
||||
| Codex skill | Phase 11 | future |
|
||||
| MCP server | Phase 11 | future |
|
||||
| Web UI / dashboard | Phase 11+ | future |
|
||||
|
||||
The shared client is the **substrate Phase 11 will build on**.
|
||||
Every new client added in Phase 11 should be a thin frontend on
|
||||
the shared client, not a fresh reimplementation.
|
||||
|
||||
## Versioning and stability
|
||||
|
||||
The shared client's subcommand surface is **stable**. Adding new
|
||||
subcommands is non-breaking. Changing or removing existing
|
||||
subcommands is breaking and would require a coordinated update
|
||||
of every frontend that depends on them.
|
||||
|
||||
The current shared client has no explicit version constant; the
|
||||
implicit contract is "the subcommands and JSON shapes documented
|
||||
in this file". When the client surface meaningfully changes,
|
||||
add a `CLIENT_VERSION = "x.y.z"` constant to
|
||||
`scripts/atocore_client.py` and bump it per semver:
|
||||
|
||||
- patch: bug fixes, no surface change
|
||||
- minor: new subcommands or new optional fields
|
||||
- major: removed subcommands, renamed fields, changed defaults
|
||||
|
||||
## Open follow-ups
|
||||
|
||||
1. **Refactor the OpenClaw helper** to shell out to the shared
|
||||
client. Cross-repo coordination, not blocking anything in
|
||||
AtoCore itself. With the Phase 9 subcommands now in the shared
|
||||
client, the OpenClaw refactor can reuse all the reflection-loop
|
||||
work instead of duplicating it.
|
||||
2. **Real-usage validation of the Phase 9 loop**, now that the
|
||||
client surface exists. First capture → extract → review cycle
|
||||
against the live Dalidou instance, likely via the Claude Code
|
||||
slash command flow. Findings feed back into subcommand
|
||||
refinement (new optional flags are fine, renames require a
|
||||
semver bump).
|
||||
3. **Add backup admin subcommands** if and when we decide the
|
||||
shared client should be the canonical backup operator
|
||||
interface (with fail-open disabled for admin commands).
|
||||
4. **Add engineering-layer entity subcommands** as part of the
|
||||
engineering V1 implementation sprint, per
|
||||
`engineering-v1-acceptance.md`.
|
||||
5. **Tag a `CLIENT_VERSION` constant** the next time the shared
|
||||
client surface meaningfully changes. Today's surface with the
|
||||
Phase 9 loop added is the v0.2.0 baseline (v0.1.0 was the
|
||||
stable-ops-only version).
|
||||
|
||||
## TL;DR
|
||||
|
||||
- AtoCore HTTP API is the universal interface
|
||||
- `scripts/atocore_client.py` is the canonical shared Python
|
||||
backbone for stable AtoCore operations
|
||||
- Per-agent frontends (Claude Code slash command, OpenClaw
|
||||
helper, future Codex skill, future MCP server) are thin
|
||||
wrappers that shell out to the shared client
|
||||
- The shared client today covers project lifecycle, ingestion,
|
||||
retrieval, context build, project-state, retrieval audit, AND
|
||||
the full Phase 9 reflection loop (capture / extract /
|
||||
reinforce / list / queue / promote / reject)
|
||||
- Backup admin and engineering-entity commands remain deferred
|
||||
- The OpenClaw helper is currently a parallel implementation and
|
||||
the refactor to the shared client is a queued follow-up
|
||||
- New LLM clients should never reimplement HTTP calls — they
|
||||
follow the shell-out pattern documented here
|
||||
309
docs/architecture/memory-vs-entities.md
Normal file
309
docs/architecture/memory-vs-entities.md
Normal file
@@ -0,0 +1,309 @@
|
||||
# Memory vs Entities (Engineering Layer V1 boundary)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The engineering layer introduces a new representation — typed
|
||||
entities with explicit relationships — alongside AtoCore's existing
|
||||
memory system and its six memory types. The question that blocks
|
||||
every other engineering-layer planning doc is:
|
||||
|
||||
> When we extract a fact from an interaction or a document, does it
|
||||
> become a memory, an entity, or both? And if both, which one is
|
||||
> canonical?
|
||||
|
||||
Without an answer, the rest of the engineering layer cannot be
|
||||
designed. This document is the answer.
|
||||
|
||||
## The short version
|
||||
|
||||
- **Memories stay.** They are still the canonical home for
|
||||
*unstructured, attributed, personal, natural-language* facts.
|
||||
- **Entities are new.** They are the canonical home for *structured,
|
||||
typed, relational, engineering-domain* facts.
|
||||
- **No concept lives in both at full fidelity.** Every concept has
|
||||
exactly one canonical home. The other layer may hold a pointer or
|
||||
a rendered view, never a second source of truth.
|
||||
- **The two layers share one review queue.** Candidates from
|
||||
extraction flow into the same `status=candidate` lifecycle
|
||||
regardless of whether they are memory-bound or entity-bound.
|
||||
- **Memories can "graduate" into entities** when enough structure has
|
||||
accumulated, but the upgrade is an explicit, logged promotion, not
|
||||
a silent rewrite.
|
||||
|
||||
## The split per memory type
|
||||
|
||||
The six memory types from the current Phase 2 implementation each
|
||||
map to exactly one outcome in V1:
|
||||
|
||||
| Memory type | V1 destination | Rationale |
|
||||
|---------------|-------------------------------|-------------------------------------------------------------------------------------------------------------|
|
||||
| identity | **memory only** | Always about the human user. No engineering domain structure. Never gets entity-shaped. |
|
||||
| preference | **memory only** | Always about the human user's working style. Same reasoning. |
|
||||
| episodic | **memory only** | "What happened in this conversation / this day." Attribution and time are the point, not typed structure. |
|
||||
| knowledge | **entity when possible**, memory otherwise | If the knowledge maps to a typed engineering object (material property, constant, tolerance), it becomes a Fact entity with provenance. If it's loose general knowledge, stays a memory. |
|
||||
| project | **entity** | Anything that belonged in the "project" memory type is really a Requirement, Constraint, Decision, Subsystem attribute, etc. It belongs in the engineering layer once entities exist. |
|
||||
| adaptation | **entity (Decision)** | "We decided to X" is literally a Decision entity in the ontology. This is the clearest migration. |
|
||||
|
||||
**Practical consequence:** when the engineering layer V1 ships, the
|
||||
`project`, `knowledge`, and `adaptation` memory types are deprecated
|
||||
as a canonical home for new facts. Existing rows are not deleted —
|
||||
they are backfilled as entities through the promotion-rules flow
|
||||
(see `promotion-rules.md`), and the old memory rows become frozen
|
||||
references pointing at their graduated entity.
|
||||
|
||||
The `identity`, `preference`, and `episodic` memory types continue
|
||||
to exist exactly as they do today and do not interact with the
|
||||
engineering layer at all.
|
||||
|
||||
## What "canonical home" actually means
|
||||
|
||||
A concept's canonical home is the single place where:
|
||||
|
||||
- its *current active value* is stored
|
||||
- its *status lifecycle* is managed (active/superseded/invalid)
|
||||
- its *confidence* is tracked
|
||||
- its *provenance chain* is rooted
|
||||
- edits, supersessions, and invalidations are applied
|
||||
- conflict resolution is arbitrated
|
||||
|
||||
Everything else is a derived view of that canonical row.
|
||||
|
||||
If a `Decision` entity is the canonical home for "we switched to
|
||||
GF-PTFE pads", then:
|
||||
|
||||
- there is no `adaptation` memory row with the same content; the
|
||||
extractor creates a `Decision` candidate directly
|
||||
- the context builder, when asked to include relevant state, reaches
|
||||
into the entity store via the engineering layer, not the memory
|
||||
store
|
||||
- if the user wants to see "recent decisions" they hit the entity
|
||||
API, never the memory API
|
||||
- if they want to invalidate the decision, they do so via the entity
|
||||
API
|
||||
|
||||
The memory API remains the canonical home for `identity`,
|
||||
`preference`, and `episodic` — same rules, just a different set of
|
||||
types.
|
||||
|
||||
## Why not a unified table with a `kind` column?
|
||||
|
||||
It would be simpler to implement. It is rejected for three reasons:
|
||||
|
||||
1. **Different query shapes.** Memories are queried by type, project,
|
||||
confidence, recency. Entities are queried by type, relationships,
|
||||
graph traversal, coverage gaps ("orphan requirements"). Cramming
|
||||
both into one table forces the schema to be the union of both
|
||||
worlds and makes each query slower.
|
||||
|
||||
2. **Different lifecycles.** Memories have a simple four-state
|
||||
lifecycle (candidate/active/superseded/invalid). Entities have
|
||||
the same four states *plus* per-relationship supersession,
|
||||
per-field versioning for the killer correctness queries, and
|
||||
structured conflict flagging. The unified table would have to
|
||||
carry all entity apparatus for every memory row.
|
||||
|
||||
3. **Different provenance semantics.** A preference memory is
|
||||
provenanced by "the user told me" — one author, one time.
|
||||
An entity like a `Requirement` is provenanced by "this source
|
||||
chunk + this source document + these supporting Results" — a
|
||||
graph. The tables want to be different because their provenance
|
||||
models are different.
|
||||
|
||||
So: two tables, one review queue, one promotion flow, one trust
|
||||
hierarchy.
|
||||
|
||||
## The shared review queue
|
||||
|
||||
Both the memory extractor (Phase 9 Commit C, already shipped) and
|
||||
the future entity extractor write into the same conceptual queue:
|
||||
everything lands at `status=candidate` in its own table, and the
|
||||
human reviewer sees a unified list. The reviewer UI (future work)
|
||||
shows candidates of all kinds side by side, grouped by source
|
||||
interaction / source document, with the rule that fired.
|
||||
|
||||
From the data side this means:
|
||||
|
||||
- the memories table gets a `candidate` status (**already done in
|
||||
Phase 9 Commit B/C**)
|
||||
- the future entities table will get the same `candidate` status
|
||||
- both tables get the same `promote` / `reject` API shape: one verb
|
||||
per candidate, with an audit log entry
|
||||
|
||||
Implementation note: the API routes should evolve from
|
||||
`POST /memory/{id}/promote` to `POST /candidates/{id}/promote` once
|
||||
both tables exist, so the reviewer tooling can treat them
|
||||
uniformly. The current memory-only route stays in place for
|
||||
backward compatibility and is aliased by the unified route.
|
||||
|
||||
## Memory-to-entity graduation
|
||||
|
||||
Even though the split is clean on paper, real usage will reveal
|
||||
memories that deserve to be entities but started as plain text.
|
||||
Four signals are good candidates for proposing graduation:
|
||||
|
||||
1. **Reference count crosses a threshold.** A memory that has been
|
||||
reinforced 5+ times across multiple interactions is a strong
|
||||
signal that it deserves structure.
|
||||
|
||||
2. **Memory content matches a known entity template.** If a
|
||||
`knowledge` memory's content matches the shape "X = value [unit]"
|
||||
it can be proposed as a `Fact` or `Parameter` entity.
|
||||
|
||||
3. **A user explicitly asks for promotion.** `POST /memory/{id}/graduate`
|
||||
is the simplest explicit path — it returns a proposal for an
|
||||
entity structured from the memory's content, which the user can
|
||||
accept or reject.
|
||||
|
||||
4. **Extraction pass proposes an entity that happens to match an
|
||||
existing memory.** The entity extractor, when scanning a new
|
||||
interaction, sees the same content already exists as a memory
|
||||
and proposes graduation as part of its candidate output.
|
||||
|
||||
The graduation flow is:
|
||||
|
||||
```
|
||||
memory row (active, confidence C)
|
||||
|
|
||||
| propose_graduation()
|
||||
v
|
||||
entity candidate row (candidate, confidence C)
|
||||
+
|
||||
memory row gets status="graduated" and a forward pointer to the
|
||||
entity candidate
|
||||
|
|
||||
| human promotes the candidate entity
|
||||
v
|
||||
entity row (active)
|
||||
+
|
||||
memory row stays "graduated" permanently (historical record)
|
||||
```
|
||||
|
||||
The memory is never deleted. It becomes a frozen historical
|
||||
pointer to the entity it became. This keeps the audit trail intact
|
||||
and lets the Human Mirror show "this decision started life as a
|
||||
memory on April 2, was graduated to an entity on April 15, now has
|
||||
2 supporting ValidationClaims".
|
||||
|
||||
The `graduated` status is a new memory status that gets added when
|
||||
the graduation flow is implemented. For now (Phase 9), only the
|
||||
three non-graduating types (identity/preference/episodic) would
|
||||
ever avoid it, and the three graduating types stay in their current
|
||||
memory-only state until the engineering layer ships.
|
||||
|
||||
## Context pack assembly after the split
|
||||
|
||||
The context builder today (`src/atocore/context/builder.py`) pulls:
|
||||
|
||||
1. Trusted Project State
|
||||
2. Identity + Preference memories
|
||||
3. Retrieved chunks
|
||||
|
||||
After the split, it pulls:
|
||||
|
||||
1. Trusted Project State (unchanged)
|
||||
2. **Identity + Preference memories** (unchanged — these stay memories)
|
||||
3. **Engineering-layer facts relevant to the prompt**, queried through
|
||||
the entity API (new)
|
||||
4. Retrieved chunks (unchanged, lowest trust)
|
||||
|
||||
Note the ordering: identity/preference memories stay above entities,
|
||||
because personal style information is always more trusted than
|
||||
extracted engineering facts. Entities sit below the personal layer
|
||||
but above raw retrieval, because they have structured provenance
|
||||
that raw chunks lack.
|
||||
|
||||
The budget allocation gains a new slot:
|
||||
|
||||
- trusted project state: 20% (unchanged, highest trust)
|
||||
- identity memories: 5% (unchanged)
|
||||
- preference memories: 5% (unchanged)
|
||||
- **engineering entities: 15%** (new — pulls only V1-required
|
||||
objects relevant to the prompt)
|
||||
- retrieval: 55% (reduced from 70% to make room)
|
||||
|
||||
These are starting numbers. After the engineering layer ships and
|
||||
real usage tunes retrieval quality, these will be revisited.
|
||||
|
||||
## What the shipped memory types still mean after the split
|
||||
|
||||
| Memory type | Still accepts new writes? | V1 destination for new extractions |
|
||||
|-------------|---------------------------|------------------------------------|
|
||||
| identity | **yes** | memory (no change) |
|
||||
| preference | **yes** | memory (no change) |
|
||||
| episodic | **yes** | memory (no change) |
|
||||
| knowledge | yes, but only for loose facts | entity (Fact / Parameter) for structured things; memory is a fallback |
|
||||
| project | **no new writes after engineering V1 ships** | entity (Requirement / Constraint / Subsystem attribute) |
|
||||
| adaptation | **no new writes after engineering V1 ships** | entity (Decision) |
|
||||
|
||||
"No new writes" means the `create_memory` path will refuse to
|
||||
create new `project` or `adaptation` memories once the engineering
|
||||
layer V1 ships. Existing rows stay queryable and reinforceable but
|
||||
new facts of those kinds must become entities. This keeps the
|
||||
canonical-home rule clean going forward.
|
||||
|
||||
The deprecation is deferred: it does not happen until the engineering
|
||||
layer V1 is demonstrably working against the active project set. Until
|
||||
then, the existing memory types continue to accept writes so the
|
||||
Phase 9 loop can be exercised without waiting on the engineering
|
||||
layer.
|
||||
|
||||
## Consequences for Phase 9 (what we just built)
|
||||
|
||||
The capture loop, reinforcement, and extractor we shipped today
|
||||
are *memory-facing*. They produce memory candidates, reinforce
|
||||
memory confidence, and respect the memory status lifecycle. None
|
||||
of that changes.
|
||||
|
||||
When the engineering layer V1 ships, the extractor in
|
||||
`src/atocore/memory/extractor.py` gets a sibling in
|
||||
`src/atocore/entities/extractor.py` that uses the same
|
||||
interaction-scanning approach but produces entity candidates
|
||||
instead. The `POST /interactions/{id}/extract` endpoint either:
|
||||
|
||||
- runs both extractors and returns a combined result, or
|
||||
- gains a `?target=memory|entities|both` query parameter
|
||||
|
||||
and the decision between those two shapes can wait until the
|
||||
entity extractor actually exists.
|
||||
|
||||
Until the entity layer is real, the memory extractor also has to
|
||||
cover some things that will eventually move to entities (decisions,
|
||||
constraints, requirements). **That overlap is temporary and
|
||||
intentional.** Rather than leave those cues unextracted for months
|
||||
while the entity layer is being built, the memory extractor
|
||||
surfaces them as memory candidates. Later, a migration pass will
|
||||
propose graduation on every active memory created by
|
||||
`decision_heading`, `constraint_heading`, and `requirement_heading`
|
||||
rules once the entity types exist to receive them.
|
||||
|
||||
So: **no rework in Phase 9, no wasted extraction, clean handoff
|
||||
once the entity layer lands**.
|
||||
|
||||
## Open questions this document does NOT answer
|
||||
|
||||
These are deliberately deferred to later planning docs:
|
||||
|
||||
1. **When exactly does extraction fire?** (answered by
|
||||
`promotion-rules.md`)
|
||||
2. **How are conflicts between a memory and an entity handled
|
||||
during graduation?** (answered by `conflict-model.md`)
|
||||
3. **Does the context builder traverse the entity graph for
|
||||
relationship-rich queries, or does it only surface direct facts?**
|
||||
(answered by the context-builder spec in a future
|
||||
`engineering-context-integration.md` doc)
|
||||
4. **What is the exact API shape of the unified candidate review
|
||||
queue?** (answered by a future `review-queue-api.md` doc when
|
||||
the entity extractor exists and both tables need one UI)
|
||||
|
||||
## TL;DR
|
||||
|
||||
- memories = user-facing unstructured facts, still own identity/preference/episodic
|
||||
- entities = engineering-facing typed facts, own project/knowledge/adaptation
|
||||
- one canonical home per concept, never both
|
||||
- one shared candidate-review queue, same promote/reject shape
|
||||
- graduated memories stay as frozen historical pointers
|
||||
- Phase 9 stays memory-only and ships today; entity V1 follows the
|
||||
remaining architecture docs in this planning sprint
|
||||
- no rework required when the entity layer lands; the current memory
|
||||
extractor's structural cues get migrated forward via explicit
|
||||
graduation
|
||||
462
docs/architecture/project-identity-canonicalization.md
Normal file
462
docs/architecture/project-identity-canonicalization.md
Normal file
@@ -0,0 +1,462 @@
|
||||
# Project Identity Canonicalization
|
||||
|
||||
## Why this document exists
|
||||
|
||||
AtoCore identifies projects by name in many places: trusted state
|
||||
rows, memories, captured interactions, query/context API parameters,
|
||||
extractor candidates, future engineering entities. Without an
|
||||
explicit rule, every callsite would have to remember to canonicalize
|
||||
project names through the registry — and the recent codex review
|
||||
caught exactly the bug class that follows when one of them forgets.
|
||||
|
||||
The fix landed in `fb6298a` and works correctly today. This document
|
||||
exists to make the rule **explicit and discoverable** so the
|
||||
engineering layer V1 implementation, future entity write paths, and
|
||||
any new agent integration don't reintroduce the same fragmentation
|
||||
when nobody is looking.
|
||||
|
||||
## The contract
|
||||
|
||||
> **Every read/write that takes a project name MUST canonicalize it
|
||||
> through `resolve_project_name()` before the value crosses a service
|
||||
> boundary.**
|
||||
|
||||
The boundary is wherever a project name becomes a database row, a
|
||||
query filter, an attribute on a stored object, or a key for any
|
||||
lookup. The canonicalization happens **once**, at that boundary,
|
||||
before the underlying storage primitive is called.
|
||||
|
||||
Symbolically:
|
||||
|
||||
```
|
||||
HTTP layer (raw user input)
|
||||
↓
|
||||
service entry point
|
||||
↓
|
||||
project_name = resolve_project_name(project_name) ← ONLY canonical from this point
|
||||
↓
|
||||
storage / queries / further service calls
|
||||
```
|
||||
|
||||
The rule is intentionally simple. There's no per-call exception,
|
||||
no "trust me, the caller already canonicalized it" shortcut, no
|
||||
opt-out flag. Every service-layer entry point applies the helper
|
||||
the moment it receives a project name from outside the service.
|
||||
|
||||
## The helper
|
||||
|
||||
```python
|
||||
# src/atocore/projects/registry.py
|
||||
|
||||
def resolve_project_name(name: str | None) -> str:
|
||||
"""Canonicalize a project name through the registry.
|
||||
|
||||
Returns the canonical project_id if the input matches any
|
||||
registered project's id or alias. Returns the input unchanged
|
||||
when it's empty or not in the registry — the second case keeps
|
||||
backwards compatibility with hand-curated state, memories, and
|
||||
interactions that predate the registry, or for projects that
|
||||
are intentionally not registered.
|
||||
"""
|
||||
if not name:
|
||||
return name or ""
|
||||
project = get_registered_project(name)
|
||||
if project is not None:
|
||||
return project.project_id
|
||||
return name
|
||||
```
|
||||
|
||||
Three behaviors worth keeping in mind:
|
||||
|
||||
1. **Empty / None input → empty string output.** Callers don't have
|
||||
to pre-check; passing `""` or `None` to a query filter still
|
||||
works as "no project scope".
|
||||
2. **Registered alias → canonical project_id.** The helper does the
|
||||
case-insensitive lookup and returns the project's `id` field
|
||||
(e.g. `"p05" → "p05-interferometer"`).
|
||||
3. **Unregistered name → input unchanged.** This is the
|
||||
backwards-compatibility path. Hand-curated state, memories, or
|
||||
interactions created under a name that isn't in the registry
|
||||
keep working. The retrieval is then "best effort" — the raw
|
||||
string is used as the SQL key, which still finds the row that
|
||||
was stored under the same raw string. This path exists so the
|
||||
engineering layer V1 doesn't have to also be a data migration.
|
||||
|
||||
## Where the helper is currently called
|
||||
|
||||
As of `fb6298a`, the helper is invoked at exactly these eight
|
||||
service-layer entry points:
|
||||
|
||||
| Module | Function | What gets canonicalized |
|
||||
|---|---|---|
|
||||
| `src/atocore/context/builder.py` | `build_context` | the `project_hint` parameter, before the trusted state lookup |
|
||||
| `src/atocore/context/project_state.py` | `set_state` | `project_name`, before `ensure_project()` |
|
||||
| `src/atocore/context/project_state.py` | `get_state` | `project_name`, before the SQL lookup |
|
||||
| `src/atocore/context/project_state.py` | `invalidate_state` | `project_name`, before the SQL lookup |
|
||||
| `src/atocore/interactions/service.py` | `record_interaction` | `project`, before insert |
|
||||
| `src/atocore/interactions/service.py` | `list_interactions` | `project` filter parameter, before WHERE clause |
|
||||
| `src/atocore/memory/service.py` | `create_memory` | `project`, before insert |
|
||||
| `src/atocore/memory/service.py` | `get_memories` | `project` filter parameter, before WHERE clause |
|
||||
|
||||
Every one of those is the **first** thing the function does after
|
||||
input validation. There is no path through any of those eight
|
||||
functions where a project name reaches storage without passing
|
||||
through `resolve_project_name`.
|
||||
|
||||
## Where the helper is NOT called (and why that's correct)
|
||||
|
||||
These places intentionally do not canonicalize:
|
||||
|
||||
1. **`update_memory`'s project field.** The API does not allow
|
||||
changing a memory's project after creation, so there's no
|
||||
project to canonicalize. The function only updates `content`,
|
||||
`confidence`, and `status`.
|
||||
2. **The retriever's `_project_match_boost` substring matcher.** It
|
||||
already calls `get_registered_project` internally to expand the
|
||||
hint into the candidate set (canonical id + all aliases + last
|
||||
path segments). It accepts the raw hint by design.
|
||||
3. **`_rank_chunks`'s secondary substring boost in
|
||||
`builder.py`.** Still uses the raw hint. This is a multiplicative
|
||||
factor on top of correct retrieval, not a filter, so it cannot
|
||||
drop relevant chunks. Tracked as a future cleanup but not
|
||||
critical.
|
||||
4. **Direct SQL queries for the projects table itself** (e.g.
|
||||
`ensure_project`'s lookup). These are intentional case-insensitive
|
||||
raw lookups against the column the canonical id is stored in.
|
||||
`set_state` already canonicalized before reaching `ensure_project`,
|
||||
so the value passed is the canonical id by definition.
|
||||
5. **Hand-authored project names that aren't in the registry.**
|
||||
The helper returns those unchanged. This is the backwards-compat
|
||||
path mentioned above; it is *not* a violation of the rule, it's
|
||||
the rule applied to a name with no registry record.
|
||||
|
||||
## Why this is the trust hierarchy in action
|
||||
|
||||
The whole point of AtoCore is the trust hierarchy from the operating
|
||||
model:
|
||||
|
||||
1. Trusted Project State (Layer 3) is the most authoritative layer
|
||||
2. Memories (active) are second
|
||||
3. Source chunks (raw retrieved content) are last
|
||||
|
||||
If a caller passes the alias `p05` and Layer 3 was written under
|
||||
`p05-interferometer`, and the lookup fails to find the canonical
|
||||
row, **the trust hierarchy collapses**. The most-authoritative
|
||||
layer is silently invisible to the caller. The system would still
|
||||
return *something* — namely, lower-trust retrieved chunks — and the
|
||||
human would never know they got a degraded answer.
|
||||
|
||||
The canonicalization helper is what makes the trust hierarchy
|
||||
**dependable**. Layer 3 is supposed to win every time. To win it
|
||||
has to be findable. To be findable, the lookup key has to match
|
||||
how the row was stored. And the only way to guarantee that match
|
||||
across every entry point is to canonicalize at every boundary.
|
||||
|
||||
## Compatibility gap: legacy alias-keyed rows
|
||||
|
||||
The canonicalization rule fixes new writes going forward, but it
|
||||
does NOT fix rows that were already written under a registered
|
||||
alias before `fb6298a` landed. Those rows have a real, concrete
|
||||
gap that must be closed by a one-time migration before the
|
||||
engineering layer V1 ships.
|
||||
|
||||
The exact failure mode:
|
||||
|
||||
```
|
||||
time T0 (before fb6298a):
|
||||
POST /project/state {project: "p05", ...}
|
||||
-> set_state("p05", ...) # no canonicalization
|
||||
-> ensure_project("p05") # creates a "p05" row
|
||||
-> writes state with project_id pointing at the "p05" row
|
||||
|
||||
time T1 (after fb6298a):
|
||||
POST /project/state {project: "p05", ...} (or any read)
|
||||
-> set_state("p05", ...)
|
||||
-> resolve_project_name("p05") -> "p05-interferometer"
|
||||
-> ensure_project("p05-interferometer") # creates a SECOND row
|
||||
-> writes new state under the canonical row
|
||||
-> the T0 state is still in the "p05" row, INVISIBLE to every
|
||||
canonicalized read
|
||||
```
|
||||
|
||||
The unregistered-name fallback path saves you when the project was
|
||||
never in the registry: a row stored under `"orphan-project"` is read
|
||||
back via `"orphan-project"`, both pass through `resolve_project_name`
|
||||
unchanged, and the strings line up. **It does not save you when the
|
||||
name is a registered alias** — the helper rewrites the read key but
|
||||
not the storage key, and the legacy row becomes invisible.
|
||||
|
||||
What is at risk on the live Dalidou DB:
|
||||
|
||||
1. **`projects` table**: any rows whose `name` column matches a
|
||||
registered alias (one row per alias actually written under
|
||||
before the fix landed). These shadow the canonical project row
|
||||
and silently fragment the projects namespace.
|
||||
2. **`project_state` table**: any rows whose `project_id` points
|
||||
at one of those shadow project rows. **This is the highest-risk
|
||||
case** because it directly defeats the trust hierarchy: Layer 3
|
||||
trusted state becomes invisible to every canonicalized lookup.
|
||||
3. **`memories` table**: any rows whose `project` column is a
|
||||
registered alias. Reinforcement and extraction queries will
|
||||
miss them.
|
||||
4. **`interactions` table**: any rows whose `project` column is a
|
||||
registered alias. Listing and downstream reflection will miss
|
||||
them.
|
||||
|
||||
How to find out the actual blast radius on the live Dalidou DB:
|
||||
|
||||
```sql
|
||||
-- inspect the projects table for alias-shadow rows
|
||||
SELECT id, name FROM projects;
|
||||
|
||||
-- count alias-keyed memories per known alias
|
||||
SELECT project, COUNT(*) FROM memories
|
||||
WHERE project IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core')
|
||||
GROUP BY project;
|
||||
|
||||
-- count alias-keyed interactions
|
||||
SELECT project, COUNT(*) FROM interactions
|
||||
WHERE project IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core')
|
||||
GROUP BY project;
|
||||
|
||||
-- count alias-shadowed project_state rows by project name
|
||||
SELECT p.name, COUNT(*) FROM project_state ps
|
||||
JOIN projects p ON ps.project_id = p.id
|
||||
WHERE p.name IN ('p04','p05','p06','gigabit','interferometer','polisher','ato core');
|
||||
```
|
||||
|
||||
The migration that closes the gap has to:
|
||||
|
||||
1. For each registered project, find all `projects` rows whose
|
||||
name matches one of the project's aliases AND is not the
|
||||
canonical id itself. These are the "shadow" rows.
|
||||
2. For each shadow row, MERGE its dependent state into the
|
||||
canonical project's row:
|
||||
- rekey `project_state.project_id` from shadow → canonical
|
||||
- if the merge would create a `(project_id, category, key)`
|
||||
collision (a state row already exists under the canonical
|
||||
id with the same category+key), the migration must surface
|
||||
the conflict via the existing conflict model and pause
|
||||
until the human resolves it
|
||||
- delete the now-empty shadow `projects` row
|
||||
3. For `memories` and `interactions`, the fix is simpler because
|
||||
the alias appears as a string column (not a foreign key):
|
||||
`UPDATE memories SET project = canonical WHERE project = alias`,
|
||||
then same for interactions.
|
||||
4. The migration must run in dry-run mode first, printing the
|
||||
exact rows it would touch and the canonical destinations they
|
||||
would be merged into.
|
||||
5. The migration must be idempotent — running it twice produces
|
||||
the same final state as running it once.
|
||||
|
||||
This work is **required before the engineering layer V1 ships**
|
||||
because V1 will add new `entities`, `relationships`, `conflicts`,
|
||||
and `mirror_regeneration_failures` tables that all key on the
|
||||
canonical project id. Any leaked alias-keyed rows in the existing
|
||||
tables would show up in V1 reads as silently missing data, and
|
||||
the killer-correctness queries from `engineering-query-catalog.md`
|
||||
(orphan requirements, decisions on flagged assumptions,
|
||||
unsupported claims) would report wrong results against any project
|
||||
that has shadow rows.
|
||||
|
||||
The migration script does NOT exist yet. The open follow-ups
|
||||
section below tracks it as the next concrete step.
|
||||
|
||||
## The rule for new entry points
|
||||
|
||||
When you add a new service-layer function that takes a project name,
|
||||
follow this checklist:
|
||||
|
||||
1. **Does the function read or write a row keyed by project?** If
|
||||
yes, you must call `resolve_project_name`. If no (e.g. it only
|
||||
takes `project` as a label for logging), you may skip the
|
||||
canonicalization but you should add a comment explaining why.
|
||||
2. **Where does the canonicalization go?** As the first statement
|
||||
after input validation. Not later, not "before storage", not
|
||||
"in the helper that does the actual write". As the first
|
||||
statement, so any subsequent service call inside the function
|
||||
sees the canonical value.
|
||||
3. **Add a regression test that uses an alias.** Use the
|
||||
`project_registry` fixture from `tests/conftest.py` to set up
|
||||
a temp registry with at least one project + aliases, then
|
||||
verify the new function works when called with the alias and
|
||||
when called with the canonical id.
|
||||
4. **If the function can be called with `None` or empty string,
|
||||
verify that path too.** The helper handles it correctly but
|
||||
the function-under-test might not.
|
||||
|
||||
## How the `project_registry` test fixture works
|
||||
|
||||
`tests/conftest.py::project_registry` returns a callable that
|
||||
takes one or more `(project_id, [aliases])` tuples (or just a bare
|
||||
`project_id` string), writes them into a temp registry file,
|
||||
points `ATOCORE_PROJECT_REGISTRY_PATH` at it, and reloads
|
||||
`config.settings`. Use it like:
|
||||
|
||||
```python
|
||||
def test_my_new_thing_canonicalizes(project_registry):
|
||||
project_registry(("p05-interferometer", ["p05", "interferometer"]))
|
||||
|
||||
# ... call your service function with "p05" ...
|
||||
# ... assert it works the same as if you'd passed "p05-interferometer" ...
|
||||
```
|
||||
|
||||
The fixture is reused by all 12 alias-canonicalization regression
|
||||
tests added in `fb6298a`. Following the same pattern for new
|
||||
features is the cheapest way to keep the contract intact.
|
||||
|
||||
## What this rule does NOT cover
|
||||
|
||||
1. **Alias creation / management.** This document is about reading
|
||||
and writing project-keyed data. Adding new projects or new
|
||||
aliases is the registry's own write path
|
||||
(`POST /projects/register`, `PUT /projects/{name}`), which
|
||||
already enforces collision detection and atomic file writes.
|
||||
2. **Registry hot-reloading.** The helper calls
|
||||
`load_project_registry()` on every invocation, which reads the
|
||||
JSON file each time. There is no in-process cache. If the
|
||||
registry file changes, the next call sees the new contents.
|
||||
Performance is fine for the current registry size but if it
|
||||
becomes a bottleneck, add a versioned cache here, not at every
|
||||
call site.
|
||||
3. **Cross-project deduplication.** If two different projects in
|
||||
the registry happen to share an alias, the registry's collision
|
||||
detection blocks the second one at registration time, so this
|
||||
case can't arise in practice. The helper does not handle it
|
||||
defensively.
|
||||
4. **Time-bounded canonicalization.** A project's canonical id is
|
||||
stable. Aliases can be added or removed via
|
||||
`PUT /projects/{name}`, but the canonical `id` field never
|
||||
changes after registration. So a row written today under the
|
||||
canonical id will always remain findable under that id, even
|
||||
if the alias set evolves.
|
||||
5. **Migration of legacy data.** If the live Dalidou DB has rows
|
||||
that were written under aliases before the canonicalization
|
||||
landed (e.g. a `memories` row with `project = "p05"` from
|
||||
before `fb6298a`), those rows are **NOT** automatically
|
||||
reachable from the canonicalized read path. The unregistered-
|
||||
name fallback only helps for project names that were never
|
||||
registered at all; it does **NOT** help for names that are
|
||||
registered as aliases. See the "Compatibility gap" section
|
||||
below for the exact failure mode and the migration path that
|
||||
has to run before the engineering layer V1 ships.
|
||||
|
||||
## What this enables for the engineering layer V1
|
||||
|
||||
When the engineering layer ships per `engineering-v1-acceptance.md`,
|
||||
it adds at least these new project-keyed surfaces:
|
||||
|
||||
- `entities` table with a `project_id` column
|
||||
- `relationships` table that joins entities, indirectly project-keyed
|
||||
- `conflicts` table with a `project` column
|
||||
- `mirror_regeneration_failures` table with a `project` column
|
||||
- new endpoints: `POST /entities/...`, `POST /ingest/kb-cad/export`,
|
||||
`POST /ingest/kb-fem/export`, `GET /mirror/{project}/...`,
|
||||
`GET /conflicts?project=...`
|
||||
|
||||
**Every one of those write/read paths needs to call
|
||||
`resolve_project_name` at its service-layer entry point**, following
|
||||
the same pattern as the eight existing call sites listed above. The
|
||||
implementation sprint should:
|
||||
|
||||
1. Apply the helper at each new service entry point as the first
|
||||
statement after input validation
|
||||
2. Add a regression test using the `project_registry` fixture that
|
||||
exercises an alias against each new entry point
|
||||
3. Treat any new service function that takes a project name without
|
||||
calling `resolve_project_name` as a code review failure
|
||||
|
||||
The pattern is simple enough to follow without thinking, which is
|
||||
exactly the property we want for a contract that has to hold
|
||||
across many independent additions.
|
||||
|
||||
## Open follow-ups
|
||||
|
||||
These are things the canonicalization story still has open. None
|
||||
are blockers, but they're the rough edges to be aware of.
|
||||
|
||||
1. **Legacy alias data migration — REQUIRED before engineering V1
|
||||
ships, NOT optional.** If the live Dalidou DB has any rows
|
||||
written under aliases before `fb6298a` landed, they are
|
||||
silently invisible to the canonicalized read path (see the
|
||||
"Compatibility gap" section above for the exact failure mode).
|
||||
This is a real correctness issue, not a theoretical one: any
|
||||
trusted state, memory, or interaction stored under `p05`,
|
||||
`gigabit`, `polisher`, etc. before the fix landed is currently
|
||||
unreachable from any service-layer query. The migration script
|
||||
has to walk `projects`, `project_state`, `memories`, and
|
||||
`interactions`, merge shadow rows into their canonical
|
||||
counterparts (with conflict-model handling for any collisions),
|
||||
and run in dry-run mode first. Estimated cost: ~150 LOC for
|
||||
the migration script + ~50 LOC of tests + a one-time supervised
|
||||
run on the live Dalidou DB. **This migration is the next
|
||||
concrete pre-V1 step.**
|
||||
2. **Registry file caching.** `load_project_registry()` reads the
|
||||
JSON file on every `resolve_project_name` call. With ~5
|
||||
projects this is fine; with 50+ it would warrant a versioned
|
||||
cache (cache key = file mtime + size). Defer until measured.
|
||||
3. **Case sensitivity audit.** The helper uses
|
||||
`get_registered_project` which lowercases for comparison. The
|
||||
stored canonical id keeps its original casing. No bug today
|
||||
because every test passes, but worth re-confirming when the
|
||||
engineering layer adds entity-side storage.
|
||||
4. **`_rank_chunks`'s secondary substring boost.** Mentioned
|
||||
earlier; still uses the raw hint. Replace it with the same
|
||||
helper-driven approach the retriever uses, OR delete it as
|
||||
redundant once we confirm the retriever's primary boost is
|
||||
sufficient.
|
||||
5. **Documentation discoverability.** This doc lives under
|
||||
`docs/architecture/`. The contract is also restated in the
|
||||
docstring of `resolve_project_name` and referenced from each
|
||||
call site's comment. That redundancy is intentional — the
|
||||
contract is too easy to forget to live in only one place.
|
||||
|
||||
## Quick reference card
|
||||
|
||||
Copy-pasteable for new service functions:
|
||||
|
||||
```python
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
|
||||
def my_new_service_entry_point(
|
||||
project_name: str,
|
||||
other_args: ...,
|
||||
) -> ...:
|
||||
# Validate inputs first
|
||||
if not project_name:
|
||||
raise ValueError("project_name is required")
|
||||
|
||||
# Canonicalize through the registry as the first thing after
|
||||
# validation. Every subsequent operation in this function uses
|
||||
# the canonical id, so storage and queries are guaranteed
|
||||
# consistent across alias and canonical-id callers.
|
||||
project_name = resolve_project_name(project_name)
|
||||
|
||||
# ... rest of the function ...
|
||||
```
|
||||
|
||||
## TL;DR
|
||||
|
||||
- One helper, one rule: `resolve_project_name` at every service-layer
|
||||
entry point that takes a project name
|
||||
- Currently called in 8 places across builder, project_state,
|
||||
interactions, and memory; all 8 listed in this doc
|
||||
- Backwards-compat path returns **unregistered** names unchanged
|
||||
(e.g. `"orphan-project"`); this does NOT cover **registered
|
||||
alias** names that were used as storage keys before `fb6298a`
|
||||
- **Real compatibility gap**: any row whose `project` column is a
|
||||
registered alias from before the canonicalization landed is
|
||||
silently invisible to the new read path. A one-time migration
|
||||
is required before engineering V1 ships. See the "Compatibility
|
||||
gap" section.
|
||||
- The trust hierarchy depends on this helper being applied
|
||||
everywhere — Layer 3 trusted state has to be findable for it to
|
||||
win the trust battle
|
||||
- Use the `project_registry` test fixture to add regression tests
|
||||
for any new service function that takes a project name
|
||||
- The engineering layer V1 implementation must follow the same
|
||||
pattern at every new service entry point
|
||||
- Open follow-ups (in priority order): **legacy alias data
|
||||
migration (required pre-V1)**, redundant substring boost
|
||||
cleanup, registry caching when projects scale
|
||||
343
docs/architecture/promotion-rules.md
Normal file
343
docs/architecture/promotion-rules.md
Normal file
@@ -0,0 +1,343 @@
|
||||
# Promotion Rules (Layer 0 → Layer 2 pipeline)
|
||||
|
||||
## Purpose
|
||||
|
||||
AtoCore ingests raw human-authored content (markdown, repo notes,
|
||||
interaction transcripts) and eventually must turn some of it into
|
||||
typed engineering entities that the V1 query catalog can answer.
|
||||
The path from raw text to typed entity has to be:
|
||||
|
||||
- **explicit**: every step has a named operation, a trigger, and an
|
||||
audit log
|
||||
- **reversible**: every promotion can be undone without data loss
|
||||
- **conservative**: no automatic movement into trusted state; a human
|
||||
(or later, a very confident policy) always signs off
|
||||
- **traceable**: every typed entity must carry a back-pointer to
|
||||
the raw source that produced it
|
||||
|
||||
This document defines that path.
|
||||
|
||||
## The four layers
|
||||
|
||||
Promotion is described in terms of four layers, all of which exist
|
||||
simultaneously in the system once the engineering layer V1 ships:
|
||||
|
||||
| Layer | Name | Canonical storage | Trust | Who writes |
|
||||
|-------|-------------------|------------------------------------------|-------|------------|
|
||||
| L0 | Raw source | source_documents + source_chunks | low | ingestion pipeline |
|
||||
| L1 | Memory candidate | memories (status="candidate") | low | extractor |
|
||||
| L1' | Active memory | memories (status="active") | med | human promotion |
|
||||
| L2 | Entity candidate | entities (status="candidate") | low | extractor + graduation |
|
||||
| L2' | Active entity | entities (status="active") | high | human promotion |
|
||||
| L3 | Trusted state | project_state | highest | human curation |
|
||||
|
||||
Layer 3 (trusted project state) is already implemented and stays
|
||||
manually curated — automatic promotion into L3 is **never** allowed.
|
||||
|
||||
## The promotion graph
|
||||
|
||||
```
|
||||
[L0] source chunks
|
||||
|
|
||||
| extraction (memory extractor, Phase 9 Commit C)
|
||||
v
|
||||
[L1] memory candidate
|
||||
|
|
||||
| promote_memory()
|
||||
v
|
||||
[L1'] active memory
|
||||
|
|
||||
| (optional) propose_graduation()
|
||||
v
|
||||
[L2] entity candidate
|
||||
|
|
||||
| promote_entity()
|
||||
v
|
||||
[L2'] active entity
|
||||
|
|
||||
| (manual curation, NEVER automatic)
|
||||
v
|
||||
[L3] trusted project state
|
||||
```
|
||||
|
||||
Short path (direct entity extraction, once the entity extractor
|
||||
exists):
|
||||
|
||||
```
|
||||
[L0] source chunks
|
||||
|
|
||||
| entity extractor
|
||||
v
|
||||
[L2] entity candidate
|
||||
|
|
||||
| promote_entity()
|
||||
v
|
||||
[L2'] active entity
|
||||
```
|
||||
|
||||
A single fact can travel either path depending on what the
|
||||
extractor saw. The graduation path exists for facts that started
|
||||
life as memories before the entity layer existed, and for the
|
||||
memory extractor's structural cues (decisions, constraints,
|
||||
requirements) which are eventually entity-shaped.
|
||||
|
||||
## Triggers (when does extraction fire?)
|
||||
|
||||
Phase 9 already shipped one trigger: **on explicit API request**
|
||||
(`POST /interactions/{id}/extract`). The V1 engineering layer adds
|
||||
two more:
|
||||
|
||||
1. **On interaction capture (automatic)**
|
||||
- Same event that runs reinforcement today
|
||||
- Controlled by a `extract` boolean flag on the record request
|
||||
(default: `false` for memory extractor, `true` once an
|
||||
engineering extractor exists and has been validated)
|
||||
- Output goes to the candidate queue; nothing auto-promotes
|
||||
|
||||
2. **On ingestion (batched, per wave)**
|
||||
- After a wave of markdown ingestion finishes, a batch extractor
|
||||
pass sweeps all newly-added source chunks and produces
|
||||
candidates from them
|
||||
- Batched per wave (not per chunk) to keep the review queue
|
||||
digestible and to let the reviewer see all candidates from a
|
||||
single ingestion in one place
|
||||
- Output: a report artifact plus a review queue entry per
|
||||
candidate
|
||||
|
||||
3. **On explicit human request (existing)**
|
||||
- `POST /interactions/{id}/extract` for a single interaction
|
||||
- Future: `POST /ingestion/wave/{id}/extract` for a whole wave
|
||||
- Future: `POST /memory/{id}/graduate` to propose graduation
|
||||
of one specific memory into an entity
|
||||
|
||||
Batch size rule: **extraction passes never write more than N
|
||||
candidates per human review cycle, where N = 50 by default**. If
|
||||
a pass produces more, it ranks by (rule confidence × content
|
||||
length × novelty) and only writes the top N. The remaining
|
||||
candidates are logged, not persisted. This protects the reviewer
|
||||
from getting buried.
|
||||
|
||||
## Confidence and ranking of candidates
|
||||
|
||||
Each rule-based extraction rule carries a *prior confidence*
|
||||
based on how specific its pattern is:
|
||||
|
||||
| Rule class | Prior | Rationale |
|
||||
|---------------------------|-------|-----------|
|
||||
| Heading with explicit type (`## Decision:`) | 0.7 | Very specific structural cue, intentional author marker |
|
||||
| Typed list item (`- [Decision] ...`) | 0.65 | Explicit but often embedded in looser prose |
|
||||
| Sentence pattern (`I prefer X`) | 0.5 | Moderate structure, more false positives |
|
||||
| Regex pattern matching a value+unit (`X = 4.8 kg`) | 0.6 | Structural but prone to coincidence |
|
||||
| LLM-based (future) | variable | Depends on model's returned confidence |
|
||||
|
||||
The candidate's final confidence at write time is:
|
||||
|
||||
```
|
||||
final = prior * structural_signal_multiplier * freshness_bonus
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
- `structural_signal_multiplier` is 1.1 if the source chunk path
|
||||
contains any of `_HIGH_SIGNAL_HINTS` from the retriever (status,
|
||||
decision, requirements, charter, ...) and 0.9 if it contains
|
||||
`_LOW_SIGNAL_HINTS` (`_archive`, `_history`, ...)
|
||||
- `freshness_bonus` is 1.05 if the source chunk was updated in the
|
||||
last 30 days, else 1.0
|
||||
|
||||
This formula is tuned later; the numbers are starting values.
|
||||
|
||||
## Review queue mechanics
|
||||
|
||||
### Queue population
|
||||
|
||||
- Each candidate writes one row into its target table
|
||||
(memories or entities) with `status="candidate"`
|
||||
- Each candidate carries: `rule`, `source_span`, `source_chunk_id`,
|
||||
`source_interaction_id`, `extractor_version`
|
||||
- No two candidates ever share the same (type, normalized_content,
|
||||
project) — if a second extraction pass produces a duplicate, it
|
||||
is dropped before being written
|
||||
|
||||
### Queue surfacing
|
||||
|
||||
- `GET /memory?status=candidate` lists memory candidates
|
||||
- `GET /entities?status=candidate` (future) lists entity candidates
|
||||
- `GET /candidates` (future unified route) lists both
|
||||
|
||||
### Reviewer actions
|
||||
|
||||
For each candidate, exactly one of:
|
||||
|
||||
- **promote**: `POST /memory/{id}/promote` or
|
||||
`POST /entities/{id}/promote`
|
||||
- sets `status="active"`
|
||||
- preserves the audit trail (source_chunk_id, rule, source_span)
|
||||
- **reject**: `POST /memory/{id}/reject` or
|
||||
`POST /entities/{id}/reject`
|
||||
- sets `status="invalid"`
|
||||
- preserves audit trail so repeat extractions don't re-propose
|
||||
- **edit-then-promote**: `PUT /memory/{id}` to adjust content, then
|
||||
`POST /memory/{id}/promote`
|
||||
- every edit is logged, original content preserved in a
|
||||
`previous_content_log` column (schema addition deferred to
|
||||
the first implementation sprint)
|
||||
- **defer**: no action; candidate stays in queue indefinitely
|
||||
(future: add a `pending_since` staleness indicator to the UI)
|
||||
|
||||
### Reviewer authentication
|
||||
|
||||
In V1 the review queue is single-user by convention. There is no
|
||||
per-reviewer authorization. Every promote/reject call is logged
|
||||
with the same default identity. Multi-user review is a V2 concern.
|
||||
|
||||
## Auto-promotion policies (deferred, but designed for)
|
||||
|
||||
The current V1 stance is: **no auto-promotion, ever**. All
|
||||
promotions require a human reviewer.
|
||||
|
||||
The schema and API are designed so that automatic policies can be
|
||||
added later without schema changes. The anticipated policies:
|
||||
|
||||
1. **Reference-count threshold**
|
||||
- If a candidate accumulates N+ references across multiple
|
||||
interactions within M days AND the reviewer hasn't seen it yet
|
||||
(indicating the system sees it often but the human hasn't
|
||||
gotten to it), propose auto-promote
|
||||
- Starting thresholds: N=5, M=7 days. Never auto-promote
|
||||
entity candidates that affect validation claims or decisions
|
||||
without explicit human review — those are too consequential.
|
||||
|
||||
2. **Confidence threshold**
|
||||
- If `final_confidence >= 0.85` AND the rule is a heading
|
||||
rule (not a sentence rule), eligible for auto-promotion
|
||||
|
||||
3. **Identity/preference lane**
|
||||
- identity and preference memories extracted from an
|
||||
interaction where the user explicitly says "I am X" or
|
||||
"I prefer X" with a first-person subject and high-signal
|
||||
verb could auto-promote. This is the safest lane because
|
||||
the user is the authoritative source for their own identity.
|
||||
|
||||
None of these run in V1. The APIs and data shape are designed so
|
||||
they can be added as a separate policy module without disrupting
|
||||
existing tests.
|
||||
|
||||
## Reversibility
|
||||
|
||||
Every promotion step must be undoable:
|
||||
|
||||
| Operation | How to undo |
|
||||
|---------------------------|-------------------------------------------------------|
|
||||
| memory candidate written | delete the candidate row (low-risk, it was never in context) |
|
||||
| memory candidate promoted | `PUT /memory/{id}` status=candidate (reverts to queue) |
|
||||
| memory candidate rejected | `PUT /memory/{id}` status=candidate |
|
||||
| memory graduated | memory stays as a frozen pointer; delete the entity candidate to undo |
|
||||
| entity candidate promoted | `PUT /entities/{id}` status=candidate |
|
||||
| entity promoted to active | supersede with a new active, or `PUT` back to candidate |
|
||||
|
||||
The only irreversible operation is manual curation into L3
|
||||
(trusted project state). That is by design — L3 is small, curated,
|
||||
and human-authored end to end.
|
||||
|
||||
## Provenance (what every candidate must carry)
|
||||
|
||||
Every candidate row, memory or entity, MUST have:
|
||||
|
||||
- `source_chunk_id` — if extracted from ingested content, the chunk it came from
|
||||
- `source_interaction_id` — if extracted from a captured interaction, the interaction it came from
|
||||
- `rule` — the extractor rule id that fired
|
||||
- `extractor_version` — a semver-ish string the extractor module carries
|
||||
so old candidates can be re-evaluated with a newer extractor
|
||||
|
||||
If both `source_chunk_id` and `source_interaction_id` are null, the
|
||||
candidate was hand-authored (via `POST /memory` directly) and must
|
||||
be flagged as such. Hand-authored candidates are allowed but
|
||||
discouraged — the preference is to extract from real content, not
|
||||
dictate candidates directly.
|
||||
|
||||
The active rows inherit all of these fields from their candidate
|
||||
row at promotion time. They are never overwritten.
|
||||
|
||||
## Extractor versioning
|
||||
|
||||
The extractor is going to change — new rules added, old rules
|
||||
refined, precision/recall tuned over time. The promotion flow
|
||||
must survive extractor changes:
|
||||
|
||||
- every extractor module exposes an `EXTRACTOR_VERSION = "0.1.0"`
|
||||
constant
|
||||
- every candidate row records this version
|
||||
- when the extractor version changes, the change log explains
|
||||
what the new rules do
|
||||
- old candidates are NOT automatically re-evaluated by the new
|
||||
extractor — that would lose the auditable history of why the
|
||||
old candidate was created
|
||||
- future `POST /memory/{id}/re-extract` can optionally propose
|
||||
an updated candidate from the same source chunk with the new
|
||||
extractor, but it produces a *new* candidate alongside the old
|
||||
one, never a silent rewrite
|
||||
|
||||
## Ingestion-wave extraction semantics
|
||||
|
||||
When the batched extraction pass fires on an ingestion wave, it
|
||||
produces a report artifact:
|
||||
|
||||
```
|
||||
data/extraction-reports/<wave-id>/
|
||||
├── report.json # summary counts, rule distribution
|
||||
├── candidates.ndjson # one JSON line per persisted candidate
|
||||
├── dropped.ndjson # one JSON line per candidate dropped
|
||||
│ # (over batch cap, duplicate, below
|
||||
│ # min content length, etc.)
|
||||
└── errors.log # any rule-level errors
|
||||
```
|
||||
|
||||
The report artifact lives under the configured `data_dir` and is
|
||||
retained per the backup retention policy. The ingestion-waves doc
|
||||
(`docs/ingestion-waves.md`) is updated to include an "extract"
|
||||
step after each wave, with the expectation that the human
|
||||
reviews the candidates before the next wave fires.
|
||||
|
||||
## Candidate-to-candidate deduplication across passes
|
||||
|
||||
Two extraction passes over the same chunk (or two different
|
||||
chunks containing the same fact) should not produce two identical
|
||||
candidate rows. The deduplication key is:
|
||||
|
||||
```
|
||||
(memory_type_or_entity_type, normalized_content, project, status)
|
||||
```
|
||||
|
||||
Normalization strips whitespace variants, lowercases, and drops
|
||||
trailing punctuation (same rules as the extractor's `_clean_value`
|
||||
function). If a second pass would produce a duplicate, it instead
|
||||
increments a `re_extraction_count` column on the existing
|
||||
candidate row and updates `last_re_extracted_at`. This gives the
|
||||
reviewer a "saw this N times" signal without flooding the queue.
|
||||
|
||||
This column is a future schema addition — current candidates do
|
||||
not track re-extraction. The promotion-rules implementation will
|
||||
land the column as part of its first migration.
|
||||
|
||||
## The "never auto-promote into trusted state" invariant
|
||||
|
||||
Regardless of what auto-promotion policies might exist between
|
||||
L0 → L2', **nothing ever moves into L3 (trusted project state)
|
||||
without explicit human action via `POST /project/state`**. This
|
||||
is the one hard line in the promotion graph and it is enforced
|
||||
by having no API endpoint that takes a candidate id and writes
|
||||
to `project_state`.
|
||||
|
||||
## Summary
|
||||
|
||||
- Four layers: L0 raw, L1 memory candidate/active, L2 entity
|
||||
candidate/active, L3 trusted state
|
||||
- Three triggers for extraction: on capture, on ingestion wave, on
|
||||
explicit request
|
||||
- Per-rule prior confidence, tuned by structural signals at write time
|
||||
- Shared candidate review queue, promote/reject/edit/defer actions
|
||||
- No auto-promotion in V1 (but the schema allows it later)
|
||||
- Every candidate carries full provenance and extractor version
|
||||
- Every promotion step is reversible except L3 curation
|
||||
- L3 is never touched automatically
|
||||
273
docs/architecture/representation-authority.md
Normal file
273
docs/architecture/representation-authority.md
Normal file
@@ -0,0 +1,273 @@
|
||||
# Representation Authority (canonical home matrix)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The same fact about an engineering project can show up in many
|
||||
places: a markdown note in the PKM, a structured field in KB-CAD,
|
||||
a commit message in a Gitea repo, an active memory in AtoCore, an
|
||||
entity in the engineering layer, a row in trusted project state.
|
||||
**Without an explicit rule about which representation is
|
||||
authoritative for which kind of fact, the system will accumulate
|
||||
contradictions and the human will lose trust in all of them.**
|
||||
|
||||
This document is the canonical-home matrix. Every kind of fact
|
||||
that AtoCore handles has exactly one authoritative representation,
|
||||
and every other place that holds a copy of that fact is, by
|
||||
definition, a derived view that may be stale.
|
||||
|
||||
## The representations in scope
|
||||
|
||||
Six places where facts can live in this ecosystem:
|
||||
|
||||
| Layer | What it is | Who edits it | How it's structured |
|
||||
|---|---|---|---|
|
||||
| **PKM** | Antoine's Obsidian-style markdown vault under `/srv/storage/atocore/sources/vault/` | Antoine, by hand | unstructured markdown with optional frontmatter |
|
||||
| **KB project** | the engineering Knowledge Base (KB-CAD / KB-FEM repos and any companion docs) | Antoine, semi-structured | per-tool typed records |
|
||||
| **Gitea repos** | source code repos under `dalidou:3000/Antoine/*` (Fullum-Interferometer, polisher-sim, ATOCore itself, ...) | Antoine via git commits | code, READMEs, repo-specific markdown |
|
||||
| **AtoCore memories** | rows in the `memories` table | hand-authored or extracted from interactions | typed (identity / preference / project / episodic / knowledge / adaptation) |
|
||||
| **AtoCore entities** | rows in the `entities` table (V1, not yet built) | imported from KB exports or extracted from interactions | typed entities + relationships per the V1 ontology |
|
||||
| **AtoCore project state** | rows in the `project_state` table (Layer 3, trusted) | hand-curated only, never automatic | category + key + value |
|
||||
|
||||
## The canonical home rule
|
||||
|
||||
> For each kind of fact, exactly one of the six representations is
|
||||
> the authoritative source. The other five may hold derived
|
||||
> copies, but they are not allowed to disagree with the
|
||||
> authoritative one. When they disagree, the disagreement is a
|
||||
> conflict and surfaces via the conflict model.
|
||||
|
||||
The matrix below assigns the authoritative representation per fact
|
||||
kind. It is the practical answer to the question "where does this
|
||||
fact actually live?" for daily decisions.
|
||||
|
||||
## The canonical-home matrix
|
||||
|
||||
| Fact kind | Canonical home | Why | How it gets into AtoCore |
|
||||
|---|---|---|---|
|
||||
| **CAD geometry** (the actual model) | NX (or successor CAD tool) | the only place that can render and validate it | not in AtoCore at all in V1 |
|
||||
| **CAD-side structure** (subsystem tree, component list, materials, parameters) | KB-CAD | KB-CAD is the structured wrapper around NX | KB-CAD export → `/ingest/kb-cad/export` → entities |
|
||||
| **FEM mesh & solver settings** | KB-FEM (wrapping the FEM tool) | only the solver representation can run | not in AtoCore at all in V1 |
|
||||
| **FEM results & validation outcomes** | KB-FEM | KB-FEM owns the outcome records | KB-FEM export → `/ingest/kb-fem/export` → entities |
|
||||
| **Source code** | Gitea repos | repos are version-controlled and reviewable | indirectly via repo markdown ingestion (Phase 1) |
|
||||
| **Repo-level documentation** (READMEs, design docs in the repo) | Gitea repos | lives next to the code it documents | ingested as source chunks; never hand-edited in AtoCore |
|
||||
| **Project-level prose notes** (decisions in long-form, journal-style entries, working notes) | PKM | the place Antoine actually writes when thinking | ingested as source chunks; the extractor proposes candidates from these for the review queue |
|
||||
| **Identity** ("the user is a mechanical engineer running AtoCore") | AtoCore memories (`identity` type) | nowhere else holds personal identity | hand-authored via `POST /memory` or extracted from interactions |
|
||||
| **Preference** ("prefers small reviewable diffs", "uses SI units") | AtoCore memories (`preference` type) | nowhere else holds personal preferences | hand-authored or extracted |
|
||||
| **Episodic** ("on April 6 we debugged the EXDEV bug") | AtoCore memories (`episodic` type) | nowhere else has time-bound personal recall | extracted from captured interactions |
|
||||
| **Decision** (a structured engineering decision) | AtoCore **entities** (Decision) once the engineering layer ships; AtoCore memories (`adaptation`) until then | needs structured supersession, audit trail, and link to affected components | extracted from PKM or interactions; promoted via review queue |
|
||||
| **Requirement** | AtoCore **entities** (Requirement) | needs structured satisfaction tracking | extracted from PKM, KB-CAD, or interactions |
|
||||
| **Constraint** | AtoCore **entities** (Constraint) | needs structured link to the entity it constrains | extracted from PKM, KB-CAD, or interactions |
|
||||
| **Validation claim** | AtoCore **entities** (ValidationClaim) | needs structured link to supporting Result | extracted from KB-FEM exports or interactions |
|
||||
| **Material** | KB-CAD if the material is on a real component; AtoCore entity (Material) if it's a project-wide material decision not yet attached to geometry | structured properties live in KB-CAD's material database | KB-CAD export, or hand-authored as a Material entity |
|
||||
| **Parameter** | KB-CAD or KB-FEM depending on whether it's a geometry or solver parameter; AtoCore entity (Parameter) if it's a higher-level project parameter not in either tool | structured numeric values with units live in their tool of origin | KB export, or hand-authored |
|
||||
| **Project status / current focus / next milestone** | AtoCore **project_state** (Layer 3) | the trust hierarchy says trusted state is the highest authority for "what is the current state of the project" | hand-curated via `POST /project/state` |
|
||||
| **Architectural decision records (ADRs)** | depends on form: long-form ADR markdown lives in the repo; the structured fact about which ADR was selected lives in the AtoCore Decision entity | both representations are useful for different audiences | repo ingestion provides the prose; the entity is created by extraction or hand-authored |
|
||||
| **Operational runbooks** | repo (next to the code they describe) | lives with the system it operates | not promoted into AtoCore entities — runbooks are reference material, not facts |
|
||||
| **Backup metadata** (snapshot timestamps, integrity status) | the backup-metadata.json files under `/srv/storage/atocore/backups/` | each snapshot is its own self-describing record | not in AtoCore's database; queried via the `/admin/backup` endpoints |
|
||||
| **Conversation history with AtoCore (interactions)** | AtoCore `interactions` table | nowhere else has the prompt + context pack + response triple | written by capture (Phase 9 Commit A) |
|
||||
|
||||
## The supremacy rule for cross-layer facts
|
||||
|
||||
When the same fact has copies in multiple representations and they
|
||||
disagree, the trust hierarchy applies in this order:
|
||||
|
||||
1. **AtoCore project_state** (Layer 3) is highest authority for any
|
||||
"current state of the project" question. This is why it requires
|
||||
manual curation and never gets touched by automatic processes.
|
||||
2. **The tool-of-origin canonical home** is highest authority for
|
||||
facts that are tool-managed: KB-CAD wins over AtoCore entities
|
||||
for CAD-side structure facts; KB-FEM wins for FEM result facts.
|
||||
3. **AtoCore entities** are highest authority for facts that are
|
||||
AtoCore-managed: Decisions, Requirements, Constraints,
|
||||
ValidationClaims (when the supporting Results are still loose).
|
||||
4. **Active AtoCore memories** are highest authority for personal
|
||||
facts (identity, preference, episodic).
|
||||
5. **Source chunks (PKM, repos, ingested docs)** are lowest
|
||||
authority — they are the raw substrate from which higher layers
|
||||
are extracted, but they may be stale, contradictory among
|
||||
themselves, or out of date.
|
||||
|
||||
This is the same hierarchy enforced by `conflict-model.md`. This
|
||||
document just makes it explicit per fact kind.
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1 — "what material does the lateral support pad use?"
|
||||
|
||||
Possible representations:
|
||||
|
||||
- KB-CAD has the field `component.lateral-support-pad.material = "GF-PTFE"`
|
||||
- A PKM note from last month says "considering PEEK for the
|
||||
lateral support, GF-PTFE was the previous choice"
|
||||
- An AtoCore Material entity says `GF-PTFE`
|
||||
- An AtoCore project_state entry says `p05 / decision /
|
||||
lateral_support_material = GF-PTFE`
|
||||
|
||||
Which one wins for the question "what's the current material"?
|
||||
|
||||
- **project_state wins** if the query is "what is the current
|
||||
trusted answer for p05's lateral support material" (Layer 3)
|
||||
- **KB-CAD wins** if project_state has not been curated for this
|
||||
field yet, because KB-CAD is the canonical home for CAD-side
|
||||
structure
|
||||
- **The Material entity** is a derived view from KB-CAD; if it
|
||||
disagrees with KB-CAD, the entity is wrong and a conflict is
|
||||
surfaced
|
||||
- **The PKM note** is historical context, not authoritative for
|
||||
"current"
|
||||
|
||||
### Example 2 — "did we decide to merge the bind mounts?"
|
||||
|
||||
Possible representations:
|
||||
|
||||
- A working session interaction is captured in the `interactions`
|
||||
table with the response containing `## Decision: merge the two
|
||||
bind mounts into one`
|
||||
- The Phase 9 Commit C extractor produced a candidate adaptation
|
||||
memory from that decision
|
||||
- A reviewer promoted the candidate to active
|
||||
- The AtoCore source repo has the actual code change in commit
|
||||
`d0ff8b5` and the docker-compose.yml is in its post-merge form
|
||||
|
||||
Which one wins for "is this decision real and current"?
|
||||
|
||||
- **The Gitea repo** wins for "is this decision implemented" —
|
||||
the docker-compose.yml is the canonical home for the actual
|
||||
bind mount configuration
|
||||
- **The active adaptation memory** wins for "did we decide this"
|
||||
— that's exactly what the Commit C lifecycle is for
|
||||
- **The interaction record** is the audit trail — it's
|
||||
authoritative for "when did this conversation happen and what
|
||||
did the LLM say", but not for "is this decision current"
|
||||
- **The source chunks** from PKM are not relevant here because no
|
||||
PKM note about this decision exists yet (and that's fine —
|
||||
decisions don't have to live in PKM if they live in the repo
|
||||
and the AtoCore memory)
|
||||
|
||||
### Example 3 — "what's p05's current next focus?"
|
||||
|
||||
Possible representations:
|
||||
|
||||
- The PKM has a `current-status.md` note updated last week
|
||||
- AtoCore project_state has `p05 / status / next_focus = "wave 2 ingestion"`
|
||||
- A captured interaction from yesterday discussed the next focus
|
||||
at length
|
||||
|
||||
Which one wins?
|
||||
|
||||
- **project_state wins**, full stop. The trust hierarchy says
|
||||
Layer 3 is canonical for current state. This is exactly the
|
||||
reason project_state exists.
|
||||
- The PKM note is historical context.
|
||||
- The interaction is conversation history.
|
||||
- If project_state and the PKM disagree, the human updates one or
|
||||
the other to bring them in line — usually by re-curating
|
||||
project_state if the conversation revealed a real change.
|
||||
|
||||
## What this means for the engineering layer V1 implementation
|
||||
|
||||
Several concrete consequences fall out of the matrix:
|
||||
|
||||
1. **The Material and Parameter entity types are mostly KB-CAD
|
||||
shadows in V1.** They exist in AtoCore so other entities
|
||||
(Decisions, Requirements) can reference them with structured
|
||||
links, but their authoritative values come from KB-CAD imports.
|
||||
If KB-CAD doesn't know about a material, the AtoCore entity is
|
||||
the canonical home only because nothing else is.
|
||||
2. **Decisions / Requirements / Constraints / ValidationClaims
|
||||
are AtoCore-canonical.** These don't have a natural home in
|
||||
KB-CAD or KB-FEM. They live in AtoCore as first-class entities
|
||||
with full lifecycle and supersession.
|
||||
3. **The PKM is never authoritative.** It is the substrate for
|
||||
extraction. The reviewer promotes things out of it; they don't
|
||||
point at PKM notes as the "current truth".
|
||||
4. **project_state is the override layer.** Whenever the human
|
||||
wants to declare "the current truth is X regardless of what
|
||||
the entities and memories and KB exports say", they curate
|
||||
into project_state. Layer 3 is intentionally small and
|
||||
intentionally manual.
|
||||
5. **The conflict model is the enforcement mechanism.** When two
|
||||
representations disagree on a fact whose canonical home rule
|
||||
should pick a winner, the conflict surfaces via the
|
||||
`/conflicts` endpoint and the reviewer resolves it. The
|
||||
matrix in this document tells the reviewer who is supposed
|
||||
to win in each scenario; they're not making the decision blind.
|
||||
|
||||
## What the matrix does NOT define
|
||||
|
||||
1. **Facts about people other than the user.** No "team member"
|
||||
entity, no per-collaborator preferences. AtoCore is
|
||||
single-user in V1.
|
||||
2. **Facts about AtoCore itself as a project.** Those are project
|
||||
memories and project_state entries under `project=atocore`,
|
||||
same lifecycle as any other project's facts.
|
||||
3. **Vendor / supplier / cost facts.** Out of V1 scope.
|
||||
4. **Time-bounded facts** (a value that was true between two
|
||||
dates and may not be true now). The current matrix treats all
|
||||
active facts as currently-true and uses supersession to
|
||||
represent change. Temporal facts are a V2 concern.
|
||||
5. **Cross-project shared facts** (a Material that is reused across
|
||||
p04, p05, and p06). Currently each project has its own copy.
|
||||
Cross-project deduplication is also a V2 concern.
|
||||
|
||||
## The "single canonical home" invariant in practice
|
||||
|
||||
The hard rule that every fact has exactly one canonical home is
|
||||
the load-bearing invariant of this matrix. To enforce it
|
||||
operationally:
|
||||
|
||||
- **Extraction never duplicates.** When the extractor scans an
|
||||
interaction or a source chunk and proposes a candidate, the
|
||||
candidate is dropped if it duplicates an already-active record
|
||||
in the canonical home (the existing extractor implementation
|
||||
already does this for memories; the entity extractor will
|
||||
follow the same pattern).
|
||||
- **Imports never duplicate.** When KB-CAD pushes the same
|
||||
Component twice with the same value, the second push is
|
||||
recognized as identical and updates the `last_imported_at`
|
||||
timestamp without creating a new entity.
|
||||
- **Imports surface drift as conflict.** When KB-CAD pushes the
|
||||
same Component with a different value, that's a conflict per
|
||||
the conflict model — never a silent overwrite.
|
||||
- **Hand-curation into project_state always wins.** A
|
||||
project_state entry can disagree with an entity or a KB
|
||||
export; the project_state entry is correct by fiat (Layer 3
|
||||
trust), and the reviewer is responsible for bringing the lower
|
||||
layers in line if appropriate.
|
||||
|
||||
## Open questions for V1 implementation
|
||||
|
||||
1. **How does the reviewer see the canonical home for a fact in
|
||||
the UI?** Probably by including the fact's authoritative
|
||||
layer in the entity / memory detail view: "this Material is
|
||||
currently mirrored from KB-CAD; the canonical home is KB-CAD".
|
||||
2. **Who owns running the KB-CAD / KB-FEM exporter?** The
|
||||
`tool-handoff-boundaries.md` doc lists this as an open
|
||||
question; same answer applies here.
|
||||
3. **Do we need an explicit `canonical_home` field on entity
|
||||
rows?** A field that records "this entity is canonical here"
|
||||
vs "this entity is a mirror of <external system>". Probably
|
||||
yes; deferred to the entity schema spec.
|
||||
4. **How are project_state overrides surfaced in the engineering
|
||||
layer query results?** When a query (e.g. Q-001 "what does
|
||||
this subsystem contain?") would return entity rows, the result
|
||||
should also flag any project_state entries that contradict the
|
||||
entities — letting the reviewer see the override at query
|
||||
time, not just in the conflict queue.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- Six representation layers: PKM, KB project, repos, AtoCore
|
||||
memories, AtoCore entities, AtoCore project_state
|
||||
- Every fact kind has exactly one canonical home
|
||||
- The trust hierarchy resolves cross-layer conflicts:
|
||||
project_state > tool-of-origin (KB-CAD/KB-FEM) > entities >
|
||||
active memories > source chunks
|
||||
- Decisions / Requirements / Constraints / ValidationClaims are
|
||||
AtoCore-canonical (no other system has a natural home for them)
|
||||
- Materials / Parameters / CAD-side structure are KB-CAD-canonical
|
||||
- FEM results / validation outcomes are KB-FEM-canonical
|
||||
- project_state is the human override layer, top of the
|
||||
hierarchy, manually curated only
|
||||
- Conflicts surface via `/conflicts` and the reviewer applies the
|
||||
matrix to pick a winner
|
||||
339
docs/architecture/tool-handoff-boundaries.md
Normal file
339
docs/architecture/tool-handoff-boundaries.md
Normal file
@@ -0,0 +1,339 @@
|
||||
# Tool Hand-off Boundaries (KB-CAD / KB-FEM and friends)
|
||||
|
||||
## Why this document exists
|
||||
|
||||
The engineering layer V1 will accumulate typed entities about
|
||||
projects, subsystems, components, materials, requirements,
|
||||
constraints, decisions, parameters, analysis models, results, and
|
||||
validation claims. Many of those concepts also live in real
|
||||
external tools — CAD systems, FEM solvers, BOM managers, PLM
|
||||
databases, vendor portals.
|
||||
|
||||
The first big design decision before writing any entity-layer code
|
||||
is: **what is AtoCore's read/write relationship with each of those
|
||||
external tools?**
|
||||
|
||||
The wrong answer in either direction is expensive:
|
||||
|
||||
- Too read-only: AtoCore becomes a stale shadow of the tools and
|
||||
loses the trust battle the moment a value drifts.
|
||||
- Too bidirectional: AtoCore takes on responsibilities it can't
|
||||
reliably honor (live sync, conflict resolution against external
|
||||
schemas, write-back validation), and the project never ships.
|
||||
|
||||
This document picks a position for V1.
|
||||
|
||||
## The position
|
||||
|
||||
> **AtoCore is a one-way mirror in V1.** External tools push
|
||||
> structured exports into AtoCore. AtoCore never pushes back.
|
||||
|
||||
That position has three corollaries:
|
||||
|
||||
1. **External tools remain the source of truth for everything they
|
||||
already manage.** A CAD model is canonical for geometry; a FEM
|
||||
project is canonical for meshes and solver settings; KB-CAD is
|
||||
canonical for whatever KB-CAD already calls canonical.
|
||||
2. **AtoCore is the source of truth for the *AtoCore-shaped*
|
||||
record** of those facts: the Decision that selected the geometry,
|
||||
the Requirement the geometry satisfies, the ValidationClaim the
|
||||
FEM result supports. AtoCore does not duplicate the external
|
||||
tool's primary representation; it stores the structured *facts
|
||||
about* it.
|
||||
3. **The boundary is enforced by absence.** No write endpoint in
|
||||
AtoCore ever generates a `.prt`, a `.fem`, an export to a PLM
|
||||
schema, or a vendor purchase order. If we find ourselves wanting
|
||||
to add such an endpoint in V1, we should stop and reconsider
|
||||
the V1 scope.
|
||||
|
||||
## Why one-way and not bidirectional
|
||||
|
||||
Bidirectional sync between independent systems is one of the
|
||||
hardest problems in engineering software. The honest reasons we
|
||||
are not attempting it in V1:
|
||||
|
||||
1. **Schema drift.** External tools evolve their schemas
|
||||
independently. A bidirectional sync would have to track every
|
||||
schema version of every external tool we touch. That is a
|
||||
permanent maintenance tax.
|
||||
2. **Conflict semantics.** When AtoCore and an external tool
|
||||
disagree on the same field, "who wins" is a per-tool, per-field
|
||||
decision. There is no general rule. Bidirectional sync would
|
||||
require us to specify that decision exhaustively.
|
||||
3. **Trust hierarchy.** AtoCore's whole point is the trust
|
||||
hierarchy: trusted project state > entities > memories. If we
|
||||
let entities push values back into the external tools, we
|
||||
silently elevate AtoCore's confidence to "high enough to write
|
||||
to a CAD model", which it almost never deserves.
|
||||
4. **Velocity.** A bidirectional engineering layer is a
|
||||
multi-year project. A one-way mirror is a months project. The
|
||||
value-to-effort ratio favors one-way for V1 by an enormous
|
||||
margin.
|
||||
5. **Reversibility.** We can always add bidirectional sync later
|
||||
on a per-tool basis once V1 has shown itself to be useful. We
|
||||
cannot easily walk back a half-finished bidirectional sync that
|
||||
has already corrupted data in someone's CAD model.
|
||||
|
||||
## Per-tool stance for V1
|
||||
|
||||
| External tool | V1 stance | What AtoCore reads in | What AtoCore writes back |
|
||||
|---|---|---|---|
|
||||
| **KB-CAD** (Antoine's CAD knowledge base) | one-way mirror | structured exports of subsystems, components, materials, parameters via a documented JSON or CSV shape | nothing |
|
||||
| **KB-FEM** (Antoine's FEM knowledge base) | one-way mirror | structured exports of analysis models, results, validation claims | nothing |
|
||||
| **NX / Siemens NX** (the CAD tool itself) | not connected in V1 | nothing direct — only what KB-CAD exports about NX projects | nothing |
|
||||
| **PKM (Obsidian / markdown vault)** | already connected via the ingestion pipeline (Phase 1) | full markdown/text corpus per the ingestion-waves doc | nothing |
|
||||
| **Gitea repos** | already connected via the ingestion pipeline | repo markdown/text per project | nothing |
|
||||
| **OpenClaw** (the LLM agent) | already connected via the read-only helper skill on the T420 | nothing — OpenClaw reads from AtoCore | nothing — OpenClaw does not write into AtoCore |
|
||||
| **AtoDrive** (operational truth layer, future) | future: bidirectional with AtoDrive itself, but AtoDrive is internal to AtoCore so this isn't an external tool boundary | n/a in V1 | n/a in V1 |
|
||||
| **PLM / vendor portals / cost systems** | not in V1 scope | nothing | nothing |
|
||||
|
||||
## What "one-way mirror" actually looks like in code
|
||||
|
||||
AtoCore exposes an ingestion endpoint per external tool that
|
||||
accepts a structured export and turns it into entity candidates.
|
||||
The endpoint is read-side from AtoCore's perspective (it reads
|
||||
from a file or HTTP body), even though the external tool is the
|
||||
one initiating the call.
|
||||
|
||||
Proposed V1 ingestion endpoints:
|
||||
|
||||
```
|
||||
POST /ingest/kb-cad/export body: KB-CAD export JSON
|
||||
POST /ingest/kb-fem/export body: KB-FEM export JSON
|
||||
```
|
||||
|
||||
Each endpoint:
|
||||
|
||||
1. Validates the export against the documented schema
|
||||
2. Maps each export record to an entity candidate (status="candidate")
|
||||
3. Carries the export's source identifier into the candidate's
|
||||
provenance fields (source_artifact_id, exporter_version, etc.)
|
||||
4. Returns a summary: how many candidates were created, how many
|
||||
were dropped as duplicates, how many failed schema validation
|
||||
5. Does NOT auto-promote anything
|
||||
|
||||
The KB-CAD and KB-FEM teams (which is to say, future-you) own the
|
||||
exporter scripts that produce these JSON bodies. Those scripts
|
||||
live in the KB-CAD / KB-FEM repos respectively, not in AtoCore.
|
||||
|
||||
## The export schemas (sketch, not final)
|
||||
|
||||
These are starting shapes, intentionally minimal. The schemas
|
||||
will be refined in `kb-cad-export-schema.md` and
|
||||
`kb-fem-export-schema.md` once the V1 ontology lands.
|
||||
|
||||
### KB-CAD export shape (starting sketch)
|
||||
|
||||
```json
|
||||
{
|
||||
"exporter": "kb-cad",
|
||||
"exporter_version": "1.0.0",
|
||||
"exported_at": "2026-04-07T12:00:00Z",
|
||||
"project": "p05-interferometer",
|
||||
"subsystems": [
|
||||
{
|
||||
"id": "subsystem.optical-frame",
|
||||
"name": "Optical frame",
|
||||
"parent": null,
|
||||
"components": [
|
||||
{
|
||||
"id": "component.lateral-support-pad",
|
||||
"name": "Lateral support pad",
|
||||
"material": "GF-PTFE",
|
||||
"parameters": {
|
||||
"thickness_mm": 3.0,
|
||||
"preload_n": 12.0
|
||||
},
|
||||
"source_artifact": "kb-cad://p05/subsystems/optical-frame#lateral-support"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### KB-FEM export shape (starting sketch)
|
||||
|
||||
```json
|
||||
{
|
||||
"exporter": "kb-fem",
|
||||
"exporter_version": "1.0.0",
|
||||
"exported_at": "2026-04-07T12:00:00Z",
|
||||
"project": "p05-interferometer",
|
||||
"analysis_models": [
|
||||
{
|
||||
"id": "model.optical-frame-modal",
|
||||
"name": "Optical frame modal analysis v3",
|
||||
"subsystem": "subsystem.optical-frame",
|
||||
"results": [
|
||||
{
|
||||
"id": "result.first-mode-frequency",
|
||||
"name": "First-mode frequency",
|
||||
"value": 187.4,
|
||||
"unit": "Hz",
|
||||
"supports_validation_claim": "claim.frame-rigidity-min-150hz",
|
||||
"source_artifact": "kb-fem://p05/models/optical-frame-modal#first-mode"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
These shapes will evolve. The point of including them now is to
|
||||
make the one-way mirror concrete: it is a small, well-defined
|
||||
JSON shape, not "AtoCore reaches into KB-CAD's database".
|
||||
|
||||
## What AtoCore is allowed to do with the imported records
|
||||
|
||||
After ingestion, the imported records become entity candidates
|
||||
in AtoCore's own table. From that point forward they follow the
|
||||
exact same lifecycle as any other candidate:
|
||||
|
||||
- they sit at status="candidate" until a human reviews them
|
||||
- the reviewer promotes them to status="active" or rejects them
|
||||
- the active entities are queryable via the engineering query
|
||||
catalog (Q-001 through Q-020)
|
||||
- the active entities can be referenced from Decisions, Requirements,
|
||||
ValidationClaims, etc. via the V1 relationship types
|
||||
|
||||
The imported records are never automatically pushed into trusted
|
||||
project state, never modified in place after import (they are
|
||||
superseded by re-imports, not edited), and never written back to
|
||||
the external tool.
|
||||
|
||||
## What happens when KB-CAD changes a value AtoCore already has
|
||||
|
||||
This is the canonical "drift" scenario. The flow:
|
||||
|
||||
1. KB-CAD exports a fresh JSON. Component `component.lateral-support-pad`
|
||||
now has `material: "PEEK"` instead of `material: "GF-PTFE"`.
|
||||
2. AtoCore's ingestion endpoint sees the same `id` and a different
|
||||
value.
|
||||
3. The ingestion endpoint creates a new entity candidate with the
|
||||
new value, **does NOT delete or modify the existing active
|
||||
entity**, and creates a `conflicts` row linking the two members
|
||||
(per the conflict model doc).
|
||||
4. The reviewer sees an open conflict on the next visit to
|
||||
`/conflicts`.
|
||||
5. The reviewer either:
|
||||
- **promotes the new value** (the active is superseded, the
|
||||
candidate becomes the new active, the audit trail keeps both)
|
||||
- **rejects the new value** (the candidate is invalidated, the
|
||||
active stays — useful when the export was wrong)
|
||||
- **dismisses the conflict** (declares them not actually about
|
||||
the same thing, both stay active)
|
||||
|
||||
The reviewer never touches KB-CAD from AtoCore. If the resolution
|
||||
implies a change in KB-CAD itself, the reviewer makes that change
|
||||
in KB-CAD, then re-exports.
|
||||
|
||||
## What about NX directly?
|
||||
|
||||
NX (Siemens NX) is the underlying CAD tool that KB-CAD wraps.
|
||||
**NX is not connected to AtoCore in V1.** Any facts about NX
|
||||
projects flow through KB-CAD as the structured intermediate. This
|
||||
gives us:
|
||||
|
||||
- **One schema to maintain.** AtoCore only has to understand the
|
||||
KB-CAD export shape, not the NX API.
|
||||
- **One ownership boundary.** KB-CAD owns the question of "what's
|
||||
in NX". AtoCore owns the question of "what's in the typed
|
||||
knowledge base".
|
||||
- **Future flexibility.** When NX is replaced or upgraded, only
|
||||
KB-CAD has to adapt; AtoCore doesn't notice.
|
||||
|
||||
The same logic applies to FEM solvers (Nastran, Abaqus, ANSYS):
|
||||
KB-FEM is the structured intermediate, AtoCore never talks to the
|
||||
solver directly.
|
||||
|
||||
## The hard-line invariants
|
||||
|
||||
These are the things V1 will not do, regardless of how convenient
|
||||
they might seem:
|
||||
|
||||
1. **No write to external tools.** No POST/PUT/PATCH to any
|
||||
external API, no file generation that gets written into a
|
||||
CAD/FEM project tree, no email/chat sends.
|
||||
2. **No live polling.** AtoCore does not poll KB-CAD or KB-FEM on
|
||||
a schedule. Imports are explicit pushes from the external tool
|
||||
into AtoCore's ingestion endpoint.
|
||||
3. **No silent merging.** Every value drift surfaces as a
|
||||
conflict for the reviewer (per the conflict model doc).
|
||||
4. **No schema fan-out.** AtoCore does not store every field that
|
||||
KB-CAD knows about. Only fields that map to one of the V1
|
||||
entity types make it into AtoCore. Everything else is dropped
|
||||
at the import boundary.
|
||||
5. **No external-tool-specific logic in entity types.** A
|
||||
`Component` in AtoCore is the same shape regardless of whether
|
||||
it came from KB-CAD, KB-FEM, the PKM, or a hand-curated
|
||||
project state entry. The source is recorded in provenance,
|
||||
not in the entity shape.
|
||||
|
||||
## What this enables
|
||||
|
||||
With the one-way mirror locked in, V1 implementation can focus on:
|
||||
|
||||
- The entity table and its lifecycle
|
||||
- The two `/ingest/kb-cad/export` and `/ingest/kb-fem/export`
|
||||
endpoints with their JSON validators
|
||||
- The candidate review queue extension (already designed in
|
||||
`promotion-rules.md`)
|
||||
- The conflict model (already designed in `conflict-model.md`)
|
||||
- The query catalog implementation (already designed in
|
||||
`engineering-query-catalog.md`)
|
||||
|
||||
None of those are unbounded. Each is a finite, well-defined
|
||||
implementation task. The one-way mirror is the choice that makes
|
||||
V1 finishable.
|
||||
|
||||
## What V2 might consider (deferred)
|
||||
|
||||
After V1 has been live and demonstrably useful for a quarter or
|
||||
two, the questions that become reasonable to revisit:
|
||||
|
||||
1. **Selective write-back to KB-CAD for low-risk fields.** For
|
||||
example, AtoCore could push back a "Decision id linked to this
|
||||
component" annotation that KB-CAD then displays without it
|
||||
being canonical there. Read-only annotations from AtoCore's
|
||||
perspective, advisory metadata from KB-CAD's perspective.
|
||||
2. **Live polling for very small payloads.** A daily poll of
|
||||
"what subsystem ids exist in KB-CAD now" so AtoCore can flag
|
||||
subsystems that disappeared from KB-CAD without an explicit
|
||||
AtoCore invalidation.
|
||||
3. **Direct NX integration** if the KB-CAD layer becomes a
|
||||
bottleneck — but only if the friction is real, not theoretical.
|
||||
4. **Cost / vendor / PLM connections** for projects where the
|
||||
procurement cycle is part of the active engineering work.
|
||||
|
||||
None of these are V1 work and they are listed only so the V1
|
||||
design intentionally leaves room for them later.
|
||||
|
||||
## Open questions for the V1 implementation sprint
|
||||
|
||||
1. **Where do the export schemas live?** Probably in
|
||||
`docs/architecture/kb-cad-export-schema.md` and
|
||||
`docs/architecture/kb-fem-export-schema.md`, drafted during
|
||||
the implementation sprint.
|
||||
2. **Who runs the exporter?** A scheduled job on the KB-CAD /
|
||||
KB-FEM hosts, triggered by the human after a meaningful
|
||||
change, or both?
|
||||
3. **Is the export incremental or full?** Full is simpler but
|
||||
more expensive. Incremental needs delta semantics. V1 starts
|
||||
with full and revisits when full becomes too slow.
|
||||
4. **How is the exporter authenticated to AtoCore?** Probably
|
||||
the existing PAT model (one PAT per exporter, scoped to
|
||||
`write:engineering-import` once that scope exists). Worth a
|
||||
quick auth design pass before the endpoints exist.
|
||||
|
||||
## TL;DR
|
||||
|
||||
- AtoCore is a one-way mirror in V1: external tools push,
|
||||
AtoCore reads, AtoCore never writes back
|
||||
- Two import endpoints for V1: KB-CAD and KB-FEM, each with a
|
||||
documented JSON export shape
|
||||
- Drift surfaces as conflicts in the existing conflict model
|
||||
- No NX, no FEM solvers, no PLM, no vendor portals, no
|
||||
cost/BOM systems in V1
|
||||
- Bidirectional sync is reserved for V2+ on a per-tool basis,
|
||||
only after V1 demonstrates value
|
||||
155
docs/atocore-ecosystem-and-hosting.md
Normal file
155
docs/atocore-ecosystem-and-hosting.md
Normal file
@@ -0,0 +1,155 @@
|
||||
# AtoCore Ecosystem And Hosting
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the intended boundaries between the Ato ecosystem layers
|
||||
and the current hosting model.
|
||||
|
||||
## Ecosystem Roles
|
||||
|
||||
- `AtoCore`
|
||||
- runtime, ingestion, retrieval, memory, context builder, API
|
||||
- owns the machine-memory and context assembly system
|
||||
- `AtoMind`
|
||||
- future intelligence layer
|
||||
- will own promotion, reflection, conflict handling, and trust decisions
|
||||
- `AtoVault`
|
||||
- human-readable memory source
|
||||
- intended for Obsidian and manual inspection/editing
|
||||
- `AtoDrive`
|
||||
- trusted operational project source
|
||||
- curated project truth with higher trust than general notes
|
||||
|
||||
## Trust Model
|
||||
|
||||
Current intended trust precedence:
|
||||
|
||||
1. Trusted Project State
|
||||
2. AtoDrive artifacts
|
||||
3. Recent validated memory
|
||||
4. AtoVault summaries
|
||||
5. PKM chunks
|
||||
6. Historical or low-confidence material
|
||||
|
||||
## Storage Boundaries
|
||||
|
||||
Human-readable source layers and machine operational storage must remain
|
||||
separate.
|
||||
|
||||
- `AtoVault` is a source layer, not the live vector database
|
||||
- `AtoDrive` is a source layer, not the live vector database
|
||||
- machine operational state includes:
|
||||
- SQLite database
|
||||
- vector store
|
||||
- indexes
|
||||
- embeddings
|
||||
- runtime metadata
|
||||
- cache and temp artifacts
|
||||
|
||||
The machine database is derived operational state, not the primary
|
||||
human-readable source of truth.
|
||||
|
||||
## Source Snapshot Vs Machine Store
|
||||
|
||||
The human-readable files visible under `sources/vault` or `sources/drive` are
|
||||
not the final "smart storage" format of AtoCore.
|
||||
|
||||
They are source snapshots made visible to the canonical Dalidou instance so
|
||||
AtoCore can ingest them.
|
||||
|
||||
The actual machine-processed state lives in:
|
||||
|
||||
- `source_documents`
|
||||
- `source_chunks`
|
||||
- vector embeddings and indexes
|
||||
- project memories
|
||||
- trusted project state
|
||||
- context-builder output
|
||||
|
||||
This means the staged markdown can still look very similar to the original PKM
|
||||
or repo docs. That is normal.
|
||||
|
||||
The intelligence does not come from rewriting everything into a new markdown
|
||||
vault. It comes from ingesting selected source material into the machine store
|
||||
and then using that store for retrieval, trust-aware context assembly, and
|
||||
memory.
|
||||
|
||||
## Canonical Hosting Model
|
||||
|
||||
Dalidou is the canonical host for the AtoCore service and machine database.
|
||||
|
||||
OpenClaw on the T420 should consume AtoCore over API and network, ideally over
|
||||
Tailscale or another trusted internal network path.
|
||||
|
||||
The live SQLite and vector store must not be treated as a multi-node synced
|
||||
filesystem. The architecture should prefer one canonical running service over
|
||||
file replication of the live machine store.
|
||||
|
||||
## Canonical Dalidou Layout
|
||||
|
||||
```text
|
||||
/srv/storage/atocore/
|
||||
app/ # deployed AtoCore repository
|
||||
data/ # canonical machine state
|
||||
db/
|
||||
chroma/
|
||||
cache/
|
||||
tmp/
|
||||
sources/ # human-readable source inputs
|
||||
vault/
|
||||
drive/
|
||||
logs/
|
||||
backups/
|
||||
run/
|
||||
```
|
||||
|
||||
## Operational Rules
|
||||
|
||||
- source directories are treated as read-only by the AtoCore runtime
|
||||
- Dalidou holds the canonical machine DB
|
||||
- OpenClaw should use AtoCore as an additive context service
|
||||
- OpenClaw must continue to work if AtoCore is unavailable
|
||||
- write-back from OpenClaw into AtoCore is deferred until later phases
|
||||
|
||||
Current staging behavior:
|
||||
|
||||
- selected project docs may be copied into a readable staging area on Dalidou
|
||||
- AtoCore ingests from that staging area into the machine store
|
||||
- the staging area is not itself the durable intelligence layer
|
||||
- changes to the original PKM or repo source do not propagate automatically
|
||||
until a refresh or re-ingest happens
|
||||
|
||||
## Intended Daily Operating Model
|
||||
|
||||
The target workflow is:
|
||||
|
||||
- the human continues to work primarily in PKM project notes, Git/Gitea repos,
|
||||
Discord, and normal OpenClaw sessions
|
||||
- OpenClaw keeps its own runtime behavior and memory system
|
||||
- AtoCore acts as the durable external context layer that compiles trusted
|
||||
project state, retrieval, and long-lived machine-readable context
|
||||
- AtoCore improves prompt quality and robustness without replacing direct repo
|
||||
work, direct file reads, or OpenClaw's own memory
|
||||
|
||||
In other words:
|
||||
|
||||
- PKM and repos remain the human-authoritative project sources
|
||||
- OpenClaw remains the active operating environment
|
||||
- AtoCore remains the compiled context engine and machine-memory host
|
||||
|
||||
## Current Status
|
||||
|
||||
As of the current implementation pass:
|
||||
|
||||
- the AtoCore runtime is deployed on Dalidou
|
||||
- the canonical machine-data layout exists on Dalidou
|
||||
- the service is running from Dalidou
|
||||
- the T420/OpenClaw machine can reach AtoCore over network
|
||||
- a first read-only OpenClaw-side helper exists
|
||||
- the live corpus now includes initial AtoCore self-knowledge and a first
|
||||
curated batch for active projects
|
||||
- the long-term content corpus still needs broader project and vault ingestion
|
||||
|
||||
This means the platform is hosted on Dalidou now, the first cross-machine
|
||||
integration path exists, and the live content corpus is partially populated but
|
||||
not yet fully ingested.
|
||||
442
docs/backup-restore-procedure.md
Normal file
442
docs/backup-restore-procedure.md
Normal file
@@ -0,0 +1,442 @@
|
||||
# AtoCore Backup and Restore Procedure
|
||||
|
||||
## Scope
|
||||
|
||||
This document defines the operational procedure for backing up and
|
||||
restoring AtoCore's machine state on the Dalidou deployment. It is
|
||||
the practical companion to `docs/backup-strategy.md` (which defines
|
||||
the strategy) and `src/atocore/ops/backup.py` (which implements the
|
||||
mechanics).
|
||||
|
||||
The intent is that this procedure can be followed by anyone with
|
||||
SSH access to Dalidou and the AtoCore admin endpoints.
|
||||
|
||||
## What gets backed up
|
||||
|
||||
A `create_runtime_backup` snapshot contains, in order of importance:
|
||||
|
||||
| Artifact | Source path on Dalidou | Backup destination | Always included |
|
||||
|---|---|---|---|
|
||||
| SQLite database | `/srv/storage/atocore/data/db/atocore.db` | `<backup_root>/db/atocore.db` | yes |
|
||||
| Project registry JSON | `/srv/storage/atocore/config/project-registry.json` | `<backup_root>/config/project-registry.json` | yes (if file exists) |
|
||||
| Backup metadata | (generated) | `<backup_root>/backup-metadata.json` | yes |
|
||||
| Chroma vector store | `/srv/storage/atocore/data/chroma/` | `<backup_root>/chroma/` | only when `include_chroma=true` |
|
||||
|
||||
The SQLite snapshot uses the online `conn.backup()` API and is safe
|
||||
to take while the database is in use. The Chroma snapshot is a cold
|
||||
directory copy and is **only safe when no ingestion is running**;
|
||||
the API endpoint enforces this by acquiring the ingestion lock for
|
||||
the duration of the copy.
|
||||
|
||||
What is **not** in the backup:
|
||||
|
||||
- Source documents under `/srv/storage/atocore/sources/vault/` and
|
||||
`/srv/storage/atocore/sources/drive/`. These are read-only
|
||||
inputs and live in the user's PKM/Drive, which is backed up
|
||||
separately by their own systems.
|
||||
- Application code. The container image is the source of truth for
|
||||
code; recovery means rebuilding the image, not restoring code from
|
||||
a backup.
|
||||
- Logs under `/srv/storage/atocore/logs/`.
|
||||
- Embeddings cache under `/srv/storage/atocore/data/cache/`.
|
||||
- Temp files under `/srv/storage/atocore/data/tmp/`.
|
||||
|
||||
## Backup root layout
|
||||
|
||||
Each backup snapshot lives in its own timestamped directory:
|
||||
|
||||
```
|
||||
/srv/storage/atocore/backups/snapshots/
|
||||
├── 20260407T060000Z/
|
||||
│ ├── backup-metadata.json
|
||||
│ ├── db/
|
||||
│ │ └── atocore.db
|
||||
│ ├── config/
|
||||
│ │ └── project-registry.json
|
||||
│ └── chroma/ # only if include_chroma=true
|
||||
│ └── ...
|
||||
├── 20260408T060000Z/
|
||||
│ └── ...
|
||||
└── ...
|
||||
```
|
||||
|
||||
The timestamp is UTC, format `YYYYMMDDTHHMMSSZ`.
|
||||
|
||||
## Triggering a backup
|
||||
|
||||
### Option A — via the admin endpoint (preferred)
|
||||
|
||||
```bash
|
||||
# DB + registry only (fast, safe at any time)
|
||||
curl -fsS -X POST http://dalidou:8100/admin/backup \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"include_chroma": false}'
|
||||
|
||||
# DB + registry + Chroma (acquires ingestion lock)
|
||||
curl -fsS -X POST http://dalidou:8100/admin/backup \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"include_chroma": true}'
|
||||
```
|
||||
|
||||
The response is the backup metadata JSON. Save the `backup_root`
|
||||
field — that's the directory the snapshot was written to.
|
||||
|
||||
### Option B — via the standalone script (when the API is down)
|
||||
|
||||
```bash
|
||||
docker exec atocore python -m atocore.ops.backup
|
||||
```
|
||||
|
||||
This runs `create_runtime_backup()` directly, without going through
|
||||
the API or the ingestion lock. Use it only when the AtoCore service
|
||||
itself is unhealthy and you can't hit the admin endpoint.
|
||||
|
||||
### Option C — manual file copy (last resort)
|
||||
|
||||
If both the API and the standalone script are unusable:
|
||||
|
||||
```bash
|
||||
sudo systemctl stop atocore # or: docker compose stop atocore
|
||||
sudo cp /srv/storage/atocore/data/db/atocore.db \
|
||||
/srv/storage/atocore/backups/manual-$(date -u +%Y%m%dT%H%M%SZ).db
|
||||
sudo cp /srv/storage/atocore/config/project-registry.json \
|
||||
/srv/storage/atocore/backups/manual-$(date -u +%Y%m%dT%H%M%SZ).registry.json
|
||||
sudo systemctl start atocore
|
||||
```
|
||||
|
||||
This is a cold backup and requires brief downtime.
|
||||
|
||||
## Listing backups
|
||||
|
||||
```bash
|
||||
curl -fsS http://dalidou:8100/admin/backup
|
||||
```
|
||||
|
||||
Returns the configured `backup_dir` and a list of all snapshots
|
||||
under it, with their full metadata if available.
|
||||
|
||||
Or, on the host directly:
|
||||
|
||||
```bash
|
||||
ls -la /srv/storage/atocore/backups/snapshots/
|
||||
```
|
||||
|
||||
## Validating a backup
|
||||
|
||||
Before relying on a backup for restore, validate it:
|
||||
|
||||
```bash
|
||||
curl -fsS http://dalidou:8100/admin/backup/20260407T060000Z/validate
|
||||
```
|
||||
|
||||
The validator:
|
||||
- confirms the snapshot directory exists
|
||||
- opens the SQLite snapshot and runs `PRAGMA integrity_check`
|
||||
- parses the registry JSON
|
||||
- confirms the Chroma directory exists (if it was included)
|
||||
|
||||
A valid backup returns `"valid": true` and an empty `errors` array.
|
||||
A failing validation returns `"valid": false` with one or more
|
||||
specific error strings (e.g. `db_integrity_check_failed`,
|
||||
`registry_invalid_json`, `chroma_snapshot_missing`).
|
||||
|
||||
**Validate every backup at creation time.** A backup that has never
|
||||
been validated is not actually a backup — it's just a hopeful copy
|
||||
of bytes.
|
||||
|
||||
## Restore procedure
|
||||
|
||||
Since 2026-04-09 the restore is implemented as a proper module
|
||||
function plus CLI entry point: `restore_runtime_backup()` in
|
||||
`src/atocore/ops/backup.py`, invoked as
|
||||
`python -m atocore.ops.backup restore <STAMP> --confirm-service-stopped`.
|
||||
It automatically takes a pre-restore safety snapshot (your rollback
|
||||
anchor), handles SQLite WAL/SHM cleanly, restores the registry, and
|
||||
runs `PRAGMA integrity_check` on the restored db. This replaces the
|
||||
earlier manual `sudo cp` sequence.
|
||||
|
||||
The function refuses to run without `--confirm-service-stopped`.
|
||||
This is deliberate: hot-restoring into a running service corrupts
|
||||
SQLite state.
|
||||
|
||||
### Pre-flight (always)
|
||||
|
||||
1. Identify which snapshot you want to restore. List available
|
||||
snapshots and pick by timestamp:
|
||||
```bash
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup | jq '.backups[].stamp'
|
||||
```
|
||||
2. Validate it. Refuse to restore an invalid backup:
|
||||
```bash
|
||||
STAMP=20260409T060000Z
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup/$STAMP/validate | jq .
|
||||
```
|
||||
3. **Stop AtoCore.** SQLite cannot be hot-restored under a running
|
||||
process and Chroma will not pick up new files until the process
|
||||
restarts.
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose down
|
||||
docker compose ps # atocore should be Exited/gone
|
||||
```
|
||||
|
||||
### Run the restore
|
||||
|
||||
Use a one-shot container that reuses the live service's volume
|
||||
mounts so every path (`db_path`, `chroma_path`, backup dir) resolves
|
||||
to the same place the main service would see:
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore \
|
||||
$STAMP \
|
||||
--confirm-service-stopped
|
||||
```
|
||||
|
||||
Output is a JSON document. The critical fields:
|
||||
|
||||
- `pre_restore_snapshot`: stamp of the safety snapshot of live
|
||||
state taken right before the restore. **Write this down.** If
|
||||
the restore was the wrong call, this is how you roll it back.
|
||||
- `db_restored`: should be `true`
|
||||
- `registry_restored`: `true` if the backup captured a registry
|
||||
- `chroma_restored`: `true` if the backup captured a chroma tree
|
||||
and include_chroma resolved to true (default)
|
||||
- `restored_integrity_ok`: **must be `true`** — if this is false,
|
||||
STOP and do not start the service; investigate the integrity
|
||||
error first. The restored file is still on disk but untrusted.
|
||||
|
||||
### Controlling the restore
|
||||
|
||||
The CLI supports a few flags for finer control:
|
||||
|
||||
- `--no-pre-snapshot` skips the pre-restore safety snapshot. Use
|
||||
this only when you know you have another rollback path.
|
||||
- `--no-chroma` restores only SQLite + registry, leaving the
|
||||
current Chroma dir alone. Useful if Chroma is consistent but
|
||||
SQLite needs a rollback.
|
||||
- `--chroma` forces Chroma restoration even if the metadata
|
||||
doesn't clearly indicate the snapshot has it (rare).
|
||||
|
||||
### Chroma restore and bind-mounted volumes
|
||||
|
||||
The Chroma dir on Dalidou is a bind-mounted Docker volume. The
|
||||
restore cannot `rmtree` the destination (you can't unlink a mount
|
||||
point — it raises `OSError [Errno 16] Device or resource busy`),
|
||||
so the function clears the dir's CONTENTS and uses
|
||||
`copytree(dirs_exist_ok=True)` to copy the snapshot back in. The
|
||||
regression test `test_restore_chroma_does_not_unlink_destination_directory`
|
||||
in `tests/test_backup.py` captures the destination inode before
|
||||
and after restore and asserts it's stable — the same invariant
|
||||
that protects the bind mount.
|
||||
|
||||
This was discovered during the first real Dalidou restore drill
|
||||
on 2026-04-09. If you see a new restore failure with
|
||||
`Device or resource busy`, something has regressed this fix.
|
||||
|
||||
### Restart AtoCore
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose up -d
|
||||
# Wait for /health to come up
|
||||
for i in 1 2 3 4 5 6 7 8 9 10; do
|
||||
curl -fsS http://127.0.0.1:8100/health \
|
||||
&& break || { echo "not ready ($i/10)"; sleep 3; }
|
||||
done
|
||||
```
|
||||
|
||||
**Note on build_sha after restore:** The one-shot `docker compose run`
|
||||
container does not carry the build provenance env vars that `deploy.sh`
|
||||
exports at deploy time. After a restore, `/health` will report
|
||||
`build_sha: "unknown"` until you re-run `deploy.sh` or manually
|
||||
re-deploy. This is cosmetic — the data is correctly restored — but if
|
||||
you need `build_sha` to be accurate, run a redeploy after the restore:
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app
|
||||
bash deploy/dalidou/deploy.sh
|
||||
```
|
||||
|
||||
### Post-restore verification
|
||||
|
||||
```bash
|
||||
# 1. Service is healthy
|
||||
curl -fsS http://127.0.0.1:8100/health | jq .
|
||||
|
||||
# 2. Stats look right
|
||||
curl -fsS http://127.0.0.1:8100/stats | jq .
|
||||
|
||||
# 3. Project registry loads
|
||||
curl -fsS http://127.0.0.1:8100/projects | jq '.projects | length'
|
||||
|
||||
# 4. A known-good context query returns non-empty results
|
||||
curl -fsS -X POST http://127.0.0.1:8100/context/build \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"prompt": "what is p05 about", "project": "p05-interferometer"}' | jq '.chunks_used'
|
||||
```
|
||||
|
||||
If any of these are wrong, the restore is bad. Roll back using the
|
||||
pre-restore safety snapshot whose stamp you recorded from the
|
||||
restore output. The rollback is the same procedure — stop the
|
||||
service and restore that stamp:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore \
|
||||
$PRE_RESTORE_SNAPSHOT_STAMP \
|
||||
--confirm-service-stopped \
|
||||
--no-pre-snapshot
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
(`--no-pre-snapshot` because the rollback itself doesn't need one;
|
||||
you already have the original snapshot as a fallback if everything
|
||||
goes sideways.)
|
||||
|
||||
### Restore drill
|
||||
|
||||
The restore is exercised at three levels:
|
||||
|
||||
1. **Unit tests.** `tests/test_backup.py` has six restore tests
|
||||
(refuse-without-confirm, invalid backup, full round-trip,
|
||||
Chroma round-trip, inode-stability regression, WAL sidecar
|
||||
cleanup, skip-pre-snapshot). These run in CI on every commit.
|
||||
2. **Module-level round-trip.**
|
||||
`test_restore_round_trip_reverses_post_backup_mutations` is
|
||||
the canonical drill in code form: seed baseline, snapshot,
|
||||
mutate, restore, assert mutation reversed + baseline survived
|
||||
+ pre-restore snapshot captured the mutation.
|
||||
3. **Live drill on Dalidou.** Periodically run the full procedure
|
||||
against the real service with a disposable drill-marker
|
||||
memory (created via `POST /memory` with `memory_type=episodic`
|
||||
and `project=drill`), following the sequence above and then
|
||||
verifying the marker is gone afterward via
|
||||
`GET /memory?project=drill`. The first such drill on
|
||||
2026-04-09 surfaced the bind-mount bug; future runs
|
||||
primarily exist to verify the fix stays fixed.
|
||||
|
||||
Run the live drill:
|
||||
|
||||
- **Before** enabling any new write-path automation (auto-capture,
|
||||
automated ingestion, reinforcement sweeps).
|
||||
- **After** any change to `src/atocore/ops/backup.py` or to
|
||||
schema migrations in `src/atocore/models/database.py`.
|
||||
- **After** a Dalidou OS upgrade or docker version bump.
|
||||
- **At least once per quarter** as a standing operational check.
|
||||
- **After any incident** that touched the storage layer.
|
||||
|
||||
Record each drill run (stamp, pre-restore snapshot stamp, pass/fail,
|
||||
any surprises) somewhere durable — a line in the project journal
|
||||
or a git commit message is enough. A drill you ran once and never
|
||||
again is barely more than a drill you never ran.
|
||||
|
||||
## Retention policy
|
||||
|
||||
- **Last 7 daily backups**: kept verbatim
|
||||
- **Last 4 weekly backups** (Sunday): kept verbatim
|
||||
- **Last 6 monthly backups** (1st of month): kept verbatim
|
||||
- **Anything older**: deleted
|
||||
|
||||
The retention job is **not yet implemented** and is tracked as a
|
||||
follow-up. Until then, the snapshots directory grows monotonically.
|
||||
A simple cron-based cleanup script is the next step:
|
||||
|
||||
```cron
|
||||
0 4 * * * /srv/storage/atocore/scripts/cleanup-old-backups.sh
|
||||
```
|
||||
|
||||
## Common failure modes and what to do about them
|
||||
|
||||
| Symptom | Likely cause | Action |
|
||||
|---|---|---|
|
||||
| `db_integrity_check_failed` on validation | SQLite snapshot copied while a write was in progress, or disk corruption | Take a fresh backup and validate again. If it fails twice, suspect the underlying disk. |
|
||||
| `registry_invalid_json` | Registry was being edited at backup time | Take a fresh backup. The registry is small so this is cheap. |
|
||||
| Restore: `restored_integrity_ok: false` | Source snapshot was itself corrupt (validation should have caught it — file a bug) or copy was interrupted mid-write | Do NOT start the service. Validate the snapshot directly with `python -m atocore.ops.backup validate <STAMP>`, try a different older snapshot, or roll back to the pre-restore safety snapshot. |
|
||||
| Restore: `OSError [Errno 16] Device or resource busy` on Chroma | Old code tried to `rmtree` the Chroma mount point. Fixed on 2026-04-09 by `test_restore_chroma_does_not_unlink_destination_directory` | Ensure you're running commit 2026-04-09 or later; if you need to work around an older build, use `--no-chroma` and restore Chroma contents manually. |
|
||||
| `chroma_snapshot_missing` after a restore | Snapshot was DB-only | Either rebuild via fresh ingestion or restore an older snapshot that includes Chroma. |
|
||||
| Service won't start after restore | Permissions wrong on the restored files | Re-run `chown 1000:1000` (or whatever the gitea/atocore container user is) on the data dir. |
|
||||
| `/stats` returns 0 documents after restore | The SQL store was restored but the source paths in `source_documents` don't match the current Dalidou paths | This means the backup came from a different deployment. Don't trust this restore — it's pulling from the wrong layout. |
|
||||
| Drill marker still present after restore | Wrong stamp, service still writing during `docker compose down`, or the restore JSON didn't report `db_restored: true` | Roll back via the pre-restore safety snapshot and retry with the correct source snapshot. |
|
||||
|
||||
## Open follow-ups (not yet implemented)
|
||||
|
||||
Tracked separately in `docs/next-steps.md` — the list below is the
|
||||
backup-specific subset.
|
||||
|
||||
1. **Retention cleanup script**: see the cron entry above. The
|
||||
snapshots directory grows monotonically until this exists.
|
||||
2. **Off-Dalidou backup target**: currently snapshots live on the
|
||||
same disk as the live data. A real disaster-recovery story
|
||||
needs at least one snapshot on a different physical machine.
|
||||
The simplest first step is a periodic `rsync` to the user's
|
||||
laptop or to another server.
|
||||
3. **Backup encryption**: snapshots contain raw SQLite and JSON.
|
||||
Consider age/gpg encryption if backups will be shipped off-site.
|
||||
4. **Automatic post-backup validation**: today the validator must
|
||||
be invoked manually. The `create_runtime_backup` function
|
||||
should call `validate_backup` on its own output and refuse to
|
||||
declare success if validation fails.
|
||||
5. **Chroma backup is currently full directory copy** every time.
|
||||
For large vector stores this gets expensive. A future
|
||||
improvement would be incremental snapshots via filesystem-level
|
||||
snapshotting (LVM, btrfs, ZFS).
|
||||
|
||||
**Done** (kept for historical reference):
|
||||
|
||||
- ~~Implement `restore_runtime_backup()` as a proper module
|
||||
function so the restore isn't a manual `sudo cp` dance~~ —
|
||||
landed 2026-04-09 in commit 3362080, followed by the
|
||||
Chroma bind-mount fix from the first real drill.
|
||||
|
||||
## Quickstart cheat sheet
|
||||
|
||||
```bash
|
||||
# Daily backup (DB + registry only — fast)
|
||||
curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
|
||||
-H "Content-Type: application/json" -d '{}'
|
||||
|
||||
# Weekly backup (DB + registry + Chroma — slower, holds ingestion lock)
|
||||
curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
|
||||
-H "Content-Type: application/json" -d '{"include_chroma": true}'
|
||||
|
||||
# List backups
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup | jq '.backups[].stamp'
|
||||
|
||||
# Validate the most recent backup
|
||||
LATEST=$(curl -fsS http://127.0.0.1:8100/admin/backup | jq -r '.backups[-1].stamp')
|
||||
curl -fsS http://127.0.0.1:8100/admin/backup/$LATEST/validate | jq .
|
||||
|
||||
# Full restore (service must be stopped first)
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose down
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore $STAMP --confirm-service-stopped
|
||||
docker compose up -d
|
||||
|
||||
# Live drill: exercise the full create -> mutate -> restore flow
|
||||
# against the running service. The marker memory uses
|
||||
# memory_type=episodic (valid types: identity, preference, project,
|
||||
# episodic, knowledge, adaptation) and project=drill so it's easy
|
||||
# to find via GET /memory?project=drill before and after.
|
||||
#
|
||||
# See the "Restore drill" section above for the full sequence.
|
||||
STAMP=$(curl -fsS -X POST http://127.0.0.1:8100/admin/backup \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"include_chroma": true}' | jq -r '.backup_root' | awk -F/ '{print $NF}')
|
||||
|
||||
curl -fsS -X POST http://127.0.0.1:8100/memory \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"memory_type":"episodic","content":"DRILL-MARKER","project":"drill","confidence":1.0}'
|
||||
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose down
|
||||
docker compose run --rm --entrypoint python atocore \
|
||||
-m atocore.ops.backup restore $STAMP --confirm-service-stopped
|
||||
docker compose up -d
|
||||
|
||||
# Marker should be gone:
|
||||
curl -fsS 'http://127.0.0.1:8100/memory?project=drill' | jq .
|
||||
```
|
||||
80
docs/backup-strategy.md
Normal file
80
docs/backup-strategy.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# AtoCore Backup Strategy
|
||||
|
||||
## Purpose
|
||||
|
||||
This document describes the current backup baseline for the Dalidou-hosted
|
||||
AtoCore machine store.
|
||||
|
||||
The immediate goal is not full disaster-proof automation yet. The goal is to
|
||||
have one safe, repeatable way to snapshot the most important writable state.
|
||||
|
||||
## Current Backup Baseline
|
||||
|
||||
Today, the safest hot-backup target is:
|
||||
|
||||
- SQLite machine database
|
||||
- project registry JSON
|
||||
- backup metadata describing what was captured
|
||||
|
||||
This is now supported by:
|
||||
|
||||
- `python -m atocore.ops.backup`
|
||||
|
||||
## What The Script Captures
|
||||
|
||||
The backup command creates a timestamped snapshot under:
|
||||
|
||||
- `ATOCORE_BACKUP_DIR/snapshots/<timestamp>/`
|
||||
|
||||
It currently writes:
|
||||
|
||||
- `db/atocore.db`
|
||||
- created with SQLite's backup API
|
||||
- `config/project-registry.json`
|
||||
- copied if it exists
|
||||
- `backup-metadata.json`
|
||||
- timestamp, paths, and backup notes
|
||||
|
||||
## What It Does Not Yet Capture
|
||||
|
||||
The current script does not hot-backup Chroma.
|
||||
|
||||
That is intentional.
|
||||
|
||||
For now, Chroma should be treated as one of:
|
||||
|
||||
- rebuildable derived state
|
||||
- or something that needs a deliberate cold snapshot/export workflow
|
||||
|
||||
Until that workflow exists, do not rely on ad hoc live file copies of the
|
||||
vector store while the service is actively writing.
|
||||
|
||||
## Dalidou Use
|
||||
|
||||
On Dalidou, the canonical machine paths are:
|
||||
|
||||
- DB:
|
||||
- `/srv/storage/atocore/data/db/atocore.db`
|
||||
- registry:
|
||||
- `/srv/storage/atocore/config/project-registry.json`
|
||||
- backups:
|
||||
- `/srv/storage/atocore/backups`
|
||||
|
||||
So a normal backup run should happen on Dalidou itself, not from another
|
||||
machine.
|
||||
|
||||
## Next Backup Improvements
|
||||
|
||||
1. decide Chroma policy clearly
|
||||
- rebuild vs cold snapshot vs export
|
||||
2. add a simple scheduled backup routine on Dalidou
|
||||
3. add retention policy for old snapshots
|
||||
4. optionally add a restore validation check
|
||||
|
||||
## Healthy Rule
|
||||
|
||||
Do not design around syncing the live machine DB/vector store between machines.
|
||||
|
||||
Back up the canonical Dalidou state.
|
||||
Restore from Dalidou state.
|
||||
Keep OpenClaw as a client of AtoCore, not a storage peer.
|
||||
275
docs/current-state.md
Normal file
275
docs/current-state.md
Normal file
@@ -0,0 +1,275 @@
|
||||
# AtoCore Current State
|
||||
|
||||
## Status Summary
|
||||
|
||||
AtoCore is no longer just a proof of concept. The local engine exists, the
|
||||
correctness pass is complete, Dalidou now hosts the canonical runtime and
|
||||
machine-storage location, and the T420/OpenClaw side now has a safe read-only
|
||||
path to consume AtoCore. The live corpus is no longer just self-knowledge: it
|
||||
now includes a first curated ingestion batch for the active projects.
|
||||
|
||||
## Phase Assessment
|
||||
|
||||
- completed
|
||||
- Phase 0
|
||||
- Phase 0.5
|
||||
- Phase 1
|
||||
- baseline complete
|
||||
- Phase 2
|
||||
- Phase 3
|
||||
- Phase 5
|
||||
- Phase 7
|
||||
- Phase 9 (Commits A/B/C: capture, reinforcement, extractor + review queue)
|
||||
- partial
|
||||
- Phase 4
|
||||
- Phase 8
|
||||
- not started
|
||||
- Phase 6
|
||||
- Phase 10
|
||||
- Phase 11
|
||||
- Phase 12
|
||||
- Phase 13
|
||||
|
||||
## What Exists Today
|
||||
|
||||
- ingestion pipeline
|
||||
- parser and chunker
|
||||
- SQLite-backed memory and project state
|
||||
- vector retrieval
|
||||
- context builder
|
||||
- API routes for query, context, health, and source status
|
||||
- project registry and per-project refresh foundation
|
||||
- project registration lifecycle:
|
||||
- template
|
||||
- proposal preview
|
||||
- approved registration
|
||||
- safe update of existing project registrations
|
||||
- refresh
|
||||
- implementation-facing architecture notes for:
|
||||
- engineering knowledge hybrid architecture
|
||||
- engineering ontology v1
|
||||
- env-driven storage and deployment paths
|
||||
- Dalidou Docker deployment foundation
|
||||
- initial AtoCore self-knowledge corpus ingested on Dalidou
|
||||
- T420/OpenClaw read-only AtoCore helper skill
|
||||
- full active-project markdown/text corpus wave for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
|
||||
## What Is True On Dalidou
|
||||
|
||||
- deployed repo location:
|
||||
- `/srv/storage/atocore/app`
|
||||
- canonical machine DB location:
|
||||
- `/srv/storage/atocore/data/db/atocore.db`
|
||||
- canonical vector store location:
|
||||
- `/srv/storage/atocore/data/chroma`
|
||||
- source input locations:
|
||||
- `/srv/storage/atocore/sources/vault`
|
||||
- `/srv/storage/atocore/sources/drive`
|
||||
|
||||
The service and storage foundation are live on Dalidou.
|
||||
|
||||
The machine-data host is real and canonical.
|
||||
|
||||
The project registry is now also persisted in a canonical mounted config path on
|
||||
Dalidou:
|
||||
|
||||
- `/srv/storage/atocore/config/project-registry.json`
|
||||
|
||||
The content corpus is partially populated now.
|
||||
|
||||
The Dalidou instance already contains:
|
||||
|
||||
- AtoCore ecosystem and hosting docs
|
||||
- current-state and OpenClaw integration docs
|
||||
- Master Plan V3
|
||||
- Build Spec V1
|
||||
- trusted project-state entries for `atocore`
|
||||
- full staged project markdown/text corpora for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
- curated repo-context docs for:
|
||||
- `p05`: `Fullum-Interferometer`
|
||||
- `p06`: `polisher-sim`
|
||||
- trusted project-state entries for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
|
||||
Current live stats after the full active-project wave are now far beyond the
|
||||
initial seed stage:
|
||||
|
||||
- more than `1,100` source documents
|
||||
- more than `20,000` chunks
|
||||
- matching vector count
|
||||
|
||||
The broader long-term corpus is still not fully populated yet. Wider project and
|
||||
vault ingestion remains a deliberate next step rather than something already
|
||||
completed, but the corpus is now meaningfully seeded beyond AtoCore's own docs.
|
||||
|
||||
For human-readable quality review, the current staged project markdown corpus is
|
||||
primarily visible under:
|
||||
|
||||
- `/srv/storage/atocore/sources/vault/incoming/projects`
|
||||
|
||||
This staged area is now useful for review because it contains the markdown/text
|
||||
project docs that were actually ingested for the full active-project wave.
|
||||
|
||||
It is important to read this staged area correctly:
|
||||
|
||||
- it is a readable ingestion input layer
|
||||
- it is not the final machine-memory representation itself
|
||||
- seeing familiar PKM-style notes there is expected
|
||||
- the machine-processed intelligence lives in the DB, chunks, vectors, memory,
|
||||
trusted project state, and context-builder outputs
|
||||
|
||||
## What Is True On The T420
|
||||
|
||||
- SSH access is working
|
||||
- OpenClaw workspace inspected at `/home/papa/clawd`
|
||||
- OpenClaw's own memory system remains unchanged
|
||||
- a read-only AtoCore integration skill exists in the workspace:
|
||||
- `/home/papa/clawd/skills/atocore-context/`
|
||||
- the T420 can successfully reach Dalidou AtoCore over network/Tailscale
|
||||
- fail-open behavior has been verified for the helper path
|
||||
- OpenClaw can now seed AtoCore in two distinct ways:
|
||||
- project-scoped memory entries
|
||||
- staged document ingestion into the retrieval corpus
|
||||
- the helper now supports the practical registered-project lifecycle:
|
||||
- projects
|
||||
- project-template
|
||||
- propose-project
|
||||
- register-project
|
||||
- update-project
|
||||
- refresh-project
|
||||
- the helper now also supports the first organic routing layer:
|
||||
- `detect-project "<prompt>"`
|
||||
- `auto-context "<prompt>" [budget] [project]`
|
||||
- OpenClaw can now default to AtoCore for project-knowledge questions without
|
||||
requiring explicit helper commands from the human every time
|
||||
|
||||
## What Exists In Memory vs Corpus
|
||||
|
||||
These remain separate and that is intentional.
|
||||
|
||||
In `/memory`:
|
||||
|
||||
- project-scoped curated memories now exist for:
|
||||
- `p04-gigabit`: 5 memories
|
||||
- `p05-interferometer`: 6 memories
|
||||
- `p06-polisher`: 8 memories
|
||||
|
||||
These are curated summaries and extracted stable project signals.
|
||||
|
||||
In `source_documents` / retrieval corpus:
|
||||
|
||||
- full project markdown/text corpora are now present for the active project set
|
||||
- retrieval is no longer limited to AtoCore self-knowledge only
|
||||
- the current corpus is broad enough that ranking quality matters more than
|
||||
corpus presence alone
|
||||
- underspecified prompts can still pull in historical or archive material, so
|
||||
project-aware routing and better ranking remain important
|
||||
|
||||
The source refresh model now has a concrete foundation in code:
|
||||
|
||||
- a project registry file defines known project ids, aliases, and ingest roots
|
||||
- the API can list registered projects
|
||||
- the API can return a registration template
|
||||
- the API can preview a registration without mutating state
|
||||
- the API can persist an approved registration
|
||||
- the API can update an existing registered project without changing its canonical id
|
||||
- the API can refresh one registered project at a time
|
||||
|
||||
This lifecycle is now coherent end to end for normal use.
|
||||
|
||||
The first live update passes on existing registered projects have now been
|
||||
verified against `p04-gigabit` and `p05-interferometer`:
|
||||
|
||||
- the registration description can be updated safely
|
||||
- the canonical project id remains unchanged
|
||||
- refresh still behaves cleanly after the update
|
||||
- `context/build` still returns useful project-specific context afterward
|
||||
|
||||
## Reliability Baseline
|
||||
|
||||
The runtime has now been hardened in a few practical ways:
|
||||
|
||||
- SQLite connections use a configurable busy timeout
|
||||
- SQLite uses WAL mode to reduce transient lock pain under normal concurrent use
|
||||
- project registry writes are atomic file replacements rather than in-place rewrites
|
||||
- a full runtime backup and restore path now exists and has been exercised on
|
||||
live Dalidou:
|
||||
- SQLite (hot online backup via `conn.backup()`)
|
||||
- project registry (file copy)
|
||||
- Chroma vector store (cold directory copy under `exclusive_ingestion()`)
|
||||
- backup metadata
|
||||
- `restore_runtime_backup()` with CLI entry point
|
||||
(`python -m atocore.ops.backup restore <STAMP>
|
||||
--confirm-service-stopped`), pre-restore safety snapshot for
|
||||
rollback, WAL/SHM sidecar cleanup, `PRAGMA integrity_check`
|
||||
on the restored file
|
||||
- the first live drill on 2026-04-09 surfaced and fixed a Chroma
|
||||
restore bug on Docker bind-mounted volumes (`shutil.rmtree`
|
||||
on a mount point); a regression test now asserts the
|
||||
destination inode is stable across restore
|
||||
- deploy provenance is visible end-to-end:
|
||||
- `/health` reports `build_sha`, `build_time`, `build_branch`
|
||||
from env vars wired by `deploy.sh`
|
||||
- `deploy.sh` Step 6 verifies the live `build_sha` matches the
|
||||
just-built commit (exit code 6 on drift) so "live is current?"
|
||||
can be answered precisely, not just by `__version__`
|
||||
- `deploy.sh` Step 1.5 detects that the script itself changed
|
||||
in the pulled commit and re-execs into the fresh copy, so
|
||||
the deploy never silently runs the old script against new source
|
||||
|
||||
This does not eliminate every concurrency edge, but it materially improves the
|
||||
current operational baseline.
|
||||
|
||||
In `Trusted Project State`:
|
||||
|
||||
- each active seeded project now has a conservative trusted-state set
|
||||
- promoted facts cover:
|
||||
- summary
|
||||
- core architecture or boundary decision
|
||||
- key constraints
|
||||
- next focus
|
||||
|
||||
This separation is healthy:
|
||||
|
||||
- memory stores distilled project facts
|
||||
- corpus stores the underlying retrievable documents
|
||||
|
||||
## Immediate Next Focus
|
||||
|
||||
1. ~~Re-run the full backup/restore drill~~ — DONE 2026-04-11,
|
||||
full pass (db, registry, chroma, integrity all true)
|
||||
2. ~~Turn on auto-capture of Claude Code sessions in conservative
|
||||
mode~~ — DONE 2026-04-11, Stop hook wired via
|
||||
`deploy/hooks/capture_stop.py` → `POST /interactions`
|
||||
with `reinforce=false`; kill switch via
|
||||
`ATOCORE_CAPTURE_DISABLED=1`
|
||||
3. Run a short real-use pilot with auto-capture on, verify
|
||||
interactions are landing in Dalidou, review quality
|
||||
4. Use the new T420-side organic routing layer in real OpenClaw workflows
|
||||
4. Tighten retrieval quality for the now fully ingested active project corpora
|
||||
5. Move to Wave 2 trusted-operational ingestion instead of blindly widening raw corpus further
|
||||
6. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
|
||||
7. Expand the remaining boring operations baseline:
|
||||
- retention policy cleanup script
|
||||
- off-Dalidou backup target (rsync or similar)
|
||||
8. Only later consider write-back, reflection, or deeper autonomous behaviors
|
||||
|
||||
See also:
|
||||
|
||||
- [ingestion-waves.md](C:/Users/antoi/ATOCore/docs/ingestion-waves.md)
|
||||
- [master-plan-status.md](C:/Users/antoi/ATOCore/docs/master-plan-status.md)
|
||||
|
||||
## Guiding Constraints
|
||||
|
||||
- bad memory is worse than no memory
|
||||
- trusted project state must remain highest priority
|
||||
- human-readable sources and machine storage stay separate
|
||||
- OpenClaw integration must not degrade OpenClaw baseline behavior
|
||||
270
docs/dalidou-deployment.md
Normal file
270
docs/dalidou-deployment.md
Normal file
@@ -0,0 +1,270 @@
|
||||
# Dalidou Deployment
|
||||
|
||||
## Purpose
|
||||
Deploy AtoCore on Dalidou as the canonical runtime and machine-memory host.
|
||||
|
||||
## Model
|
||||
|
||||
- Dalidou hosts the canonical AtoCore service.
|
||||
- OpenClaw on the T420 consumes AtoCore over network/Tailscale API.
|
||||
- `sources/vault` and `sources/drive` are read-only inputs by convention.
|
||||
- SQLite/Chroma machine state stays on Dalidou and is not treated as a sync peer.
|
||||
- The app and machine-storage host can be live before the long-term content
|
||||
corpus is fully populated.
|
||||
|
||||
## Directory layout
|
||||
|
||||
```text
|
||||
/srv/storage/atocore/
|
||||
app/ # deployed repo checkout
|
||||
data/
|
||||
db/
|
||||
chroma/
|
||||
cache/
|
||||
tmp/
|
||||
sources/
|
||||
vault/
|
||||
drive/
|
||||
logs/
|
||||
backups/
|
||||
run/
|
||||
```
|
||||
|
||||
## Compose workflow
|
||||
|
||||
The compose definition lives in:
|
||||
|
||||
```text
|
||||
deploy/dalidou/docker-compose.yml
|
||||
```
|
||||
|
||||
The Dalidou environment file should be copied to:
|
||||
|
||||
```text
|
||||
deploy/dalidou/.env
|
||||
```
|
||||
|
||||
starting from:
|
||||
|
||||
```text
|
||||
deploy/dalidou/.env.example
|
||||
```
|
||||
|
||||
## First-time deployment steps
|
||||
|
||||
1. Place the repository under `/srv/storage/atocore/app` — ideally as a
|
||||
proper git clone so future updates can be pulled, not as a static
|
||||
snapshot:
|
||||
|
||||
```bash
|
||||
sudo git clone http://dalidou:3000/Antoine/ATOCore.git \
|
||||
/srv/storage/atocore/app
|
||||
```
|
||||
|
||||
2. Create the canonical directories listed above.
|
||||
3. Copy `deploy/dalidou/.env.example` to `deploy/dalidou/.env`.
|
||||
4. Adjust the source paths if your AtoVault/AtoDrive mirrors live elsewhere.
|
||||
5. Run:
|
||||
|
||||
```bash
|
||||
cd /srv/storage/atocore/app/deploy/dalidou
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
6. Validate:
|
||||
|
||||
```bash
|
||||
curl http://127.0.0.1:8100/health
|
||||
curl http://127.0.0.1:8100/sources
|
||||
```
|
||||
|
||||
## Updating a running deployment
|
||||
|
||||
**Use `deploy/dalidou/deploy.sh` for every code update.** It is the
|
||||
one-shot sync script that:
|
||||
|
||||
- fetches latest main from Gitea into `/srv/storage/atocore/app`
|
||||
- (if the app dir is not a git checkout) backs it up as
|
||||
`<dir>.pre-git-<timestamp>` and re-clones
|
||||
- rebuilds the container image
|
||||
- restarts the container
|
||||
- waits for `/health` to respond
|
||||
- compares the reported `code_version` against the
|
||||
`__version__` in the freshly-pulled source, and exits non-zero
|
||||
if they don't match (deployment drift detection)
|
||||
|
||||
```bash
|
||||
# Normal update from main
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
|
||||
# Deploy a specific branch or tag
|
||||
ATOCORE_BRANCH=codex/some-feature \
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
|
||||
# Dry-run: show what would happen without touching anything
|
||||
ATOCORE_DEPLOY_DRY_RUN=1 \
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
|
||||
# Deploy from a remote host (e.g. the laptop) using the Tailscale
|
||||
# or LAN address instead of loopback
|
||||
ATOCORE_GIT_REMOTE=http://192.168.86.50:3000/Antoine/ATOCore.git \
|
||||
bash /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
```
|
||||
|
||||
The script is idempotent and safe to re-run. It never touches the
|
||||
database directly — schema migrations are applied automatically at
|
||||
service startup by the lifespan handler in `src/atocore/main.py`
|
||||
which calls `init_db()` (which in turn runs the ALTER TABLE
|
||||
statements in `_apply_migrations`).
|
||||
|
||||
### Troubleshooting hostname resolution
|
||||
|
||||
`deploy.sh` defaults `ATOCORE_GIT_REMOTE` to
|
||||
`http://127.0.0.1:3000/Antoine/ATOCore.git` (loopback) because the
|
||||
hostname "dalidou" doesn't reliably resolve on the host itself —
|
||||
the first real Dalidou deploy hit exactly this on 2026-04-08. If
|
||||
you need to override (e.g. running deploy.sh from a laptop against
|
||||
the Dalidou LAN), set `ATOCORE_GIT_REMOTE` explicitly.
|
||||
|
||||
The same applies to `scripts/atocore_client.py`: its default
|
||||
`ATOCORE_BASE_URL` is `http://dalidou:8100` for remote callers, but
|
||||
when running the client on Dalidou itself (or inside the container
|
||||
via `docker exec`), override to loopback:
|
||||
|
||||
```bash
|
||||
ATOCORE_BASE_URL=http://127.0.0.1:8100 \
|
||||
python scripts/atocore_client.py health
|
||||
```
|
||||
|
||||
If you see `{"status": "unavailable", "fail_open": true}` from the
|
||||
client, the first thing to check is whether the base URL resolves
|
||||
from where you're running the client.
|
||||
|
||||
### The deploy.sh self-update race
|
||||
|
||||
When `deploy.sh` itself changes in the commit being pulled, the
|
||||
first run after the update is still executing the *old* script from
|
||||
the bash process's in-memory copy. `git reset --hard` updates the
|
||||
file on disk, but the running bash has already loaded the
|
||||
instructions. On 2026-04-09 this silently shipped an "unknown"
|
||||
`build_sha` because the old Step 2 (which predated env-var export)
|
||||
ran against fresh source.
|
||||
|
||||
`deploy.sh` now detects this: Step 1.5 compares the sha1 of `$0`
|
||||
(the running script) against the sha1 of
|
||||
`$APP_DIR/deploy/dalidou/deploy.sh` (the on-disk copy) after the
|
||||
git reset. If they differ, it sets `ATOCORE_DEPLOY_REEXECED=1` and
|
||||
`exec`s the fresh copy so the rest of the deploy runs under the new
|
||||
script. The sentinel env var prevents infinite recursion.
|
||||
|
||||
You'll see this in the logs as:
|
||||
|
||||
```text
|
||||
==> Step 1.5: deploy.sh changed in the pulled commit; re-exec'ing
|
||||
==> running script hash: <old>
|
||||
==> on-disk script hash: <new>
|
||||
==> re-exec -> /srv/storage/atocore/app/deploy/dalidou/deploy.sh
|
||||
```
|
||||
|
||||
To opt out (debugging, for example), pre-set
|
||||
`ATOCORE_DEPLOY_REEXECED=1` before invoking `deploy.sh` and the
|
||||
self-update guard will be skipped.
|
||||
|
||||
### Deployment drift detection
|
||||
|
||||
`/health` reports drift signals at three increasing levels of
|
||||
precision:
|
||||
|
||||
| Field | Source | Precision | When to use |
|
||||
|---|---|---|---|
|
||||
| `version` / `code_version` | `atocore.__version__` (manual bump) | coarse — same value across many commits | quick smoke check that the right *release* is running |
|
||||
| `build_sha` | `ATOCORE_BUILD_SHA` env var, set by `deploy.sh` per build | precise — changes per commit | the canonical drift signal |
|
||||
| `build_time` / `build_branch` | same env var path | per-build | forensics when multiple branches in flight |
|
||||
|
||||
The **precise** check (run on the laptop or any host that can curl
|
||||
the live service AND has the source repo at hand):
|
||||
|
||||
```bash
|
||||
# What's actually running on Dalidou
|
||||
LIVE_SHA=$(curl -fsS http://dalidou:8100/health | grep -o '"build_sha":"[^"]*"' | cut -d'"' -f4)
|
||||
|
||||
# What the deployed branch tip should be
|
||||
EXPECTED_SHA=$(cd /srv/storage/atocore/app && git rev-parse HEAD)
|
||||
|
||||
# Compare
|
||||
if [ "$LIVE_SHA" = "$EXPECTED_SHA" ]; then
|
||||
echo "live is current at $LIVE_SHA"
|
||||
else
|
||||
echo "DRIFT: live $LIVE_SHA vs expected $EXPECTED_SHA"
|
||||
echo "run deploy.sh to sync"
|
||||
fi
|
||||
```
|
||||
|
||||
The `deploy.sh` script does exactly this comparison automatically
|
||||
in its post-deploy verification step (Step 6) and exits non-zero
|
||||
on mismatch. So the **simplest drift check** is just to run
|
||||
`deploy.sh` — if there's nothing to deploy, it succeeds quickly;
|
||||
if the live service is stale, it deploys and verifies.
|
||||
|
||||
If `/health` reports `build_sha: "unknown"`, the running container
|
||||
was started without `deploy.sh` (probably via `docker compose up`
|
||||
directly), and the build provenance was never recorded. Re-run
|
||||
via `deploy.sh` to fix.
|
||||
|
||||
The coarse `code_version` check is still useful as a quick visual
|
||||
sanity check — bumping `__version__` from `0.2.0` to `0.3.0`
|
||||
signals a meaningful release boundary even if the precise
|
||||
`build_sha` is what tools should compare against:
|
||||
|
||||
```bash
|
||||
# Quick sanity check (coarse)
|
||||
curl -s http://127.0.0.1:8100/health | grep -o '"code_version":"[^"]*"'
|
||||
grep '__version__' /srv/storage/atocore/app/src/atocore/__init__.py
|
||||
```
|
||||
|
||||
### Schema migrations on redeploy
|
||||
|
||||
When updating from an older `__version__`, the first startup after
|
||||
the redeploy runs the idempotent ALTER TABLE migrations in
|
||||
`_apply_migrations`. For a pre-0.2.0 → 0.2.0 upgrade the migrations
|
||||
add these columns to existing tables (all with safe defaults so no
|
||||
data is touched):
|
||||
|
||||
- `memories.project TEXT DEFAULT ''`
|
||||
- `memories.last_referenced_at DATETIME`
|
||||
- `memories.reference_count INTEGER DEFAULT 0`
|
||||
- `interactions.response TEXT DEFAULT ''`
|
||||
- `interactions.memories_used TEXT DEFAULT '[]'`
|
||||
- `interactions.chunks_used TEXT DEFAULT '[]'`
|
||||
- `interactions.client TEXT DEFAULT ''`
|
||||
- `interactions.session_id TEXT DEFAULT ''`
|
||||
- `interactions.project TEXT DEFAULT ''`
|
||||
|
||||
Plus new indexes on the new columns. No row data is modified. The
|
||||
migration is safe to run against a database that already has the
|
||||
columns — the `_column_exists` check makes each ALTER a no-op in
|
||||
that case.
|
||||
|
||||
Backup the database before any redeploy (via `POST /admin/backup`)
|
||||
if you want a pre-upgrade snapshot. The migration is additive and
|
||||
reversible by restoring the snapshot.
|
||||
|
||||
## Deferred
|
||||
|
||||
- backup automation
|
||||
- restore/snapshot tooling
|
||||
- reverse proxy / TLS exposure
|
||||
- automated source ingestion job
|
||||
- OpenClaw client wiring
|
||||
|
||||
## Current Reality Check
|
||||
|
||||
When this deployment is first brought up, the service may be healthy before the
|
||||
real corpus has been ingested.
|
||||
|
||||
That means:
|
||||
|
||||
- AtoCore the system can already be hosted on Dalidou
|
||||
- the canonical machine-data location can already be on Dalidou
|
||||
- but the live knowledge/content corpus may still be empty or only partially
|
||||
loaded until source ingestion is run
|
||||
61
docs/dalidou-storage-migration.md
Normal file
61
docs/dalidou-storage-migration.md
Normal file
@@ -0,0 +1,61 @@
|
||||
# Dalidou Storage Migration
|
||||
|
||||
## Goal
|
||||
Establish Dalidou as the canonical AtoCore host while keeping human-readable
|
||||
source layers separate from machine operational storage.
|
||||
|
||||
## Canonical layout
|
||||
|
||||
```text
|
||||
/srv/atocore/
|
||||
app/ # git checkout of this repository
|
||||
data/ # machine operational state
|
||||
db/
|
||||
atocore.db
|
||||
chroma/
|
||||
cache/
|
||||
tmp/
|
||||
sources/
|
||||
vault/ # AtoVault input, read-only by convention
|
||||
drive/ # AtoDrive input, read-only by convention
|
||||
logs/
|
||||
backups/
|
||||
run/
|
||||
config/
|
||||
.env
|
||||
```
|
||||
|
||||
## Environment variables
|
||||
|
||||
Suggested Dalidou values:
|
||||
|
||||
```bash
|
||||
ATOCORE_ENV=production
|
||||
ATOCORE_DATA_DIR=/srv/atocore/data
|
||||
ATOCORE_DB_DIR=/srv/atocore/data/db
|
||||
ATOCORE_CHROMA_DIR=/srv/atocore/data/chroma
|
||||
ATOCORE_CACHE_DIR=/srv/atocore/data/cache
|
||||
ATOCORE_TMP_DIR=/srv/atocore/data/tmp
|
||||
ATOCORE_VAULT_SOURCE_DIR=/srv/atocore/sources/vault
|
||||
ATOCORE_DRIVE_SOURCE_DIR=/srv/atocore/sources/drive
|
||||
ATOCORE_LOG_DIR=/srv/atocore/logs
|
||||
ATOCORE_BACKUP_DIR=/srv/atocore/backups
|
||||
ATOCORE_RUN_DIR=/srv/atocore/run
|
||||
```
|
||||
|
||||
## Migration notes
|
||||
|
||||
- Existing local installs remain backward-compatible.
|
||||
- If `data/atocore.db` already exists, AtoCore continues using it.
|
||||
- Fresh installs default to `data/db/atocore.db`.
|
||||
- Source directories are inputs only; AtoCore should ingest from them but not
|
||||
treat them as writable runtime state.
|
||||
- Avoid syncing live SQLite/Chroma state between Dalidou and other machines.
|
||||
Prefer one canonical running service and API access from OpenClaw.
|
||||
|
||||
## Deferred work
|
||||
|
||||
- service manager wiring
|
||||
- backup/snapshot procedures
|
||||
- automated source registration jobs
|
||||
- OpenClaw integration
|
||||
129
docs/ingestion-waves.md
Normal file
129
docs/ingestion-waves.md
Normal file
@@ -0,0 +1,129 @@
|
||||
# AtoCore Ingestion Waves
|
||||
|
||||
## Purpose
|
||||
|
||||
This document tracks how the corpus should grow without losing signal quality.
|
||||
|
||||
The rule is:
|
||||
|
||||
- ingest in waves
|
||||
- validate retrieval after each wave
|
||||
- only then widen the source scope
|
||||
|
||||
## Wave 1 - Active Project Full Markdown Corpus
|
||||
|
||||
Status: complete
|
||||
|
||||
Projects:
|
||||
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
|
||||
What was ingested:
|
||||
|
||||
- the full markdown/text PKM stacks for the three active projects
|
||||
- selected staged operational docs already under the Dalidou source roots
|
||||
- selected repo markdown/text context for:
|
||||
- `Fullum-Interferometer`
|
||||
- `polisher-sim`
|
||||
- `Polisher-Toolhead` (when markdown exists)
|
||||
|
||||
What was intentionally excluded:
|
||||
|
||||
- binaries
|
||||
- images
|
||||
- PDFs
|
||||
- generated outputs unless they were plain text reports
|
||||
- dependency folders
|
||||
- hidden runtime junk
|
||||
|
||||
Practical result:
|
||||
|
||||
- AtoCore moved from a curated-seed corpus to a real active-project corpus
|
||||
- the live corpus now contains well over one thousand source documents and over
|
||||
twenty thousand chunks
|
||||
- project-specific context building is materially stronger than before
|
||||
|
||||
Main lesson from Wave 1:
|
||||
|
||||
- full project ingestion is valuable
|
||||
- but broad historical/archive material can dilute retrieval for underspecified
|
||||
prompts
|
||||
- context quality now depends more strongly on good project hints and better
|
||||
ranking than on corpus size alone
|
||||
|
||||
## Wave 2 - Trusted Operational Layer Expansion
|
||||
|
||||
Status: next
|
||||
|
||||
Goal:
|
||||
|
||||
- expand `AtoDrive`-style operational truth for the active projects
|
||||
|
||||
Candidate inputs:
|
||||
|
||||
- current status dashboards
|
||||
- decision logs
|
||||
- milestone tracking
|
||||
- curated requirements baselines
|
||||
- explicit next-step plans
|
||||
|
||||
Why this matters:
|
||||
|
||||
- this raises the quality of the high-trust layer instead of only widening
|
||||
general retrieval
|
||||
|
||||
## Wave 3 - Broader Active Engineering References
|
||||
|
||||
Status: planned
|
||||
|
||||
Goal:
|
||||
|
||||
- ingest reusable engineering references that support the active project set
|
||||
without dumping the entire vault
|
||||
|
||||
Candidate inputs:
|
||||
|
||||
- interferometry reference notes directly tied to `p05`
|
||||
- polishing physics references directly tied to `p06`
|
||||
- mirror and structural reference material directly tied to `p04`
|
||||
|
||||
Rule:
|
||||
|
||||
- only bring in references with a clear connection to active work
|
||||
|
||||
## Wave 4 - Wider PKM Population
|
||||
|
||||
Status: deferred
|
||||
|
||||
Goal:
|
||||
|
||||
- widen beyond the active projects while preserving retrieval quality
|
||||
|
||||
Preconditions:
|
||||
|
||||
- stronger ranking
|
||||
- better project-aware routing
|
||||
- stable operational restore path
|
||||
- clearer promotion rules for trusted state
|
||||
|
||||
## Validation After Each Wave
|
||||
|
||||
After every ingestion wave, verify:
|
||||
|
||||
- `stats`
|
||||
- project-specific `query`
|
||||
- project-specific `context-build`
|
||||
- `debug-context`
|
||||
- whether trusted project state still dominates when it should
|
||||
- whether cross-project bleed is getting worse or better
|
||||
|
||||
## Working Rule
|
||||
|
||||
The next wave should only happen when the current wave is:
|
||||
|
||||
- ingested
|
||||
- inspected
|
||||
- retrieval-tested
|
||||
- operationally stable
|
||||
196
docs/master-plan-status.md
Normal file
196
docs/master-plan-status.md
Normal file
@@ -0,0 +1,196 @@
|
||||
# AtoCore Master Plan Status
|
||||
|
||||
## Current Position
|
||||
|
||||
AtoCore is currently between **Phase 7** and **Phase 8**.
|
||||
|
||||
The platform is no longer just a proof of concept. The local engine exists, the
|
||||
core correctness pass is complete, Dalidou hosts the canonical runtime and
|
||||
machine database, and OpenClaw on the T420 can consume AtoCore safely in
|
||||
read-only additive mode.
|
||||
|
||||
## Phase Status
|
||||
|
||||
### Completed
|
||||
|
||||
- Phase 0 - Foundation
|
||||
- Phase 0.5 - Proof of Concept
|
||||
- Phase 1 - Ingestion
|
||||
|
||||
### Baseline Complete
|
||||
|
||||
- Phase 2 - Memory Core
|
||||
- Phase 3 - Retrieval
|
||||
- Phase 5 - Project State
|
||||
- Phase 7 - Context Builder
|
||||
|
||||
### Partial
|
||||
|
||||
- Phase 4 - Identity / Preferences
|
||||
- Phase 8 - OpenClaw Integration
|
||||
|
||||
### Baseline Complete
|
||||
|
||||
- Phase 9 - Reflection (all three foundation commits landed:
|
||||
A capture, B reinforcement, C candidate extraction + review queue).
|
||||
As of 2026-04-11 the capture → reinforce half runs automatically on
|
||||
every Stop-hook capture (length-aware token-overlap matcher handles
|
||||
paragraph-length memories), and project-scoped memories now reach
|
||||
the context pack via a dedicated `--- Project Memories ---` band
|
||||
between identity/preference and retrieved chunks. The extract half
|
||||
is still a manual / batch flow by design (`scripts/atocore_client.py
|
||||
batch-extract` + `triage`). First live batch-extract run over 42
|
||||
captured interactions produced 1 candidate (rule extractor is
|
||||
conservative and keys on structural cues like `## Decision:`
|
||||
headings that rarely appear in conversational LLM responses) —
|
||||
extractor tuning is a known follow-up.
|
||||
|
||||
### Not Yet Complete In The Intended Sense
|
||||
|
||||
- Phase 6 - AtoDrive
|
||||
- Phase 10 - Write-back
|
||||
- Phase 11 - Multi-model
|
||||
- Phase 12 - Evaluation
|
||||
- Phase 13 - Hardening
|
||||
|
||||
### Engineering Layer Planning Sprint
|
||||
|
||||
**Status: complete.** All 8 architecture docs are drafted. The
|
||||
engineering layer is now ready for V1 implementation against the
|
||||
active project set.
|
||||
|
||||
- [engineering-query-catalog.md](architecture/engineering-query-catalog.md) —
|
||||
the 20 v1-required queries the engineering layer must answer
|
||||
- [memory-vs-entities.md](architecture/memory-vs-entities.md) —
|
||||
canonical home split between memory and entity tables
|
||||
- [promotion-rules.md](architecture/promotion-rules.md) —
|
||||
Layer 0 → Layer 2 pipeline, triggers, review queue mechanics
|
||||
- [conflict-model.md](architecture/conflict-model.md) —
|
||||
detection, representation, and resolution of contradictory facts
|
||||
- [tool-handoff-boundaries.md](architecture/tool-handoff-boundaries.md) —
|
||||
KB-CAD / KB-FEM one-way mirror stance, ingest endpoints, drift handling
|
||||
- [representation-authority.md](architecture/representation-authority.md) —
|
||||
canonical home matrix across PKM / KB / repos / AtoCore for 22 fact kinds
|
||||
- [human-mirror-rules.md](architecture/human-mirror-rules.md) —
|
||||
templates, regeneration triggers, edit flow, "do not edit" enforcement
|
||||
- [engineering-v1-acceptance.md](architecture/engineering-v1-acceptance.md) —
|
||||
measurable done definition with 23 acceptance criteria
|
||||
- [engineering-knowledge-hybrid-architecture.md](architecture/engineering-knowledge-hybrid-architecture.md) —
|
||||
the 5-layer model (from the previous planning wave)
|
||||
- [engineering-ontology-v1.md](architecture/engineering-ontology-v1.md) —
|
||||
the initial V1 object and relationship inventory (previous wave)
|
||||
- [project-identity-canonicalization.md](architecture/project-identity-canonicalization.md) —
|
||||
the helper-at-every-service-boundary contract that keeps the
|
||||
trust hierarchy dependable across alias and canonical-id callers;
|
||||
required reading before adding new project-keyed entity surfaces
|
||||
in the V1 implementation sprint
|
||||
|
||||
The next concrete next step is the V1 implementation sprint, which
|
||||
should follow engineering-v1-acceptance.md as its checklist, and
|
||||
must apply the project-identity-canonicalization contract at every
|
||||
new service-layer entry point.
|
||||
|
||||
### LLM Client Integration
|
||||
|
||||
A separate but related architectural concern: how AtoCore is reachable
|
||||
from many different LLM client contexts (OpenClaw, Claude Code, future
|
||||
Codex skills, future MCP server). The layering rule is documented in:
|
||||
|
||||
- [llm-client-integration.md](architecture/llm-client-integration.md) —
|
||||
three-layer shape: HTTP API → shared operator client
|
||||
(`scripts/atocore_client.py`) → per-agent thin frontends; the
|
||||
shared client is the canonical backbone every new client should
|
||||
shell out to instead of reimplementing HTTP calls
|
||||
|
||||
This sits implicitly between Phase 8 (OpenClaw) and Phase 11
|
||||
(multi-model). Memory-review and engineering-entity commands are
|
||||
deferred from the shared client until their workflows are exercised.
|
||||
|
||||
## What Is Real Today
|
||||
|
||||
- canonical AtoCore runtime on Dalidou
|
||||
- canonical machine DB and vector store on Dalidou
|
||||
- project registry with:
|
||||
- template
|
||||
- proposal preview
|
||||
- register
|
||||
- update
|
||||
- refresh
|
||||
- read-only additive OpenClaw helper on the T420
|
||||
- seeded project corpus for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
- conservative Trusted Project State for those active projects
|
||||
- first operational backup foundation for SQLite + project registry
|
||||
- implementation-facing architecture notes for future engineering knowledge work
|
||||
- first organic routing layer in OpenClaw via:
|
||||
- `detect-project`
|
||||
- `auto-context`
|
||||
|
||||
## Now
|
||||
|
||||
These are the current practical priorities.
|
||||
|
||||
1. Finish practical OpenClaw integration
|
||||
- make the helper lifecycle feel natural in daily use
|
||||
- use the new organic routing layer for project-knowledge questions
|
||||
- confirm fail-open behavior remains acceptable
|
||||
- keep AtoCore clearly additive
|
||||
2. Tighten retrieval quality
|
||||
- reduce cross-project competition
|
||||
- improve ranking on short or ambiguous prompts
|
||||
- add only a few anchor docs where retrieval is still weak
|
||||
3. Continue controlled ingestion
|
||||
- deepen active projects selectively
|
||||
- avoid noisy bulk corpus growth
|
||||
4. Strengthen operational boringness
|
||||
- backup and restore procedure
|
||||
- Chroma rebuild / backup policy
|
||||
- retention and restore validation
|
||||
|
||||
## Next
|
||||
|
||||
These are the next major layers after the current practical pass.
|
||||
|
||||
1. Clarify AtoDrive as a real operational truth layer
|
||||
2. Mature identity / preferences handling
|
||||
3. Improve observability for:
|
||||
- retrieval quality
|
||||
- context-pack inspection
|
||||
- comparison of behavior with and without AtoCore
|
||||
|
||||
## Later
|
||||
|
||||
These are the deliberate future expansions already supported by the architecture
|
||||
direction, but not yet ready for immediate implementation.
|
||||
|
||||
1. Minimal engineering knowledge layer
|
||||
- driven by `docs/architecture/engineering-knowledge-hybrid-architecture.md`
|
||||
- guided by `docs/architecture/engineering-ontology-v1.md`
|
||||
2. Minimal typed objects and relationships
|
||||
3. Evidence-linking and provenance-rich structured records
|
||||
4. Human mirror generation from structured state
|
||||
|
||||
## Not Yet
|
||||
|
||||
These remain intentionally deferred.
|
||||
|
||||
- automatic write-back from OpenClaw into AtoCore
|
||||
- automatic memory promotion
|
||||
- ~~reflection loop integration~~ — baseline now in (capture→reinforce
|
||||
auto, extract batch/manual). Extractor tuning and scheduled batch
|
||||
extraction still open.
|
||||
- replacing OpenClaw's own memory system
|
||||
- live machine-DB sync between machines
|
||||
- full ontology / graph expansion before the current baseline is stable
|
||||
|
||||
## Working Rule
|
||||
|
||||
The next sensible implementation threshold for the engineering ontology work is:
|
||||
|
||||
- after the current ingestion, retrieval, registry, OpenClaw helper, organic
|
||||
routing, and backup baseline feels boring and dependable
|
||||
|
||||
Until then, the architecture docs should shape decisions, not force premature
|
||||
schema work.
|
||||
248
docs/next-steps.md
Normal file
248
docs/next-steps.md
Normal file
@@ -0,0 +1,248 @@
|
||||
# AtoCore Next Steps
|
||||
|
||||
## Current Position
|
||||
|
||||
AtoCore now has:
|
||||
|
||||
- canonical runtime and machine storage on Dalidou
|
||||
- separated source and machine-data boundaries
|
||||
- initial self-knowledge ingested into the live instance
|
||||
- trusted project-state entries for AtoCore itself
|
||||
- a first read-only OpenClaw integration path on the T420
|
||||
- a first real active-project corpus batch for:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
|
||||
This working list should be read alongside:
|
||||
|
||||
- [master-plan-status.md](C:/Users/antoi/ATOCore/docs/master-plan-status.md)
|
||||
|
||||
## Immediate Next Steps
|
||||
|
||||
1. ~~Re-run the backup/restore drill~~ — DONE 2026-04-11, full pass
|
||||
2. ~~Turn on auto-capture of Claude Code sessions~~ — DONE 2026-04-11,
|
||||
Stop hook via `deploy/hooks/capture_stop.py` → `POST /interactions`
|
||||
with `reinforce=false`; kill switch: `ATOCORE_CAPTURE_DISABLED=1`
|
||||
2a. Run a short real-use pilot with auto-capture on
|
||||
- verify interactions are landing in Dalidou
|
||||
- check prompt/response quality and truncation
|
||||
- confirm fail-open: no user-visible impact when Dalidou is down
|
||||
3. Use the T420 `atocore-context` skill and the new organic routing layer in
|
||||
real OpenClaw workflows
|
||||
- confirm `auto-context` feels natural
|
||||
- confirm project inference is good enough in practice
|
||||
- confirm the fail-open behavior remains acceptable in practice
|
||||
4. Review retrieval quality after the first real project ingestion batch
|
||||
- check whether the top hits are useful
|
||||
- check whether trusted project state remains dominant
|
||||
- reduce cross-project competition and prompt ambiguity where needed
|
||||
- use `debug-context` to inspect the exact last AtoCore supplement
|
||||
5. Treat the active-project full markdown/text wave as complete
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
6. Define a cleaner source refresh model
|
||||
- make the difference between source truth, staged inputs, and machine store
|
||||
explicit
|
||||
- move toward a project source registry and refresh workflow
|
||||
- foundation now exists via project registry + per-project refresh API
|
||||
- registration policy + template + proposal + approved registration are now
|
||||
the normal path for new projects
|
||||
7. Move to Wave 2 trusted-operational ingestion
|
||||
- curated dashboards
|
||||
- decision logs
|
||||
- milestone/current-status views
|
||||
- operational truth, not just raw project notes
|
||||
8. Integrate the new engineering architecture docs into active planning, not immediate schema code
|
||||
- keep `docs/architecture/engineering-knowledge-hybrid-architecture.md` as the target layer model
|
||||
- keep `docs/architecture/engineering-ontology-v1.md` as the V1 structured-domain target
|
||||
- do not start entity/relationship persistence until the ingestion, retrieval, registry, and backup baseline feels boring and stable
|
||||
9. Finish the boring operations baseline around backup
|
||||
- retention policy cleanup script (snapshots dir grows
|
||||
monotonically today)
|
||||
- off-Dalidou backup target (at minimum an rsync to laptop or
|
||||
another host so a single-disk failure isn't terminal)
|
||||
- automatic post-backup validation (have `create_runtime_backup`
|
||||
call `validate_backup` on its own output and refuse to
|
||||
declare success if validation fails)
|
||||
- DONE in commits be40994 / 0382238 / 3362080 / this one:
|
||||
- `create_runtime_backup` + `list_runtime_backups` +
|
||||
`validate_backup` + `restore_runtime_backup` with CLI
|
||||
- `POST /admin/backup` with `include_chroma=true` under
|
||||
the ingestion lock
|
||||
- `/health` build_sha / build_time / build_branch provenance
|
||||
- `deploy.sh` self-update re-exec guard + build_sha drift
|
||||
verification
|
||||
- live drill procedure in `docs/backup-restore-procedure.md`
|
||||
with failure-mode table and the memory_type=episodic
|
||||
marker pattern from the 2026-04-09 drill
|
||||
10. Keep deeper automatic runtime integration modest until the organic read-only
|
||||
model has proven value
|
||||
|
||||
## Trusted State Status
|
||||
|
||||
The first conservative trusted-state promotion pass is now complete for:
|
||||
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
|
||||
Each project now has a small set of stable entries covering:
|
||||
|
||||
- summary
|
||||
- architecture or boundary decision
|
||||
- key constraints
|
||||
- current next focus
|
||||
|
||||
This materially improves `context/build` quality for project-hinted prompts.
|
||||
|
||||
## Recommended Near-Term Project Work
|
||||
|
||||
The active-project full markdown/text wave is now in.
|
||||
|
||||
The near-term work is now:
|
||||
|
||||
1. strengthen retrieval quality
|
||||
2. promote or refine trusted operational truth where the broad corpus is now too noisy
|
||||
3. keep trusted project state concise and high-confidence
|
||||
4. widen only through named ingestion waves
|
||||
|
||||
## Recommended Next Wave Inputs
|
||||
|
||||
Wave 2 should emphasize trusted operational truth, not bulk historical notes.
|
||||
|
||||
P04:
|
||||
|
||||
- current status dashboard
|
||||
- current selected design path
|
||||
- current frame interface truth
|
||||
- current next-step milestone view
|
||||
|
||||
P05:
|
||||
|
||||
- selected vendor path
|
||||
- current error-budget baseline
|
||||
- current architecture freeze or open decisions
|
||||
- current procurement / next-action view
|
||||
|
||||
P06:
|
||||
|
||||
- current system map
|
||||
- current shared contracts baseline
|
||||
- current calibration procedure truth
|
||||
- current July / proving roadmap view
|
||||
|
||||
## Deferred On Purpose
|
||||
|
||||
- automatic write-back from OpenClaw into AtoCore
|
||||
- automatic memory promotion
|
||||
- ~~reflection loop integration~~ — baseline now landed (2026-04-11):
|
||||
Stop hook runs reinforce automatically, project memories are folded
|
||||
into the context pack, batch-extract and triage CLIs exist. What
|
||||
remains deferred: scheduled/automatic batch extraction and extractor
|
||||
rule tuning (rule-based extractor produced 1 candidate from 42 real
|
||||
captures — needs new cues for conversational LLM content).
|
||||
- replacing OpenClaw's own memory system
|
||||
- syncing the live machine DB between machines
|
||||
|
||||
## Success Criteria For The Next Batch
|
||||
|
||||
The next batch is successful if:
|
||||
|
||||
- OpenClaw can use AtoCore naturally when context is needed
|
||||
- OpenClaw can infer registered projects and call AtoCore organically for
|
||||
project-knowledge questions
|
||||
- the active-project full corpus wave can be inspected and used concretely
|
||||
through `auto-context`, `context-build`, and `debug-context`
|
||||
- OpenClaw can also register a new project cleanly before refreshing it
|
||||
- existing project registrations can be refined safely before refresh when the
|
||||
staged source set evolves
|
||||
- AtoCore answers correctly for the active project set
|
||||
- retrieval surfaces the seeded project docs instead of mostly AtoCore meta-docs
|
||||
- trusted project state remains concise and high confidence
|
||||
- project ingestion remains controlled rather than noisy
|
||||
- the canonical Dalidou instance stays stable
|
||||
|
||||
## Retrieval Quality Review — 2026-04-11
|
||||
|
||||
First sweep with real project-hinted queries on Dalidou. Used
|
||||
`POST /context/build` against p04, p05, p06 with representative
|
||||
questions and inspected `formatted_context`.
|
||||
|
||||
Findings:
|
||||
|
||||
- **Trusted Project State is surfacing correctly.** The DECISION and
|
||||
REQUIREMENT categories appear at the top of the pack and include
|
||||
the expected key facts (e.g. p04 "Option B conical-back mirror
|
||||
architecture"). This is the strongest signal in the pack today.
|
||||
- **Chunk retrieval is relevant on-topic but broad.** Top chunks for
|
||||
the p04 architecture query are PDR intro, CAD assembly overview,
|
||||
and the index — all on the right project but none of them directly
|
||||
answer the "why was Option B chosen" question. The authoritative
|
||||
answer sits in Project State, not in the chunks.
|
||||
- **Active memories are NOT reaching the pack.** The context builder
|
||||
surfaces Trusted Project State and retrieved chunks but does not
|
||||
include the 21 active project/knowledge memories. Reinforcement
|
||||
(Phase 9 Commit B) bumps memory confidence without the memory ever
|
||||
being read back into a prompt — the reflection loop has no outlet
|
||||
on the retrieval side. This is a design gap, not a bug: needs a
|
||||
decision on whether memories should feed into context assembly,
|
||||
and if so at what trust level (below project_state, above chunks).
|
||||
- **Cross-project bleed is low.** The p04 query did pull one p05
|
||||
chunk (CGH_Design_Input_for_AOM) as the bottom hit but the top-4
|
||||
were all p04.
|
||||
|
||||
Proposed follow-ups (not yet scheduled):
|
||||
|
||||
1. ~~Decide whether memories should be folded into `formatted_context`
|
||||
and under what section header.~~ DONE 2026-04-11 (commits 8ea53f4,
|
||||
5913da5, 1161645). A `--- Project Memories ---` band now sits
|
||||
between identity/preference and retrieved chunks, gated on a
|
||||
canonical project hint to prevent cross-project bleed. Budget
|
||||
ratio 0.25 (tuned empirically — paragraph memories are ~400 chars
|
||||
and earlier 0.15 ratio starved the first entry by one char).
|
||||
Verified live: p04 architecture query surfaces the Option B memory.
|
||||
2. Re-run the same three queries after any builder change and compare
|
||||
`formatted_context` diffs — still open, and is the natural entry
|
||||
point for the retrieval eval harness on the roadmap.
|
||||
|
||||
## Reflection Loop Live Check — 2026-04-11
|
||||
|
||||
First real run of `batch-extract` across 42 captured Claude Code
|
||||
interactions on Dalidou produced exactly **1 candidate**, and that
|
||||
candidate was a synthetic test capture from earlier in the session
|
||||
(rejected). Finding:
|
||||
|
||||
- The rule-based extractor in `src/atocore/memory/extractor.py` keys
|
||||
on explicit structural cues (decision headings like
|
||||
`## Decision: ...`, preference sentences, etc.). Real Claude Code
|
||||
responses are conversational and almost never contain those cues.
|
||||
- This means the capture → extract half of the reflection loop is
|
||||
effectively inert against organic LLM sessions until either the
|
||||
rules are broadened (new cue families: "we chose X because...",
|
||||
"the selected approach is...", etc.) or an LLM-assisted extraction
|
||||
path is added alongside the rule-based one.
|
||||
- Capture → reinforce is working correctly on live data (length-aware
|
||||
matcher verified on live paraphrase of a p04 memory).
|
||||
|
||||
Follow-up candidates (not yet scheduled):
|
||||
|
||||
1. Extractor rule expansion — add conversational-form rules so real
|
||||
session text has a chance of surfacing candidates.
|
||||
2. LLM-assisted extractor as a separate rule family, guarded by
|
||||
confidence and always landing in `status=candidate` (never active).
|
||||
3. Retrieval eval harness — diffable scorecard of
|
||||
`formatted_context` across a fixed question set per active project.
|
||||
|
||||
## Long-Run Goal
|
||||
|
||||
The long-run target is:
|
||||
|
||||
- continue working normally inside PKM project stacks and Gitea repos
|
||||
- let OpenClaw keep its own memory and runtime behavior
|
||||
- let AtoCore supplement LLM work with stronger trusted context, retrieval, and
|
||||
context assembly
|
||||
|
||||
That means AtoCore should behave like a durable external context engine and
|
||||
machine-memory layer, not a replacement for normal repo work or OpenClaw memory.
|
||||
157
docs/openclaw-integration-contract.md
Normal file
157
docs/openclaw-integration-contract.md
Normal file
@@ -0,0 +1,157 @@
|
||||
# OpenClaw Integration Contract
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the first safe integration contract between OpenClaw and
|
||||
AtoCore.
|
||||
|
||||
The goal is to let OpenClaw consume AtoCore as an external context service
|
||||
without degrading OpenClaw's existing baseline behavior.
|
||||
|
||||
## Current Implemented State
|
||||
|
||||
The first safe integration foundation now exists on the T420 workspace:
|
||||
|
||||
- OpenClaw's own memory system is unchanged
|
||||
- a local read-only helper skill exists at:
|
||||
- `/home/papa/clawd/skills/atocore-context/`
|
||||
- the helper currently talks to the canonical Dalidou instance
|
||||
- the helper has verified:
|
||||
- `health`
|
||||
- `project-state`
|
||||
- `query`
|
||||
- `detect-project`
|
||||
- `auto-context`
|
||||
- fail-open fallback when AtoCore is unavailable
|
||||
|
||||
This means the network and workflow foundation is working, and the first
|
||||
organic routing layer now exists, even though deeper autonomous integration
|
||||
into OpenClaw runtime behavior is still deferred.
|
||||
|
||||
## Integration Principles
|
||||
|
||||
- OpenClaw remains the runtime and orchestration layer
|
||||
- AtoCore remains the context enrichment layer
|
||||
- AtoCore is optional at runtime
|
||||
- if AtoCore is unavailable, OpenClaw must continue operating normally
|
||||
- initial integration is read-only
|
||||
- OpenClaw should not automatically write memories, project state, or ingestion
|
||||
updates during the first integration batch
|
||||
|
||||
## First Safe Responsibilities
|
||||
|
||||
OpenClaw may use AtoCore for:
|
||||
|
||||
- health and readiness checks
|
||||
- context building for contextual prompts
|
||||
- retrieval/query support
|
||||
- project-state lookup when a project is detected
|
||||
- automatic project-context augmentation for project-knowledge questions
|
||||
|
||||
OpenClaw should not yet use AtoCore for:
|
||||
|
||||
- automatic memory write-back
|
||||
- automatic reflection
|
||||
- conflict resolution decisions
|
||||
- replacing OpenClaw's own memory system
|
||||
|
||||
## First API Surface
|
||||
|
||||
OpenClaw should treat these as the initial contract:
|
||||
|
||||
- `GET /health`
|
||||
- check service readiness
|
||||
- `GET /sources`
|
||||
- inspect source registration state
|
||||
- `POST /context/build`
|
||||
- ask AtoCore for a budgeted context pack
|
||||
- `POST /query`
|
||||
- use retrieval when useful
|
||||
|
||||
Additional project-state inspection can be added if needed, but the first
|
||||
integration should stay small and resilient.
|
||||
|
||||
## Current Helper Surface
|
||||
|
||||
The current helper script exposes:
|
||||
|
||||
- `health`
|
||||
- `sources`
|
||||
- `stats`
|
||||
- `projects`
|
||||
- `project-template`
|
||||
- `detect-project <prompt>`
|
||||
- `auto-context <prompt> [budget] [project]`
|
||||
- `debug-context`
|
||||
- `propose-project ...`
|
||||
- `register-project ...`
|
||||
- `update-project ...`
|
||||
- `refresh-project <project>`
|
||||
- `project-state <project>`
|
||||
- `query <prompt> [top_k]`
|
||||
- `context-build <prompt> [project] [budget]`
|
||||
- `ingest-sources`
|
||||
|
||||
This means OpenClaw can now use the full practical registry lifecycle for known
|
||||
projects without dropping down to raw API calls.
|
||||
|
||||
## Failure Behavior
|
||||
|
||||
OpenClaw must treat AtoCore as additive.
|
||||
|
||||
If AtoCore times out, returns an error, or is unavailable:
|
||||
|
||||
- OpenClaw should continue with its own normal baseline behavior
|
||||
- no hard dependency should block the user's run
|
||||
- no partially written AtoCore state should be assumed
|
||||
|
||||
## Suggested OpenClaw Configuration
|
||||
|
||||
OpenClaw should eventually expose configuration like:
|
||||
|
||||
- `ATOCORE_ENABLED`
|
||||
- `ATOCORE_BASE_URL`
|
||||
- `ATOCORE_TIMEOUT_MS`
|
||||
- `ATOCORE_FAIL_OPEN`
|
||||
|
||||
Recommended first behavior:
|
||||
|
||||
- enabled only when configured
|
||||
- low timeout
|
||||
- fail open by default
|
||||
- no writeback enabled
|
||||
|
||||
## Suggested Usage Pattern
|
||||
|
||||
1. OpenClaw receives a user request
|
||||
2. If the prompt looks like project knowledge, OpenClaw should try:
|
||||
- `auto-context "<prompt>" 3000`
|
||||
- optionally `debug-context` immediately after if a human wants to inspect
|
||||
the exact AtoCore supplement
|
||||
3. If the prompt is clearly asking for trusted current truth, OpenClaw should
|
||||
prefer:
|
||||
- `project-state <project>`
|
||||
4. If the user explicitly asked for source refresh or ingestion, OpenClaw
|
||||
should use:
|
||||
- `refresh-project <id>`
|
||||
5. If AtoCore returns usable context, OpenClaw includes it
|
||||
6. If AtoCore fails, returns `no_project_match`, or is unavailable, OpenClaw
|
||||
proceeds normally
|
||||
|
||||
## Deferred Work
|
||||
|
||||
- deeper automatic runtime wiring inside OpenClaw itself
|
||||
- memory promotion rules
|
||||
- identity and preference write flows
|
||||
- reflection loop
|
||||
- automatic ingestion requests from OpenClaw
|
||||
- write-back policy
|
||||
- conflict-resolution integration
|
||||
|
||||
## Precondition Before Wider Ingestion
|
||||
|
||||
Before bulk ingestion of projects or ecosystem notes:
|
||||
|
||||
- the AtoCore service should be reachable from the T420
|
||||
- the OpenClaw failure fallback path should be confirmed
|
||||
- the initial contract should be documented and stable
|
||||
142
docs/operating-model.md
Normal file
142
docs/operating-model.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# AtoCore Operating Model
|
||||
|
||||
## Purpose
|
||||
|
||||
This document makes the intended day-to-day operating model explicit.
|
||||
|
||||
The goal is not to replace how work already happens. The goal is to make that
|
||||
existing workflow stronger by adding a durable context engine.
|
||||
|
||||
## Core Idea
|
||||
|
||||
Normal work continues in:
|
||||
|
||||
- PKM project notes
|
||||
- Gitea repositories
|
||||
- Discord and OpenClaw workflows
|
||||
|
||||
OpenClaw keeps:
|
||||
|
||||
- its own memory
|
||||
- its own runtime and orchestration behavior
|
||||
- its own workspace and direct file/repo tooling
|
||||
|
||||
AtoCore adds:
|
||||
|
||||
- trusted project state
|
||||
- retrievable cross-source context
|
||||
- durable machine memory
|
||||
- context assembly that improves prompt quality and robustness
|
||||
|
||||
## Layer Responsibilities
|
||||
|
||||
- PKM and repos
|
||||
- human-authoritative project sources
|
||||
- where knowledge is created, edited, reviewed, and maintained
|
||||
- OpenClaw
|
||||
- active operating environment
|
||||
- orchestration, direct repo work, messaging, agent workflows, local memory
|
||||
- AtoCore
|
||||
- compiled context engine
|
||||
- durable machine-memory host
|
||||
- retrieval and context assembly layer
|
||||
|
||||
## Why This Architecture Works
|
||||
|
||||
Each layer has different strengths and weaknesses.
|
||||
|
||||
- PKM and repos are rich but noisy and manual to search
|
||||
- OpenClaw memory is useful but session-shaped and not the whole project record
|
||||
- raw LLM repo work is powerful but can miss trusted broader context
|
||||
- AtoCore can compile context across sources and provide a better prompt input
|
||||
|
||||
The result should be:
|
||||
|
||||
- stronger prompts
|
||||
- more robust outputs
|
||||
- less manual reconstruction
|
||||
- better continuity across sessions and models
|
||||
|
||||
## What AtoCore Should Not Replace
|
||||
|
||||
AtoCore should not replace:
|
||||
|
||||
- normal file reads
|
||||
- direct repo search
|
||||
- direct PKM work
|
||||
- OpenClaw's own memory
|
||||
- OpenClaw's runtime and tool behavior
|
||||
|
||||
It should supplement those systems.
|
||||
|
||||
## What Healthy Usage Looks Like
|
||||
|
||||
When working on a project:
|
||||
|
||||
1. OpenClaw still uses local workspace/repo context
|
||||
2. OpenClaw still uses its own memory
|
||||
3. AtoCore adds:
|
||||
- trusted current project state
|
||||
- retrieved project documents
|
||||
- cross-source project context
|
||||
- context assembly for more robust model prompts
|
||||
|
||||
## Practical Rule
|
||||
|
||||
Think of AtoCore as the durable external context hard drive for LLM work:
|
||||
|
||||
- fast machine-readable context
|
||||
- persistent project understanding
|
||||
- stronger prompt inputs
|
||||
- no need to replace the normal project workflow
|
||||
|
||||
That is the architecture target.
|
||||
|
||||
## Why The Staged Markdown Exists
|
||||
|
||||
The staged markdown on Dalidou is a source-input layer, not the end product of
|
||||
the system.
|
||||
|
||||
In the current deployment model:
|
||||
|
||||
1. selected PKM, AtoDrive, or repo docs are copied or mirrored into a Dalidou
|
||||
source path
|
||||
2. AtoCore ingests them
|
||||
3. the machine store keeps the processed representation
|
||||
4. retrieval and context building operate on that machine store
|
||||
|
||||
So if the staged docs look very similar to your original PKM notes, that is
|
||||
expected. They are source material, not the compiled context layer itself.
|
||||
|
||||
## What Happens When A Source Changes
|
||||
|
||||
If you edit a PKM note or repo doc at the original source, AtoCore does not
|
||||
magically know yet.
|
||||
|
||||
The current model is refresh-based:
|
||||
|
||||
1. update the human-authoritative source
|
||||
2. refresh or re-stage the relevant project source set on Dalidou
|
||||
3. run ingestion again
|
||||
4. let AtoCore update the machine representation
|
||||
|
||||
This is still an intermediate workflow. The long-run target is a cleaner source
|
||||
registry and refresh model so that commands like `refresh p05-interferometer`
|
||||
become natural and reliable.
|
||||
|
||||
## Current Scope Of Ingestion
|
||||
|
||||
The current project corpus is intentionally selective, not exhaustive.
|
||||
|
||||
For active projects, the goal right now is to ingest:
|
||||
|
||||
- high-value anchor docs
|
||||
- strong meeting notes with real decisions
|
||||
- architecture and constraints docs
|
||||
- selected repo context that explains the system shape
|
||||
|
||||
The goal is not to dump the entire PKM or whole repo tree into AtoCore on the
|
||||
first pass.
|
||||
|
||||
So if a project only has some curated notes and not the full project universe in
|
||||
the staged area yet, that is normal for the current phase.
|
||||
96
docs/operations.md
Normal file
96
docs/operations.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# AtoCore Operations
|
||||
|
||||
Current operating order for improving AtoCore:
|
||||
|
||||
1. Retrieval-quality pass
|
||||
2. Wave 2 trusted-operational ingestion
|
||||
3. AtoDrive clarification
|
||||
4. Restore and ops validation
|
||||
|
||||
## Retrieval-Quality Pass
|
||||
|
||||
Current live behavior:
|
||||
|
||||
- broad prompts like `gigabit` and `polisher` can surface archive/history noise
|
||||
- meaningful project prompts perform much better
|
||||
- ranking quality now matters more than raw corpus growth
|
||||
|
||||
Use the operator client to audit retrieval:
|
||||
|
||||
```bash
|
||||
python scripts/atocore_client.py audit-query "gigabit" 5
|
||||
python scripts/atocore_client.py audit-query "polisher" 5
|
||||
python scripts/atocore_client.py audit-query "mirror frame stiffness requirements and selected architecture" 5 p04-gigabit
|
||||
python scripts/atocore_client.py audit-query "interferometer error budget and vendor selection constraints" 5 p05-interferometer
|
||||
python scripts/atocore_client.py audit-query "polisher system map shared contracts and calibration workflow" 5 p06-polisher
|
||||
```
|
||||
|
||||
What to improve:
|
||||
|
||||
- reduce `_archive`, `pre-cleanup`, `pre-migration`, and `History` prominence
|
||||
- prefer current-status, decision, requirement, architecture-freeze, and milestone docs
|
||||
- prefer trusted project-state when it expresses current truth
|
||||
- avoid letting broad single-word prompts drift into stale chunks
|
||||
|
||||
## Wave 2 Trusted-Operational Ingestion
|
||||
|
||||
Do not ingest the whole PKM vault next.
|
||||
|
||||
Prioritize, for each active project:
|
||||
|
||||
- current status
|
||||
- current decisions
|
||||
- requirements baseline
|
||||
- architecture freeze / current baseline
|
||||
- milestone plan
|
||||
- next actions
|
||||
|
||||
Useful commands:
|
||||
|
||||
```bash
|
||||
python scripts/atocore_client.py project-state p04-gigabit
|
||||
python scripts/atocore_client.py project-state p05-interferometer
|
||||
python scripts/atocore_client.py project-state p06-polisher
|
||||
python scripts/atocore_client.py refresh-project p04-gigabit
|
||||
python scripts/atocore_client.py refresh-project p05-interferometer
|
||||
python scripts/atocore_client.py refresh-project p06-polisher
|
||||
```
|
||||
|
||||
## AtoDrive Clarification
|
||||
|
||||
Treat AtoDrive as a curated trusted-operational source, not a generic dump.
|
||||
|
||||
Good candidates:
|
||||
|
||||
- current dashboards
|
||||
- approved baselines
|
||||
- architecture freezes
|
||||
- decision logs
|
||||
- milestone and next-step views
|
||||
|
||||
Avoid by default:
|
||||
|
||||
- duplicated exports
|
||||
- stale snapshots
|
||||
- generic archives
|
||||
- exploratory notes that are not designated current truth
|
||||
|
||||
## Restore and Ops Validation
|
||||
|
||||
Backups are not enough until restore has been tested.
|
||||
|
||||
Validate:
|
||||
|
||||
- SQLite metadata restore
|
||||
- Chroma restore or rebuild
|
||||
- project registry restore
|
||||
- project refresh after recovery
|
||||
- retrieval audit before and after recovery
|
||||
|
||||
Baseline capture:
|
||||
|
||||
```bash
|
||||
python scripts/atocore_client.py health
|
||||
python scripts/atocore_client.py stats
|
||||
python scripts/atocore_client.py projects
|
||||
```
|
||||
321
docs/phase9-first-real-use.md
Normal file
321
docs/phase9-first-real-use.md
Normal file
@@ -0,0 +1,321 @@
|
||||
# Phase 9 First Real Use Report
|
||||
|
||||
## What this is
|
||||
|
||||
The first empirical exercise of the Phase 9 reflection loop after
|
||||
Commits A, B, and C all landed. The goal is to find out where the
|
||||
extractor and the reinforcement matcher actually behave well versus
|
||||
where their behaviour drifts from the design intent.
|
||||
|
||||
The validation is reproducible. To re-run:
|
||||
|
||||
```bash
|
||||
python scripts/phase9_first_real_use.py
|
||||
```
|
||||
|
||||
This writes an isolated SQLite + Chroma store under
|
||||
`data/validation/phase9-first-use/` (gitignored), seeds three active
|
||||
memories, then runs eight sample interactions through the full
|
||||
capture → reinforce → extract pipeline.
|
||||
|
||||
## What we ran
|
||||
|
||||
Eight synthetic interactions, each paraphrased from a real working
|
||||
session about AtoCore itself or the active engineering projects:
|
||||
|
||||
| # | Label | Project | Expected |
|
||||
|---|--------------------------------------|----------------------|---------------------------|
|
||||
| 1 | exdev-mount-merge-decision | atocore | 1 decision_heading |
|
||||
| 2 | ownership-was-the-real-fix | atocore | 1 fact_heading |
|
||||
| 3 | memory-vs-entity-canonical-home | atocore | 1 decision_heading (long) |
|
||||
| 4 | auto-promotion-deferred | atocore | 1 decision_heading |
|
||||
| 5 | preference-rebase-workflow | atocore | 1 preference_sentence |
|
||||
| 6 | constraint-from-doc-cite | p05-interferometer | 1 constraint_heading |
|
||||
| 7 | prose-only-no-cues | atocore | 0 candidates |
|
||||
| 8 | multiple-cues-in-one-interaction | p06-polisher | 3 distinct rules |
|
||||
|
||||
Plus 3 seed memories were inserted before the run:
|
||||
|
||||
- `pref_rebase`: "prefers rebase-based workflows because history stays linear" (preference, 0.6)
|
||||
- `pref_concise`: "writes commit messages focused on the why, not the what" (preference, 0.6)
|
||||
- `identity_runs_atocore`: "mechanical engineer who runs AtoCore for context engineering" (identity, 0.9)
|
||||
|
||||
## What happened — extraction (the good news)
|
||||
|
||||
**Every extraction expectation was met exactly.** All eight samples
|
||||
produced the predicted candidate count and the predicted rule
|
||||
classifications:
|
||||
|
||||
| Sample | Expected | Got | Pass |
|
||||
|---------------------------------------|----------|-----|------|
|
||||
| exdev-mount-merge-decision | 1 | 1 | ✅ |
|
||||
| ownership-was-the-real-fix | 1 | 1 | ✅ |
|
||||
| memory-vs-entity-canonical-home | 1 | 1 | ✅ |
|
||||
| auto-promotion-deferred | 1 | 1 | ✅ |
|
||||
| preference-rebase-workflow | 1 | 1 | ✅ |
|
||||
| constraint-from-doc-cite | 1 | 1 | ✅ |
|
||||
| prose-only-no-cues | **0** | **0** | ✅ |
|
||||
| multiple-cues-in-one-interaction | 3 | 3 | ✅ |
|
||||
|
||||
**Total: 9 candidates from 8 interactions, 0 false positives, 0 misses
|
||||
on heading patterns or sentence patterns.**
|
||||
|
||||
The extractor's strictness is well-tuned for the kinds of structural
|
||||
cues we actually use. Things worth noting:
|
||||
|
||||
- **Sample 7 (`prose-only-no-cues`) produced zero candidates as
|
||||
designed.** This is the most important sanity check — it confirms
|
||||
the extractor won't fill the review queue with general prose when
|
||||
there's no structural intent.
|
||||
- **Sample 3's long content was preserved without truncation.** The
|
||||
280-char max wasn't hit, and the content kept its full meaning.
|
||||
- **Sample 8 produced three distinct rules in one interaction**
|
||||
(decision_heading, constraint_heading, requirement_heading) without
|
||||
the dedup key collapsing them. The dedup key is
|
||||
`(memory_type, normalized_content, rule)` and the three are all
|
||||
different on at least one axis, so they coexist as expected.
|
||||
- **The prose around each heading was correctly ignored.** Sample 6
|
||||
has a second sentence ("the error budget allocates 6 nm to the
|
||||
laser source...") that does NOT have a structural cue, and the
|
||||
extractor correctly didn't fire on it.
|
||||
|
||||
## What happened — reinforcement (the empirical finding)
|
||||
|
||||
**Reinforcement matched zero seeded memories across all 8 samples,
|
||||
even when the response clearly echoed the seed.**
|
||||
|
||||
Sample 5's response was:
|
||||
|
||||
> *"I prefer rebase-based workflows because the history stays linear
|
||||
> and reviewers have an easier time."*
|
||||
|
||||
The seeded `pref_rebase` memory was:
|
||||
|
||||
> *"prefers rebase-based workflows because history stays linear"*
|
||||
|
||||
A human reading both says these are the same fact. The reinforcement
|
||||
matcher disagrees. After all 8 interactions:
|
||||
|
||||
```
|
||||
pref_rebase: confidence=0.6000 refs=0 last=-
|
||||
pref_concise: confidence=0.6000 refs=0 last=-
|
||||
identity_runs_atocore: confidence=0.9000 refs=0 last=-
|
||||
```
|
||||
|
||||
**Nothing moved.** This is the most important finding from this
|
||||
validation pass.
|
||||
|
||||
### Why the matcher missed it
|
||||
|
||||
The current `_memory_matches` rule (in
|
||||
`src/atocore/memory/reinforcement.py`) does a normalized substring
|
||||
match: it lowercases both sides, collapses whitespace, then asks
|
||||
"does the leading 80-char window of the memory content appear as a
|
||||
substring in the response?"
|
||||
|
||||
For the rebase example:
|
||||
|
||||
- needle (normalized): `prefers rebase-based workflows because history stays linear`
|
||||
- haystack (normalized): `i prefer rebase-based workflows because the history stays linear and reviewers have an easier time.`
|
||||
|
||||
The needle starts with `prefers` (with the trailing `s`), and the
|
||||
haystack has `prefer` (without the `s`, because of the first-person
|
||||
voice). And the needle has `because history stays linear`, while the
|
||||
haystack has `because the history stays linear`. **Two small natural
|
||||
paraphrases, and the substring fails.**
|
||||
|
||||
This isn't a bug in the matcher's implementation — it's doing
|
||||
exactly what it was specified to do. It's a design limitation: the
|
||||
substring rule is too brittle for real prose, where the same fact
|
||||
gets re-stated with different verb forms, articles, and word order.
|
||||
|
||||
### Severity
|
||||
|
||||
**Medium-high.** Reinforcement is the entire point of Commit B.
|
||||
A reinforcement matcher that never fires on natural paraphrases
|
||||
will leave seeded memories with stale confidence forever. The
|
||||
reflection loop runs but it doesn't actually reinforce anything.
|
||||
That hollows out the value of having reinforcement at all.
|
||||
|
||||
It is not a critical bug because:
|
||||
- Nothing breaks. The pipeline still runs cleanly.
|
||||
- Reinforcement is supposed to be a *signal*, not the only path to
|
||||
high confidence — humans can still curate confidence directly.
|
||||
- The candidate-extraction path (Commit C) is unaffected and works
|
||||
perfectly.
|
||||
|
||||
But it does need to be addressed before Phase 9 can be considered
|
||||
operationally complete.
|
||||
|
||||
## Recommended fix (deferred to a follow-up commit)
|
||||
|
||||
Replace the substring matcher with a token-overlap matcher. The
|
||||
specification:
|
||||
|
||||
1. Tokenize both memory content and response into lowercase words
|
||||
of length >= 3, dropping a small stop list (`the`, `a`, `an`,
|
||||
`and`, `or`, `of`, `to`, `is`, `was`, `that`, `this`, `with`,
|
||||
`for`, `from`, `into`).
|
||||
2. Stem aggressively (or at minimum, fold trailing `s` and `ed`
|
||||
so `prefers`/`prefer`/`preferred` collapse to one token).
|
||||
3. A match exists if **at least 70% of the memory's content
|
||||
tokens** appear in the response token set.
|
||||
4. Memory content must still be at least `_MIN_MEMORY_CONTENT_LENGTH`
|
||||
characters to be considered.
|
||||
|
||||
This is more permissive than the substring rule but still tight
|
||||
enough to avoid spurious matches on generic words. It would have
|
||||
caught the rebase example because:
|
||||
|
||||
- memory tokens (after stop-list and stemming):
|
||||
`{prefer, rebase-bas, workflow, because, history, stay, linear}`
|
||||
- response tokens:
|
||||
`{prefer, rebase-bas, workflow, because, history, stay, linear,
|
||||
reviewer, easi, time}`
|
||||
- overlap: 7 / 7 memory tokens = 100% > 70% threshold → match
|
||||
|
||||
### Why not fix it in this report
|
||||
|
||||
Three reasons:
|
||||
|
||||
1. The validation report is supposed to be evidence, not a fix
|
||||
spec. A separate commit will introduce the new matcher with
|
||||
its own tests.
|
||||
2. The token-overlap matcher needs its own design review for edge
|
||||
cases (very long memories, very short responses, technical
|
||||
abbreviations, code snippets in responses).
|
||||
3. Mixing the report and the fix into one commit would muddle the
|
||||
audit trail. The report is the empirical evidence; the fix is
|
||||
the response.
|
||||
|
||||
The fix is queued as the next Phase 9 maintenance commit and is
|
||||
flagged in the next-steps section below.
|
||||
|
||||
## Other observations
|
||||
|
||||
### Extraction is conservative on purpose, and that's working
|
||||
|
||||
Sample 7 is the most important data point in the whole run.
|
||||
A natural prose response with no structural cues produced zero
|
||||
candidates. **This is exactly the design intent** — the extractor
|
||||
should be loud about explicit decisions/constraints/requirements
|
||||
and quiet about everything else. If the extractor were too loose
|
||||
the review queue would fill up with low-value items and the human
|
||||
would stop reviewing.
|
||||
|
||||
After this run I have measurably more confidence that the V0 rule
|
||||
set is the right starting point. Future rules can be added one at
|
||||
a time as we see specific patterns the extractor misses, instead of
|
||||
guessing at what might be useful.
|
||||
|
||||
### Confidence on candidates
|
||||
|
||||
All extracted candidates landed at the default `confidence=0.5`,
|
||||
which is what the extractor is currently hardcoded to do. The
|
||||
`promotion-rules.md` doc proposes a per-rule prior with a
|
||||
structural-signal multiplier and freshness bonus. None of that is
|
||||
implemented yet. The validation didn't reveal any urgency around
|
||||
this — humans review the candidates either way — but it confirms
|
||||
that the priors-and-multipliers refinement is a reasonable next
|
||||
step rather than a critical one.
|
||||
|
||||
### Multiple cues in one interaction
|
||||
|
||||
Sample 8 confirmed an important property: **three structural
|
||||
cues in the same response do not collide in dedup**. The dedup
|
||||
key is `(memory_type, normalized_content, rule)`, and since each
|
||||
cue produced a distinct (type, content, rule) tuple, all three
|
||||
landed cleanly.
|
||||
|
||||
This matters because real working sessions naturally bundle
|
||||
multiple decisions/constraints/requirements into one summary.
|
||||
The extractor handles those bundles correctly.
|
||||
|
||||
### Project scoping
|
||||
|
||||
Each candidate carries the `project` from the source interaction
|
||||
into its own `project` field. Sample 6 (p05) and sample 8 (p06)
|
||||
both produced candidates with the right project. This is
|
||||
non-obvious because the extractor module never explicitly looks
|
||||
at project — it inherits from the interaction it's scanning. Worth
|
||||
keeping in mind when the entity extractor is built: same pattern
|
||||
should apply.
|
||||
|
||||
## What this validates and what it doesn't
|
||||
|
||||
### Validates
|
||||
|
||||
- The Phase 9 Commit C extractor's rule set is well-tuned for
|
||||
hand-written structural cues
|
||||
- The dedup logic does the right thing across multiple cues
|
||||
- The "drop candidates that match an existing active memory" filter
|
||||
works (would have been visible if any seeded memory had matched
|
||||
one of the heading texts — none did, but the code path is the
|
||||
same one that's covered in `tests/test_extractor.py`)
|
||||
- The `prose-only-no-cues` no-fire case is solid
|
||||
- Long content is preserved without truncation
|
||||
- Project scoping flows through the pipeline
|
||||
|
||||
### Does NOT validate
|
||||
|
||||
- The reinforcement matcher (clearly, since it caught nothing)
|
||||
- The behaviour against very long documents (each sample was
|
||||
under 700 chars; real interaction responses can be 10× that)
|
||||
- The behaviour against responses that contain code blocks (the
|
||||
extractor's regex rules don't handle code-block fenced sections
|
||||
specially)
|
||||
- Cross-interaction promotion-to-active flow (no candidate was
|
||||
promoted in this run; the lifecycle is covered by the unit tests
|
||||
but not by this empirical exercise)
|
||||
- The behaviour at scale: 8 interactions is a one-shot. We need
|
||||
to see the queue after 50+ before judging reviewer ergonomics.
|
||||
|
||||
### Recommended next empirical exercises
|
||||
|
||||
1. **Real conversation capture**, using a slash command from a
|
||||
real Claude Code session against either a local or Dalidou
|
||||
AtoCore instance. The synthetic responses in this script are
|
||||
honest paraphrases but they're still hand-curated.
|
||||
2. **Bulk capture from existing PKM**, ingesting a few real
|
||||
project notes through the extractor as if they were
|
||||
interactions. This stresses the rules against documents that
|
||||
weren't written with the extractor in mind.
|
||||
3. **Reinforcement matcher rerun** after the token-overlap
|
||||
matcher lands.
|
||||
|
||||
## Action items from this report
|
||||
|
||||
- [ ] **Fix reinforcement matcher** with token-overlap rule
|
||||
described in the "Recommended fix" section above. Owner:
|
||||
next session. Severity: medium-high.
|
||||
- [x] **Document the extractor's V0 strictness** as a working
|
||||
property, not a limitation. Sample 7 makes the case.
|
||||
- [ ] **Build the slash command** so the next validation run
|
||||
can use real (not synthetic) interactions. Tracked in
|
||||
Session 2 of the current planning sprint.
|
||||
- [ ] **Run a 50+ interaction batch** to evaluate reviewer
|
||||
ergonomics. Deferred until the slash command exists.
|
||||
|
||||
## Reproducibility
|
||||
|
||||
The script is deterministic. Re-running it will produce
|
||||
identical results because:
|
||||
|
||||
- the data dir is wiped on every run
|
||||
- the sample interactions are constants
|
||||
- the memory uuid generation is non-deterministic but the
|
||||
important fields (content, type, count, rule) are not
|
||||
- the `data/validation/phase9-first-use/` directory is gitignored,
|
||||
so no state leaks across runs
|
||||
|
||||
To reproduce this exact report:
|
||||
|
||||
```bash
|
||||
python scripts/phase9_first_real_use.py
|
||||
```
|
||||
|
||||
To get JSON output for downstream tooling:
|
||||
|
||||
```bash
|
||||
python scripts/phase9_first_real_use.py --json
|
||||
```
|
||||
129
docs/project-registration-policy.md
Normal file
129
docs/project-registration-policy.md
Normal file
@@ -0,0 +1,129 @@
|
||||
# AtoCore Project Registration Policy
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the normal path for adding a new project to AtoCore and
|
||||
for safely updating an existing registration later.
|
||||
|
||||
The goal is to make `register + refresh` the standard workflow instead of
|
||||
relying on long custom ingestion prompts every time.
|
||||
|
||||
## What Registration Means
|
||||
|
||||
Registering a project does not ingest it by itself.
|
||||
|
||||
Registration means:
|
||||
|
||||
- the project gets a canonical AtoCore id
|
||||
- known aliases are recorded
|
||||
- the staged source roots for that project are defined
|
||||
- AtoCore and OpenClaw can later refresh that project consistently
|
||||
|
||||
Updating a project means:
|
||||
|
||||
- aliases can be corrected or expanded
|
||||
- the short registry description can be improved
|
||||
- ingest roots can be adjusted deliberately
|
||||
- the canonical project id remains stable
|
||||
|
||||
## Required Fields
|
||||
|
||||
Each project registry entry must include:
|
||||
|
||||
- `id`
|
||||
- stable canonical project id
|
||||
- prefer lowercase kebab-case
|
||||
- examples:
|
||||
- `p04-gigabit`
|
||||
- `p05-interferometer`
|
||||
- `p06-polisher`
|
||||
- `aliases`
|
||||
- short common names or abbreviations
|
||||
- examples:
|
||||
- `p05`
|
||||
- `interferometer`
|
||||
- `description`
|
||||
- short explanation of what the registered source set represents
|
||||
- `ingest_roots`
|
||||
- one or more staged roots under configured source layers
|
||||
|
||||
## Allowed Source Roots
|
||||
|
||||
Current allowed `source` values are:
|
||||
|
||||
- `vault`
|
||||
- `drive`
|
||||
|
||||
These map to the configured Dalidou source boundaries.
|
||||
|
||||
## Recommended Registration Rules
|
||||
|
||||
1. Prefer one canonical project id
|
||||
2. Keep aliases short and practical
|
||||
3. Start with the smallest useful staged roots
|
||||
4. Prefer curated high-signal docs before broad corpora
|
||||
5. Keep repo context selective at first
|
||||
6. Avoid registering noisy or generated trees
|
||||
7. Use `drive` for trusted operational material when available
|
||||
8. Use `vault` for curated staged PKM and repo-doc snapshots
|
||||
|
||||
## Normal Workflow
|
||||
|
||||
For a new project:
|
||||
|
||||
1. stage the initial source docs on Dalidou
|
||||
2. inspect the expected shape with:
|
||||
- `GET /projects/template`
|
||||
- or `atocore.sh project-template`
|
||||
3. preview the entry without mutating state:
|
||||
- `POST /projects/proposal`
|
||||
- or `atocore.sh propose-project ...`
|
||||
4. register the approved entry:
|
||||
- `POST /projects/register`
|
||||
- or `atocore.sh register-project ...`
|
||||
5. verify the entry with:
|
||||
- `GET /projects`
|
||||
- or the T420 helper `atocore.sh projects`
|
||||
6. refresh it with:
|
||||
- `POST /projects/{id}/refresh`
|
||||
- or `atocore.sh refresh-project <id>`
|
||||
7. verify retrieval and context quality
|
||||
8. only later promote stable facts into Trusted Project State
|
||||
|
||||
For an existing registered project:
|
||||
|
||||
1. inspect the current entry with:
|
||||
- `GET /projects`
|
||||
- or `atocore.sh projects`
|
||||
2. update the registration if aliases, description, or roots need refinement:
|
||||
- `PUT /projects/{id}`
|
||||
3. verify the updated entry
|
||||
4. refresh the project again
|
||||
5. verify retrieval and context quality did not regress
|
||||
|
||||
## What Not To Do
|
||||
|
||||
Do not:
|
||||
|
||||
- register giant noisy trees blindly
|
||||
- treat registration as equivalent to trusted state
|
||||
- dump the full PKM by default
|
||||
- rely on aliases that collide across projects
|
||||
- use the live machine DB as a source root
|
||||
|
||||
## Template
|
||||
|
||||
Use:
|
||||
|
||||
- [project-registry.example.json](C:/Users/antoi/ATOCore/config/project-registry.example.json)
|
||||
|
||||
And the API template endpoint:
|
||||
|
||||
- `GET /projects/template`
|
||||
|
||||
Other lifecycle endpoints:
|
||||
|
||||
- `POST /projects/proposal`
|
||||
- `POST /projects/register`
|
||||
- `PUT /projects/{id}`
|
||||
- `POST /projects/{id}/refresh`
|
||||
103
docs/source-refresh-model.md
Normal file
103
docs/source-refresh-model.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# AtoCore Source Refresh Model
|
||||
|
||||
## Purpose
|
||||
|
||||
This document explains how human-authored project material should flow into the
|
||||
Dalidou-hosted AtoCore machine store.
|
||||
|
||||
It exists to make one distinction explicit:
|
||||
|
||||
- source markdown is not the same thing as the machine-memory layer
|
||||
- source refresh is how changes in PKM or repos become visible to AtoCore
|
||||
|
||||
## Current Model
|
||||
|
||||
Today, the flow is:
|
||||
|
||||
1. human-authoritative project material exists in PKM, AtoDrive, and repos
|
||||
2. selected high-value files are staged into Dalidou source paths
|
||||
3. AtoCore ingests those source files
|
||||
4. AtoCore stores the processed representation in:
|
||||
- document records
|
||||
- chunks
|
||||
- vectors
|
||||
- project memory
|
||||
- trusted project state
|
||||
5. retrieval and context assembly use the machine store, not the staged folder
|
||||
|
||||
## Why This Feels Redundant
|
||||
|
||||
The staged source files can look almost identical to the original PKM notes or
|
||||
repo docs because they are still source material.
|
||||
|
||||
That is expected.
|
||||
|
||||
The staged source area exists because the canonical AtoCore instance on Dalidou
|
||||
needs a server-visible path to ingest from.
|
||||
|
||||
## What Happens When A Project Source Changes
|
||||
|
||||
If you edit a note in PKM or a doc in a repo:
|
||||
|
||||
- the original source changes immediately
|
||||
- the staged Dalidou copy does not change automatically
|
||||
- the AtoCore machine store also does not change automatically
|
||||
|
||||
To refresh AtoCore:
|
||||
|
||||
1. select the updated project source set
|
||||
2. copy or mirror the new version into the Dalidou source area
|
||||
3. run ingestion again
|
||||
4. verify that retrieval and context reflect the new material
|
||||
|
||||
## Current Intentional Limits
|
||||
|
||||
The current active-project ingestion strategy is selective.
|
||||
|
||||
That means:
|
||||
|
||||
- not every note from a project is staged
|
||||
- not every repo file is staged
|
||||
- the goal is to start with high-value anchor docs
|
||||
- broader ingestion comes later if needed
|
||||
|
||||
This is why the staged source area for a project may look partial or uneven at
|
||||
this stage.
|
||||
|
||||
## Long-Run Target
|
||||
|
||||
The long-run workflow should become much more natural:
|
||||
|
||||
- each project has a registered source map
|
||||
- PKM root
|
||||
- AtoDrive root
|
||||
- repo root
|
||||
- preferred docs
|
||||
- excluded noisy paths
|
||||
- a command like `refresh p06-polisher` resolves the right sources
|
||||
- AtoCore refreshes the machine representation cleanly
|
||||
- OpenClaw consumes the improved context over API
|
||||
|
||||
## Current Foundation
|
||||
|
||||
The first concrete foundation for this now exists in AtoCore:
|
||||
|
||||
- a project registry file records known project ids, aliases, and ingest roots
|
||||
- the API can list those registered projects
|
||||
- the API can return a registration template for new projects
|
||||
- the API can preview a proposed registration before writing it
|
||||
- the API can persist an approved registration to the registry
|
||||
- the API can refresh a single registered project from its configured roots
|
||||
|
||||
This is not full source automation yet, but it gives the refresh model a real
|
||||
home in the system.
|
||||
|
||||
## Healthy Mental Model
|
||||
|
||||
Use this distinction:
|
||||
|
||||
- PKM / AtoDrive / repos = human-authoritative sources
|
||||
- staged Dalidou markdown = server-visible ingestion inputs
|
||||
- AtoCore DB/vector state = compiled machine context layer
|
||||
|
||||
That separation is intentional and healthy.
|
||||
36
pyproject.toml
Normal file
36
pyproject.toml
Normal file
@@ -0,0 +1,36 @@
|
||||
[build-system]
|
||||
requires = ["setuptools>=68.0", "wheel"]
|
||||
build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "atocore"
|
||||
version = "0.2.0"
|
||||
description = "Personal context engine for LLM interactions"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"fastapi>=0.110.0",
|
||||
"uvicorn[standard]>=0.27.0",
|
||||
"python-frontmatter>=1.1.0",
|
||||
"chromadb>=0.4.22",
|
||||
"sentence-transformers>=2.5.0",
|
||||
"pydantic>=2.6.0",
|
||||
"pydantic-settings>=2.1.0",
|
||||
"structlog>=24.1.0",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
dev = [
|
||||
"pytest>=8.0.0",
|
||||
"pytest-cov>=4.1.0",
|
||||
"httpx>=0.27.0",
|
||||
"pyyaml>=6.0.0",
|
||||
]
|
||||
|
||||
[tool.setuptools.packages.find]
|
||||
where = ["src"]
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
testpaths = ["tests"]
|
||||
python_files = ["test_*.py"]
|
||||
python_functions = ["test_*"]
|
||||
addopts = "-v"
|
||||
5
requirements-dev.txt
Normal file
5
requirements-dev.txt
Normal file
@@ -0,0 +1,5 @@
|
||||
-r requirements.txt
|
||||
pytest>=8.0.0
|
||||
pytest-cov>=4.1.0
|
||||
httpx>=0.27.0
|
||||
pyyaml>=6.0.0
|
||||
8
requirements.txt
Normal file
8
requirements.txt
Normal file
@@ -0,0 +1,8 @@
|
||||
fastapi>=0.110.0
|
||||
uvicorn[standard]>=0.27.0
|
||||
python-frontmatter>=1.1.0
|
||||
chromadb>=0.4.22
|
||||
sentence-transformers>=2.5.0
|
||||
pydantic>=2.6.0
|
||||
pydantic-settings>=2.1.0
|
||||
structlog>=24.1.0
|
||||
630
scripts/atocore_client.py
Normal file
630
scripts/atocore_client.py
Normal file
@@ -0,0 +1,630 @@
|
||||
"""Operator-facing API client for live AtoCore instances.
|
||||
|
||||
This script is intentionally external to the app runtime. It is for admins
|
||||
and operators who want a convenient way to inspect live project state,
|
||||
refresh projects, audit retrieval quality, manage trusted project-state
|
||||
entries, and drive the Phase 9 reflection loop (capture, extract, queue,
|
||||
promote, reject).
|
||||
|
||||
Environment variables
|
||||
---------------------
|
||||
|
||||
ATOCORE_BASE_URL
|
||||
Base URL of the AtoCore service (default: ``http://dalidou:8100``).
|
||||
|
||||
When running ON the Dalidou host itself or INSIDE the Dalidou
|
||||
container, override this with loopback or the real IP::
|
||||
|
||||
ATOCORE_BASE_URL=http://127.0.0.1:8100 \\
|
||||
python scripts/atocore_client.py health
|
||||
|
||||
The default hostname "dalidou" is meant for cases where the
|
||||
caller is a remote machine (laptop, T420/OpenClaw, etc.) with
|
||||
"dalidou" in its /etc/hosts or resolvable via Tailscale. It does
|
||||
NOT reliably resolve on the host itself or inside the container,
|
||||
and when it fails the client returns
|
||||
``{"status": "unavailable", "fail_open": true}`` — the right
|
||||
diagnosis when that happens is to set ATOCORE_BASE_URL explicitly
|
||||
to 127.0.0.1:8100 and retry.
|
||||
|
||||
ATOCORE_TIMEOUT_SECONDS
|
||||
Request timeout for most operations (default: 30).
|
||||
|
||||
ATOCORE_REFRESH_TIMEOUT_SECONDS
|
||||
Longer timeout for project refresh operations which can be slow
|
||||
(default: 1800).
|
||||
|
||||
ATOCORE_FAIL_OPEN
|
||||
When "true" (default), network errors return a small fail-open
|
||||
envelope instead of raising. Set to "false" for admin operations
|
||||
where you need the real error.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
from typing import Any
|
||||
|
||||
|
||||
BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://dalidou:8100").rstrip("/")
|
||||
TIMEOUT = int(os.environ.get("ATOCORE_TIMEOUT_SECONDS", "30"))
|
||||
REFRESH_TIMEOUT = int(os.environ.get("ATOCORE_REFRESH_TIMEOUT_SECONDS", "1800"))
|
||||
FAIL_OPEN = os.environ.get("ATOCORE_FAIL_OPEN", "true").lower() == "true"
|
||||
|
||||
# Bumped when the subcommand surface or JSON output shapes meaningfully
|
||||
# change. See docs/architecture/llm-client-integration.md for the
|
||||
# semver rules. History:
|
||||
# 0.1.0 initial stable-ops-only client
|
||||
# 0.2.0 Phase 9 reflection loop added: capture, extract,
|
||||
# reinforce-interaction, list-interactions, get-interaction,
|
||||
# queue, promote, reject
|
||||
CLIENT_VERSION = "0.2.0"
|
||||
|
||||
|
||||
def print_json(payload: Any) -> None:
|
||||
print(json.dumps(payload, ensure_ascii=True, indent=2))
|
||||
|
||||
|
||||
def fail_open_payload() -> dict[str, Any]:
|
||||
return {"status": "unavailable", "source": "atocore", "fail_open": True}
|
||||
|
||||
|
||||
def request(
|
||||
method: str,
|
||||
path: str,
|
||||
data: dict[str, Any] | None = None,
|
||||
timeout: int | None = None,
|
||||
) -> Any:
|
||||
url = f"{BASE_URL}{path}"
|
||||
headers = {"Content-Type": "application/json"} if data is not None else {}
|
||||
payload = json.dumps(data).encode("utf-8") if data is not None else None
|
||||
req = urllib.request.Request(url, data=payload, headers=headers, method=method)
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=timeout or TIMEOUT) as response:
|
||||
body = response.read().decode("utf-8")
|
||||
except urllib.error.HTTPError as exc:
|
||||
body = exc.read().decode("utf-8")
|
||||
if body:
|
||||
print(body)
|
||||
raise SystemExit(22) from exc
|
||||
except (urllib.error.URLError, TimeoutError, OSError):
|
||||
if FAIL_OPEN:
|
||||
print_json(fail_open_payload())
|
||||
raise SystemExit(0)
|
||||
raise
|
||||
|
||||
if not body.strip():
|
||||
return {}
|
||||
return json.loads(body)
|
||||
|
||||
|
||||
def parse_aliases(aliases_csv: str) -> list[str]:
|
||||
return [alias.strip() for alias in aliases_csv.split(",") if alias.strip()]
|
||||
|
||||
|
||||
def detect_project(prompt: str) -> dict[str, Any]:
|
||||
payload = request("GET", "/projects")
|
||||
prompt_lower = prompt.lower()
|
||||
best_project = None
|
||||
best_alias = None
|
||||
best_score = -1
|
||||
|
||||
for project in payload.get("projects", []):
|
||||
candidates = [project.get("id", ""), *project.get("aliases", [])]
|
||||
for candidate in candidates:
|
||||
candidate = (candidate or "").strip()
|
||||
if not candidate:
|
||||
continue
|
||||
pattern = rf"(?<![a-z0-9]){re.escape(candidate.lower())}(?![a-z0-9])"
|
||||
matched = re.search(pattern, prompt_lower) is not None
|
||||
if not matched and candidate.lower() not in prompt_lower:
|
||||
continue
|
||||
score = len(candidate)
|
||||
if score > best_score:
|
||||
best_project = project.get("id")
|
||||
best_alias = candidate
|
||||
best_score = score
|
||||
|
||||
return {"matched_project": best_project, "matched_alias": best_alias}
|
||||
|
||||
|
||||
def classify_result(result: dict[str, Any]) -> dict[str, Any]:
|
||||
source_file = (result.get("source_file") or "").lower()
|
||||
heading = (result.get("heading_path") or "").lower()
|
||||
title = (result.get("title") or "").lower()
|
||||
text = " ".join([source_file, heading, title])
|
||||
|
||||
labels: list[str] = []
|
||||
if any(token in text for token in ["_archive", "/archive", "archive/", "pre-cleanup", "pre-migration", "history"]):
|
||||
labels.append("archive_or_history")
|
||||
if any(token in text for token in ["status", "dashboard", "current-state", "current state", "next-steps", "next steps"]):
|
||||
labels.append("current_status")
|
||||
if any(token in text for token in ["decision", "adr", "tradeoff", "selected architecture", "selection"]):
|
||||
labels.append("decision")
|
||||
if any(token in text for token in ["requirement", "spec", "constraints", "baseline", "cdr", "sow"]):
|
||||
labels.append("requirements")
|
||||
if any(token in text for token in ["roadmap", "milestone", "plan", "workflow", "calibration", "contract"]):
|
||||
labels.append("execution_plan")
|
||||
if not labels:
|
||||
labels.append("reference")
|
||||
|
||||
return {
|
||||
"score": result.get("score"),
|
||||
"title": result.get("title"),
|
||||
"heading_path": result.get("heading_path"),
|
||||
"source_file": result.get("source_file"),
|
||||
"labels": labels,
|
||||
"is_noise_risk": "archive_or_history" in labels,
|
||||
}
|
||||
|
||||
|
||||
def audit_query(prompt: str, top_k: int, project: str | None) -> dict[str, Any]:
|
||||
response = request(
|
||||
"POST",
|
||||
"/query",
|
||||
{"prompt": prompt, "top_k": top_k, "project": project or None},
|
||||
)
|
||||
classifications = [classify_result(result) for result in response.get("results", [])]
|
||||
broad_prompt = len(prompt.split()) <= 2
|
||||
noise_hits = sum(1 for item in classifications if item["is_noise_risk"])
|
||||
current_hits = sum(1 for item in classifications if "current_status" in item["labels"])
|
||||
decision_hits = sum(1 for item in classifications if "decision" in item["labels"])
|
||||
requirements_hits = sum(1 for item in classifications if "requirements" in item["labels"])
|
||||
|
||||
recommendations: list[str] = []
|
||||
if broad_prompt:
|
||||
recommendations.append("Prompt is broad; prefer a project-specific question with intent, artifact type, or constraint language.")
|
||||
if noise_hits:
|
||||
recommendations.append("Archive/history noise is present; prefer current-status, decision, requirements, and baseline docs in the next ingestion/ranking pass.")
|
||||
if current_hits == 0:
|
||||
recommendations.append("No current-status docs surfaced in the top results; Wave 2 should ingest or strengthen trusted operational truth.")
|
||||
if decision_hits == 0:
|
||||
recommendations.append("No decision docs surfaced in the top results; add or freeze decision logs for the active project.")
|
||||
if requirements_hits == 0:
|
||||
recommendations.append("No requirements/baseline docs surfaced in the top results; prioritize baseline and architecture-freeze material.")
|
||||
if not recommendations:
|
||||
recommendations.append("Ranking looks healthy for this prompt.")
|
||||
|
||||
return {
|
||||
"prompt": prompt,
|
||||
"project": project,
|
||||
"top_k": top_k,
|
||||
"broad_prompt": broad_prompt,
|
||||
"noise_hits": noise_hits,
|
||||
"current_status_hits": current_hits,
|
||||
"decision_hits": decision_hits,
|
||||
"requirements_hits": requirements_hits,
|
||||
"results": classifications,
|
||||
"recommendations": recommendations,
|
||||
}
|
||||
|
||||
|
||||
def project_payload(
|
||||
project_id: str,
|
||||
aliases_csv: str,
|
||||
source: str,
|
||||
subpath: str,
|
||||
description: str,
|
||||
label: str,
|
||||
) -> dict[str, Any]:
|
||||
return {
|
||||
"project_id": project_id,
|
||||
"aliases": parse_aliases(aliases_csv),
|
||||
"description": description,
|
||||
"ingest_roots": [{"source": source, "subpath": subpath, "label": label}],
|
||||
}
|
||||
|
||||
|
||||
def build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(description="AtoCore live API client")
|
||||
sub = parser.add_subparsers(dest="command", required=True)
|
||||
|
||||
for name in ["health", "sources", "stats", "projects", "project-template", "debug-context", "ingest-sources"]:
|
||||
sub.add_parser(name)
|
||||
|
||||
p = sub.add_parser("detect-project")
|
||||
p.add_argument("prompt")
|
||||
|
||||
p = sub.add_parser("auto-context")
|
||||
p.add_argument("prompt")
|
||||
p.add_argument("budget", nargs="?", type=int, default=3000)
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
|
||||
for name in ["propose-project", "register-project"]:
|
||||
p = sub.add_parser(name)
|
||||
p.add_argument("project_id")
|
||||
p.add_argument("aliases_csv")
|
||||
p.add_argument("source")
|
||||
p.add_argument("subpath")
|
||||
p.add_argument("description", nargs="?", default="")
|
||||
p.add_argument("label", nargs="?", default="")
|
||||
|
||||
p = sub.add_parser("update-project")
|
||||
p.add_argument("project")
|
||||
p.add_argument("description")
|
||||
p.add_argument("aliases_csv", nargs="?", default="")
|
||||
|
||||
p = sub.add_parser("refresh-project")
|
||||
p.add_argument("project")
|
||||
p.add_argument("purge_deleted", nargs="?", default="false")
|
||||
|
||||
p = sub.add_parser("project-state")
|
||||
p.add_argument("project")
|
||||
p.add_argument("category", nargs="?", default="")
|
||||
|
||||
p = sub.add_parser("project-state-set")
|
||||
p.add_argument("project")
|
||||
p.add_argument("category")
|
||||
p.add_argument("key")
|
||||
p.add_argument("value")
|
||||
p.add_argument("source", nargs="?", default="")
|
||||
p.add_argument("confidence", nargs="?", type=float, default=1.0)
|
||||
|
||||
p = sub.add_parser("project-state-invalidate")
|
||||
p.add_argument("project")
|
||||
p.add_argument("category")
|
||||
p.add_argument("key")
|
||||
|
||||
p = sub.add_parser("query")
|
||||
p.add_argument("prompt")
|
||||
p.add_argument("top_k", nargs="?", type=int, default=5)
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
|
||||
p = sub.add_parser("context-build")
|
||||
p.add_argument("prompt")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("budget", nargs="?", type=int, default=3000)
|
||||
|
||||
p = sub.add_parser("audit-query")
|
||||
p.add_argument("prompt")
|
||||
p.add_argument("top_k", nargs="?", type=int, default=5)
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
|
||||
# --- Phase 9 reflection loop surface --------------------------------
|
||||
#
|
||||
# capture: record one interaction (prompt + response + context used).
|
||||
# Mirrors POST /interactions. response is positional so shell
|
||||
# callers can pass it via $(cat file.txt) or heredoc. project,
|
||||
# client, and session_id are optional positionals with empty
|
||||
# defaults, matching the existing script's style.
|
||||
p = sub.add_parser("capture")
|
||||
p.add_argument("prompt")
|
||||
p.add_argument("response", nargs="?", default="")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("client", nargs="?", default="")
|
||||
p.add_argument("session_id", nargs="?", default="")
|
||||
p.add_argument("reinforce", nargs="?", default="true")
|
||||
|
||||
# extract: run the Phase 9 C rule-based extractor against an
|
||||
# already-captured interaction. persist='true' writes the
|
||||
# candidates as status='candidate' memories; default is
|
||||
# preview-only.
|
||||
p = sub.add_parser("extract")
|
||||
p.add_argument("interaction_id")
|
||||
p.add_argument("persist", nargs="?", default="false")
|
||||
|
||||
# reinforce: backfill reinforcement on an already-captured interaction.
|
||||
p = sub.add_parser("reinforce-interaction")
|
||||
p.add_argument("interaction_id")
|
||||
|
||||
# list-interactions: paginated listing with filters.
|
||||
p = sub.add_parser("list-interactions")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("session_id", nargs="?", default="")
|
||||
p.add_argument("client", nargs="?", default="")
|
||||
p.add_argument("since", nargs="?", default="")
|
||||
p.add_argument("limit", nargs="?", type=int, default=50)
|
||||
|
||||
# get-interaction: fetch one by id
|
||||
p = sub.add_parser("get-interaction")
|
||||
p.add_argument("interaction_id")
|
||||
|
||||
# queue: list the candidate review queue
|
||||
p = sub.add_parser("queue")
|
||||
p.add_argument("memory_type", nargs="?", default="")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("limit", nargs="?", type=int, default=50)
|
||||
|
||||
# promote: candidate -> active
|
||||
p = sub.add_parser("promote")
|
||||
p.add_argument("memory_id")
|
||||
|
||||
# reject: candidate -> invalid
|
||||
p = sub.add_parser("reject")
|
||||
p.add_argument("memory_id")
|
||||
|
||||
# batch-extract: fan out /interactions/{id}/extract?persist=true across
|
||||
# recent interactions. Idempotent — the extractor create_memory path
|
||||
# silently skips duplicates, so re-running is safe.
|
||||
p = sub.add_parser("batch-extract")
|
||||
p.add_argument("since", nargs="?", default="")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("limit", nargs="?", type=int, default=100)
|
||||
p.add_argument("persist", nargs="?", default="true")
|
||||
|
||||
# triage: interactive candidate review loop. Fetches the queue, shows
|
||||
# each candidate, accepts p/r/s (promote / reject / skip) / q (quit).
|
||||
p = sub.add_parser("triage")
|
||||
p.add_argument("memory_type", nargs="?", default="")
|
||||
p.add_argument("project", nargs="?", default="")
|
||||
p.add_argument("limit", nargs="?", type=int, default=50)
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = build_parser().parse_args()
|
||||
cmd = args.command
|
||||
|
||||
if cmd == "health":
|
||||
print_json(request("GET", "/health"))
|
||||
elif cmd == "sources":
|
||||
print_json(request("GET", "/sources"))
|
||||
elif cmd == "stats":
|
||||
print_json(request("GET", "/stats"))
|
||||
elif cmd == "projects":
|
||||
print_json(request("GET", "/projects"))
|
||||
elif cmd == "project-template":
|
||||
print_json(request("GET", "/projects/template"))
|
||||
elif cmd == "debug-context":
|
||||
print_json(request("GET", "/debug/context"))
|
||||
elif cmd == "ingest-sources":
|
||||
print_json(request("POST", "/ingest/sources", {}))
|
||||
elif cmd == "detect-project":
|
||||
print_json(detect_project(args.prompt))
|
||||
elif cmd == "auto-context":
|
||||
project = args.project or detect_project(args.prompt).get("matched_project") or ""
|
||||
if not project:
|
||||
print_json({"status": "no_project_match", "source": "atocore", "mode": "auto-context"})
|
||||
else:
|
||||
print_json(request("POST", "/context/build", {"prompt": args.prompt, "project": project, "budget": args.budget}))
|
||||
elif cmd in {"propose-project", "register-project"}:
|
||||
path = "/projects/proposal" if cmd == "propose-project" else "/projects/register"
|
||||
print_json(request("POST", path, project_payload(args.project_id, args.aliases_csv, args.source, args.subpath, args.description, args.label)))
|
||||
elif cmd == "update-project":
|
||||
payload: dict[str, Any] = {"description": args.description}
|
||||
if args.aliases_csv.strip():
|
||||
payload["aliases"] = parse_aliases(args.aliases_csv)
|
||||
print_json(request("PUT", f"/projects/{urllib.parse.quote(args.project)}", payload))
|
||||
elif cmd == "refresh-project":
|
||||
purge_deleted = args.purge_deleted.lower() in {"1", "true", "yes", "y"}
|
||||
path = f"/projects/{urllib.parse.quote(args.project)}/refresh?purge_deleted={str(purge_deleted).lower()}"
|
||||
print_json(request("POST", path, {}, timeout=REFRESH_TIMEOUT))
|
||||
elif cmd == "project-state":
|
||||
suffix = f"?category={urllib.parse.quote(args.category)}" if args.category else ""
|
||||
print_json(request("GET", f"/project/state/{urllib.parse.quote(args.project)}{suffix}"))
|
||||
elif cmd == "project-state-set":
|
||||
print_json(request("POST", "/project/state", {
|
||||
"project": args.project,
|
||||
"category": args.category,
|
||||
"key": args.key,
|
||||
"value": args.value,
|
||||
"source": args.source,
|
||||
"confidence": args.confidence,
|
||||
}))
|
||||
elif cmd == "project-state-invalidate":
|
||||
print_json(request("DELETE", "/project/state", {"project": args.project, "category": args.category, "key": args.key}))
|
||||
elif cmd == "query":
|
||||
print_json(request("POST", "/query", {"prompt": args.prompt, "top_k": args.top_k, "project": args.project or None}))
|
||||
elif cmd == "context-build":
|
||||
print_json(request("POST", "/context/build", {"prompt": args.prompt, "project": args.project or None, "budget": args.budget}))
|
||||
elif cmd == "audit-query":
|
||||
print_json(audit_query(args.prompt, args.top_k, args.project or None))
|
||||
# --- Phase 9 reflection loop surface ------------------------------
|
||||
elif cmd == "capture":
|
||||
body: dict[str, Any] = {
|
||||
"prompt": args.prompt,
|
||||
"response": args.response,
|
||||
"project": args.project,
|
||||
"client": args.client or "atocore-client",
|
||||
"session_id": args.session_id,
|
||||
"reinforce": args.reinforce.lower() in {"1", "true", "yes", "y"},
|
||||
}
|
||||
print_json(request("POST", "/interactions", body))
|
||||
elif cmd == "extract":
|
||||
persist = args.persist.lower() in {"1", "true", "yes", "y"}
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/extract",
|
||||
{"persist": persist},
|
||||
)
|
||||
)
|
||||
elif cmd == "reinforce-interaction":
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}/reinforce",
|
||||
{},
|
||||
)
|
||||
)
|
||||
elif cmd == "list-interactions":
|
||||
query_parts: list[str] = []
|
||||
if args.project:
|
||||
query_parts.append(f"project={urllib.parse.quote(args.project)}")
|
||||
if args.session_id:
|
||||
query_parts.append(f"session_id={urllib.parse.quote(args.session_id)}")
|
||||
if args.client:
|
||||
query_parts.append(f"client={urllib.parse.quote(args.client)}")
|
||||
if args.since:
|
||||
query_parts.append(f"since={urllib.parse.quote(args.since)}")
|
||||
query_parts.append(f"limit={int(args.limit)}")
|
||||
query = "?" + "&".join(query_parts)
|
||||
print_json(request("GET", f"/interactions{query}"))
|
||||
elif cmd == "get-interaction":
|
||||
print_json(
|
||||
request(
|
||||
"GET",
|
||||
f"/interactions/{urllib.parse.quote(args.interaction_id, safe='')}",
|
||||
)
|
||||
)
|
||||
elif cmd == "queue":
|
||||
query_parts = ["status=candidate"]
|
||||
if args.memory_type:
|
||||
query_parts.append(f"memory_type={urllib.parse.quote(args.memory_type)}")
|
||||
if args.project:
|
||||
query_parts.append(f"project={urllib.parse.quote(args.project)}")
|
||||
query_parts.append(f"limit={int(args.limit)}")
|
||||
query = "?" + "&".join(query_parts)
|
||||
print_json(request("GET", f"/memory{query}"))
|
||||
elif cmd == "promote":
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/promote",
|
||||
{},
|
||||
)
|
||||
)
|
||||
elif cmd == "reject":
|
||||
print_json(
|
||||
request(
|
||||
"POST",
|
||||
f"/memory/{urllib.parse.quote(args.memory_id, safe='')}/reject",
|
||||
{},
|
||||
)
|
||||
)
|
||||
elif cmd == "batch-extract":
|
||||
print_json(run_batch_extract(args.since, args.project, args.limit, args.persist))
|
||||
elif cmd == "triage":
|
||||
return run_triage(args.memory_type, args.project, args.limit)
|
||||
else:
|
||||
return 1
|
||||
return 0
|
||||
|
||||
|
||||
def run_batch_extract(since: str, project: str, limit: int, persist_flag: str) -> dict:
|
||||
"""Fetch recent interactions and run the extractor against each one.
|
||||
|
||||
Returns an aggregated summary. Safe to re-run: the server-side
|
||||
persist path catches ValueError on duplicates and the endpoint
|
||||
reports per-interaction candidate counts either way.
|
||||
"""
|
||||
persist = persist_flag.lower() in {"1", "true", "yes", "y"}
|
||||
query_parts: list[str] = []
|
||||
if project:
|
||||
query_parts.append(f"project={urllib.parse.quote(project)}")
|
||||
if since:
|
||||
query_parts.append(f"since={urllib.parse.quote(since)}")
|
||||
query_parts.append(f"limit={int(limit)}")
|
||||
query = "?" + "&".join(query_parts)
|
||||
|
||||
listing = request("GET", f"/interactions{query}")
|
||||
interactions = listing.get("interactions", []) if isinstance(listing, dict) else []
|
||||
|
||||
processed = 0
|
||||
total_candidates = 0
|
||||
total_persisted = 0
|
||||
errors: list[dict] = []
|
||||
per_interaction: list[dict] = []
|
||||
|
||||
for item in interactions:
|
||||
iid = item.get("id") or ""
|
||||
if not iid:
|
||||
continue
|
||||
try:
|
||||
result = request(
|
||||
"POST",
|
||||
f"/interactions/{urllib.parse.quote(iid, safe='')}/extract",
|
||||
{"persist": persist},
|
||||
)
|
||||
except Exception as exc: # pragma: no cover - network errors land here
|
||||
errors.append({"interaction_id": iid, "error": str(exc)})
|
||||
continue
|
||||
processed += 1
|
||||
count = int(result.get("candidate_count", 0) or 0)
|
||||
persisted_ids = result.get("persisted_ids") or []
|
||||
total_candidates += count
|
||||
total_persisted += len(persisted_ids)
|
||||
if count:
|
||||
per_interaction.append(
|
||||
{
|
||||
"interaction_id": iid,
|
||||
"candidate_count": count,
|
||||
"persisted_count": len(persisted_ids),
|
||||
"project": item.get("project") or "",
|
||||
}
|
||||
)
|
||||
|
||||
return {
|
||||
"processed": processed,
|
||||
"total_candidates": total_candidates,
|
||||
"total_persisted": total_persisted,
|
||||
"persist": persist,
|
||||
"errors": errors,
|
||||
"interactions_with_candidates": per_interaction,
|
||||
}
|
||||
|
||||
|
||||
def run_triage(memory_type: str, project: str, limit: int) -> int:
|
||||
"""Interactive review of candidate memories.
|
||||
|
||||
Loads the queue once, walks through entries, prompts for
|
||||
(p)romote / (r)eject / (s)kip / (q)uit. Stateless between runs —
|
||||
re-running picks up whatever is still status=candidate.
|
||||
"""
|
||||
query_parts = ["status=candidate"]
|
||||
if memory_type:
|
||||
query_parts.append(f"memory_type={urllib.parse.quote(memory_type)}")
|
||||
if project:
|
||||
query_parts.append(f"project={urllib.parse.quote(project)}")
|
||||
query_parts.append(f"limit={int(limit)}")
|
||||
listing = request("GET", "/memory?" + "&".join(query_parts))
|
||||
memories = listing.get("memories", []) if isinstance(listing, dict) else []
|
||||
|
||||
if not memories:
|
||||
print_json({"status": "empty_queue", "count": 0})
|
||||
return 0
|
||||
|
||||
promoted = 0
|
||||
rejected = 0
|
||||
skipped = 0
|
||||
stopped_early = False
|
||||
|
||||
print(f"Triage queue: {len(memories)} candidate(s)\n", file=sys.stderr)
|
||||
for idx, mem in enumerate(memories, 1):
|
||||
mid = mem.get("id", "")
|
||||
print(f"[{idx}/{len(memories)}] {mem.get('memory_type','?')} project={mem.get('project','')} conf={mem.get('confidence','?')}", file=sys.stderr)
|
||||
print(f" id: {mid}", file=sys.stderr)
|
||||
print(f" {mem.get('content','')}", file=sys.stderr)
|
||||
try:
|
||||
choice = input(" (p)romote / (r)eject / (s)kip / (q)uit > ").strip().lower()
|
||||
except EOFError:
|
||||
stopped_early = True
|
||||
break
|
||||
if choice in {"q", "quit"}:
|
||||
stopped_early = True
|
||||
break
|
||||
if choice in {"p", "promote"}:
|
||||
request("POST", f"/memory/{urllib.parse.quote(mid, safe='')}/promote", {})
|
||||
promoted += 1
|
||||
print(" -> promoted", file=sys.stderr)
|
||||
elif choice in {"r", "reject"}:
|
||||
request("POST", f"/memory/{urllib.parse.quote(mid, safe='')}/reject", {})
|
||||
rejected += 1
|
||||
print(" -> rejected", file=sys.stderr)
|
||||
else:
|
||||
skipped += 1
|
||||
print(" -> skipped", file=sys.stderr)
|
||||
|
||||
print_json(
|
||||
{
|
||||
"reviewed": promoted + rejected + skipped,
|
||||
"promoted": promoted,
|
||||
"rejected": rejected,
|
||||
"skipped": skipped,
|
||||
"stopped_early": stopped_early,
|
||||
"remaining_in_queue": len(memories) - (promoted + rejected + skipped) - (1 if stopped_early else 0),
|
||||
}
|
||||
)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
54
scripts/ingest_folder.py
Normal file
54
scripts/ingest_folder.py
Normal file
@@ -0,0 +1,54 @@
|
||||
"""CLI script to ingest a folder of markdown files."""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add src to path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
|
||||
|
||||
from atocore.ingestion.pipeline import ingest_folder
|
||||
from atocore.models.database import init_db
|
||||
from atocore.observability.logger import setup_logging
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Ingest markdown files into AtoCore")
|
||||
parser.add_argument("--path", required=True, help="Path to folder with markdown files")
|
||||
args = parser.parse_args()
|
||||
|
||||
setup_logging()
|
||||
init_db()
|
||||
|
||||
folder = Path(args.path)
|
||||
if not folder.is_dir():
|
||||
print(f"Error: {folder} is not a directory")
|
||||
sys.exit(1)
|
||||
|
||||
results = ingest_folder(folder)
|
||||
|
||||
# Summary
|
||||
ingested = sum(1 for r in results if r["status"] == "ingested")
|
||||
skipped = sum(1 for r in results if r["status"] == "skipped")
|
||||
errors = sum(1 for r in results if r["status"] == "error")
|
||||
total_chunks = sum(r.get("chunks", 0) for r in results)
|
||||
|
||||
print(f"\n{'='*50}")
|
||||
print(f"Ingestion complete:")
|
||||
print(f" Files processed: {len(results)}")
|
||||
print(f" Ingested: {ingested}")
|
||||
print(f" Skipped (unchanged): {skipped}")
|
||||
print(f" Errors: {errors}")
|
||||
print(f" Total chunks created: {total_chunks}")
|
||||
print(f"{'='*50}")
|
||||
|
||||
if errors:
|
||||
print("\nErrors:")
|
||||
for r in results:
|
||||
if r["status"] == "error":
|
||||
print(f" {r['file']}: {r['error']}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1018
scripts/migrate_legacy_aliases.py
Normal file
1018
scripts/migrate_legacy_aliases.py
Normal file
File diff suppressed because it is too large
Load Diff
393
scripts/phase9_first_real_use.py
Normal file
393
scripts/phase9_first_real_use.py
Normal file
@@ -0,0 +1,393 @@
|
||||
"""Phase 9 first-real-use validation script.
|
||||
|
||||
Captures a small set of representative interactions drawn from a real
|
||||
working session, runs the full Phase 9 loop (capture -> reinforce ->
|
||||
extract) over them, and prints what each step produced. The intent is
|
||||
to generate empirical evidence about the extractor's behaviour against
|
||||
prose that wasn't written to make the test pass.
|
||||
|
||||
Usage:
|
||||
python scripts/phase9_first_real_use.py [--data-dir PATH]
|
||||
|
||||
The script writes a fresh isolated SQLite + Chroma store under the
|
||||
given data dir (default: ./data/validation/phase9-first-use). The
|
||||
data dir is gitignored so the script can be re-run cleanly.
|
||||
|
||||
Each interaction is printed with:
|
||||
- the captured interaction id
|
||||
- the reinforcement results (which seeded memories were echoed)
|
||||
- the extraction results (which candidates were proposed and why)
|
||||
- notes on what the extractor MISSED (manually annotated below)
|
||||
|
||||
The output is intentionally human-readable so the run can be saved as
|
||||
the body of docs/phase9-first-real-use.md.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import sys
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
# Make the package importable when run directly from the repo root.
|
||||
_REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
sys.path.insert(0, str(_REPO_ROOT / "src"))
|
||||
|
||||
|
||||
@dataclass
|
||||
class SampleInteraction:
|
||||
label: str
|
||||
prompt: str
|
||||
response: str
|
||||
project: str
|
||||
notes_for_human: str
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Sample interactions, paraphrased from real working sessions on AtoCore
|
||||
# itself. Each one is a real-feeling exchange with at least one structural
|
||||
# cue the extractor should catch, plus prose the extractor should NOT
|
||||
# accidentally catch.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
SAMPLES: list[SampleInteraction] = [
|
||||
SampleInteraction(
|
||||
label="exdev-mount-merge-decision",
|
||||
project="atocore",
|
||||
prompt="The TMPDIR fix didn't work, the LOCAL_COPY_PATH change didn't work, what now?",
|
||||
response=(
|
||||
"We've been chasing this through three layers without success. "
|
||||
"Time to stop guessing and apply the simple fix Dalidou Claude "
|
||||
"proposed in the very first diagnosis.\n\n"
|
||||
"## Decision: merge the two bind mounts into one\n\n"
|
||||
"Both /srv/storage/repos and /srv/storage/gitea/data live on "
|
||||
"/dev/sdb1, so the move is a same-filesystem rename — instant, "
|
||||
"atomic, no data copy. After the move, the EXDEV class of bug "
|
||||
"becomes structurally impossible because there is only one "
|
||||
"mount inside the container."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Should fire decision_heading rule. Should NOT fire requirement "
|
||||
"or constraint rules. The 'becomes structurally impossible' line "
|
||||
"is prose, not a candidate."
|
||||
),
|
||||
),
|
||||
SampleInteraction(
|
||||
label="ownership-was-the-real-fix",
|
||||
project="atocore",
|
||||
prompt="Still failing after the mount merge. What's actually wrong?",
|
||||
response=(
|
||||
"Same rejection even after the mounts were merged, which kills "
|
||||
"the EXDEV theory. The error 'unable to migrate objects to "
|
||||
"permanent storage' is git's catch-all for any failure in "
|
||||
"tmp_objdir_migrate, and the real errno is being thrown away.\n\n"
|
||||
"## Fact: the actual root cause was ownership, not cross-device link\n\n"
|
||||
"After mv on the host, the moved directory had perms git could "
|
||||
"navigate for some subdirs but not others. Recursive chown "
|
||||
"git:git plus chmod u+rwX,g+rwX fixed it on the first try."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Should fire fact_heading rule, target type knowledge. The "
|
||||
"'unable to migrate' string is inside prose and should NOT "
|
||||
"trigger any rule."
|
||||
),
|
||||
),
|
||||
SampleInteraction(
|
||||
label="memory-vs-entity-canonical-home",
|
||||
project="atocore",
|
||||
prompt="Does a Decision live in memories or entities?",
|
||||
response=(
|
||||
"Both layers exist for different reasons. After thinking through "
|
||||
"the ontology question, I'm landing on a clean rule: one "
|
||||
"canonical home per concept, with explicit graduation when a "
|
||||
"memory eventually deserves structure.\n\n"
|
||||
"## Decision: project, knowledge, and adaptation memory types graduate to entities\n\n"
|
||||
"The identity, preference, and episodic types stay in the memory "
|
||||
"layer because they're personal and unstructured. The other "
|
||||
"three become entities once the engineering layer V1 ships. The "
|
||||
"original memory rows are NEVER deleted — they become frozen "
|
||||
"historical pointers to whichever entity they graduated into."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Should fire decision_heading rule. Long content (over 100 "
|
||||
"chars) — verifies the max-length cap of 280 doesn't truncate."
|
||||
),
|
||||
),
|
||||
SampleInteraction(
|
||||
label="auto-promotion-deferred",
|
||||
project="atocore",
|
||||
prompt="Should the engineering layer auto-promote candidates?",
|
||||
response=(
|
||||
"The trust hierarchy is the whole point of AtoCore. If we let "
|
||||
"the system auto-promote, we trade away the 'bad memory is "
|
||||
"worse than no memory' invariant for a marginal speed gain.\n\n"
|
||||
"## Decision: no auto-promotion in V1, ever\n\n"
|
||||
"The schema and API are designed so policies can be added "
|
||||
"later without migration, but the V1 stance is that every "
|
||||
"promotion needs a human signoff. The hard line: nothing ever "
|
||||
"moves into trusted project state automatically, even with the "
|
||||
"engineering layer running."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Should fire decision_heading rule. Has 'invariant' jargon — "
|
||||
"verify the extractor doesn't get distracted."
|
||||
),
|
||||
),
|
||||
SampleInteraction(
|
||||
label="preference-rebase-workflow",
|
||||
project="atocore",
|
||||
prompt="Should we use rebase or merge commits for this branch?",
|
||||
response=(
|
||||
"I prefer rebase-based workflows because the history stays "
|
||||
"linear and reviewers have an easier time. For the dalidou "
|
||||
"storage-foundation branch we'll rebase before merging into "
|
||||
"main so the timeline reads as one coherent narrative."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Should fire preference_sentence rule. Note: the rule looks "
|
||||
"for 'I prefer' which is exactly the wording used."
|
||||
),
|
||||
),
|
||||
SampleInteraction(
|
||||
label="constraint-from-doc-cite",
|
||||
project="p05-interferometer",
|
||||
prompt="What's the budget situation for p05 right now?",
|
||||
response=(
|
||||
"Looking at the latest design notes, the constraint is fairly "
|
||||
"tight on the optical path side.\n\n"
|
||||
"### Constraint: total optical path stability must stay within 10 nm RMS at 1 kHz\n\n"
|
||||
"The error budget allocates 6 nm to the laser source and 4 nm "
|
||||
"to the mechanical mounts and frame combined."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Should fire constraint_heading rule with type=project. The "
|
||||
"second sentence ('error budget allocates 6 nm') is descriptive "
|
||||
"and should NOT fire any rule because it doesn't have a "
|
||||
"structural cue."
|
||||
),
|
||||
),
|
||||
SampleInteraction(
|
||||
label="prose-only-no-cues",
|
||||
project="atocore",
|
||||
prompt="What should I work on next?",
|
||||
response=(
|
||||
"Looking at the current state of AtoCore, the next natural "
|
||||
"step is to validate Phase 9 against real interactions before "
|
||||
"starting the engineering layer implementation. Empirical "
|
||||
"evidence is missing and the planning docs would benefit from "
|
||||
"real signal."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Should produce ZERO candidates. Pure recommendation prose, no "
|
||||
"structural cues. If this fires anything the extractor is too "
|
||||
"loose."
|
||||
),
|
||||
),
|
||||
SampleInteraction(
|
||||
label="multiple-cues-in-one-interaction",
|
||||
project="p06-polisher",
|
||||
prompt="Summarize today's polisher session",
|
||||
response=(
|
||||
"We worked through three things in the polisher session today.\n\n"
|
||||
"## Decision: defer the laser interlock redesign to after the July milestone\n\n"
|
||||
"## Constraint: the calibration routine must complete in under 90 seconds for production use\n\n"
|
||||
"## Requirement: the polisher must hold position to within 0.5 micron at 1 g loading\n\n"
|
||||
"Action items captured for the next sync."
|
||||
),
|
||||
notes_for_human=(
|
||||
"Three rules should fire on the same interaction: "
|
||||
"decision_heading -> adaptation, constraint_heading -> project, "
|
||||
"requirement_heading -> project. Verify dedup doesn't merge them."
|
||||
),
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
def setup_environment(data_dir: Path) -> None:
|
||||
"""Configure AtoCore to use an isolated data directory for this run."""
|
||||
if data_dir.exists():
|
||||
shutil.rmtree(data_dir)
|
||||
data_dir.mkdir(parents=True, exist_ok=True)
|
||||
os.environ["ATOCORE_DATA_DIR"] = str(data_dir)
|
||||
os.environ.setdefault("ATOCORE_DEBUG", "true")
|
||||
# Reset cached settings so the new env vars take effect
|
||||
import atocore.config as config
|
||||
|
||||
config.settings = config.Settings()
|
||||
import atocore.retrieval.vector_store as vs
|
||||
|
||||
vs._store = None
|
||||
|
||||
|
||||
def seed_memories() -> dict[str, str]:
|
||||
"""Insert a small set of seed active memories so reinforcement has
|
||||
something to match against."""
|
||||
from atocore.memory.service import create_memory
|
||||
|
||||
seeded: dict[str, str] = {}
|
||||
seeded["pref_rebase"] = create_memory(
|
||||
memory_type="preference",
|
||||
content="prefers rebase-based workflows because history stays linear",
|
||||
confidence=0.6,
|
||||
).id
|
||||
seeded["pref_concise"] = create_memory(
|
||||
memory_type="preference",
|
||||
content="writes commit messages focused on the why, not the what",
|
||||
confidence=0.6,
|
||||
).id
|
||||
seeded["identity_runs_atocore"] = create_memory(
|
||||
memory_type="identity",
|
||||
content="mechanical engineer who runs AtoCore for context engineering",
|
||||
confidence=0.9,
|
||||
).id
|
||||
return seeded
|
||||
|
||||
|
||||
def run_sample(sample: SampleInteraction) -> dict:
|
||||
"""Capture one sample, run extraction, return a result dict."""
|
||||
from atocore.interactions.service import record_interaction
|
||||
from atocore.memory.extractor import extract_candidates_from_interaction
|
||||
|
||||
interaction = record_interaction(
|
||||
prompt=sample.prompt,
|
||||
response=sample.response,
|
||||
project=sample.project,
|
||||
client="phase9-first-real-use",
|
||||
session_id="first-real-use",
|
||||
reinforce=True,
|
||||
)
|
||||
candidates = extract_candidates_from_interaction(interaction)
|
||||
|
||||
return {
|
||||
"label": sample.label,
|
||||
"project": sample.project,
|
||||
"interaction_id": interaction.id,
|
||||
"expected_notes": sample.notes_for_human,
|
||||
"candidate_count": len(candidates),
|
||||
"candidates": [
|
||||
{
|
||||
"memory_type": c.memory_type,
|
||||
"rule": c.rule,
|
||||
"content": c.content,
|
||||
"source_span": c.source_span[:120],
|
||||
}
|
||||
for c in candidates
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def report_seed_memory_state(seeded_ids: dict[str, str]) -> dict:
|
||||
from atocore.memory.service import get_memories
|
||||
|
||||
state = {}
|
||||
for label, mid in seeded_ids.items():
|
||||
rows = [m for m in get_memories(limit=200) if m.id == mid]
|
||||
if not rows:
|
||||
state[label] = None
|
||||
continue
|
||||
m = rows[0]
|
||||
state[label] = {
|
||||
"id": m.id,
|
||||
"memory_type": m.memory_type,
|
||||
"content_preview": m.content[:80],
|
||||
"confidence": round(m.confidence, 4),
|
||||
"reference_count": m.reference_count,
|
||||
"last_referenced_at": m.last_referenced_at,
|
||||
}
|
||||
return state
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument(
|
||||
"--data-dir",
|
||||
default=str(_REPO_ROOT / "data" / "validation" / "phase9-first-use"),
|
||||
help="Isolated data directory to use for this validation run",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--json",
|
||||
action="store_true",
|
||||
help="Emit machine-readable JSON instead of human prose",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
data_dir = Path(args.data_dir).resolve()
|
||||
setup_environment(data_dir)
|
||||
|
||||
from atocore.models.database import init_db
|
||||
from atocore.context.project_state import init_project_state_schema
|
||||
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
|
||||
seeded = seed_memories()
|
||||
sample_results = [run_sample(s) for s in SAMPLES]
|
||||
final_seed_state = report_seed_memory_state(seeded)
|
||||
|
||||
if args.json:
|
||||
json.dump(
|
||||
{
|
||||
"data_dir": str(data_dir),
|
||||
"seeded_memories_initial": list(seeded.keys()),
|
||||
"samples": sample_results,
|
||||
"seed_memory_state_after_run": final_seed_state,
|
||||
},
|
||||
sys.stdout,
|
||||
indent=2,
|
||||
default=str,
|
||||
)
|
||||
return 0
|
||||
|
||||
print("=" * 78)
|
||||
print("Phase 9 first-real-use validation run")
|
||||
print("=" * 78)
|
||||
print(f"Isolated data dir: {data_dir}")
|
||||
print()
|
||||
print("Seeded the memory store with 3 active memories:")
|
||||
for label, mid in seeded.items():
|
||||
print(f" - {label} ({mid[:8]})")
|
||||
print()
|
||||
print("-" * 78)
|
||||
print(f"Running {len(SAMPLES)} sample interactions ...")
|
||||
print("-" * 78)
|
||||
|
||||
for result in sample_results:
|
||||
print()
|
||||
print(f"## {result['label']} [project={result['project']}]")
|
||||
print(f" interaction_id={result['interaction_id'][:8]}")
|
||||
print(f" expected: {result['expected_notes']}")
|
||||
print(f" candidates produced: {result['candidate_count']}")
|
||||
for i, cand in enumerate(result["candidates"], 1):
|
||||
print(
|
||||
f" [{i}] type={cand['memory_type']:11s} "
|
||||
f"rule={cand['rule']:21s} "
|
||||
f"content={cand['content']!r}"
|
||||
)
|
||||
|
||||
print()
|
||||
print("-" * 78)
|
||||
print("Reinforcement state on seeded memories AFTER all interactions:")
|
||||
print("-" * 78)
|
||||
for label, state in final_seed_state.items():
|
||||
if state is None:
|
||||
print(f" {label}: <missing>")
|
||||
continue
|
||||
print(
|
||||
f" {label}: confidence={state['confidence']:.4f} "
|
||||
f"refs={state['reference_count']} "
|
||||
f"last={state['last_referenced_at'] or '-'}"
|
||||
)
|
||||
|
||||
print()
|
||||
print("=" * 78)
|
||||
print("Run complete. Data written to:", data_dir)
|
||||
print("=" * 78)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
76
scripts/query_test.py
Normal file
76
scripts/query_test.py
Normal file
@@ -0,0 +1,76 @@
|
||||
"""CLI script to run test prompts and compare baseline vs enriched."""
|
||||
|
||||
import argparse
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import yaml
|
||||
|
||||
# Add src to path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
|
||||
|
||||
from atocore.context.builder import build_context
|
||||
from atocore.models.database import init_db
|
||||
from atocore.observability.logger import setup_logging
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Run test prompts against AtoCore")
|
||||
parser.add_argument(
|
||||
"--prompts",
|
||||
default=str(Path(__file__).parent.parent / "tests" / "test_prompts" / "prompts.yaml"),
|
||||
help="Path to prompts YAML file",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
setup_logging()
|
||||
init_db()
|
||||
|
||||
prompts_path = Path(args.prompts)
|
||||
if not prompts_path.exists():
|
||||
print(f"Error: {prompts_path} not found")
|
||||
sys.exit(1)
|
||||
|
||||
with open(prompts_path) as f:
|
||||
data = yaml.safe_load(f)
|
||||
|
||||
prompts = data.get("prompts", [])
|
||||
print(f"Running {len(prompts)} test prompts...\n")
|
||||
|
||||
for p in prompts:
|
||||
prompt_id = p["id"]
|
||||
prompt_text = p["prompt"]
|
||||
project = p.get("project")
|
||||
expected = p.get("expected", "")
|
||||
|
||||
print(f"{'='*60}")
|
||||
print(f"[{prompt_id}] {prompt_text}")
|
||||
print(f"Project: {project or 'none'}")
|
||||
print(f"Expected: {expected}")
|
||||
print(f"-" * 60)
|
||||
|
||||
pack = build_context(
|
||||
user_prompt=prompt_text,
|
||||
project_hint=project,
|
||||
)
|
||||
|
||||
print(f"Chunks retrieved: {len(pack.chunks_used)}")
|
||||
print(f"Total chars: {pack.total_chars} / {pack.budget}")
|
||||
print(f"Duration: {pack.duration_ms}ms")
|
||||
print()
|
||||
|
||||
for i, chunk in enumerate(pack.chunks_used[:5]):
|
||||
print(f" [{i+1}] Score: {chunk.score:.2f} | {chunk.source_file}")
|
||||
print(f" Section: {chunk.heading_path}")
|
||||
print(f" Preview: {chunk.content[:120]}...")
|
||||
print()
|
||||
|
||||
print(f"Full prompt length: {len(pack.full_prompt)} chars")
|
||||
print()
|
||||
|
||||
print(f"{'='*60}")
|
||||
print("Done. Review output above to assess retrieval quality.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
194
scripts/retrieval_eval.py
Normal file
194
scripts/retrieval_eval.py
Normal file
@@ -0,0 +1,194 @@
|
||||
"""Retrieval quality eval harness.
|
||||
|
||||
Runs a fixed set of project-hinted questions against
|
||||
``POST /context/build`` on a live AtoCore instance and scores the
|
||||
resulting ``formatted_context`` against per-question expectations.
|
||||
The goal is a diffable scorecard that tells you, run-to-run,
|
||||
whether a retrieval / builder / ingestion change moved the needle.
|
||||
|
||||
Design notes
|
||||
------------
|
||||
- Fixtures live in ``scripts/retrieval_eval_fixtures.json`` so new
|
||||
questions can be added without touching Python. Each fixture
|
||||
names the project, the prompt, and a checklist of substrings that
|
||||
MUST appear in ``formatted_context`` (``expect_present``) and
|
||||
substrings that MUST NOT appear (``expect_absent``). The absent
|
||||
list catches cross-project bleed and stale content.
|
||||
- The checklist is deliberately substring-based (not regex, not
|
||||
embedding-similarity) so a failure is always a trivially
|
||||
reproducible "this string is not in that string". Richer scoring
|
||||
can come later once we know the harness is useful.
|
||||
- The harness is external to the app runtime and talks to AtoCore
|
||||
over HTTP, so it works against dev, staging, or prod. It follows
|
||||
the same environment-variable contract as ``atocore_client.py``
|
||||
(``ATOCORE_BASE_URL``, ``ATOCORE_TIMEOUT_SECONDS``).
|
||||
- Exit code 0 on all-pass, 1 on any fixture failure. Intended for
|
||||
manual runs today; a future cron / CI hook can consume the
|
||||
JSON output via ``--json``.
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
python scripts/retrieval_eval.py # human-readable report
|
||||
python scripts/retrieval_eval.py --json # machine-readable
|
||||
python scripts/retrieval_eval.py --fixtures path/to/custom.json
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.error
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
DEFAULT_BASE_URL = os.environ.get("ATOCORE_BASE_URL", "http://dalidou:8100")
|
||||
DEFAULT_TIMEOUT = int(os.environ.get("ATOCORE_TIMEOUT_SECONDS", "30"))
|
||||
DEFAULT_BUDGET = 3000
|
||||
DEFAULT_FIXTURES = Path(__file__).parent / "retrieval_eval_fixtures.json"
|
||||
|
||||
|
||||
@dataclass
|
||||
class Fixture:
|
||||
name: str
|
||||
project: str
|
||||
prompt: str
|
||||
budget: int = DEFAULT_BUDGET
|
||||
expect_present: list[str] = field(default_factory=list)
|
||||
expect_absent: list[str] = field(default_factory=list)
|
||||
notes: str = ""
|
||||
|
||||
|
||||
@dataclass
|
||||
class FixtureResult:
|
||||
fixture: Fixture
|
||||
ok: bool
|
||||
missing_present: list[str]
|
||||
unexpected_absent: list[str]
|
||||
total_chars: int
|
||||
error: str = ""
|
||||
|
||||
|
||||
def load_fixtures(path: Path) -> list[Fixture]:
|
||||
data = json.loads(path.read_text(encoding="utf-8"))
|
||||
if not isinstance(data, list):
|
||||
raise ValueError(f"{path} must contain a JSON array of fixtures")
|
||||
fixtures: list[Fixture] = []
|
||||
for i, raw in enumerate(data):
|
||||
if not isinstance(raw, dict):
|
||||
raise ValueError(f"fixture {i} is not an object")
|
||||
fixtures.append(
|
||||
Fixture(
|
||||
name=raw["name"],
|
||||
project=raw.get("project", ""),
|
||||
prompt=raw["prompt"],
|
||||
budget=int(raw.get("budget", DEFAULT_BUDGET)),
|
||||
expect_present=list(raw.get("expect_present", [])),
|
||||
expect_absent=list(raw.get("expect_absent", [])),
|
||||
notes=raw.get("notes", ""),
|
||||
)
|
||||
)
|
||||
return fixtures
|
||||
|
||||
|
||||
def run_fixture(fixture: Fixture, base_url: str, timeout: int) -> FixtureResult:
|
||||
payload = {
|
||||
"prompt": fixture.prompt,
|
||||
"project": fixture.project or None,
|
||||
"budget": fixture.budget,
|
||||
}
|
||||
req = urllib.request.Request(
|
||||
url=f"{base_url}/context/build",
|
||||
method="POST",
|
||||
headers={"Content-Type": "application/json"},
|
||||
data=json.dumps(payload).encode("utf-8"),
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
body = json.loads(resp.read().decode("utf-8"))
|
||||
except urllib.error.URLError as exc:
|
||||
return FixtureResult(
|
||||
fixture=fixture,
|
||||
ok=False,
|
||||
missing_present=list(fixture.expect_present),
|
||||
unexpected_absent=[],
|
||||
total_chars=0,
|
||||
error=f"http_error: {exc}",
|
||||
)
|
||||
|
||||
formatted = body.get("formatted_context") or ""
|
||||
missing = [s for s in fixture.expect_present if s not in formatted]
|
||||
unexpected = [s for s in fixture.expect_absent if s in formatted]
|
||||
return FixtureResult(
|
||||
fixture=fixture,
|
||||
ok=not missing and not unexpected,
|
||||
missing_present=missing,
|
||||
unexpected_absent=unexpected,
|
||||
total_chars=len(formatted),
|
||||
)
|
||||
|
||||
|
||||
def print_human_report(results: list[FixtureResult]) -> None:
|
||||
total = len(results)
|
||||
passed = sum(1 for r in results if r.ok)
|
||||
print(f"Retrieval eval: {passed}/{total} fixtures passed")
|
||||
print()
|
||||
for r in results:
|
||||
marker = "PASS" if r.ok else "FAIL"
|
||||
print(f"[{marker}] {r.fixture.name} project={r.fixture.project} chars={r.total_chars}")
|
||||
if r.error:
|
||||
print(f" error: {r.error}")
|
||||
for miss in r.missing_present:
|
||||
print(f" missing expected: {miss!r}")
|
||||
for bleed in r.unexpected_absent:
|
||||
print(f" unexpected present: {bleed!r}")
|
||||
if r.fixture.notes and not r.ok:
|
||||
print(f" notes: {r.fixture.notes}")
|
||||
|
||||
|
||||
def print_json_report(results: list[FixtureResult]) -> None:
|
||||
payload = {
|
||||
"total": len(results),
|
||||
"passed": sum(1 for r in results if r.ok),
|
||||
"fixtures": [
|
||||
{
|
||||
"name": r.fixture.name,
|
||||
"project": r.fixture.project,
|
||||
"ok": r.ok,
|
||||
"total_chars": r.total_chars,
|
||||
"missing_present": r.missing_present,
|
||||
"unexpected_absent": r.unexpected_absent,
|
||||
"error": r.error,
|
||||
}
|
||||
for r in results
|
||||
],
|
||||
}
|
||||
json.dump(payload, sys.stdout, indent=2)
|
||||
sys.stdout.write("\n")
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser(description="AtoCore retrieval quality eval harness")
|
||||
parser.add_argument("--base-url", default=DEFAULT_BASE_URL)
|
||||
parser.add_argument("--timeout", type=int, default=DEFAULT_TIMEOUT)
|
||||
parser.add_argument("--fixtures", type=Path, default=DEFAULT_FIXTURES)
|
||||
parser.add_argument("--json", action="store_true", help="emit machine-readable JSON")
|
||||
args = parser.parse_args()
|
||||
|
||||
fixtures = load_fixtures(args.fixtures)
|
||||
results = [run_fixture(f, args.base_url, args.timeout) for f in fixtures]
|
||||
|
||||
if args.json:
|
||||
print_json_report(results)
|
||||
else:
|
||||
print_human_report(results)
|
||||
|
||||
return 0 if all(r.ok for r in results) else 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
86
scripts/retrieval_eval_fixtures.json
Normal file
86
scripts/retrieval_eval_fixtures.json
Normal file
@@ -0,0 +1,86 @@
|
||||
[
|
||||
{
|
||||
"name": "p04-architecture-decision",
|
||||
"project": "p04-gigabit",
|
||||
"prompt": "what mirror architecture was selected for GigaBIT M1 and why",
|
||||
"expect_present": [
|
||||
"--- Trusted Project State ---",
|
||||
"Option B",
|
||||
"conical",
|
||||
"--- Project Memories ---"
|
||||
],
|
||||
"expect_absent": [
|
||||
"p06-polisher",
|
||||
"folded-beam"
|
||||
],
|
||||
"notes": "Canonical p04 decision — should surface both Trusted Project State (selected_mirror_architecture) and the project-memory band with the Option B memory"
|
||||
},
|
||||
{
|
||||
"name": "p04-constraints",
|
||||
"project": "p04-gigabit",
|
||||
"prompt": "what are the key GigaBIT M1 program constraints",
|
||||
"expect_present": [
|
||||
"--- Trusted Project State ---",
|
||||
"Zerodur",
|
||||
"1.2"
|
||||
],
|
||||
"expect_absent": [
|
||||
"polisher suite"
|
||||
],
|
||||
"notes": "Key constraints are in Trusted Project State (key_constraints) and in the mission-framing memory"
|
||||
},
|
||||
{
|
||||
"name": "p05-configuration",
|
||||
"project": "p05-interferometer",
|
||||
"prompt": "what is the selected interferometer configuration",
|
||||
"expect_present": [
|
||||
"folded-beam",
|
||||
"CGH"
|
||||
],
|
||||
"expect_absent": [
|
||||
"Option B",
|
||||
"conical back",
|
||||
"polisher suite"
|
||||
],
|
||||
"notes": "P05 architecture memory covers folded-beam + CGH. GigaBIT M1 is the mirror under test and legitimately appears in p05 source docs (the interferometer measures it), so we only flag genuinely p04-only decisions like the mirror architecture choice."
|
||||
},
|
||||
{
|
||||
"name": "p05-vendor-signal",
|
||||
"project": "p05-interferometer",
|
||||
"prompt": "what is the current vendor signal for the interferometer procurement",
|
||||
"expect_present": [
|
||||
"4D",
|
||||
"Zygo"
|
||||
],
|
||||
"expect_absent": [
|
||||
"polisher"
|
||||
],
|
||||
"notes": "Vendor memory mentions 4D as strongest technical candidate and Zygo Verifire SV as value path"
|
||||
},
|
||||
{
|
||||
"name": "p06-suite-split",
|
||||
"project": "p06-polisher",
|
||||
"prompt": "how is the polisher software suite split across layers",
|
||||
"expect_present": [
|
||||
"polisher-sim",
|
||||
"polisher-post",
|
||||
"polisher-control"
|
||||
],
|
||||
"expect_absent": [
|
||||
"GigaBIT"
|
||||
],
|
||||
"notes": "The three-layer split is in multiple p06 memories; check all three names surface together"
|
||||
},
|
||||
{
|
||||
"name": "p06-control-rule",
|
||||
"project": "p06-polisher",
|
||||
"prompt": "what is the polisher control design rule",
|
||||
"expect_present": [
|
||||
"interlocks"
|
||||
],
|
||||
"expect_absent": [
|
||||
"interferometer"
|
||||
],
|
||||
"notes": "Control design rule memory mentions interlocks and state transitions"
|
||||
}
|
||||
]
|
||||
15
src/atocore/__init__.py
Normal file
15
src/atocore/__init__.py
Normal file
@@ -0,0 +1,15 @@
|
||||
"""AtoCore — Personal Context Engine."""
|
||||
|
||||
# Bumped when a commit meaningfully changes the API surface, schema, or
|
||||
# user-visible behavior. The /health endpoint reports this value so
|
||||
# deployment drift is immediately visible: if the running service's
|
||||
# /health reports an older version than the main branch's __version__,
|
||||
# the deployment is stale and needs a redeploy (see
|
||||
# docs/dalidou-deployment.md and deploy/dalidou/deploy.sh).
|
||||
#
|
||||
# History:
|
||||
# 0.1.0 Phase 0/0.5/1/2/3/5/7 baseline
|
||||
# 0.2.0 Phase 9 reflection loop (capture/reinforce/extract + review
|
||||
# queue), shared client v0.2.0, project identity
|
||||
# canonicalization at every service-layer entry point
|
||||
__version__ = "0.2.0"
|
||||
0
src/atocore/api/__init__.py
Normal file
0
src/atocore/api/__init__.py
Normal file
845
src/atocore/api/routes.py
Normal file
845
src/atocore/api/routes.py
Normal file
@@ -0,0 +1,845 @@
|
||||
"""FastAPI route definitions."""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from fastapi import APIRouter, HTTPException
|
||||
from pydantic import BaseModel
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.context.builder import (
|
||||
build_context,
|
||||
get_last_context_pack,
|
||||
_pack_to_dict,
|
||||
)
|
||||
from atocore.context.project_state import (
|
||||
CATEGORIES,
|
||||
get_state,
|
||||
invalidate_state,
|
||||
set_state,
|
||||
)
|
||||
from atocore.ingestion.pipeline import (
|
||||
exclusive_ingestion,
|
||||
get_ingestion_stats,
|
||||
get_source_status,
|
||||
ingest_configured_sources,
|
||||
ingest_file,
|
||||
ingest_folder,
|
||||
)
|
||||
from atocore.interactions.service import (
|
||||
get_interaction,
|
||||
list_interactions,
|
||||
record_interaction,
|
||||
)
|
||||
from atocore.memory.extractor import (
|
||||
EXTRACTOR_VERSION,
|
||||
MemoryCandidate,
|
||||
extract_candidates_from_interaction,
|
||||
)
|
||||
from atocore.memory.reinforcement import reinforce_from_interaction
|
||||
from atocore.memory.service import (
|
||||
MEMORY_STATUSES,
|
||||
MEMORY_TYPES,
|
||||
create_memory,
|
||||
get_memories,
|
||||
invalidate_memory,
|
||||
promote_memory,
|
||||
reject_candidate_memory,
|
||||
supersede_memory,
|
||||
update_memory,
|
||||
)
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.ops.backup import (
|
||||
cleanup_old_backups,
|
||||
create_runtime_backup,
|
||||
list_runtime_backups,
|
||||
validate_backup,
|
||||
)
|
||||
from atocore.projects.registry import (
|
||||
build_project_registration_proposal,
|
||||
get_project_registry_template,
|
||||
list_registered_projects,
|
||||
register_project,
|
||||
refresh_registered_project,
|
||||
update_project,
|
||||
)
|
||||
from atocore.retrieval.retriever import retrieve
|
||||
from atocore.retrieval.vector_store import get_vector_store
|
||||
|
||||
router = APIRouter()
|
||||
log = get_logger("api")
|
||||
|
||||
|
||||
# --- Request/Response models ---
|
||||
|
||||
|
||||
class IngestRequest(BaseModel):
|
||||
path: str
|
||||
|
||||
|
||||
class IngestResponse(BaseModel):
|
||||
results: list[dict]
|
||||
|
||||
|
||||
class IngestSourcesResponse(BaseModel):
|
||||
results: list[dict]
|
||||
|
||||
|
||||
class ProjectRefreshResponse(BaseModel):
|
||||
project: str
|
||||
aliases: list[str]
|
||||
description: str
|
||||
purge_deleted: bool
|
||||
status: str
|
||||
roots_ingested: int
|
||||
roots_skipped: int
|
||||
roots: list[dict]
|
||||
|
||||
|
||||
class ProjectRegistrationProposalRequest(BaseModel):
|
||||
project_id: str
|
||||
aliases: list[str] = []
|
||||
description: str = ""
|
||||
ingest_roots: list[dict]
|
||||
|
||||
|
||||
class ProjectUpdateRequest(BaseModel):
|
||||
aliases: list[str] | None = None
|
||||
description: str | None = None
|
||||
ingest_roots: list[dict] | None = None
|
||||
|
||||
|
||||
class QueryRequest(BaseModel):
|
||||
prompt: str
|
||||
top_k: int = 10
|
||||
filter_tags: list[str] | None = None
|
||||
project: str | None = None
|
||||
|
||||
|
||||
class QueryResponse(BaseModel):
|
||||
results: list[dict]
|
||||
|
||||
|
||||
class ContextBuildRequest(BaseModel):
|
||||
prompt: str
|
||||
project: str | None = None
|
||||
budget: int | None = None
|
||||
|
||||
|
||||
class ContextBuildResponse(BaseModel):
|
||||
formatted_context: str
|
||||
full_prompt: str
|
||||
chunks_used: int
|
||||
total_chars: int
|
||||
budget: int
|
||||
budget_remaining: int
|
||||
duration_ms: int
|
||||
chunks: list[dict]
|
||||
|
||||
|
||||
class MemoryCreateRequest(BaseModel):
|
||||
memory_type: str
|
||||
content: str
|
||||
project: str = ""
|
||||
confidence: float = 1.0
|
||||
|
||||
|
||||
class MemoryUpdateRequest(BaseModel):
|
||||
content: str | None = None
|
||||
confidence: float | None = None
|
||||
status: str | None = None
|
||||
|
||||
|
||||
class ProjectStateSetRequest(BaseModel):
|
||||
project: str
|
||||
category: str
|
||||
key: str
|
||||
value: str
|
||||
source: str = ""
|
||||
confidence: float = 1.0
|
||||
|
||||
|
||||
class ProjectStateGetRequest(BaseModel):
|
||||
project: str
|
||||
category: str | None = None
|
||||
|
||||
|
||||
class ProjectStateInvalidateRequest(BaseModel):
|
||||
project: str
|
||||
category: str
|
||||
key: str
|
||||
|
||||
|
||||
# --- Endpoints ---
|
||||
|
||||
|
||||
@router.post("/ingest", response_model=IngestResponse)
|
||||
def api_ingest(req: IngestRequest) -> IngestResponse:
|
||||
"""Ingest a markdown file or folder."""
|
||||
target = Path(req.path)
|
||||
try:
|
||||
with exclusive_ingestion():
|
||||
if target.is_file():
|
||||
results = [ingest_file(target)]
|
||||
elif target.is_dir():
|
||||
results = ingest_folder(target)
|
||||
else:
|
||||
raise HTTPException(status_code=404, detail=f"Path not found: {req.path}")
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as e:
|
||||
log.error("ingest_failed", path=req.path, error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Ingestion failed: {e}")
|
||||
return IngestResponse(results=results)
|
||||
|
||||
|
||||
@router.post("/ingest/sources", response_model=IngestSourcesResponse)
|
||||
def api_ingest_sources() -> IngestSourcesResponse:
|
||||
"""Ingest enabled configured source directories."""
|
||||
try:
|
||||
with exclusive_ingestion():
|
||||
results = ingest_configured_sources()
|
||||
except Exception as e:
|
||||
log.error("ingest_sources_failed", error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Configured source ingestion failed: {e}")
|
||||
return IngestSourcesResponse(results=results)
|
||||
|
||||
|
||||
@router.get("/projects")
|
||||
def api_projects() -> dict:
|
||||
"""Return registered projects and their resolved ingest roots."""
|
||||
return {
|
||||
"projects": list_registered_projects(),
|
||||
"registry_path": str(_config.settings.resolved_project_registry_path),
|
||||
}
|
||||
|
||||
|
||||
@router.get("/projects/template")
|
||||
def api_projects_template() -> dict:
|
||||
"""Return a starter template for project registry entries."""
|
||||
return {
|
||||
"template": get_project_registry_template(),
|
||||
"registry_path": str(_config.settings.resolved_project_registry_path),
|
||||
"allowed_sources": ["vault", "drive"],
|
||||
}
|
||||
|
||||
|
||||
@router.post("/projects/proposal")
|
||||
def api_project_registration_proposal(req: ProjectRegistrationProposalRequest) -> dict:
|
||||
"""Return a normalized project registration proposal without writing it."""
|
||||
try:
|
||||
return build_project_registration_proposal(
|
||||
project_id=req.project_id,
|
||||
aliases=req.aliases,
|
||||
description=req.description,
|
||||
ingest_roots=req.ingest_roots,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
|
||||
|
||||
@router.post("/projects/register")
|
||||
def api_project_registration(req: ProjectRegistrationProposalRequest) -> dict:
|
||||
"""Persist a validated project registration to the registry file."""
|
||||
try:
|
||||
return register_project(
|
||||
project_id=req.project_id,
|
||||
aliases=req.aliases,
|
||||
description=req.description,
|
||||
ingest_roots=req.ingest_roots,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
|
||||
|
||||
@router.put("/projects/{project_name}")
|
||||
def api_project_update(project_name: str, req: ProjectUpdateRequest) -> dict:
|
||||
"""Update an existing project registration."""
|
||||
try:
|
||||
return update_project(
|
||||
project_name=project_name,
|
||||
aliases=req.aliases,
|
||||
description=req.description,
|
||||
ingest_roots=req.ingest_roots,
|
||||
)
|
||||
except ValueError as e:
|
||||
detail = str(e)
|
||||
if detail.startswith("Unknown project"):
|
||||
raise HTTPException(status_code=404, detail=detail)
|
||||
raise HTTPException(status_code=400, detail=detail)
|
||||
|
||||
|
||||
@router.post("/projects/{project_name}/refresh", response_model=ProjectRefreshResponse)
|
||||
def api_refresh_project(project_name: str, purge_deleted: bool = False) -> ProjectRefreshResponse:
|
||||
"""Refresh one registered project from its configured ingest roots."""
|
||||
try:
|
||||
with exclusive_ingestion():
|
||||
result = refresh_registered_project(project_name, purge_deleted=purge_deleted)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=404, detail=str(e))
|
||||
except Exception as e:
|
||||
log.error("project_refresh_failed", project=project_name, error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Project refresh failed: {e}")
|
||||
return ProjectRefreshResponse(**result)
|
||||
|
||||
|
||||
@router.post("/query", response_model=QueryResponse)
|
||||
def api_query(req: QueryRequest) -> QueryResponse:
|
||||
"""Retrieve relevant chunks for a prompt."""
|
||||
try:
|
||||
chunks = retrieve(
|
||||
req.prompt,
|
||||
top_k=req.top_k,
|
||||
filter_tags=req.filter_tags,
|
||||
project_hint=req.project,
|
||||
)
|
||||
except Exception as e:
|
||||
log.error("query_failed", prompt=req.prompt[:100], error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Query failed: {e}")
|
||||
return QueryResponse(
|
||||
results=[
|
||||
{
|
||||
"chunk_id": c.chunk_id,
|
||||
"content": c.content,
|
||||
"score": c.score,
|
||||
"heading_path": c.heading_path,
|
||||
"source_file": c.source_file,
|
||||
"title": c.title,
|
||||
}
|
||||
for c in chunks
|
||||
]
|
||||
)
|
||||
|
||||
|
||||
@router.post("/context/build", response_model=ContextBuildResponse)
|
||||
def api_build_context(req: ContextBuildRequest) -> ContextBuildResponse:
|
||||
"""Build a full context pack for a prompt."""
|
||||
try:
|
||||
pack = build_context(
|
||||
user_prompt=req.prompt,
|
||||
project_hint=req.project,
|
||||
budget=req.budget,
|
||||
)
|
||||
except Exception as e:
|
||||
log.error("context_build_failed", prompt=req.prompt[:100], error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Context build failed: {e}")
|
||||
pack_dict = _pack_to_dict(pack)
|
||||
return ContextBuildResponse(
|
||||
formatted_context=pack.formatted_context,
|
||||
full_prompt=pack.full_prompt,
|
||||
chunks_used=len(pack.chunks_used),
|
||||
total_chars=pack.total_chars,
|
||||
budget=pack.budget,
|
||||
budget_remaining=pack.budget_remaining,
|
||||
duration_ms=pack.duration_ms,
|
||||
chunks=pack_dict["chunks"],
|
||||
)
|
||||
|
||||
|
||||
@router.post("/memory")
|
||||
def api_create_memory(req: MemoryCreateRequest) -> dict:
|
||||
"""Create a new memory entry."""
|
||||
try:
|
||||
mem = create_memory(
|
||||
memory_type=req.memory_type,
|
||||
content=req.content,
|
||||
project=req.project,
|
||||
confidence=req.confidence,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
return {"status": "ok", "id": mem.id, "memory_type": mem.memory_type}
|
||||
|
||||
|
||||
@router.get("/memory")
|
||||
def api_get_memories(
|
||||
memory_type: str | None = None,
|
||||
project: str | None = None,
|
||||
active_only: bool = True,
|
||||
min_confidence: float = 0.0,
|
||||
limit: int = 50,
|
||||
status: str | None = None,
|
||||
) -> dict:
|
||||
"""List memories, optionally filtered.
|
||||
|
||||
When ``status`` is given explicitly it overrides ``active_only`` so
|
||||
the Phase 9 Commit C review queue can be listed via
|
||||
``GET /memory?status=candidate``.
|
||||
"""
|
||||
try:
|
||||
memories = get_memories(
|
||||
memory_type=memory_type,
|
||||
project=project,
|
||||
active_only=active_only,
|
||||
min_confidence=min_confidence,
|
||||
limit=limit,
|
||||
status=status,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
return {
|
||||
"memories": [
|
||||
{
|
||||
"id": m.id,
|
||||
"memory_type": m.memory_type,
|
||||
"content": m.content,
|
||||
"project": m.project,
|
||||
"confidence": m.confidence,
|
||||
"status": m.status,
|
||||
"reference_count": m.reference_count,
|
||||
"last_referenced_at": m.last_referenced_at,
|
||||
"updated_at": m.updated_at,
|
||||
}
|
||||
for m in memories
|
||||
],
|
||||
"types": MEMORY_TYPES,
|
||||
"statuses": MEMORY_STATUSES,
|
||||
}
|
||||
|
||||
|
||||
@router.put("/memory/{memory_id}")
|
||||
def api_update_memory(memory_id: str, req: MemoryUpdateRequest) -> dict:
|
||||
"""Update an existing memory."""
|
||||
try:
|
||||
success = update_memory(
|
||||
memory_id=memory_id,
|
||||
content=req.content,
|
||||
confidence=req.confidence,
|
||||
status=req.status,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
if not success:
|
||||
raise HTTPException(status_code=404, detail="Memory not found")
|
||||
return {"status": "updated", "id": memory_id}
|
||||
|
||||
|
||||
@router.delete("/memory/{memory_id}")
|
||||
def api_invalidate_memory(memory_id: str) -> dict:
|
||||
"""Invalidate a memory (error correction)."""
|
||||
success = invalidate_memory(memory_id)
|
||||
if not success:
|
||||
raise HTTPException(status_code=404, detail="Memory not found")
|
||||
return {"status": "invalidated", "id": memory_id}
|
||||
|
||||
|
||||
@router.post("/memory/{memory_id}/promote")
|
||||
def api_promote_memory(memory_id: str) -> dict:
|
||||
"""Promote a candidate memory to active (Phase 9 Commit C)."""
|
||||
try:
|
||||
success = promote_memory(memory_id)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
if not success:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Memory not found or not a candidate: {memory_id}",
|
||||
)
|
||||
return {"status": "promoted", "id": memory_id}
|
||||
|
||||
|
||||
@router.post("/memory/{memory_id}/reject")
|
||||
def api_reject_candidate_memory(memory_id: str) -> dict:
|
||||
"""Reject a candidate memory (Phase 9 Commit C review queue)."""
|
||||
success = reject_candidate_memory(memory_id)
|
||||
if not success:
|
||||
raise HTTPException(
|
||||
status_code=404,
|
||||
detail=f"Memory not found or not a candidate: {memory_id}",
|
||||
)
|
||||
return {"status": "rejected", "id": memory_id}
|
||||
|
||||
|
||||
@router.post("/project/state")
|
||||
def api_set_project_state(req: ProjectStateSetRequest) -> dict:
|
||||
"""Set or update a trusted project state entry."""
|
||||
try:
|
||||
entry = set_state(
|
||||
project_name=req.project,
|
||||
category=req.category,
|
||||
key=req.key,
|
||||
value=req.value,
|
||||
source=req.source,
|
||||
confidence=req.confidence,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
except Exception as e:
|
||||
log.error("set_state_failed", error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Failed to set state: {e}")
|
||||
return {"status": "ok", "id": entry.id, "category": entry.category, "key": entry.key}
|
||||
|
||||
|
||||
@router.get("/project/state/{project_name}")
|
||||
def api_get_project_state(project_name: str, category: str | None = None) -> dict:
|
||||
"""Get trusted project state entries."""
|
||||
entries = get_state(project_name, category=category)
|
||||
return {
|
||||
"project": project_name,
|
||||
"entries": [
|
||||
{
|
||||
"id": e.id,
|
||||
"category": e.category,
|
||||
"key": e.key,
|
||||
"value": e.value,
|
||||
"source": e.source,
|
||||
"confidence": e.confidence,
|
||||
"status": e.status,
|
||||
"updated_at": e.updated_at,
|
||||
}
|
||||
for e in entries
|
||||
],
|
||||
"categories": CATEGORIES,
|
||||
}
|
||||
|
||||
|
||||
@router.delete("/project/state")
|
||||
def api_invalidate_project_state(req: ProjectStateInvalidateRequest) -> dict:
|
||||
"""Invalidate (supersede) a project state entry."""
|
||||
success = invalidate_state(req.project, req.category, req.key)
|
||||
if not success:
|
||||
raise HTTPException(status_code=404, detail="State entry not found or already invalidated")
|
||||
return {"status": "invalidated", "project": req.project, "category": req.category, "key": req.key}
|
||||
|
||||
|
||||
class InteractionRecordRequest(BaseModel):
|
||||
prompt: str
|
||||
response: str = ""
|
||||
response_summary: str = ""
|
||||
project: str = ""
|
||||
client: str = ""
|
||||
session_id: str = ""
|
||||
memories_used: list[str] = []
|
||||
chunks_used: list[str] = []
|
||||
context_pack: dict | None = None
|
||||
reinforce: bool = True
|
||||
extract: bool = False
|
||||
|
||||
|
||||
@router.post("/interactions")
|
||||
def api_record_interaction(req: InteractionRecordRequest) -> dict:
|
||||
"""Capture one interaction (prompt + response + what was used).
|
||||
|
||||
This is the foundation of the AtoCore reflection loop. It records
|
||||
what the system fed to an LLM and what came back. If ``reinforce``
|
||||
is true (default) and there is response content, the Phase 9
|
||||
Commit B reinforcement pass runs automatically, bumping the
|
||||
confidence of any active memory echoed in the response. Nothing is
|
||||
ever promoted into trusted state automatically.
|
||||
"""
|
||||
try:
|
||||
interaction = record_interaction(
|
||||
prompt=req.prompt,
|
||||
response=req.response,
|
||||
response_summary=req.response_summary,
|
||||
project=req.project,
|
||||
client=req.client,
|
||||
session_id=req.session_id,
|
||||
memories_used=req.memories_used,
|
||||
chunks_used=req.chunks_used,
|
||||
context_pack=req.context_pack,
|
||||
reinforce=req.reinforce,
|
||||
extract=req.extract,
|
||||
)
|
||||
except ValueError as e:
|
||||
raise HTTPException(status_code=400, detail=str(e))
|
||||
return {
|
||||
"status": "recorded",
|
||||
"id": interaction.id,
|
||||
"created_at": interaction.created_at,
|
||||
}
|
||||
|
||||
|
||||
@router.post("/interactions/{interaction_id}/reinforce")
|
||||
def api_reinforce_interaction(interaction_id: str) -> dict:
|
||||
"""Run the reinforcement pass on an already-captured interaction.
|
||||
|
||||
Useful for backfilling reinforcement over historical interactions,
|
||||
or for retrying after a transient failure in the automatic pass
|
||||
that runs inside ``POST /interactions``.
|
||||
"""
|
||||
interaction = get_interaction(interaction_id)
|
||||
if interaction is None:
|
||||
raise HTTPException(status_code=404, detail=f"Interaction not found: {interaction_id}")
|
||||
results = reinforce_from_interaction(interaction)
|
||||
return {
|
||||
"interaction_id": interaction_id,
|
||||
"reinforced_count": len(results),
|
||||
"reinforced": [
|
||||
{
|
||||
"memory_id": r.memory_id,
|
||||
"memory_type": r.memory_type,
|
||||
"old_confidence": round(r.old_confidence, 4),
|
||||
"new_confidence": round(r.new_confidence, 4),
|
||||
}
|
||||
for r in results
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
class InteractionExtractRequest(BaseModel):
|
||||
persist: bool = False
|
||||
|
||||
|
||||
@router.post("/interactions/{interaction_id}/extract")
|
||||
def api_extract_from_interaction(
|
||||
interaction_id: str,
|
||||
req: InteractionExtractRequest | None = None,
|
||||
) -> dict:
|
||||
"""Extract candidate memories from a captured interaction.
|
||||
|
||||
Phase 9 Commit C. The extractor is rule-based and deliberately
|
||||
conservative — it only surfaces candidates that matched an explicit
|
||||
structural cue (decision heading, preference sentence, etc.). By
|
||||
default the candidates are returned *without* being persisted so a
|
||||
caller can preview them before committing to a review queue. Pass
|
||||
``persist: true`` to immediately create candidate memories for
|
||||
each extraction result.
|
||||
"""
|
||||
interaction = get_interaction(interaction_id)
|
||||
if interaction is None:
|
||||
raise HTTPException(status_code=404, detail=f"Interaction not found: {interaction_id}")
|
||||
payload = req or InteractionExtractRequest()
|
||||
candidates: list[MemoryCandidate] = extract_candidates_from_interaction(interaction)
|
||||
|
||||
persisted_ids: list[str] = []
|
||||
if payload.persist:
|
||||
for candidate in candidates:
|
||||
try:
|
||||
mem = create_memory(
|
||||
memory_type=candidate.memory_type,
|
||||
content=candidate.content,
|
||||
project=candidate.project,
|
||||
confidence=candidate.confidence,
|
||||
status="candidate",
|
||||
)
|
||||
persisted_ids.append(mem.id)
|
||||
except ValueError as e:
|
||||
log.error(
|
||||
"extract_persist_failed",
|
||||
interaction_id=interaction_id,
|
||||
rule=candidate.rule,
|
||||
error=str(e),
|
||||
)
|
||||
|
||||
return {
|
||||
"interaction_id": interaction_id,
|
||||
"candidate_count": len(candidates),
|
||||
"persisted": payload.persist,
|
||||
"persisted_ids": persisted_ids,
|
||||
"extractor_version": EXTRACTOR_VERSION,
|
||||
"candidates": [
|
||||
{
|
||||
"memory_type": c.memory_type,
|
||||
"content": c.content,
|
||||
"project": c.project,
|
||||
"confidence": c.confidence,
|
||||
"rule": c.rule,
|
||||
"source_span": c.source_span,
|
||||
"extractor_version": c.extractor_version,
|
||||
}
|
||||
for c in candidates
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
@router.get("/interactions")
|
||||
def api_list_interactions(
|
||||
project: str | None = None,
|
||||
session_id: str | None = None,
|
||||
client: str | None = None,
|
||||
since: str | None = None,
|
||||
limit: int = 50,
|
||||
) -> dict:
|
||||
"""List captured interactions, optionally filtered by project, session,
|
||||
client, or creation time. Hard-capped at 500 entries per call."""
|
||||
interactions = list_interactions(
|
||||
project=project,
|
||||
session_id=session_id,
|
||||
client=client,
|
||||
since=since,
|
||||
limit=limit,
|
||||
)
|
||||
return {
|
||||
"count": len(interactions),
|
||||
"interactions": [
|
||||
{
|
||||
"id": i.id,
|
||||
"prompt": i.prompt,
|
||||
"response_summary": i.response_summary,
|
||||
"response_chars": len(i.response),
|
||||
"project": i.project,
|
||||
"client": i.client,
|
||||
"session_id": i.session_id,
|
||||
"memories_used": i.memories_used,
|
||||
"chunks_used": i.chunks_used,
|
||||
"created_at": i.created_at,
|
||||
}
|
||||
for i in interactions
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
@router.get("/interactions/{interaction_id}")
|
||||
def api_get_interaction(interaction_id: str) -> dict:
|
||||
"""Fetch a single interaction with the full response and context pack."""
|
||||
interaction = get_interaction(interaction_id)
|
||||
if interaction is None:
|
||||
raise HTTPException(status_code=404, detail=f"Interaction not found: {interaction_id}")
|
||||
return {
|
||||
"id": interaction.id,
|
||||
"prompt": interaction.prompt,
|
||||
"response": interaction.response,
|
||||
"response_summary": interaction.response_summary,
|
||||
"project": interaction.project,
|
||||
"client": interaction.client,
|
||||
"session_id": interaction.session_id,
|
||||
"memories_used": interaction.memories_used,
|
||||
"chunks_used": interaction.chunks_used,
|
||||
"context_pack": interaction.context_pack,
|
||||
"created_at": interaction.created_at,
|
||||
}
|
||||
|
||||
|
||||
class BackupCreateRequest(BaseModel):
|
||||
include_chroma: bool = False
|
||||
|
||||
|
||||
@router.post("/admin/backup")
|
||||
def api_create_backup(req: BackupCreateRequest | None = None) -> dict:
|
||||
"""Create a runtime backup snapshot.
|
||||
|
||||
When ``include_chroma`` is true the call holds the ingestion lock so a
|
||||
safe cold copy of the vector store can be taken without racing against
|
||||
refresh or ingest endpoints.
|
||||
"""
|
||||
payload = req or BackupCreateRequest()
|
||||
try:
|
||||
if payload.include_chroma:
|
||||
with exclusive_ingestion():
|
||||
metadata = create_runtime_backup(include_chroma=True)
|
||||
else:
|
||||
metadata = create_runtime_backup(include_chroma=False)
|
||||
except Exception as e:
|
||||
log.error("admin_backup_failed", error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Backup failed: {e}")
|
||||
return metadata
|
||||
|
||||
|
||||
@router.get("/admin/backup")
|
||||
def api_list_backups() -> dict:
|
||||
"""List all runtime backups under the configured backup directory."""
|
||||
return {
|
||||
"backup_dir": str(_config.settings.resolved_backup_dir),
|
||||
"backups": list_runtime_backups(),
|
||||
}
|
||||
|
||||
|
||||
class BackupCleanupRequest(BaseModel):
|
||||
confirm: bool = False
|
||||
|
||||
|
||||
@router.post("/admin/backup/cleanup")
|
||||
def api_cleanup_backups(req: BackupCleanupRequest | None = None) -> dict:
|
||||
"""Apply retention policy to old backup snapshots.
|
||||
|
||||
Dry-run by default. Pass ``confirm: true`` to actually delete.
|
||||
Retention: last 7 daily, last 4 weekly (Sundays), last 6 monthly (1st).
|
||||
"""
|
||||
payload = req or BackupCleanupRequest()
|
||||
try:
|
||||
return cleanup_old_backups(confirm=payload.confirm)
|
||||
except Exception as e:
|
||||
log.error("admin_cleanup_failed", error=str(e))
|
||||
raise HTTPException(status_code=500, detail=f"Cleanup failed: {e}")
|
||||
|
||||
|
||||
@router.get("/admin/backup/{stamp}/validate")
|
||||
def api_validate_backup(stamp: str) -> dict:
|
||||
"""Validate that a previously created backup is structurally usable."""
|
||||
result = validate_backup(stamp)
|
||||
if not result.get("exists", False):
|
||||
raise HTTPException(status_code=404, detail=f"Backup not found: {stamp}")
|
||||
return result
|
||||
|
||||
|
||||
@router.get("/health")
|
||||
def api_health() -> dict:
|
||||
"""Health check.
|
||||
|
||||
Three layers of version reporting, in increasing precision:
|
||||
|
||||
- ``version`` / ``code_version``: ``atocore.__version__`` (e.g.
|
||||
"0.2.0"). Bumped manually on commits that change the API
|
||||
surface, schema, or user-visible behavior. Coarse — any
|
||||
number of commits can land between bumps without changing
|
||||
this value.
|
||||
- ``build_sha``: full git SHA of the commit the running
|
||||
container was built from. Set by ``deploy/dalidou/deploy.sh``
|
||||
via the ``ATOCORE_BUILD_SHA`` env var on every rebuild.
|
||||
Reports ``"unknown"`` for builds that bypass deploy.sh
|
||||
(direct ``docker compose up`` etc.). This is the precise
|
||||
drift signal: if the live ``build_sha`` doesn't match the
|
||||
tip of the deployed branch on Gitea, the service is stale
|
||||
regardless of what ``code_version`` says.
|
||||
- ``build_time`` / ``build_branch``: when and from which branch
|
||||
the live container was built. Useful for forensics when
|
||||
multiple branches are in flight or when build_sha is
|
||||
ambiguous (e.g. a force-push to the same SHA).
|
||||
|
||||
The deploy.sh post-deploy verification step compares the live
|
||||
``build_sha`` to the SHA it just set, and exits non-zero on
|
||||
mismatch.
|
||||
"""
|
||||
import os
|
||||
|
||||
from atocore import __version__
|
||||
|
||||
store = get_vector_store()
|
||||
source_status = get_source_status()
|
||||
return {
|
||||
"status": "ok",
|
||||
"version": __version__,
|
||||
"code_version": __version__,
|
||||
"build_sha": os.environ.get("ATOCORE_BUILD_SHA", "unknown"),
|
||||
"build_time": os.environ.get("ATOCORE_BUILD_TIME", "unknown"),
|
||||
"build_branch": os.environ.get("ATOCORE_BUILD_BRANCH", "unknown"),
|
||||
"vectors_count": store.count,
|
||||
"env": _config.settings.env,
|
||||
"machine_paths": {
|
||||
"db_path": str(_config.settings.db_path),
|
||||
"chroma_path": str(_config.settings.chroma_path),
|
||||
"log_dir": str(_config.settings.resolved_log_dir),
|
||||
"backup_dir": str(_config.settings.resolved_backup_dir),
|
||||
"run_dir": str(_config.settings.resolved_run_dir),
|
||||
},
|
||||
"sources_ready": all(
|
||||
(not source["enabled"]) or (source["exists"] and source["is_dir"])
|
||||
for source in source_status
|
||||
),
|
||||
"source_status": source_status,
|
||||
}
|
||||
|
||||
|
||||
@router.get("/sources")
|
||||
def api_sources() -> dict:
|
||||
"""Return configured ingestion source directories and readiness."""
|
||||
return {
|
||||
"sources": get_source_status(),
|
||||
"vault_enabled": _config.settings.source_vault_enabled,
|
||||
"drive_enabled": _config.settings.source_drive_enabled,
|
||||
}
|
||||
|
||||
|
||||
@router.get("/stats")
|
||||
def api_stats() -> dict:
|
||||
"""Ingestion statistics."""
|
||||
return get_ingestion_stats()
|
||||
|
||||
|
||||
@router.get("/debug/context")
|
||||
def api_debug_context() -> dict:
|
||||
"""Inspect the last assembled context pack."""
|
||||
pack = get_last_context_pack()
|
||||
if pack is None:
|
||||
return {"message": "No context pack built yet."}
|
||||
return _pack_to_dict(pack)
|
||||
157
src/atocore/config.py
Normal file
157
src/atocore/config.py
Normal file
@@ -0,0 +1,157 @@
|
||||
"""AtoCore configuration via environment variables."""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from pydantic_settings import BaseSettings
|
||||
|
||||
|
||||
class Settings(BaseSettings):
|
||||
env: str = "development"
|
||||
debug: bool = False
|
||||
log_level: str = "INFO"
|
||||
data_dir: Path = Path("./data")
|
||||
db_dir: Path | None = None
|
||||
chroma_dir: Path | None = None
|
||||
cache_dir: Path | None = None
|
||||
tmp_dir: Path | None = None
|
||||
vault_source_dir: Path = Path("./sources/vault")
|
||||
drive_source_dir: Path = Path("./sources/drive")
|
||||
source_vault_enabled: bool = True
|
||||
source_drive_enabled: bool = True
|
||||
log_dir: Path = Path("./logs")
|
||||
backup_dir: Path = Path("./backups")
|
||||
run_dir: Path = Path("./run")
|
||||
project_registry_path: Path = Path("./config/project-registry.json")
|
||||
host: str = "127.0.0.1"
|
||||
port: int = 8100
|
||||
db_busy_timeout_ms: int = 5000
|
||||
|
||||
# Embedding
|
||||
embedding_model: str = (
|
||||
"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
|
||||
)
|
||||
|
||||
# Chunking
|
||||
chunk_max_size: int = 800
|
||||
chunk_overlap: int = 100
|
||||
chunk_min_size: int = 50
|
||||
|
||||
# Context
|
||||
context_budget: int = 3000
|
||||
context_top_k: int = 15
|
||||
|
||||
# Retrieval ranking weights (tunable per environment).
|
||||
# All multipliers default to the values used since Wave 1; tighten or
|
||||
# loosen them via ATOCORE_* env vars without touching code.
|
||||
rank_project_match_boost: float = 2.0
|
||||
rank_query_token_step: float = 0.08
|
||||
rank_query_token_cap: float = 1.32
|
||||
rank_path_high_signal_boost: float = 1.18
|
||||
rank_path_low_signal_penalty: float = 0.72
|
||||
|
||||
model_config = {"env_prefix": "ATOCORE_"}
|
||||
|
||||
@property
|
||||
def db_path(self) -> Path:
|
||||
legacy_path = self.resolved_data_dir / "atocore.db"
|
||||
if self.db_dir is not None:
|
||||
return self.resolved_db_dir / "atocore.db"
|
||||
if legacy_path.exists():
|
||||
return legacy_path
|
||||
return self.resolved_db_dir / "atocore.db"
|
||||
|
||||
@property
|
||||
def chroma_path(self) -> Path:
|
||||
return self._resolve_path(self.chroma_dir or (self.resolved_data_dir / "chroma"))
|
||||
|
||||
@property
|
||||
def cache_path(self) -> Path:
|
||||
return self._resolve_path(self.cache_dir or (self.resolved_data_dir / "cache"))
|
||||
|
||||
@property
|
||||
def tmp_path(self) -> Path:
|
||||
return self._resolve_path(self.tmp_dir or (self.resolved_data_dir / "tmp"))
|
||||
|
||||
@property
|
||||
def resolved_data_dir(self) -> Path:
|
||||
return self._resolve_path(self.data_dir)
|
||||
|
||||
@property
|
||||
def resolved_db_dir(self) -> Path:
|
||||
return self._resolve_path(self.db_dir or (self.resolved_data_dir / "db"))
|
||||
|
||||
@property
|
||||
def resolved_vault_source_dir(self) -> Path:
|
||||
return self._resolve_path(self.vault_source_dir)
|
||||
|
||||
@property
|
||||
def resolved_drive_source_dir(self) -> Path:
|
||||
return self._resolve_path(self.drive_source_dir)
|
||||
|
||||
@property
|
||||
def resolved_log_dir(self) -> Path:
|
||||
return self._resolve_path(self.log_dir)
|
||||
|
||||
@property
|
||||
def resolved_backup_dir(self) -> Path:
|
||||
return self._resolve_path(self.backup_dir)
|
||||
|
||||
@property
|
||||
def resolved_run_dir(self) -> Path:
|
||||
if self.run_dir == Path("./run"):
|
||||
return self._resolve_path(self.resolved_data_dir.parent / "run")
|
||||
return self._resolve_path(self.run_dir)
|
||||
|
||||
@property
|
||||
def resolved_project_registry_path(self) -> Path:
|
||||
return self._resolve_path(self.project_registry_path)
|
||||
|
||||
@property
|
||||
def machine_dirs(self) -> list[Path]:
|
||||
return [
|
||||
self.db_path.parent,
|
||||
self.chroma_path,
|
||||
self.cache_path,
|
||||
self.tmp_path,
|
||||
self.resolved_log_dir,
|
||||
self.resolved_backup_dir,
|
||||
self.resolved_run_dir,
|
||||
self.resolved_project_registry_path.parent,
|
||||
]
|
||||
|
||||
@property
|
||||
def source_specs(self) -> list[dict[str, object]]:
|
||||
return [
|
||||
{
|
||||
"name": "vault",
|
||||
"enabled": self.source_vault_enabled,
|
||||
"path": self.resolved_vault_source_dir,
|
||||
"read_only": True,
|
||||
},
|
||||
{
|
||||
"name": "drive",
|
||||
"enabled": self.source_drive_enabled,
|
||||
"path": self.resolved_drive_source_dir,
|
||||
"read_only": True,
|
||||
},
|
||||
]
|
||||
|
||||
@property
|
||||
def source_dirs(self) -> list[Path]:
|
||||
return [spec["path"] for spec in self.source_specs if spec["enabled"]]
|
||||
|
||||
def _resolve_path(self, path: Path) -> Path:
|
||||
return path.expanduser().resolve(strict=False)
|
||||
|
||||
|
||||
settings = Settings()
|
||||
|
||||
|
||||
def ensure_runtime_dirs() -> None:
|
||||
"""Create writable runtime directories for machine state and logs.
|
||||
|
||||
Source directories are intentionally excluded because they are treated as
|
||||
read-only ingestion inputs by convention.
|
||||
"""
|
||||
for directory in settings.machine_dirs:
|
||||
directory.mkdir(parents=True, exist_ok=True)
|
||||
0
src/atocore/context/__init__.py
Normal file
0
src/atocore/context/__init__.py
Normal file
424
src/atocore/context/builder.py
Normal file
424
src/atocore/context/builder.py
Normal file
@@ -0,0 +1,424 @@
|
||||
"""Context pack assembly: retrieve, rank, budget, format.
|
||||
|
||||
Trust precedence (per Master Plan):
|
||||
1. Trusted Project State → always included first, highest authority
|
||||
2. Identity + Preference memories → included next
|
||||
3. Retrieved chunks → ranked, deduplicated, budget-constrained
|
||||
"""
|
||||
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.context.project_state import format_project_state, get_state
|
||||
from atocore.memory.service import get_memories_for_context
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
from atocore.retrieval.retriever import ChunkResult, retrieve
|
||||
|
||||
log = get_logger("context_builder")
|
||||
|
||||
SYSTEM_PREFIX = (
|
||||
"You have access to the following personal context from the user's knowledge base.\n"
|
||||
"Use it to inform your answer. If the context is not relevant, ignore it.\n"
|
||||
"Do not mention the context system unless asked.\n"
|
||||
"When project state is provided, treat it as the most authoritative source."
|
||||
)
|
||||
|
||||
# Budget allocation (per Master Plan section 9):
|
||||
# identity: 5%, preferences: 5%, project state: 20%, retrieval: 60%+
|
||||
PROJECT_STATE_BUDGET_RATIO = 0.20
|
||||
MEMORY_BUDGET_RATIO = 0.10 # 5% identity + 5% preference
|
||||
# Project-scoped memories (project/knowledge/episodic) are the outlet
|
||||
# for the Phase 9 reflection loop on the retrieval side. Budget sits
|
||||
# between identity/preference and retrieved chunks so a reinforced
|
||||
# memory can actually reach the model.
|
||||
PROJECT_MEMORY_BUDGET_RATIO = 0.25
|
||||
PROJECT_MEMORY_TYPES = ["project", "knowledge", "episodic"]
|
||||
|
||||
# Last built context pack for debug inspection
|
||||
_last_context_pack: "ContextPack | None" = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class ContextChunk:
|
||||
content: str
|
||||
source_file: str
|
||||
heading_path: str
|
||||
score: float
|
||||
char_count: int
|
||||
|
||||
|
||||
@dataclass
|
||||
class ContextPack:
|
||||
chunks_used: list[ContextChunk] = field(default_factory=list)
|
||||
project_state_text: str = ""
|
||||
project_state_chars: int = 0
|
||||
memory_text: str = ""
|
||||
memory_chars: int = 0
|
||||
project_memory_text: str = ""
|
||||
project_memory_chars: int = 0
|
||||
total_chars: int = 0
|
||||
budget: int = 0
|
||||
budget_remaining: int = 0
|
||||
formatted_context: str = ""
|
||||
full_prompt: str = ""
|
||||
query: str = ""
|
||||
project_hint: str = ""
|
||||
duration_ms: int = 0
|
||||
|
||||
|
||||
def build_context(
|
||||
user_prompt: str,
|
||||
project_hint: str | None = None,
|
||||
budget: int | None = None,
|
||||
) -> ContextPack:
|
||||
"""Build a context pack for a user prompt.
|
||||
|
||||
Trust precedence applied:
|
||||
1. Project state is injected first (highest trust)
|
||||
2. Identity + preference memories (second trust level)
|
||||
3. Retrieved chunks fill the remaining budget
|
||||
"""
|
||||
global _last_context_pack
|
||||
start = time.time()
|
||||
budget = _config.settings.context_budget if budget is None else max(budget, 0)
|
||||
|
||||
# 1. Get Trusted Project State (highest precedence)
|
||||
project_state_text = ""
|
||||
project_state_chars = 0
|
||||
project_state_budget = min(
|
||||
budget,
|
||||
max(0, int(budget * PROJECT_STATE_BUDGET_RATIO)),
|
||||
)
|
||||
|
||||
# Canonicalize the project hint through the registry so callers
|
||||
# can pass an alias (`p05`, `gigabit`) and still find trusted
|
||||
# state stored under the canonical project id. The same helper
|
||||
# is used everywhere a project name crosses a trust boundary
|
||||
# (project_state, memories, interactions). When the registry has
|
||||
# no entry the helper returns the input unchanged so hand-curated
|
||||
# state that predates the registry still works.
|
||||
canonical_project = resolve_project_name(project_hint) if project_hint else ""
|
||||
if canonical_project:
|
||||
state_entries = get_state(canonical_project)
|
||||
if state_entries:
|
||||
project_state_text = format_project_state(state_entries)
|
||||
project_state_text, project_state_chars = _truncate_text_block(
|
||||
project_state_text,
|
||||
project_state_budget or budget,
|
||||
)
|
||||
|
||||
# 2. Get identity + preference memories (second precedence)
|
||||
memory_budget = min(int(budget * MEMORY_BUDGET_RATIO), max(budget - project_state_chars, 0))
|
||||
memory_text, memory_chars = get_memories_for_context(
|
||||
memory_types=["identity", "preference"],
|
||||
budget=memory_budget,
|
||||
query=user_prompt,
|
||||
)
|
||||
|
||||
# 2b. Get project-scoped memories (third precedence). Only
|
||||
# populated when a canonical project is in scope — cross-project
|
||||
# memory bleed would rot the pack. Active-only filtering is
|
||||
# handled by the shared min_confidence=0.5 gate inside
|
||||
# get_memories_for_context.
|
||||
project_memory_text = ""
|
||||
project_memory_chars = 0
|
||||
if canonical_project:
|
||||
project_memory_budget = min(
|
||||
int(budget * PROJECT_MEMORY_BUDGET_RATIO),
|
||||
max(budget - project_state_chars - memory_chars, 0),
|
||||
)
|
||||
project_memory_text, project_memory_chars = get_memories_for_context(
|
||||
memory_types=PROJECT_MEMORY_TYPES,
|
||||
project=canonical_project,
|
||||
budget=project_memory_budget,
|
||||
header="--- Project Memories ---",
|
||||
footer="--- End Project Memories ---",
|
||||
query=user_prompt,
|
||||
)
|
||||
|
||||
# 3. Calculate remaining budget for retrieval
|
||||
retrieval_budget = budget - project_state_chars - memory_chars - project_memory_chars
|
||||
|
||||
# 4. Retrieve candidates
|
||||
candidates = (
|
||||
retrieve(
|
||||
user_prompt,
|
||||
top_k=_config.settings.context_top_k,
|
||||
project_hint=project_hint,
|
||||
)
|
||||
if retrieval_budget > 0
|
||||
else []
|
||||
)
|
||||
|
||||
# 5. Score and rank
|
||||
scored = _rank_chunks(candidates, project_hint)
|
||||
|
||||
# 6. Select within remaining budget
|
||||
selected = _select_within_budget(scored, max(retrieval_budget, 0))
|
||||
|
||||
# 7. Format full context
|
||||
formatted = _format_full_context(
|
||||
project_state_text, memory_text, project_memory_text, selected
|
||||
)
|
||||
if len(formatted) > budget:
|
||||
formatted, selected = _trim_context_to_budget(
|
||||
project_state_text,
|
||||
memory_text,
|
||||
project_memory_text,
|
||||
selected,
|
||||
budget,
|
||||
)
|
||||
|
||||
# 8. Build full prompt
|
||||
full_prompt = f"{SYSTEM_PREFIX}\n\n{formatted}\n\n{user_prompt}"
|
||||
|
||||
project_state_chars = len(project_state_text)
|
||||
memory_chars = len(memory_text)
|
||||
project_memory_chars = len(project_memory_text)
|
||||
retrieval_chars = sum(c.char_count for c in selected)
|
||||
total_chars = len(formatted)
|
||||
duration_ms = int((time.time() - start) * 1000)
|
||||
|
||||
pack = ContextPack(
|
||||
chunks_used=selected,
|
||||
project_state_text=project_state_text,
|
||||
project_state_chars=project_state_chars,
|
||||
memory_text=memory_text,
|
||||
memory_chars=memory_chars,
|
||||
project_memory_text=project_memory_text,
|
||||
project_memory_chars=project_memory_chars,
|
||||
total_chars=total_chars,
|
||||
budget=budget,
|
||||
budget_remaining=budget - total_chars,
|
||||
formatted_context=formatted,
|
||||
full_prompt=full_prompt,
|
||||
query=user_prompt,
|
||||
project_hint=project_hint or "",
|
||||
duration_ms=duration_ms,
|
||||
)
|
||||
|
||||
_last_context_pack = pack
|
||||
|
||||
log.info(
|
||||
"context_built",
|
||||
chunks_used=len(selected),
|
||||
project_state_chars=project_state_chars,
|
||||
memory_chars=memory_chars,
|
||||
project_memory_chars=project_memory_chars,
|
||||
retrieval_chars=retrieval_chars,
|
||||
total_chars=total_chars,
|
||||
budget_remaining=budget - total_chars,
|
||||
duration_ms=duration_ms,
|
||||
)
|
||||
log.debug("context_pack_detail", pack=_pack_to_dict(pack))
|
||||
|
||||
return pack
|
||||
|
||||
|
||||
def get_last_context_pack() -> ContextPack | None:
|
||||
"""Return the last built context pack for debug inspection."""
|
||||
return _last_context_pack
|
||||
|
||||
|
||||
def _rank_chunks(
|
||||
candidates: list[ChunkResult],
|
||||
project_hint: str | None,
|
||||
) -> list[tuple[float, ChunkResult]]:
|
||||
"""Rank candidates with boosting for project match."""
|
||||
scored = []
|
||||
seen_content: set[str] = set()
|
||||
|
||||
for chunk in candidates:
|
||||
# Deduplicate by content prefix (first 200 chars)
|
||||
content_key = chunk.content[:200]
|
||||
if content_key in seen_content:
|
||||
continue
|
||||
seen_content.add(content_key)
|
||||
|
||||
# Base score from similarity
|
||||
final_score = chunk.score
|
||||
|
||||
# Project boost
|
||||
if project_hint:
|
||||
tags_str = chunk.tags.lower() if chunk.tags else ""
|
||||
source_str = chunk.source_file.lower()
|
||||
title_str = chunk.title.lower() if chunk.title else ""
|
||||
hint_lower = project_hint.lower()
|
||||
|
||||
if hint_lower in tags_str or hint_lower in source_str or hint_lower in title_str:
|
||||
final_score *= 1.3
|
||||
|
||||
scored.append((final_score, chunk))
|
||||
|
||||
# Sort by score descending
|
||||
scored.sort(key=lambda x: x[0], reverse=True)
|
||||
return scored
|
||||
|
||||
|
||||
def _select_within_budget(
|
||||
scored: list[tuple[float, ChunkResult]],
|
||||
budget: int,
|
||||
) -> list[ContextChunk]:
|
||||
"""Select top chunks that fit within the character budget."""
|
||||
selected = []
|
||||
used = 0
|
||||
|
||||
for score, chunk in scored:
|
||||
chunk_len = len(chunk.content)
|
||||
if used + chunk_len > budget:
|
||||
continue
|
||||
selected.append(
|
||||
ContextChunk(
|
||||
content=chunk.content,
|
||||
source_file=_shorten_path(chunk.source_file),
|
||||
heading_path=chunk.heading_path,
|
||||
score=score,
|
||||
char_count=chunk_len,
|
||||
)
|
||||
)
|
||||
used += chunk_len
|
||||
|
||||
return selected
|
||||
|
||||
|
||||
def _format_full_context(
|
||||
project_state_text: str,
|
||||
memory_text: str,
|
||||
project_memory_text: str,
|
||||
chunks: list[ContextChunk],
|
||||
) -> str:
|
||||
"""Format project state + memories + retrieved chunks into full context block."""
|
||||
parts = []
|
||||
|
||||
# 1. Project state first (highest trust)
|
||||
if project_state_text:
|
||||
parts.append(project_state_text)
|
||||
parts.append("")
|
||||
|
||||
# 2. Identity + preference memories (second trust level)
|
||||
if memory_text:
|
||||
parts.append(memory_text)
|
||||
parts.append("")
|
||||
|
||||
# 3. Project-scoped memories (third trust level)
|
||||
if project_memory_text:
|
||||
parts.append(project_memory_text)
|
||||
parts.append("")
|
||||
|
||||
# 4. Retrieved chunks (lowest trust)
|
||||
if chunks:
|
||||
parts.append("--- AtoCore Retrieved Context ---")
|
||||
if project_state_text:
|
||||
parts.append("If retrieved context conflicts with Trusted Project State above, trust the Trusted Project State.")
|
||||
for chunk in chunks:
|
||||
parts.append(
|
||||
f"[Source: {chunk.source_file} | Section: {chunk.heading_path} | Score: {chunk.score:.2f}]"
|
||||
)
|
||||
parts.append(chunk.content)
|
||||
parts.append("")
|
||||
parts.append("--- End Context ---")
|
||||
elif not project_state_text and not memory_text and not project_memory_text:
|
||||
parts.append("--- AtoCore Context ---\nNo relevant context found.\n--- End Context ---")
|
||||
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
def _shorten_path(path: str) -> str:
|
||||
"""Shorten an absolute path to a relative-like display."""
|
||||
p = Path(path)
|
||||
parts = p.parts
|
||||
if len(parts) > 3:
|
||||
return str(Path(*parts[-3:]))
|
||||
return str(p)
|
||||
|
||||
|
||||
def _pack_to_dict(pack: ContextPack) -> dict:
|
||||
"""Convert a context pack to a JSON-serializable dict."""
|
||||
return {
|
||||
"query": pack.query,
|
||||
"project_hint": pack.project_hint,
|
||||
"project_state_chars": pack.project_state_chars,
|
||||
"memory_chars": pack.memory_chars,
|
||||
"project_memory_chars": pack.project_memory_chars,
|
||||
"chunks_used": len(pack.chunks_used),
|
||||
"total_chars": pack.total_chars,
|
||||
"budget": pack.budget,
|
||||
"budget_remaining": pack.budget_remaining,
|
||||
"duration_ms": pack.duration_ms,
|
||||
"has_project_state": bool(pack.project_state_text),
|
||||
"has_memories": bool(pack.memory_text),
|
||||
"has_project_memories": bool(pack.project_memory_text),
|
||||
"chunks": [
|
||||
{
|
||||
"source_file": c.source_file,
|
||||
"heading_path": c.heading_path,
|
||||
"score": c.score,
|
||||
"char_count": c.char_count,
|
||||
"content_preview": c.content[:100],
|
||||
}
|
||||
for c in pack.chunks_used
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def _truncate_text_block(text: str, budget: int) -> tuple[str, int]:
|
||||
"""Trim a formatted text block so trusted tiers cannot exceed the total budget."""
|
||||
if budget <= 0 or not text:
|
||||
return "", 0
|
||||
if len(text) <= budget:
|
||||
return text, len(text)
|
||||
if budget <= 3:
|
||||
trimmed = text[:budget]
|
||||
else:
|
||||
trimmed = f"{text[: budget - 3].rstrip()}..."
|
||||
return trimmed, len(trimmed)
|
||||
|
||||
|
||||
def _trim_context_to_budget(
|
||||
project_state_text: str,
|
||||
memory_text: str,
|
||||
project_memory_text: str,
|
||||
chunks: list[ContextChunk],
|
||||
budget: int,
|
||||
) -> tuple[str, list[ContextChunk]]:
|
||||
"""Trim retrieval → project memories → identity/preference → project state."""
|
||||
kept_chunks = list(chunks)
|
||||
formatted = _format_full_context(
|
||||
project_state_text, memory_text, project_memory_text, kept_chunks
|
||||
)
|
||||
while len(formatted) > budget and kept_chunks:
|
||||
kept_chunks.pop()
|
||||
formatted = _format_full_context(
|
||||
project_state_text, memory_text, project_memory_text, kept_chunks
|
||||
)
|
||||
|
||||
if len(formatted) <= budget:
|
||||
return formatted, kept_chunks
|
||||
|
||||
# Drop project memories next (they were the most recently added
|
||||
# tier and carry less trust than identity/preference).
|
||||
project_memory_text, _ = _truncate_text_block(
|
||||
project_memory_text,
|
||||
max(budget - len(project_state_text) - len(memory_text), 0),
|
||||
)
|
||||
formatted = _format_full_context(
|
||||
project_state_text, memory_text, project_memory_text, kept_chunks
|
||||
)
|
||||
if len(formatted) <= budget:
|
||||
return formatted, kept_chunks
|
||||
|
||||
memory_text, _ = _truncate_text_block(memory_text, max(budget - len(project_state_text), 0))
|
||||
formatted = _format_full_context(
|
||||
project_state_text, memory_text, project_memory_text, kept_chunks
|
||||
)
|
||||
if len(formatted) <= budget:
|
||||
return formatted, kept_chunks
|
||||
|
||||
project_state_text, _ = _truncate_text_block(project_state_text, budget)
|
||||
formatted = _format_full_context(project_state_text, "", "", [])
|
||||
if len(formatted) > budget:
|
||||
formatted, _ = _truncate_text_block(formatted, budget)
|
||||
return formatted, []
|
||||
254
src/atocore/context/project_state.py
Normal file
254
src/atocore/context/project_state.py
Normal file
@@ -0,0 +1,254 @@
|
||||
"""Trusted Project State — the highest-priority context source.
|
||||
|
||||
Per the Master Plan trust precedence:
|
||||
1. Trusted Project State (this module)
|
||||
2. AtoDrive artifacts
|
||||
3. Recent validated memory
|
||||
4. AtoVault summaries
|
||||
5. PKM chunks
|
||||
6. Historical / low-confidence
|
||||
|
||||
Project state is manually curated or explicitly confirmed facts about a project.
|
||||
It always wins over retrieval-based context when there's a conflict.
|
||||
"""
|
||||
|
||||
import uuid
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
log = get_logger("project_state")
|
||||
|
||||
# DB schema extension for project state
|
||||
PROJECT_STATE_SCHEMA = """
|
||||
CREATE TABLE IF NOT EXISTS project_state (
|
||||
id TEXT PRIMARY KEY,
|
||||
project_id TEXT NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
|
||||
category TEXT NOT NULL,
|
||||
key TEXT NOT NULL,
|
||||
value TEXT NOT NULL,
|
||||
source TEXT DEFAULT '',
|
||||
confidence REAL DEFAULT 1.0,
|
||||
status TEXT DEFAULT 'active',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(project_id, category, key)
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_project_state_project ON project_state(project_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_project_state_category ON project_state(category);
|
||||
CREATE INDEX IF NOT EXISTS idx_project_state_status ON project_state(status);
|
||||
"""
|
||||
|
||||
# Valid categories for project state entries
|
||||
CATEGORIES = [
|
||||
"status", # current project status, phase, blockers
|
||||
"decision", # confirmed design/engineering decisions
|
||||
"requirement", # key requirements and constraints
|
||||
"contact", # key people, vendors, stakeholders
|
||||
"milestone", # dates, deadlines, deliverables
|
||||
"fact", # verified technical facts
|
||||
"config", # project configuration, parameters
|
||||
]
|
||||
|
||||
|
||||
@dataclass
|
||||
class ProjectStateEntry:
|
||||
id: str
|
||||
project_id: str
|
||||
category: str
|
||||
key: str
|
||||
value: str
|
||||
source: str = ""
|
||||
confidence: float = 1.0
|
||||
status: str = "active"
|
||||
created_at: str = ""
|
||||
updated_at: str = ""
|
||||
|
||||
|
||||
def init_project_state_schema() -> None:
|
||||
"""Create the project_state table if it doesn't exist."""
|
||||
with get_connection() as conn:
|
||||
conn.executescript(PROJECT_STATE_SCHEMA)
|
||||
log.info("project_state_schema_initialized")
|
||||
|
||||
|
||||
def ensure_project(name: str, description: str = "") -> str:
|
||||
"""Get or create a project by name. Returns project_id."""
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT id FROM projects WHERE lower(name) = lower(?)", (name,)
|
||||
).fetchone()
|
||||
if row:
|
||||
return row["id"]
|
||||
|
||||
project_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name, description) VALUES (?, ?, ?)",
|
||||
(project_id, name, description),
|
||||
)
|
||||
log.info("project_created", name=name, project_id=project_id)
|
||||
return project_id
|
||||
|
||||
|
||||
def set_state(
|
||||
project_name: str,
|
||||
category: str,
|
||||
key: str,
|
||||
value: str,
|
||||
source: str = "",
|
||||
confidence: float = 1.0,
|
||||
) -> ProjectStateEntry:
|
||||
"""Set or update a project state entry. Upsert semantics.
|
||||
|
||||
The ``project_name`` is canonicalized through the registry so a
|
||||
caller passing an alias (``p05``) ends up writing into the same
|
||||
row as the canonical id (``p05-interferometer``). Without this
|
||||
step, alias and canonical names would create two parallel
|
||||
project rows and fragmented state.
|
||||
"""
|
||||
if category not in CATEGORIES:
|
||||
raise ValueError(f"Invalid category '{category}'. Must be one of: {CATEGORIES}")
|
||||
_validate_confidence(confidence)
|
||||
|
||||
project_name = resolve_project_name(project_name)
|
||||
project_id = ensure_project(project_name)
|
||||
entry_id = str(uuid.uuid4())
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
with get_connection() as conn:
|
||||
# Check if entry exists
|
||||
existing = conn.execute(
|
||||
"SELECT id FROM project_state WHERE project_id = ? AND category = ? AND key = ?",
|
||||
(project_id, category, key),
|
||||
).fetchone()
|
||||
|
||||
if existing:
|
||||
entry_id = existing["id"]
|
||||
conn.execute(
|
||||
"UPDATE project_state SET value = ?, source = ?, confidence = ?, "
|
||||
"status = 'active', updated_at = CURRENT_TIMESTAMP "
|
||||
"WHERE id = ?",
|
||||
(value, source, confidence, entry_id),
|
||||
)
|
||||
log.info("project_state_updated", project=project_name, category=category, key=key)
|
||||
else:
|
||||
conn.execute(
|
||||
"INSERT INTO project_state (id, project_id, category, key, value, source, confidence) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(entry_id, project_id, category, key, value, source, confidence),
|
||||
)
|
||||
log.info("project_state_created", project=project_name, category=category, key=key)
|
||||
|
||||
return ProjectStateEntry(
|
||||
id=entry_id,
|
||||
project_id=project_id,
|
||||
category=category,
|
||||
key=key,
|
||||
value=value,
|
||||
source=source,
|
||||
confidence=confidence,
|
||||
status="active",
|
||||
created_at=now,
|
||||
updated_at=now,
|
||||
)
|
||||
|
||||
|
||||
def get_state(
|
||||
project_name: str,
|
||||
category: str | None = None,
|
||||
active_only: bool = True,
|
||||
) -> list[ProjectStateEntry]:
|
||||
"""Get project state entries, optionally filtered by category.
|
||||
|
||||
The lookup is canonicalized through the registry so an alias hint
|
||||
finds the same rows as the canonical id.
|
||||
"""
|
||||
project_name = resolve_project_name(project_name)
|
||||
with get_connection() as conn:
|
||||
project = conn.execute(
|
||||
"SELECT id FROM projects WHERE lower(name) = lower(?)", (project_name,)
|
||||
).fetchone()
|
||||
if not project:
|
||||
return []
|
||||
|
||||
query = "SELECT * FROM project_state WHERE project_id = ?"
|
||||
params: list = [project["id"]]
|
||||
|
||||
if category:
|
||||
query += " AND category = ?"
|
||||
params.append(category)
|
||||
if active_only:
|
||||
query += " AND status = 'active'"
|
||||
|
||||
query += " ORDER BY category, key"
|
||||
rows = conn.execute(query, params).fetchall()
|
||||
|
||||
return [
|
||||
ProjectStateEntry(
|
||||
id=r["id"],
|
||||
project_id=r["project_id"],
|
||||
category=r["category"],
|
||||
key=r["key"],
|
||||
value=r["value"],
|
||||
source=r["source"],
|
||||
confidence=r["confidence"],
|
||||
status=r["status"],
|
||||
created_at=r["created_at"],
|
||||
updated_at=r["updated_at"],
|
||||
)
|
||||
for r in rows
|
||||
]
|
||||
|
||||
|
||||
def invalidate_state(project_name: str, category: str, key: str) -> bool:
|
||||
"""Mark a project state entry as superseded.
|
||||
|
||||
The lookup is canonicalized through the registry so an alias is
|
||||
treated as the canonical project for the invalidation lookup.
|
||||
"""
|
||||
project_name = resolve_project_name(project_name)
|
||||
with get_connection() as conn:
|
||||
project = conn.execute(
|
||||
"SELECT id FROM projects WHERE lower(name) = lower(?)", (project_name,)
|
||||
).fetchone()
|
||||
if not project:
|
||||
return False
|
||||
|
||||
result = conn.execute(
|
||||
"UPDATE project_state SET status = 'superseded', updated_at = CURRENT_TIMESTAMP "
|
||||
"WHERE project_id = ? AND category = ? AND key = ? AND status = 'active'",
|
||||
(project["id"], category, key),
|
||||
)
|
||||
if result.rowcount > 0:
|
||||
log.info("project_state_invalidated", project=project_name, category=category, key=key)
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def format_project_state(entries: list[ProjectStateEntry]) -> str:
|
||||
"""Format project state entries for context injection."""
|
||||
if not entries:
|
||||
return ""
|
||||
|
||||
lines = ["--- Trusted Project State ---"]
|
||||
current_category = ""
|
||||
|
||||
for entry in entries:
|
||||
if entry.category != current_category:
|
||||
current_category = entry.category
|
||||
lines.append(f"\n[{current_category.upper()}]")
|
||||
lines.append(f" {entry.key}: {entry.value}")
|
||||
if entry.source:
|
||||
lines.append(f" (source: {entry.source})")
|
||||
|
||||
lines.append("\n--- End Project State ---")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _validate_confidence(confidence: float) -> None:
|
||||
if not 0.0 <= confidence <= 1.0:
|
||||
raise ValueError("Confidence must be between 0.0 and 1.0")
|
||||
0
src/atocore/ingestion/__init__.py
Normal file
0
src/atocore/ingestion/__init__.py
Normal file
150
src/atocore/ingestion/chunker.py
Normal file
150
src/atocore/ingestion/chunker.py
Normal file
@@ -0,0 +1,150 @@
|
||||
"""Heading-aware recursive markdown chunking."""
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
|
||||
import atocore.config as _config
|
||||
|
||||
|
||||
@dataclass
|
||||
class Chunk:
|
||||
content: str
|
||||
chunk_index: int
|
||||
heading_path: str
|
||||
char_count: int
|
||||
metadata: dict = field(default_factory=dict)
|
||||
|
||||
|
||||
def chunk_markdown(
|
||||
body: str,
|
||||
base_metadata: dict | None = None,
|
||||
max_size: int | None = None,
|
||||
overlap: int | None = None,
|
||||
min_size: int | None = None,
|
||||
) -> list[Chunk]:
|
||||
"""Split markdown body into chunks using heading-aware strategy.
|
||||
|
||||
1. Split on H2 boundaries
|
||||
2. If section > max_size, split on H3
|
||||
3. If still > max_size, split on paragraph breaks
|
||||
4. If still > max_size, hard split with overlap
|
||||
"""
|
||||
max_size = max_size or _config.settings.chunk_max_size
|
||||
overlap = overlap or _config.settings.chunk_overlap
|
||||
min_size = min_size or _config.settings.chunk_min_size
|
||||
base_metadata = base_metadata or {}
|
||||
|
||||
sections = _split_by_heading(body, level=2)
|
||||
raw_chunks: list[tuple[str, str]] = [] # (heading_path, content)
|
||||
|
||||
for heading, content in sections:
|
||||
if len(content) <= max_size:
|
||||
raw_chunks.append((heading, content))
|
||||
else:
|
||||
# Try splitting on H3
|
||||
subsections = _split_by_heading(content, level=3)
|
||||
for sub_heading, sub_content in subsections:
|
||||
full_path = (
|
||||
f"{heading} > {sub_heading}" if heading and sub_heading else heading or sub_heading
|
||||
)
|
||||
if len(sub_content) <= max_size:
|
||||
raw_chunks.append((full_path, sub_content))
|
||||
else:
|
||||
# Split on paragraphs
|
||||
para_chunks = _split_by_paragraphs(
|
||||
sub_content, max_size, overlap
|
||||
)
|
||||
for pc in para_chunks:
|
||||
raw_chunks.append((full_path, pc))
|
||||
|
||||
# Build final chunks, filtering out too-small ones
|
||||
chunks = []
|
||||
idx = 0
|
||||
for heading_path, content in raw_chunks:
|
||||
content = content.strip()
|
||||
if len(content) < min_size:
|
||||
continue
|
||||
chunks.append(
|
||||
Chunk(
|
||||
content=content,
|
||||
chunk_index=idx,
|
||||
heading_path=heading_path,
|
||||
char_count=len(content),
|
||||
metadata={**base_metadata},
|
||||
)
|
||||
)
|
||||
idx += 1
|
||||
|
||||
return chunks
|
||||
|
||||
|
||||
def _split_by_heading(text: str, level: int) -> list[tuple[str, str]]:
|
||||
"""Split text by heading level. Returns (heading_text, section_content) pairs."""
|
||||
pattern = rf"^({'#' * level})\s+(.+)$"
|
||||
parts: list[tuple[str, str]] = []
|
||||
current_heading = ""
|
||||
current_lines: list[str] = []
|
||||
|
||||
for line in text.split("\n"):
|
||||
match = re.match(pattern, line)
|
||||
if match:
|
||||
# Save previous section
|
||||
if current_lines:
|
||||
parts.append((current_heading, "\n".join(current_lines)))
|
||||
current_heading = match.group(2).strip()
|
||||
current_lines = []
|
||||
else:
|
||||
current_lines.append(line)
|
||||
|
||||
# Save last section
|
||||
if current_lines:
|
||||
parts.append((current_heading, "\n".join(current_lines)))
|
||||
|
||||
return parts
|
||||
|
||||
|
||||
def _split_by_paragraphs(
|
||||
text: str, max_size: int, overlap: int
|
||||
) -> list[str]:
|
||||
"""Split text by paragraph breaks, then hard-split if needed."""
|
||||
paragraphs = re.split(r"\n\n+", text)
|
||||
chunks: list[str] = []
|
||||
current = ""
|
||||
|
||||
for para in paragraphs:
|
||||
para = para.strip()
|
||||
if not para:
|
||||
continue
|
||||
|
||||
if len(current) + len(para) + 2 <= max_size:
|
||||
current = f"{current}\n\n{para}" if current else para
|
||||
else:
|
||||
if current:
|
||||
chunks.append(current)
|
||||
# If single paragraph exceeds max, hard split
|
||||
if len(para) > max_size:
|
||||
chunks.extend(_hard_split(para, max_size, overlap))
|
||||
else:
|
||||
current = para
|
||||
continue
|
||||
current = ""
|
||||
|
||||
if current:
|
||||
chunks.append(current)
|
||||
|
||||
return chunks
|
||||
|
||||
|
||||
def _hard_split(text: str, max_size: int, overlap: int) -> list[str]:
|
||||
"""Hard split text at max_size with overlap."""
|
||||
# Prevent infinite loop: overlap must be less than max_size
|
||||
if overlap >= max_size:
|
||||
overlap = max_size // 4
|
||||
|
||||
chunks = []
|
||||
start = 0
|
||||
while start < len(text):
|
||||
end = start + max_size
|
||||
chunks.append(text[start:end])
|
||||
start = end - overlap
|
||||
return chunks
|
||||
65
src/atocore/ingestion/parser.py
Normal file
65
src/atocore/ingestion/parser.py
Normal file
@@ -0,0 +1,65 @@
|
||||
"""Markdown file parsing with frontmatter extraction."""
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
import frontmatter
|
||||
|
||||
|
||||
@dataclass
|
||||
class ParsedDocument:
|
||||
file_path: str
|
||||
title: str
|
||||
body: str
|
||||
tags: list[str] = field(default_factory=list)
|
||||
frontmatter: dict = field(default_factory=dict)
|
||||
headings: list[tuple[int, str]] = field(default_factory=list)
|
||||
|
||||
|
||||
def parse_markdown(file_path: Path, text: str | None = None) -> ParsedDocument:
|
||||
"""Parse a markdown file, extracting frontmatter and structure."""
|
||||
raw_text = text if text is not None else file_path.read_text(encoding="utf-8")
|
||||
post = frontmatter.loads(raw_text)
|
||||
|
||||
meta = dict(post.metadata) if post.metadata else {}
|
||||
body = post.content.strip()
|
||||
|
||||
# Extract title: first H1, or filename
|
||||
title = _extract_title(body, file_path)
|
||||
|
||||
# Extract tags from frontmatter
|
||||
tags = meta.get("tags", [])
|
||||
if isinstance(tags, str):
|
||||
tags = [t.strip() for t in tags.split(",") if t.strip()]
|
||||
tags = tags or []
|
||||
|
||||
# Extract heading structure
|
||||
headings = _extract_headings(body)
|
||||
|
||||
return ParsedDocument(
|
||||
file_path=str(file_path.resolve()),
|
||||
title=title,
|
||||
body=body,
|
||||
tags=tags,
|
||||
frontmatter=meta,
|
||||
headings=headings,
|
||||
)
|
||||
|
||||
|
||||
def _extract_title(body: str, file_path: Path) -> str:
|
||||
"""Get title from first H1 or fallback to filename."""
|
||||
match = re.search(r"^#\s+(.+)$", body, re.MULTILINE)
|
||||
if match:
|
||||
return match.group(1).strip()
|
||||
return file_path.stem.replace("_", " ").replace("-", " ").title()
|
||||
|
||||
|
||||
def _extract_headings(body: str) -> list[tuple[int, str]]:
|
||||
"""Extract all headings with their level."""
|
||||
headings = []
|
||||
for match in re.finditer(r"^(#{1,4})\s+(.+)$", body, re.MULTILINE):
|
||||
level = len(match.group(1))
|
||||
text = match.group(2).strip()
|
||||
headings.append((level, text))
|
||||
return headings
|
||||
321
src/atocore/ingestion/pipeline.py
Normal file
321
src/atocore/ingestion/pipeline.py
Normal file
@@ -0,0 +1,321 @@
|
||||
"""Ingestion pipeline: parse → chunk → embed → store."""
|
||||
|
||||
import hashlib
|
||||
import json
|
||||
import threading
|
||||
import time
|
||||
import uuid
|
||||
from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.ingestion.chunker import chunk_markdown
|
||||
from atocore.ingestion.parser import parse_markdown
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.retrieval.vector_store import get_vector_store
|
||||
|
||||
log = get_logger("ingestion")
|
||||
|
||||
# Encodings to try when reading markdown files
|
||||
_ENCODINGS = ["utf-8", "utf-8-sig", "latin-1", "cp1252"]
|
||||
_INGESTION_LOCK = threading.Lock()
|
||||
|
||||
|
||||
@contextmanager
|
||||
def exclusive_ingestion():
|
||||
"""Serialize long-running ingestion operations across API requests."""
|
||||
_INGESTION_LOCK.acquire()
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
_INGESTION_LOCK.release()
|
||||
|
||||
|
||||
def ingest_file(file_path: Path) -> dict:
|
||||
"""Ingest a single markdown file. Returns stats."""
|
||||
start = time.time()
|
||||
file_path = file_path.resolve()
|
||||
|
||||
if not file_path.exists():
|
||||
raise FileNotFoundError(f"File not found: {file_path}")
|
||||
if file_path.suffix.lower() not in (".md", ".markdown"):
|
||||
raise ValueError(f"Not a markdown file: {file_path}")
|
||||
|
||||
# Read with encoding fallback
|
||||
raw_content = _read_file_safe(file_path)
|
||||
file_hash = hashlib.sha256(raw_content.encode("utf-8")).hexdigest()
|
||||
|
||||
# Check if already ingested and unchanged
|
||||
with get_connection() as conn:
|
||||
existing = conn.execute(
|
||||
"SELECT id, file_hash FROM source_documents WHERE file_path = ?",
|
||||
(str(file_path),),
|
||||
).fetchone()
|
||||
|
||||
if existing and existing["file_hash"] == file_hash:
|
||||
log.info("file_skipped_unchanged", file_path=str(file_path))
|
||||
return {"file": str(file_path), "status": "skipped", "reason": "unchanged"}
|
||||
|
||||
# Parse
|
||||
parsed = parse_markdown(file_path, text=raw_content)
|
||||
|
||||
# Chunk
|
||||
base_meta = {
|
||||
"source_file": str(file_path),
|
||||
"tags": parsed.tags,
|
||||
"title": parsed.title,
|
||||
}
|
||||
chunks = chunk_markdown(parsed.body, base_metadata=base_meta)
|
||||
|
||||
# Store in DB and vector store
|
||||
doc_id = str(uuid.uuid4())
|
||||
vector_store = get_vector_store()
|
||||
old_chunk_ids: list[str] = []
|
||||
new_chunk_ids: list[str] = []
|
||||
|
||||
try:
|
||||
with get_connection() as conn:
|
||||
# Remove old data if re-ingesting
|
||||
if existing:
|
||||
doc_id = existing["id"]
|
||||
old_chunk_ids = [
|
||||
row["id"]
|
||||
for row in conn.execute(
|
||||
"SELECT id FROM source_chunks WHERE document_id = ?",
|
||||
(doc_id,),
|
||||
).fetchall()
|
||||
]
|
||||
conn.execute(
|
||||
"DELETE FROM source_chunks WHERE document_id = ?", (doc_id,)
|
||||
)
|
||||
conn.execute(
|
||||
"UPDATE source_documents SET file_hash = ?, title = ?, tags = ?, updated_at = CURRENT_TIMESTAMP WHERE id = ?",
|
||||
(file_hash, parsed.title, json.dumps(parsed.tags), doc_id),
|
||||
)
|
||||
else:
|
||||
conn.execute(
|
||||
"INSERT INTO source_documents (id, file_path, file_hash, title, doc_type, tags) VALUES (?, ?, ?, ?, ?, ?)",
|
||||
(doc_id, str(file_path), file_hash, parsed.title, "markdown", json.dumps(parsed.tags)),
|
||||
)
|
||||
|
||||
if not chunks:
|
||||
log.warning("no_chunks_created", file_path=str(file_path))
|
||||
else:
|
||||
# Insert chunks
|
||||
chunk_contents = []
|
||||
chunk_metadatas = []
|
||||
|
||||
for chunk in chunks:
|
||||
chunk_id = str(uuid.uuid4())
|
||||
new_chunk_ids.append(chunk_id)
|
||||
chunk_contents.append(chunk.content)
|
||||
chunk_metadatas.append({
|
||||
"document_id": doc_id,
|
||||
"heading_path": chunk.heading_path,
|
||||
"source_file": str(file_path),
|
||||
"tags": json.dumps(parsed.tags),
|
||||
"title": parsed.title,
|
||||
})
|
||||
|
||||
conn.execute(
|
||||
"INSERT INTO source_chunks (id, document_id, chunk_index, content, heading_path, char_count, metadata) VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(
|
||||
chunk_id,
|
||||
doc_id,
|
||||
chunk.chunk_index,
|
||||
chunk.content,
|
||||
chunk.heading_path,
|
||||
chunk.char_count,
|
||||
json.dumps(chunk.metadata),
|
||||
),
|
||||
)
|
||||
|
||||
# Add new vectors before commit so DB can still roll back on failure.
|
||||
vector_store.add(new_chunk_ids, chunk_contents, chunk_metadatas)
|
||||
except Exception:
|
||||
if new_chunk_ids:
|
||||
vector_store.delete(new_chunk_ids)
|
||||
raise
|
||||
|
||||
# Delete stale vectors only after the DB transaction committed.
|
||||
if old_chunk_ids:
|
||||
vector_store.delete(old_chunk_ids)
|
||||
|
||||
duration_ms = int((time.time() - start) * 1000)
|
||||
if chunks:
|
||||
log.info(
|
||||
"file_ingested",
|
||||
file_path=str(file_path),
|
||||
chunks_created=len(chunks),
|
||||
duration_ms=duration_ms,
|
||||
)
|
||||
else:
|
||||
log.info(
|
||||
"file_ingested_empty",
|
||||
file_path=str(file_path),
|
||||
duration_ms=duration_ms,
|
||||
)
|
||||
|
||||
return {
|
||||
"file": str(file_path),
|
||||
"status": "ingested" if chunks else "empty",
|
||||
"chunks": len(chunks),
|
||||
"duration_ms": duration_ms,
|
||||
}
|
||||
|
||||
|
||||
def ingest_folder(folder_path: Path, purge_deleted: bool = True) -> list[dict]:
|
||||
"""Ingest all markdown files in a folder recursively.
|
||||
|
||||
Args:
|
||||
folder_path: Directory to scan for .md files.
|
||||
purge_deleted: If True, remove DB/vector entries for files
|
||||
that no longer exist on disk.
|
||||
"""
|
||||
folder_path = folder_path.resolve()
|
||||
if not folder_path.is_dir():
|
||||
raise NotADirectoryError(f"Not a directory: {folder_path}")
|
||||
|
||||
results = []
|
||||
md_files = sorted(
|
||||
list(folder_path.rglob("*.md")) + list(folder_path.rglob("*.markdown"))
|
||||
)
|
||||
current_paths = {str(f.resolve()) for f in md_files}
|
||||
log.info("ingestion_started", folder=str(folder_path), file_count=len(md_files))
|
||||
|
||||
# Ingest new/changed files
|
||||
for md_file in md_files:
|
||||
try:
|
||||
result = ingest_file(md_file)
|
||||
results.append(result)
|
||||
except Exception as e:
|
||||
log.error("ingestion_error", file_path=str(md_file), error=str(e))
|
||||
results.append({"file": str(md_file), "status": "error", "error": str(e)})
|
||||
|
||||
# Purge entries for deleted files
|
||||
if purge_deleted:
|
||||
deleted = _purge_deleted_files(folder_path, current_paths)
|
||||
if deleted:
|
||||
log.info("purged_deleted_files", count=deleted)
|
||||
results.append({"status": "purged", "deleted_count": deleted})
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def get_source_status() -> list[dict]:
|
||||
"""Describe configured source directories and their readiness."""
|
||||
sources = []
|
||||
for spec in _config.settings.source_specs:
|
||||
path = spec["path"]
|
||||
assert isinstance(path, Path)
|
||||
sources.append(
|
||||
{
|
||||
"name": spec["name"],
|
||||
"enabled": spec["enabled"],
|
||||
"path": str(path),
|
||||
"exists": path.exists(),
|
||||
"is_dir": path.is_dir(),
|
||||
"read_only": spec["read_only"],
|
||||
}
|
||||
)
|
||||
return sources
|
||||
|
||||
|
||||
def ingest_configured_sources(purge_deleted: bool = False) -> list[dict]:
|
||||
"""Ingest enabled source directories declared in config.
|
||||
|
||||
Purge is disabled by default here because sources are intended to be
|
||||
read-only inputs and should not be treated as the primary writable state.
|
||||
"""
|
||||
results = []
|
||||
for source in get_source_status():
|
||||
if not source["enabled"]:
|
||||
results.append({"source": source["name"], "status": "disabled", "path": source["path"]})
|
||||
continue
|
||||
if not source["exists"] or not source["is_dir"]:
|
||||
results.append({"source": source["name"], "status": "missing", "path": source["path"]})
|
||||
continue
|
||||
|
||||
folder_results = ingest_folder(Path(source["path"]), purge_deleted=purge_deleted)
|
||||
results.append(
|
||||
{
|
||||
"source": source["name"],
|
||||
"status": "ingested",
|
||||
"path": source["path"],
|
||||
"results": folder_results,
|
||||
}
|
||||
)
|
||||
return results
|
||||
|
||||
|
||||
def get_ingestion_stats() -> dict:
|
||||
"""Return ingestion statistics."""
|
||||
with get_connection() as conn:
|
||||
docs = conn.execute("SELECT COUNT(*) as c FROM source_documents").fetchone()
|
||||
chunks = conn.execute("SELECT COUNT(*) as c FROM source_chunks").fetchone()
|
||||
recent = conn.execute(
|
||||
"SELECT file_path, title, ingested_at FROM source_documents "
|
||||
"ORDER BY updated_at DESC LIMIT 5"
|
||||
).fetchall()
|
||||
|
||||
vector_store = get_vector_store()
|
||||
return {
|
||||
"total_documents": docs["c"],
|
||||
"total_chunks": chunks["c"],
|
||||
"total_vectors": vector_store.count,
|
||||
"recent_documents": [
|
||||
{"file_path": r["file_path"], "title": r["title"], "ingested_at": r["ingested_at"]}
|
||||
for r in recent
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def _read_file_safe(file_path: Path) -> str:
|
||||
"""Read a file with encoding fallback."""
|
||||
for encoding in _ENCODINGS:
|
||||
try:
|
||||
return file_path.read_text(encoding=encoding)
|
||||
except (UnicodeDecodeError, ValueError):
|
||||
continue
|
||||
# Last resort: read with errors replaced
|
||||
return file_path.read_text(encoding="utf-8", errors="replace")
|
||||
|
||||
|
||||
def _purge_deleted_files(folder_path: Path, current_paths: set[str]) -> int:
|
||||
"""Remove DB/vector entries for files under folder_path that no longer exist."""
|
||||
folder_str = str(folder_path)
|
||||
deleted_count = 0
|
||||
vector_store = get_vector_store()
|
||||
chunk_ids_to_delete: list[str] = []
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
"SELECT id, file_path FROM source_documents"
|
||||
).fetchall()
|
||||
|
||||
for row in rows:
|
||||
doc_path = Path(row["file_path"])
|
||||
try:
|
||||
doc_path.relative_to(folder_path)
|
||||
except ValueError:
|
||||
continue
|
||||
|
||||
if row["file_path"] not in current_paths:
|
||||
doc_id = row["id"]
|
||||
chunk_ids_to_delete.extend(
|
||||
r["id"]
|
||||
for r in conn.execute(
|
||||
"SELECT id FROM source_chunks WHERE document_id = ?",
|
||||
(doc_id,),
|
||||
).fetchall()
|
||||
)
|
||||
conn.execute("DELETE FROM source_chunks WHERE document_id = ?", (doc_id,))
|
||||
conn.execute("DELETE FROM source_documents WHERE id = ?", (doc_id,))
|
||||
log.info("purged_deleted_file", file_path=row["file_path"])
|
||||
deleted_count += 1
|
||||
|
||||
if chunk_ids_to_delete:
|
||||
vector_store.delete(chunk_ids_to_delete)
|
||||
|
||||
return deleted_count
|
||||
27
src/atocore/interactions/__init__.py
Normal file
27
src/atocore/interactions/__init__.py
Normal file
@@ -0,0 +1,27 @@
|
||||
"""Interactions: capture loop for AtoCore.
|
||||
|
||||
This module is the foundation for Phase 9 (Reflection) and Phase 10
|
||||
(Write-back). It records what AtoCore fed to an LLM and what came back,
|
||||
so that later phases can:
|
||||
|
||||
- reinforce active memories that the LLM actually relied on
|
||||
- extract candidate memories / project state from real conversations
|
||||
- inspect the audit trail of any answer the system helped produce
|
||||
|
||||
Nothing here automatically promotes information into trusted state.
|
||||
The capture loop is intentionally read-only with respect to trust.
|
||||
"""
|
||||
|
||||
from atocore.interactions.service import (
|
||||
Interaction,
|
||||
get_interaction,
|
||||
list_interactions,
|
||||
record_interaction,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"Interaction",
|
||||
"get_interaction",
|
||||
"list_interactions",
|
||||
"record_interaction",
|
||||
]
|
||||
329
src/atocore/interactions/service.py
Normal file
329
src/atocore/interactions/service.py
Normal file
@@ -0,0 +1,329 @@
|
||||
"""Interaction capture service.
|
||||
|
||||
An *interaction* is one round-trip of:
|
||||
- a user prompt
|
||||
- the AtoCore context pack that was assembled for it
|
||||
- the LLM response (full text or a summary, caller's choice)
|
||||
- which memories and chunks were actually used in the pack
|
||||
- a client identifier (e.g. ``openclaw``, ``claude-code``, ``manual``)
|
||||
- an optional session identifier so multi-turn conversations can be
|
||||
reconstructed later
|
||||
|
||||
The capture is intentionally additive: it never modifies memories,
|
||||
project state, or chunks. Reflection (Phase 9 Commit B/C) and
|
||||
write-back (Phase 10) are layered on top of this audit trail without
|
||||
violating the AtoCore trust hierarchy.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
import uuid
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
log = get_logger("interactions")
|
||||
|
||||
# Stored timestamps use 'YYYY-MM-DD HH:MM:SS' (no timezone offset, UTC by
|
||||
# convention) so they sort lexically and compare cleanly with the SQLite
|
||||
# CURRENT_TIMESTAMP default. The since filter accepts ISO 8601 strings
|
||||
# (with 'T', optional 'Z' or +offset, optional fractional seconds) and
|
||||
# normalizes them to the storage format before the SQL comparison.
|
||||
_STORAGE_TIMESTAMP_FORMAT = "%Y-%m-%d %H:%M:%S"
|
||||
|
||||
|
||||
@dataclass
|
||||
class Interaction:
|
||||
id: str
|
||||
prompt: str
|
||||
response: str
|
||||
response_summary: str
|
||||
project: str
|
||||
client: str
|
||||
session_id: str
|
||||
memories_used: list[str] = field(default_factory=list)
|
||||
chunks_used: list[str] = field(default_factory=list)
|
||||
context_pack: dict = field(default_factory=dict)
|
||||
created_at: str = ""
|
||||
|
||||
|
||||
def record_interaction(
|
||||
prompt: str,
|
||||
response: str = "",
|
||||
response_summary: str = "",
|
||||
project: str = "",
|
||||
client: str = "",
|
||||
session_id: str = "",
|
||||
memories_used: list[str] | None = None,
|
||||
chunks_used: list[str] | None = None,
|
||||
context_pack: dict | None = None,
|
||||
reinforce: bool = True,
|
||||
extract: bool = False,
|
||||
) -> Interaction:
|
||||
"""Persist a single interaction to the audit trail.
|
||||
|
||||
The only required field is ``prompt`` so this can be called even when
|
||||
the caller is in the middle of a partial turn (for example to record
|
||||
that AtoCore was queried even before the LLM response is back).
|
||||
|
||||
When ``reinforce`` is True (default) and the interaction has response
|
||||
content, the Phase 9 Commit B reinforcement pass runs automatically
|
||||
against the active memory set. This bumps the confidence of any
|
||||
memory whose content is echoed in the response. Set ``reinforce`` to
|
||||
False to capture the interaction without touching memory confidence,
|
||||
which is useful for backfill and for tests that want to isolate the
|
||||
audit trail from the reinforcement loop.
|
||||
"""
|
||||
if not prompt or not prompt.strip():
|
||||
raise ValueError("Interaction prompt must be non-empty")
|
||||
|
||||
# Canonicalize the project through the registry so an alias and
|
||||
# the canonical id store under the same bucket. Without this,
|
||||
# reinforcement and extraction (which both query by raw
|
||||
# interaction.project) would silently miss memories and create
|
||||
# candidates in the wrong project.
|
||||
project = resolve_project_name(project)
|
||||
|
||||
interaction_id = str(uuid.uuid4())
|
||||
# Store created_at explicitly so the same string lives in both the DB
|
||||
# column and the returned dataclass. SQLite's CURRENT_TIMESTAMP uses
|
||||
# 'YYYY-MM-DD HH:MM:SS' which would not compare cleanly against ISO
|
||||
# timestamps with 'T' and tz offset, breaking the `since` filter on
|
||||
# list_interactions.
|
||||
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
memories_used = list(memories_used or [])
|
||||
chunks_used = list(chunks_used or [])
|
||||
context_pack_payload = context_pack or {}
|
||||
|
||||
with get_connection() as conn:
|
||||
conn.execute(
|
||||
"""
|
||||
INSERT INTO interactions (
|
||||
id, prompt, context_pack, response_summary, response,
|
||||
memories_used, chunks_used, client, session_id, project,
|
||||
created_at
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
""",
|
||||
(
|
||||
interaction_id,
|
||||
prompt,
|
||||
json.dumps(context_pack_payload, ensure_ascii=True),
|
||||
response_summary,
|
||||
response,
|
||||
json.dumps(memories_used, ensure_ascii=True),
|
||||
json.dumps(chunks_used, ensure_ascii=True),
|
||||
client,
|
||||
session_id,
|
||||
project,
|
||||
now,
|
||||
),
|
||||
)
|
||||
|
||||
log.info(
|
||||
"interaction_recorded",
|
||||
interaction_id=interaction_id,
|
||||
project=project,
|
||||
client=client,
|
||||
session_id=session_id,
|
||||
memories_used=len(memories_used),
|
||||
chunks_used=len(chunks_used),
|
||||
response_chars=len(response),
|
||||
)
|
||||
|
||||
interaction = Interaction(
|
||||
id=interaction_id,
|
||||
prompt=prompt,
|
||||
response=response,
|
||||
response_summary=response_summary,
|
||||
project=project,
|
||||
client=client,
|
||||
session_id=session_id,
|
||||
memories_used=memories_used,
|
||||
chunks_used=chunks_used,
|
||||
context_pack=context_pack_payload,
|
||||
created_at=now,
|
||||
)
|
||||
|
||||
if reinforce and (response or response_summary):
|
||||
# Import inside the function to avoid a circular import between
|
||||
# the interactions service and the reinforcement module which
|
||||
# depends on it.
|
||||
try:
|
||||
from atocore.memory.reinforcement import reinforce_from_interaction
|
||||
|
||||
reinforce_from_interaction(interaction)
|
||||
except Exception as exc: # pragma: no cover - reinforcement must never block capture
|
||||
log.error(
|
||||
"reinforcement_failed_on_capture",
|
||||
interaction_id=interaction_id,
|
||||
error=str(exc),
|
||||
)
|
||||
|
||||
if extract and (response or response_summary):
|
||||
try:
|
||||
from atocore.memory.extractor import extract_candidates_from_interaction
|
||||
from atocore.memory.service import create_memory
|
||||
|
||||
candidates = extract_candidates_from_interaction(interaction)
|
||||
for candidate in candidates:
|
||||
try:
|
||||
create_memory(
|
||||
memory_type=candidate.memory_type,
|
||||
content=candidate.content,
|
||||
project=candidate.project,
|
||||
confidence=candidate.confidence,
|
||||
status="candidate",
|
||||
)
|
||||
except ValueError:
|
||||
pass # duplicate or validation error — skip silently
|
||||
except Exception as exc: # pragma: no cover - extraction must never block capture
|
||||
log.error(
|
||||
"extraction_failed_on_capture",
|
||||
interaction_id=interaction_id,
|
||||
error=str(exc),
|
||||
)
|
||||
|
||||
return interaction
|
||||
|
||||
|
||||
def list_interactions(
|
||||
project: str | None = None,
|
||||
session_id: str | None = None,
|
||||
client: str | None = None,
|
||||
since: str | None = None,
|
||||
limit: int = 50,
|
||||
) -> list[Interaction]:
|
||||
"""List captured interactions, optionally filtered.
|
||||
|
||||
``since`` accepts an ISO 8601 timestamp string (with ``T``, an
|
||||
optional ``Z`` or numeric offset, optional fractional seconds).
|
||||
The value is normalized to the storage format (UTC,
|
||||
``YYYY-MM-DD HH:MM:SS``) before the SQL comparison so external
|
||||
callers can pass any of the common ISO shapes without filter
|
||||
drift. ``project`` is canonicalized through the registry so an
|
||||
alias finds rows stored under the canonical project id.
|
||||
``limit`` is hard-capped at 500 to keep casual API listings cheap.
|
||||
"""
|
||||
if limit <= 0:
|
||||
return []
|
||||
limit = min(limit, 500)
|
||||
|
||||
query = "SELECT * FROM interactions WHERE 1=1"
|
||||
params: list = []
|
||||
|
||||
if project:
|
||||
query += " AND project = ?"
|
||||
params.append(resolve_project_name(project))
|
||||
if session_id:
|
||||
query += " AND session_id = ?"
|
||||
params.append(session_id)
|
||||
if client:
|
||||
query += " AND client = ?"
|
||||
params.append(client)
|
||||
if since:
|
||||
query += " AND created_at >= ?"
|
||||
params.append(_normalize_since(since))
|
||||
|
||||
query += " ORDER BY created_at DESC LIMIT ?"
|
||||
params.append(limit)
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(query, params).fetchall()
|
||||
|
||||
return [_row_to_interaction(row) for row in rows]
|
||||
|
||||
|
||||
def get_interaction(interaction_id: str) -> Interaction | None:
|
||||
"""Fetch one interaction by id, or return None if it does not exist."""
|
||||
if not interaction_id:
|
||||
return None
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT * FROM interactions WHERE id = ?", (interaction_id,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return None
|
||||
return _row_to_interaction(row)
|
||||
|
||||
|
||||
def _row_to_interaction(row) -> Interaction:
|
||||
return Interaction(
|
||||
id=row["id"],
|
||||
prompt=row["prompt"],
|
||||
response=row["response"] or "",
|
||||
response_summary=row["response_summary"] or "",
|
||||
project=row["project"] or "",
|
||||
client=row["client"] or "",
|
||||
session_id=row["session_id"] or "",
|
||||
memories_used=_safe_json_list(row["memories_used"]),
|
||||
chunks_used=_safe_json_list(row["chunks_used"]),
|
||||
context_pack=_safe_json_dict(row["context_pack"]),
|
||||
created_at=row["created_at"] or "",
|
||||
)
|
||||
|
||||
|
||||
def _safe_json_list(raw: str | None) -> list[str]:
|
||||
if not raw:
|
||||
return []
|
||||
try:
|
||||
value = json.loads(raw)
|
||||
except json.JSONDecodeError:
|
||||
return []
|
||||
if not isinstance(value, list):
|
||||
return []
|
||||
return [str(item) for item in value]
|
||||
|
||||
|
||||
def _safe_json_dict(raw: str | None) -> dict:
|
||||
if not raw:
|
||||
return {}
|
||||
try:
|
||||
value = json.loads(raw)
|
||||
except json.JSONDecodeError:
|
||||
return {}
|
||||
if not isinstance(value, dict):
|
||||
return {}
|
||||
return value
|
||||
|
||||
|
||||
def _normalize_since(since: str) -> str:
|
||||
"""Normalize an ISO 8601 ``since`` filter to the storage format.
|
||||
|
||||
Stored ``created_at`` values are ``YYYY-MM-DD HH:MM:SS`` (no
|
||||
timezone, UTC by convention). External callers naturally pass
|
||||
ISO 8601 with ``T`` separator, optional ``Z`` suffix, optional
|
||||
fractional seconds, and optional ``+HH:MM`` offsets. A naive
|
||||
string comparison between the two formats fails on the same
|
||||
day because the lexically-greater ``T`` makes any ISO value
|
||||
sort after any space-separated value.
|
||||
|
||||
This helper accepts the common ISO shapes plus the bare
|
||||
storage format and returns the storage format. On a parse
|
||||
failure it returns the input unchanged so the SQL comparison
|
||||
fails open (no rows match) instead of raising and breaking
|
||||
the listing endpoint.
|
||||
"""
|
||||
if not since:
|
||||
return since
|
||||
candidate = since.strip()
|
||||
# Python's fromisoformat understands trailing 'Z' from 3.11+ but
|
||||
# we replace it explicitly for safety against earlier shapes.
|
||||
if candidate.endswith("Z"):
|
||||
candidate = candidate[:-1] + "+00:00"
|
||||
try:
|
||||
dt = datetime.fromisoformat(candidate)
|
||||
except ValueError:
|
||||
# Already in storage format, or unparseable: best-effort
|
||||
# match the storage format with a regex; if that fails too,
|
||||
# return the raw input.
|
||||
if re.fullmatch(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", since):
|
||||
return since
|
||||
return since
|
||||
if dt.tzinfo is not None:
|
||||
dt = dt.astimezone(timezone.utc).replace(tzinfo=None)
|
||||
return dt.strftime(_STORAGE_TIMESTAMP_FORMAT)
|
||||
62
src/atocore/main.py
Normal file
62
src/atocore/main.py
Normal file
@@ -0,0 +1,62 @@
|
||||
"""AtoCore — FastAPI application entry point."""
|
||||
|
||||
from contextlib import asynccontextmanager
|
||||
|
||||
from fastapi import FastAPI
|
||||
|
||||
from atocore import __version__
|
||||
from atocore.api.routes import router
|
||||
import atocore.config as _config
|
||||
from atocore.context.project_state import init_project_state_schema
|
||||
from atocore.ingestion.pipeline import get_source_status
|
||||
from atocore.models.database import init_db
|
||||
from atocore.observability.logger import get_logger, setup_logging
|
||||
|
||||
|
||||
log = get_logger("main")
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
"""Run setup before the first request and teardown after shutdown.
|
||||
|
||||
Replaces the deprecated ``@app.on_event("startup")`` hook with the
|
||||
modern ``lifespan`` context manager. Setup runs synchronously (the
|
||||
underlying calls are blocking I/O) so no await is needed; the
|
||||
function still must be async per the FastAPI contract.
|
||||
"""
|
||||
setup_logging()
|
||||
_config.ensure_runtime_dirs()
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
log.info(
|
||||
"startup_ready",
|
||||
env=_config.settings.env,
|
||||
db_path=str(_config.settings.db_path),
|
||||
chroma_path=str(_config.settings.chroma_path),
|
||||
source_status=get_source_status(),
|
||||
)
|
||||
yield
|
||||
# No teardown work needed today; SQLite connections are short-lived
|
||||
# and the Chroma client cleans itself up on process exit.
|
||||
|
||||
|
||||
app = FastAPI(
|
||||
title="AtoCore",
|
||||
description="Personal Context Engine for LLM interactions",
|
||||
version=__version__,
|
||||
lifespan=lifespan,
|
||||
)
|
||||
|
||||
app.include_router(router)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import uvicorn
|
||||
|
||||
uvicorn.run(
|
||||
"atocore.main:app",
|
||||
host=_config.settings.host,
|
||||
port=_config.settings.port,
|
||||
reload=True,
|
||||
)
|
||||
0
src/atocore/memory/__init__.py
Normal file
0
src/atocore/memory/__init__.py
Normal file
242
src/atocore/memory/extractor.py
Normal file
242
src/atocore/memory/extractor.py
Normal file
@@ -0,0 +1,242 @@
|
||||
"""Rule-based candidate-memory extraction from captured interactions.
|
||||
|
||||
Phase 9 Commit C. This module reads an interaction's response text and
|
||||
produces a list of *candidate* memories that a human can later review
|
||||
and either promote to active or reject. Nothing extracted here is ever
|
||||
automatically promoted into trusted state — the AtoCore trust rule is
|
||||
that bad memory is worse than no memory, so the extractor is
|
||||
conservative on purpose.
|
||||
|
||||
Design rules for V0
|
||||
-------------------
|
||||
1. Rule-based only. No LLM calls. The extractor should be fast, cheap,
|
||||
fully explainable, and produce the same output for the same input
|
||||
across runs.
|
||||
2. Patterns match obvious, high-signal structures and are intentionally
|
||||
narrow. False positives are more harmful than false negatives because
|
||||
every candidate means review work for a human.
|
||||
3. Every extracted candidate records which pattern fired and which text
|
||||
span it came from, so a reviewer can audit the extractor's reasoning.
|
||||
4. Patterns should feel like idioms the user already writes in their
|
||||
PKM and interaction notes:
|
||||
* ``## Decision: ...`` and variants
|
||||
* ``## Constraint: ...`` and variants
|
||||
* ``I prefer <X>`` / ``the user prefers <X>``
|
||||
* ``decided to <X>``
|
||||
* ``<X> is a requirement`` / ``requirement: <X>``
|
||||
5. Candidates are de-duplicated against already-active memories of the
|
||||
same type+project so review queues don't fill up with things the
|
||||
user has already curated.
|
||||
|
||||
The extractor produces ``MemoryCandidate`` objects. The caller decides
|
||||
whether to persist them via ``create_memory(..., status="candidate")``.
|
||||
Persistence is kept out of the extractor itself so it can be tested
|
||||
without touching the database and so future extractors (LLM-based,
|
||||
structural, ontology-driven) can be swapped in cleanly.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
|
||||
from atocore.interactions.service import Interaction
|
||||
from atocore.memory.service import MEMORY_TYPES, get_memories
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("extractor")
|
||||
|
||||
|
||||
# Bumped whenever the rule set, regex shapes, or post-processing
|
||||
# semantics change in a way that could affect candidate output. The
|
||||
# promotion-rules doc requires every candidate to record the version
|
||||
# of the extractor that produced it so old candidates can be re-evaluated
|
||||
# (or kept as-is) when the rules evolve.
|
||||
#
|
||||
# History:
|
||||
# 0.1.0 - initial Phase 9 Commit C rule set (Apr 6, 2026)
|
||||
EXTRACTOR_VERSION = "0.1.0"
|
||||
|
||||
|
||||
# Every candidate is attributed to the rule that fired so reviewers can
|
||||
# audit why it was proposed.
|
||||
@dataclass
|
||||
class MemoryCandidate:
|
||||
memory_type: str
|
||||
content: str
|
||||
rule: str
|
||||
source_span: str
|
||||
project: str = ""
|
||||
confidence: float = 0.5 # default review-queue confidence
|
||||
source_interaction_id: str = ""
|
||||
extractor_version: str = EXTRACTOR_VERSION
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Pattern definitions
|
||||
# ---------------------------------------------------------------------------
|
||||
#
|
||||
# Each pattern maps to:
|
||||
# - the memory type the candidate should land in
|
||||
# - a compiled regex over the response text
|
||||
# - a short human-readable rule id
|
||||
#
|
||||
# Regexes are intentionally anchored to obvious structural cues so random
|
||||
# prose doesn't light them up. All are case-insensitive and DOTALL so
|
||||
# they can span a line break inside a single logical phrase.
|
||||
|
||||
_RULES: list[tuple[str, str, re.Pattern]] = [
|
||||
(
|
||||
"decision_heading",
|
||||
"adaptation",
|
||||
re.compile(
|
||||
r"^[ \t]*#{1,6}[ \t]*decision[ \t]*[:\-\u2014][ \t]*(?P<value>.+?)$",
|
||||
re.IGNORECASE | re.MULTILINE,
|
||||
),
|
||||
),
|
||||
(
|
||||
"constraint_heading",
|
||||
"project",
|
||||
re.compile(
|
||||
r"^[ \t]*#{1,6}[ \t]*constraint[ \t]*[:\-\u2014][ \t]*(?P<value>.+?)$",
|
||||
re.IGNORECASE | re.MULTILINE,
|
||||
),
|
||||
),
|
||||
(
|
||||
"requirement_heading",
|
||||
"project",
|
||||
re.compile(
|
||||
r"^[ \t]*#{1,6}[ \t]*requirement[ \t]*[:\-\u2014][ \t]*(?P<value>.+?)$",
|
||||
re.IGNORECASE | re.MULTILINE,
|
||||
),
|
||||
),
|
||||
(
|
||||
"fact_heading",
|
||||
"knowledge",
|
||||
re.compile(
|
||||
r"^[ \t]*#{1,6}[ \t]*fact[ \t]*[:\-\u2014][ \t]*(?P<value>.+?)$",
|
||||
re.IGNORECASE | re.MULTILINE,
|
||||
),
|
||||
),
|
||||
(
|
||||
"preference_sentence",
|
||||
"preference",
|
||||
re.compile(
|
||||
r"(?:^|[\s\.])(?:I|the user)\s+prefer(?:s)?\s+(?P<value>[^\n\.\!]{6,200})",
|
||||
re.IGNORECASE,
|
||||
),
|
||||
),
|
||||
(
|
||||
"decided_to_sentence",
|
||||
"adaptation",
|
||||
re.compile(
|
||||
r"(?:^|[\s\.])(?:I|we|the user)\s+decided\s+to\s+(?P<value>[^\n\.\!]{6,200})",
|
||||
re.IGNORECASE,
|
||||
),
|
||||
),
|
||||
(
|
||||
"requirement_sentence",
|
||||
"project",
|
||||
re.compile(
|
||||
r"(?:^|[\s\.])(?:the[ \t]+)?requirement\s+(?:is|was)\s+(?P<value>[^\n\.\!]{6,200})",
|
||||
re.IGNORECASE,
|
||||
),
|
||||
),
|
||||
]
|
||||
|
||||
# A minimum content length after trimming stops silly one-word candidates.
|
||||
_MIN_CANDIDATE_LENGTH = 8
|
||||
# A maximum content length keeps candidates reviewable at a glance.
|
||||
_MAX_CANDIDATE_LENGTH = 280
|
||||
|
||||
|
||||
def extract_candidates_from_interaction(
|
||||
interaction: Interaction,
|
||||
) -> list[MemoryCandidate]:
|
||||
"""Return a list of candidate memories for human review.
|
||||
|
||||
The returned candidates are not persisted. The caller can iterate
|
||||
over the result and call ``create_memory(..., status="candidate")``
|
||||
for each one it wants to land.
|
||||
"""
|
||||
text = _combined_response_text(interaction)
|
||||
if not text:
|
||||
return []
|
||||
|
||||
raw_candidates: list[MemoryCandidate] = []
|
||||
seen_spans: set[tuple[str, str, str]] = set() # (type, normalized_value, rule)
|
||||
|
||||
for rule_id, memory_type, pattern in _RULES:
|
||||
for match in pattern.finditer(text):
|
||||
value = _clean_value(match.group("value"))
|
||||
if len(value) < _MIN_CANDIDATE_LENGTH or len(value) > _MAX_CANDIDATE_LENGTH:
|
||||
continue
|
||||
normalized = value.lower()
|
||||
dedup_key = (memory_type, normalized, rule_id)
|
||||
if dedup_key in seen_spans:
|
||||
continue
|
||||
seen_spans.add(dedup_key)
|
||||
raw_candidates.append(
|
||||
MemoryCandidate(
|
||||
memory_type=memory_type,
|
||||
content=value,
|
||||
rule=rule_id,
|
||||
source_span=match.group(0).strip(),
|
||||
project=interaction.project or "",
|
||||
confidence=0.5,
|
||||
source_interaction_id=interaction.id,
|
||||
)
|
||||
)
|
||||
|
||||
# Drop anything that duplicates an already-active memory of the
|
||||
# same type and project so reviewers aren't asked to re-curate
|
||||
# things they already promoted.
|
||||
filtered = [c for c in raw_candidates if not _matches_existing_active(c)]
|
||||
|
||||
if filtered:
|
||||
log.info(
|
||||
"extraction_produced_candidates",
|
||||
interaction_id=interaction.id,
|
||||
candidate_count=len(filtered),
|
||||
dropped_as_duplicate=len(raw_candidates) - len(filtered),
|
||||
)
|
||||
return filtered
|
||||
|
||||
|
||||
def _combined_response_text(interaction: Interaction) -> str:
|
||||
parts: list[str] = []
|
||||
if interaction.response:
|
||||
parts.append(interaction.response)
|
||||
if interaction.response_summary:
|
||||
parts.append(interaction.response_summary)
|
||||
return "\n".join(parts).strip()
|
||||
|
||||
|
||||
def _clean_value(raw: str) -> str:
|
||||
"""Trim whitespace, strip trailing punctuation, collapse inner spaces."""
|
||||
cleaned = re.sub(r"\s+", " ", raw).strip()
|
||||
# Trim trailing punctuation that commonly trails sentences but is not
|
||||
# part of the fact itself.
|
||||
cleaned = cleaned.rstrip(".;,!?\u2014-")
|
||||
return cleaned.strip()
|
||||
|
||||
|
||||
def _matches_existing_active(candidate: MemoryCandidate) -> bool:
|
||||
"""Return True if an identical active memory already exists."""
|
||||
if candidate.memory_type not in MEMORY_TYPES:
|
||||
return False
|
||||
try:
|
||||
existing = get_memories(
|
||||
memory_type=candidate.memory_type,
|
||||
project=candidate.project or None,
|
||||
active_only=True,
|
||||
limit=200,
|
||||
)
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
log.error("extractor_existing_lookup_failed", error=str(exc))
|
||||
return False
|
||||
needle = candidate.content.lower()
|
||||
for mem in existing:
|
||||
if mem.content.lower() == needle:
|
||||
return True
|
||||
return False
|
||||
241
src/atocore/memory/reinforcement.py
Normal file
241
src/atocore/memory/reinforcement.py
Normal file
@@ -0,0 +1,241 @@
|
||||
"""Reinforce active memories from captured interactions (Phase 9 Commit B).
|
||||
|
||||
When an interaction is captured with a non-empty response, this module
|
||||
scans the response text against currently-active memories and bumps the
|
||||
confidence of any memory whose content appears in the response. The
|
||||
intent is to surface a weak signal that the LLM actually relied on a
|
||||
given memory, without ever promoting anything new into trusted state.
|
||||
|
||||
Design notes
|
||||
------------
|
||||
- Matching uses token-overlap: tokenize both sides (lowercase, stem,
|
||||
drop stop words), then check whether >= 70 % of the memory's content
|
||||
tokens appear in the response token set. This handles natural
|
||||
paraphrases (e.g. "prefers" vs "prefer", "because history" vs
|
||||
"because the history") that substring matching missed.
|
||||
- Candidates and invalidated memories are NEVER considered — reinforcement
|
||||
must not revive history.
|
||||
- Reinforcement is capped at 1.0 and monotonically non-decreasing.
|
||||
- The function is idempotent with respect to a single call but will
|
||||
accumulate confidence across multiple calls; that is intentional — if
|
||||
the same memory is mentioned in 10 separate conversations it is, by
|
||||
definition, more confidently useful.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
|
||||
from atocore.interactions.service import Interaction
|
||||
from atocore.memory.service import (
|
||||
Memory,
|
||||
get_memories,
|
||||
reinforce_memory,
|
||||
)
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("reinforcement")
|
||||
|
||||
# Minimum memory content length to consider for matching. Too-short
|
||||
# memories (e.g. "use SI") would otherwise fire on almost every response
|
||||
# and generate noise. 12 characters is long enough to require real
|
||||
# semantic content but short enough to match one-liner identity
|
||||
# memories like "prefers Python".
|
||||
_MIN_MEMORY_CONTENT_LENGTH = 12
|
||||
|
||||
# Token-overlap matching constants.
|
||||
_STOP_WORDS: frozenset[str] = frozenset({
|
||||
"the", "a", "an", "and", "or", "of", "to", "is", "was",
|
||||
"that", "this", "with", "for", "from", "into",
|
||||
})
|
||||
_MATCH_THRESHOLD = 0.70
|
||||
|
||||
# Long memories can't realistically hit 70% overlap through organic
|
||||
# paraphrase — a 40-token memory would need 28 stemmed tokens echoed
|
||||
# verbatim. Above this token count the matcher switches to an absolute
|
||||
# overlap floor plus a softer fraction floor so paragraph-length memories
|
||||
# still reinforce when the response genuinely uses them.
|
||||
_LONG_MEMORY_TOKEN_COUNT = 15
|
||||
_LONG_MODE_MIN_OVERLAP = 12
|
||||
_LONG_MODE_MIN_FRACTION = 0.35
|
||||
|
||||
DEFAULT_CONFIDENCE_DELTA = 0.02
|
||||
|
||||
|
||||
@dataclass
|
||||
class ReinforcementResult:
|
||||
memory_id: str
|
||||
memory_type: str
|
||||
old_confidence: float
|
||||
new_confidence: float
|
||||
|
||||
|
||||
def reinforce_from_interaction(
|
||||
interaction: Interaction,
|
||||
confidence_delta: float = DEFAULT_CONFIDENCE_DELTA,
|
||||
) -> list[ReinforcementResult]:
|
||||
"""Scan an interaction's response for active-memory mentions.
|
||||
|
||||
Returns the list of memories that were reinforced. An empty list is
|
||||
returned if the interaction has no response content, if no memories
|
||||
match, or if the interaction has no project scope and the global
|
||||
active set is empty.
|
||||
"""
|
||||
response_text = _combined_response_text(interaction)
|
||||
if not response_text:
|
||||
return []
|
||||
|
||||
normalized_response = _normalize(response_text)
|
||||
if not normalized_response:
|
||||
return []
|
||||
|
||||
# Fetch the candidate pool of active memories. We cast a wide net
|
||||
# here: project-scoped memories for the interaction's project first,
|
||||
# plus identity and preference memories which are global by nature.
|
||||
candidate_pool: list[Memory] = []
|
||||
seen_ids: set[str] = set()
|
||||
|
||||
def _add_batch(batch: list[Memory]) -> None:
|
||||
for mem in batch:
|
||||
if mem.id in seen_ids:
|
||||
continue
|
||||
seen_ids.add(mem.id)
|
||||
candidate_pool.append(mem)
|
||||
|
||||
if interaction.project:
|
||||
_add_batch(get_memories(project=interaction.project, active_only=True, limit=200))
|
||||
_add_batch(get_memories(memory_type="identity", active_only=True, limit=50))
|
||||
_add_batch(get_memories(memory_type="preference", active_only=True, limit=50))
|
||||
|
||||
reinforced: list[ReinforcementResult] = []
|
||||
for memory in candidate_pool:
|
||||
if not _memory_matches(memory.content, normalized_response):
|
||||
continue
|
||||
applied, old_conf, new_conf = reinforce_memory(
|
||||
memory.id, confidence_delta=confidence_delta
|
||||
)
|
||||
if not applied:
|
||||
continue
|
||||
reinforced.append(
|
||||
ReinforcementResult(
|
||||
memory_id=memory.id,
|
||||
memory_type=memory.memory_type,
|
||||
old_confidence=old_conf,
|
||||
new_confidence=new_conf,
|
||||
)
|
||||
)
|
||||
|
||||
if reinforced:
|
||||
log.info(
|
||||
"reinforcement_applied",
|
||||
interaction_id=interaction.id,
|
||||
project=interaction.project,
|
||||
reinforced_count=len(reinforced),
|
||||
)
|
||||
return reinforced
|
||||
|
||||
|
||||
def _combined_response_text(interaction: Interaction) -> str:
|
||||
"""Pick the best available response text from an interaction."""
|
||||
parts: list[str] = []
|
||||
if interaction.response:
|
||||
parts.append(interaction.response)
|
||||
if interaction.response_summary:
|
||||
parts.append(interaction.response_summary)
|
||||
return "\n".join(parts).strip()
|
||||
|
||||
|
||||
def _normalize(text: str) -> str:
|
||||
"""Lowercase and collapse whitespace for substring matching."""
|
||||
if not text:
|
||||
return ""
|
||||
lowered = text.lower()
|
||||
# Collapse any run of whitespace (including newlines and tabs) to
|
||||
# a single space so multi-line responses match single-line memories.
|
||||
collapsed = re.sub(r"\s+", " ", lowered)
|
||||
return collapsed.strip()
|
||||
|
||||
|
||||
def _stem(word: str) -> str:
|
||||
"""Aggressive suffix-folding so inflected forms collapse.
|
||||
|
||||
Handles trailing ``ing``, ``ed``, and ``s`` — good enough for
|
||||
reinforcement matching without pulling in nltk/snowball.
|
||||
"""
|
||||
# Order matters: try longest suffix first.
|
||||
if word.endswith("ing") and len(word) >= 6:
|
||||
return word[:-3]
|
||||
if word.endswith("ed") and len(word) > 4:
|
||||
stem = word[:-2]
|
||||
# "preferred" → "preferr" → "prefer" (doubled consonant before -ed)
|
||||
if len(stem) >= 3 and stem[-1] == stem[-2]:
|
||||
stem = stem[:-1]
|
||||
return stem
|
||||
if word.endswith("s") and len(word) > 3:
|
||||
return word[:-1]
|
||||
return word
|
||||
|
||||
|
||||
def _tokenize(text: str) -> set[str]:
|
||||
"""Split normalized text into a stemmed token set.
|
||||
|
||||
Strips punctuation, drops words shorter than 3 chars and stop
|
||||
words. Hyphenated and slash-separated identifiers
|
||||
(``polisher-control``, ``twyman-green``, ``2-projects/interferometer``)
|
||||
produce both the full form AND each sub-token, so a query for
|
||||
"polisher control" can match a memory that wrote
|
||||
"polisher-control" without forcing callers to guess the exact
|
||||
hyphenation.
|
||||
"""
|
||||
tokens: set[str] = set()
|
||||
for raw in text.split():
|
||||
word = raw.strip(".,;:!?\"'()[]{}-/")
|
||||
if not word:
|
||||
continue
|
||||
_add_token(tokens, word)
|
||||
# Also add sub-tokens split on internal '-' or '/' so
|
||||
# hyphenated identifiers match queries that don't hyphenate.
|
||||
if "-" in word or "/" in word:
|
||||
for sub in re.split(r"[-/]+", word):
|
||||
_add_token(tokens, sub)
|
||||
return tokens
|
||||
|
||||
|
||||
def _add_token(tokens: set[str], word: str) -> None:
|
||||
if len(word) < 3:
|
||||
return
|
||||
if word in _STOP_WORDS:
|
||||
return
|
||||
tokens.add(_stem(word))
|
||||
|
||||
|
||||
def _memory_matches(memory_content: str, normalized_response: str) -> bool:
|
||||
"""Return True if enough of the memory's tokens appear in the response.
|
||||
|
||||
Dual-mode token overlap:
|
||||
- Short memories (<= _LONG_MEMORY_TOKEN_COUNT stems): require
|
||||
>= 70 % of memory tokens echoed.
|
||||
- Long memories (paragraphs): require an absolute floor of
|
||||
_LONG_MODE_MIN_OVERLAP distinct stems echoed AND a softer
|
||||
fraction of _LONG_MODE_MIN_FRACTION, so organic paraphrase
|
||||
of a real project memory can reinforce without the response
|
||||
quoting the paragraph verbatim.
|
||||
"""
|
||||
if not memory_content:
|
||||
return False
|
||||
normalized_memory = _normalize(memory_content)
|
||||
if len(normalized_memory) < _MIN_MEMORY_CONTENT_LENGTH:
|
||||
return False
|
||||
memory_tokens = _tokenize(normalized_memory)
|
||||
if not memory_tokens:
|
||||
return False
|
||||
response_tokens = _tokenize(normalized_response)
|
||||
overlap = memory_tokens & response_tokens
|
||||
fraction = len(overlap) / len(memory_tokens)
|
||||
if len(memory_tokens) <= _LONG_MEMORY_TOKEN_COUNT:
|
||||
return fraction >= _MATCH_THRESHOLD
|
||||
return (
|
||||
len(overlap) >= _LONG_MODE_MIN_OVERLAP
|
||||
and fraction >= _LONG_MODE_MIN_FRACTION
|
||||
)
|
||||
478
src/atocore/memory/service.py
Normal file
478
src/atocore/memory/service.py
Normal file
@@ -0,0 +1,478 @@
|
||||
"""Memory Core — structured memory management.
|
||||
|
||||
Memory types (per Master Plan):
|
||||
- identity: who the user is, role, background
|
||||
- preference: how they like to work, style, tools
|
||||
- project: project-specific knowledge and context
|
||||
- episodic: what happened, conversations, events
|
||||
- knowledge: verified facts, technical knowledge
|
||||
- adaptation: learned corrections, behavioral adjustments
|
||||
|
||||
Memories have:
|
||||
- confidence (0.0–1.0): how certain we are
|
||||
- status: lifecycle state, one of MEMORY_STATUSES
|
||||
* candidate: extracted from an interaction, awaiting human review
|
||||
(Phase 9 Commit C). Candidates are NEVER included in
|
||||
context packs.
|
||||
* active: promoted/curated, visible to retrieval and context
|
||||
* superseded: replaced by a newer entry
|
||||
* invalid: rejected / error-corrected
|
||||
- last_referenced_at / reference_count: reinforcement signal
|
||||
(Phase 9 Commit B). Bumped whenever a captured interaction's
|
||||
response content echoes this memory.
|
||||
- optional link to source chunk: traceability
|
||||
"""
|
||||
|
||||
import uuid
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import resolve_project_name
|
||||
|
||||
log = get_logger("memory")
|
||||
|
||||
MEMORY_TYPES = [
|
||||
"identity",
|
||||
"preference",
|
||||
"project",
|
||||
"episodic",
|
||||
"knowledge",
|
||||
"adaptation",
|
||||
]
|
||||
|
||||
MEMORY_STATUSES = [
|
||||
"candidate",
|
||||
"active",
|
||||
"superseded",
|
||||
"invalid",
|
||||
]
|
||||
|
||||
|
||||
@dataclass
|
||||
class Memory:
|
||||
id: str
|
||||
memory_type: str
|
||||
content: str
|
||||
project: str
|
||||
source_chunk_id: str
|
||||
confidence: float
|
||||
status: str
|
||||
created_at: str
|
||||
updated_at: str
|
||||
last_referenced_at: str = ""
|
||||
reference_count: int = 0
|
||||
|
||||
|
||||
def create_memory(
|
||||
memory_type: str,
|
||||
content: str,
|
||||
project: str = "",
|
||||
source_chunk_id: str = "",
|
||||
confidence: float = 1.0,
|
||||
status: str = "active",
|
||||
) -> Memory:
|
||||
"""Create a new memory entry.
|
||||
|
||||
``status`` defaults to ``active`` for backward compatibility. Pass
|
||||
``candidate`` when the memory is being proposed by the Phase 9 Commit C
|
||||
extractor and still needs human review before it can influence context.
|
||||
"""
|
||||
if memory_type not in MEMORY_TYPES:
|
||||
raise ValueError(f"Invalid memory type '{memory_type}'. Must be one of: {MEMORY_TYPES}")
|
||||
if status not in MEMORY_STATUSES:
|
||||
raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
|
||||
_validate_confidence(confidence)
|
||||
|
||||
# Canonicalize the project through the registry so an alias and
|
||||
# the canonical id store under the same bucket. This keeps
|
||||
# reinforcement queries (which use the interaction's project) and
|
||||
# context retrieval (which uses the registry-canonicalized hint)
|
||||
# consistent with how memories are created.
|
||||
project = resolve_project_name(project)
|
||||
|
||||
memory_id = str(uuid.uuid4())
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
# Check for duplicate content within the same type+project at the same status.
|
||||
# Scoping by status keeps active curation separate from the candidate
|
||||
# review queue: a candidate and an active memory with identical text can
|
||||
# legitimately coexist if the candidate is a fresh extraction of something
|
||||
# already curated.
|
||||
with get_connection() as conn:
|
||||
existing = conn.execute(
|
||||
"SELECT id FROM memories "
|
||||
"WHERE memory_type = ? AND content = ? AND project = ? AND status = ?",
|
||||
(memory_type, content, project, status),
|
||||
).fetchone()
|
||||
if existing:
|
||||
log.info(
|
||||
"memory_duplicate_skipped",
|
||||
memory_type=memory_type,
|
||||
status=status,
|
||||
content_preview=content[:80],
|
||||
)
|
||||
return _row_to_memory(
|
||||
conn.execute("SELECT * FROM memories WHERE id = ?", (existing["id"],)).fetchone()
|
||||
)
|
||||
|
||||
conn.execute(
|
||||
"INSERT INTO memories (id, memory_type, content, project, source_chunk_id, confidence, status) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(memory_id, memory_type, content, project, source_chunk_id or None, confidence, status),
|
||||
)
|
||||
|
||||
log.info(
|
||||
"memory_created",
|
||||
memory_type=memory_type,
|
||||
status=status,
|
||||
content_preview=content[:80],
|
||||
)
|
||||
|
||||
return Memory(
|
||||
id=memory_id,
|
||||
memory_type=memory_type,
|
||||
content=content,
|
||||
project=project,
|
||||
source_chunk_id=source_chunk_id,
|
||||
confidence=confidence,
|
||||
status=status,
|
||||
created_at=now,
|
||||
updated_at=now,
|
||||
last_referenced_at="",
|
||||
reference_count=0,
|
||||
)
|
||||
|
||||
|
||||
def get_memories(
|
||||
memory_type: str | None = None,
|
||||
project: str | None = None,
|
||||
active_only: bool = True,
|
||||
min_confidence: float = 0.0,
|
||||
limit: int = 50,
|
||||
status: str | None = None,
|
||||
) -> list[Memory]:
|
||||
"""Retrieve memories, optionally filtered.
|
||||
|
||||
When ``status`` is provided explicitly, it takes precedence over
|
||||
``active_only`` so callers can list the candidate review queue via
|
||||
``get_memories(status='candidate')``. When ``status`` is omitted the
|
||||
legacy ``active_only`` behaviour still applies.
|
||||
"""
|
||||
if status is not None and status not in MEMORY_STATUSES:
|
||||
raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
|
||||
|
||||
query = "SELECT * FROM memories WHERE 1=1"
|
||||
params: list = []
|
||||
|
||||
if memory_type:
|
||||
query += " AND memory_type = ?"
|
||||
params.append(memory_type)
|
||||
if project is not None:
|
||||
# Canonicalize on the read side so a caller passing an alias
|
||||
# finds rows that were stored under the canonical id (and
|
||||
# vice versa). resolve_project_name returns the input
|
||||
# unchanged for unregistered names so empty-string queries
|
||||
# for "no project scope" still work.
|
||||
query += " AND project = ?"
|
||||
params.append(resolve_project_name(project))
|
||||
if status is not None:
|
||||
query += " AND status = ?"
|
||||
params.append(status)
|
||||
elif active_only:
|
||||
query += " AND status = 'active'"
|
||||
if min_confidence > 0:
|
||||
query += " AND confidence >= ?"
|
||||
params.append(min_confidence)
|
||||
|
||||
query += " ORDER BY confidence DESC, updated_at DESC LIMIT ?"
|
||||
params.append(limit)
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(query, params).fetchall()
|
||||
|
||||
return [_row_to_memory(r) for r in rows]
|
||||
|
||||
|
||||
def update_memory(
|
||||
memory_id: str,
|
||||
content: str | None = None,
|
||||
confidence: float | None = None,
|
||||
status: str | None = None,
|
||||
) -> bool:
|
||||
"""Update an existing memory."""
|
||||
with get_connection() as conn:
|
||||
existing = conn.execute("SELECT * FROM memories WHERE id = ?", (memory_id,)).fetchone()
|
||||
if existing is None:
|
||||
return False
|
||||
|
||||
next_content = content if content is not None else existing["content"]
|
||||
next_status = status if status is not None else existing["status"]
|
||||
if confidence is not None:
|
||||
_validate_confidence(confidence)
|
||||
|
||||
if next_status == "active":
|
||||
duplicate = conn.execute(
|
||||
"SELECT id FROM memories "
|
||||
"WHERE memory_type = ? AND content = ? AND project = ? AND status = 'active' AND id != ?",
|
||||
(existing["memory_type"], next_content, existing["project"] or "", memory_id),
|
||||
).fetchone()
|
||||
if duplicate:
|
||||
raise ValueError("Update would create a duplicate active memory")
|
||||
|
||||
updates = []
|
||||
params: list = []
|
||||
|
||||
if content is not None:
|
||||
updates.append("content = ?")
|
||||
params.append(content)
|
||||
if confidence is not None:
|
||||
updates.append("confidence = ?")
|
||||
params.append(confidence)
|
||||
if status is not None:
|
||||
if status not in MEMORY_STATUSES:
|
||||
raise ValueError(f"Invalid status '{status}'. Must be one of: {MEMORY_STATUSES}")
|
||||
updates.append("status = ?")
|
||||
params.append(status)
|
||||
|
||||
if not updates:
|
||||
return False
|
||||
|
||||
updates.append("updated_at = CURRENT_TIMESTAMP")
|
||||
params.append(memory_id)
|
||||
|
||||
result = conn.execute(
|
||||
f"UPDATE memories SET {', '.join(updates)} WHERE id = ?",
|
||||
params,
|
||||
)
|
||||
|
||||
if result.rowcount > 0:
|
||||
log.info("memory_updated", memory_id=memory_id)
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def invalidate_memory(memory_id: str) -> bool:
|
||||
"""Mark a memory as invalid (error correction)."""
|
||||
return update_memory(memory_id, status="invalid")
|
||||
|
||||
|
||||
def supersede_memory(memory_id: str) -> bool:
|
||||
"""Mark a memory as superseded (replaced by newer info)."""
|
||||
return update_memory(memory_id, status="superseded")
|
||||
|
||||
|
||||
def promote_memory(memory_id: str) -> bool:
|
||||
"""Promote a candidate memory to active (Phase 9 Commit C review queue).
|
||||
|
||||
Returns False if the memory does not exist or is not currently a
|
||||
candidate. Raises ValueError only if the promotion would create a
|
||||
duplicate active memory (delegates to update_memory's existing check).
|
||||
"""
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT status FROM memories WHERE id = ?", (memory_id,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return False
|
||||
if row["status"] != "candidate":
|
||||
return False
|
||||
return update_memory(memory_id, status="active")
|
||||
|
||||
|
||||
def reject_candidate_memory(memory_id: str) -> bool:
|
||||
"""Reject a candidate memory (Phase 9 Commit C).
|
||||
|
||||
Sets the candidate's status to ``invalid`` so it drops out of the
|
||||
review queue without polluting the active set. Returns False if the
|
||||
memory does not exist or is not currently a candidate.
|
||||
"""
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT status FROM memories WHERE id = ?", (memory_id,)
|
||||
).fetchone()
|
||||
if row is None:
|
||||
return False
|
||||
if row["status"] != "candidate":
|
||||
return False
|
||||
return update_memory(memory_id, status="invalid")
|
||||
|
||||
|
||||
def reinforce_memory(
|
||||
memory_id: str,
|
||||
confidence_delta: float = 0.02,
|
||||
) -> tuple[bool, float, float]:
|
||||
"""Bump a memory's confidence and reference count (Phase 9 Commit B).
|
||||
|
||||
Returns a 3-tuple ``(applied, old_confidence, new_confidence)``.
|
||||
``applied`` is False if the memory does not exist or is not in the
|
||||
``active`` state — reinforcement only touches live memories so the
|
||||
candidate queue and invalidated history are never silently revived.
|
||||
|
||||
Confidence is capped at 1.0. last_referenced_at is set to the current
|
||||
UTC time in SQLite-comparable format. reference_count is incremented
|
||||
by one per call (not per delta amount).
|
||||
"""
|
||||
if confidence_delta < 0:
|
||||
raise ValueError("confidence_delta must be non-negative for reinforcement")
|
||||
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
with get_connection() as conn:
|
||||
row = conn.execute(
|
||||
"SELECT confidence, status FROM memories WHERE id = ?", (memory_id,)
|
||||
).fetchone()
|
||||
if row is None or row["status"] != "active":
|
||||
return False, 0.0, 0.0
|
||||
old_confidence = float(row["confidence"])
|
||||
new_confidence = min(1.0, old_confidence + confidence_delta)
|
||||
conn.execute(
|
||||
"UPDATE memories SET confidence = ?, last_referenced_at = ?, "
|
||||
"reference_count = COALESCE(reference_count, 0) + 1 "
|
||||
"WHERE id = ?",
|
||||
(new_confidence, now, memory_id),
|
||||
)
|
||||
log.info(
|
||||
"memory_reinforced",
|
||||
memory_id=memory_id,
|
||||
old_confidence=round(old_confidence, 4),
|
||||
new_confidence=round(new_confidence, 4),
|
||||
)
|
||||
return True, old_confidence, new_confidence
|
||||
|
||||
|
||||
def get_memories_for_context(
|
||||
memory_types: list[str] | None = None,
|
||||
project: str | None = None,
|
||||
budget: int = 500,
|
||||
header: str = "--- AtoCore Memory ---",
|
||||
footer: str = "--- End Memory ---",
|
||||
query: str | None = None,
|
||||
) -> tuple[str, int]:
|
||||
"""Get formatted memories for context injection.
|
||||
|
||||
Returns (formatted_text, char_count).
|
||||
|
||||
Budget allocation per Master Plan section 9:
|
||||
identity: 5%, preference: 5%, rest from retrieval budget
|
||||
|
||||
The caller can override ``header`` / ``footer`` to distinguish
|
||||
multiple memory blocks in the same pack (e.g. identity/preference
|
||||
vs project/knowledge memories).
|
||||
|
||||
When ``query`` is provided, candidates within each memory type
|
||||
are ranked by lexical overlap against the query (stemmed token
|
||||
intersection, ties broken by confidence). Without a query,
|
||||
candidates fall through in the order ``get_memories`` returns
|
||||
them — which is effectively "by confidence desc".
|
||||
"""
|
||||
if memory_types is None:
|
||||
memory_types = ["identity", "preference"]
|
||||
|
||||
if budget <= 0:
|
||||
return "", 0
|
||||
wrapper_chars = len(header) + len(footer) + 2
|
||||
if budget <= wrapper_chars:
|
||||
return "", 0
|
||||
|
||||
available = budget - wrapper_chars
|
||||
selected_entries: list[str] = []
|
||||
used = 0
|
||||
|
||||
# Pre-tokenize the query once. ``_score_memory_for_query`` is a
|
||||
# free function below that reuses the reinforcement tokenizer so
|
||||
# lexical scoring here matches the reinforcement matcher.
|
||||
query_tokens: set[str] | None = None
|
||||
if query:
|
||||
from atocore.memory.reinforcement import _normalize, _tokenize
|
||||
|
||||
query_tokens = _tokenize(_normalize(query))
|
||||
if not query_tokens:
|
||||
query_tokens = None
|
||||
|
||||
# Collect ALL candidates across the requested types into one
|
||||
# pool, then rank globally before the budget walk. Ranking per
|
||||
# type and walking types in order would starve later types when
|
||||
# the first type's candidates filled the budget — even if a
|
||||
# later-type candidate matched the query perfectly. Type order
|
||||
# is preserved as a stable tiebreaker inside
|
||||
# ``_rank_memories_for_query`` via Python's stable sort.
|
||||
pool: list[Memory] = []
|
||||
seen_ids: set[str] = set()
|
||||
for mtype in memory_types:
|
||||
for mem in get_memories(
|
||||
memory_type=mtype,
|
||||
project=project,
|
||||
min_confidence=0.5,
|
||||
limit=30,
|
||||
):
|
||||
if mem.id in seen_ids:
|
||||
continue
|
||||
seen_ids.add(mem.id)
|
||||
pool.append(mem)
|
||||
|
||||
if query_tokens is not None:
|
||||
pool = _rank_memories_for_query(pool, query_tokens)
|
||||
|
||||
for mem in pool:
|
||||
entry = f"[{mem.memory_type}] {mem.content}"
|
||||
entry_len = len(entry) + 1
|
||||
if entry_len > available - used:
|
||||
continue
|
||||
selected_entries.append(entry)
|
||||
used += entry_len
|
||||
|
||||
if not selected_entries:
|
||||
return "", 0
|
||||
|
||||
lines = [header, *selected_entries, footer]
|
||||
text = "\n".join(lines)
|
||||
|
||||
log.info("memories_for_context", count=len(selected_entries), chars=len(text))
|
||||
return text, len(text)
|
||||
|
||||
|
||||
def _rank_memories_for_query(
|
||||
memories: list["Memory"],
|
||||
query_tokens: set[str],
|
||||
) -> list["Memory"]:
|
||||
"""Rerank a memory list by lexical overlap with a pre-tokenized query.
|
||||
|
||||
Ordering key: (overlap_count DESC, confidence DESC). When a query
|
||||
shares no tokens with a memory, overlap is zero and confidence
|
||||
acts as the sole tiebreaker — which matches the pre-query
|
||||
behaviour and keeps no-query calls stable.
|
||||
"""
|
||||
from atocore.memory.reinforcement import _normalize, _tokenize
|
||||
|
||||
scored: list[tuple[int, float, Memory]] = []
|
||||
for mem in memories:
|
||||
mem_tokens = _tokenize(_normalize(mem.content))
|
||||
overlap = len(mem_tokens & query_tokens) if mem_tokens else 0
|
||||
scored.append((overlap, mem.confidence, mem))
|
||||
scored.sort(key=lambda t: (t[0], t[1]), reverse=True)
|
||||
return [mem for _, _, mem in scored]
|
||||
|
||||
|
||||
def _row_to_memory(row) -> Memory:
|
||||
"""Convert a DB row to Memory dataclass."""
|
||||
keys = row.keys() if hasattr(row, "keys") else []
|
||||
last_ref = row["last_referenced_at"] if "last_referenced_at" in keys else None
|
||||
ref_count = row["reference_count"] if "reference_count" in keys else 0
|
||||
return Memory(
|
||||
id=row["id"],
|
||||
memory_type=row["memory_type"],
|
||||
content=row["content"],
|
||||
project=row["project"] or "",
|
||||
source_chunk_id=row["source_chunk_id"] or "",
|
||||
confidence=row["confidence"],
|
||||
status=row["status"],
|
||||
created_at=row["created_at"],
|
||||
updated_at=row["updated_at"],
|
||||
last_referenced_at=last_ref or "",
|
||||
reference_count=int(ref_count or 0),
|
||||
)
|
||||
|
||||
|
||||
def _validate_confidence(confidence: float) -> None:
|
||||
if not 0.0 <= confidence <= 1.0:
|
||||
raise ValueError("Confidence must be between 0.0 and 1.0")
|
||||
0
src/atocore/models/__init__.py
Normal file
0
src/atocore/models/__init__.py
Normal file
175
src/atocore/models/database.py
Normal file
175
src/atocore/models/database.py
Normal file
@@ -0,0 +1,175 @@
|
||||
"""SQLite database schema and connection management."""
|
||||
|
||||
import sqlite3
|
||||
from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
from typing import Generator
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("database")
|
||||
|
||||
SCHEMA_SQL = """
|
||||
CREATE TABLE IF NOT EXISTS source_documents (
|
||||
id TEXT PRIMARY KEY,
|
||||
file_path TEXT UNIQUE NOT NULL,
|
||||
file_hash TEXT NOT NULL,
|
||||
title TEXT,
|
||||
doc_type TEXT DEFAULT 'markdown',
|
||||
tags TEXT DEFAULT '[]',
|
||||
ingested_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS source_chunks (
|
||||
id TEXT PRIMARY KEY,
|
||||
document_id TEXT NOT NULL REFERENCES source_documents(id) ON DELETE CASCADE,
|
||||
chunk_index INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
heading_path TEXT DEFAULT '',
|
||||
char_count INTEGER NOT NULL,
|
||||
metadata TEXT DEFAULT '{}',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS memories (
|
||||
id TEXT PRIMARY KEY,
|
||||
memory_type TEXT NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
project TEXT DEFAULT '',
|
||||
source_chunk_id TEXT REFERENCES source_chunks(id),
|
||||
confidence REAL DEFAULT 1.0,
|
||||
status TEXT DEFAULT 'active',
|
||||
last_referenced_at DATETIME,
|
||||
reference_count INTEGER DEFAULT 0,
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS projects (
|
||||
id TEXT PRIMARY KEY,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
description TEXT DEFAULT '',
|
||||
status TEXT DEFAULT 'active',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS interactions (
|
||||
id TEXT PRIMARY KEY,
|
||||
prompt TEXT NOT NULL,
|
||||
context_pack TEXT DEFAULT '{}',
|
||||
response_summary TEXT DEFAULT '',
|
||||
response TEXT DEFAULT '',
|
||||
memories_used TEXT DEFAULT '[]',
|
||||
chunks_used TEXT DEFAULT '[]',
|
||||
client TEXT DEFAULT '',
|
||||
session_id TEXT DEFAULT '',
|
||||
project TEXT DEFAULT '',
|
||||
project_id TEXT REFERENCES projects(id),
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
-- Indexes that reference columns guaranteed to exist since the first
|
||||
-- release ship here. Indexes that reference columns added by later
|
||||
-- migrations (memories.project, interactions.project,
|
||||
-- interactions.session_id) are created inside _apply_migrations AFTER
|
||||
-- the corresponding ALTER TABLE, NOT here. Creating them here would
|
||||
-- fail on upgrade from a pre-migration schema because CREATE TABLE
|
||||
-- IF NOT EXISTS is a no-op on an existing table, so the new columns
|
||||
-- wouldn't be added before the CREATE INDEX runs.
|
||||
CREATE INDEX IF NOT EXISTS idx_chunks_document ON source_chunks(document_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_memories_type ON memories(memory_type);
|
||||
CREATE INDEX IF NOT EXISTS idx_memories_status ON memories(status);
|
||||
CREATE INDEX IF NOT EXISTS idx_interactions_project ON interactions(project_id);
|
||||
"""
|
||||
|
||||
|
||||
def _ensure_data_dir() -> None:
|
||||
_config.ensure_runtime_dirs()
|
||||
|
||||
|
||||
def init_db() -> None:
|
||||
"""Initialize the database with schema."""
|
||||
_ensure_data_dir()
|
||||
with get_connection() as conn:
|
||||
conn.executescript(SCHEMA_SQL)
|
||||
_apply_migrations(conn)
|
||||
log.info("database_initialized", path=str(_config.settings.db_path))
|
||||
|
||||
|
||||
def _apply_migrations(conn: sqlite3.Connection) -> None:
|
||||
"""Apply lightweight schema migrations for existing local databases."""
|
||||
if not _column_exists(conn, "memories", "project"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN project TEXT DEFAULT ''")
|
||||
conn.execute("CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project)")
|
||||
|
||||
# Phase 9 Commit B: reinforcement columns.
|
||||
# last_referenced_at records when a memory was most recently referenced
|
||||
# in a captured interaction; reference_count is a monotonically
|
||||
# increasing counter bumped on every reference. Together they let
|
||||
# Reflection (Commit C) and decay (deferred) reason about which
|
||||
# memories are actually being used versus which have gone cold.
|
||||
if not _column_exists(conn, "memories", "last_referenced_at"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN last_referenced_at DATETIME")
|
||||
if not _column_exists(conn, "memories", "reference_count"):
|
||||
conn.execute("ALTER TABLE memories ADD COLUMN reference_count INTEGER DEFAULT 0")
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_memories_last_referenced ON memories(last_referenced_at)"
|
||||
)
|
||||
|
||||
# Phase 9 Commit A: capture loop columns on the interactions table.
|
||||
# The original schema only carried prompt + project_id + a context_pack
|
||||
# JSON blob. To make interactions a real audit trail of what AtoCore fed
|
||||
# the LLM and what came back, we record the full response, the chunk
|
||||
# and memory ids that were actually used, plus client + session metadata.
|
||||
if not _column_exists(conn, "interactions", "response"):
|
||||
conn.execute("ALTER TABLE interactions ADD COLUMN response TEXT DEFAULT ''")
|
||||
if not _column_exists(conn, "interactions", "memories_used"):
|
||||
conn.execute("ALTER TABLE interactions ADD COLUMN memories_used TEXT DEFAULT '[]'")
|
||||
if not _column_exists(conn, "interactions", "chunks_used"):
|
||||
conn.execute("ALTER TABLE interactions ADD COLUMN chunks_used TEXT DEFAULT '[]'")
|
||||
if not _column_exists(conn, "interactions", "client"):
|
||||
conn.execute("ALTER TABLE interactions ADD COLUMN client TEXT DEFAULT ''")
|
||||
if not _column_exists(conn, "interactions", "session_id"):
|
||||
conn.execute("ALTER TABLE interactions ADD COLUMN session_id TEXT DEFAULT ''")
|
||||
if not _column_exists(conn, "interactions", "project"):
|
||||
conn.execute("ALTER TABLE interactions ADD COLUMN project TEXT DEFAULT ''")
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_interactions_session ON interactions(session_id)"
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_interactions_project_name ON interactions(project)"
|
||||
)
|
||||
conn.execute(
|
||||
"CREATE INDEX IF NOT EXISTS idx_interactions_created_at ON interactions(created_at)"
|
||||
)
|
||||
|
||||
|
||||
def _column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool:
|
||||
rows = conn.execute(f"PRAGMA table_info({table})").fetchall()
|
||||
return any(row["name"] == column for row in rows)
|
||||
|
||||
|
||||
@contextmanager
|
||||
def get_connection() -> Generator[sqlite3.Connection, None, None]:
|
||||
"""Get a database connection with row factory."""
|
||||
_ensure_data_dir()
|
||||
conn = sqlite3.connect(
|
||||
str(_config.settings.db_path),
|
||||
timeout=_config.settings.db_busy_timeout_ms / 1000,
|
||||
)
|
||||
conn.row_factory = sqlite3.Row
|
||||
conn.execute("PRAGMA foreign_keys = ON")
|
||||
conn.execute(f"PRAGMA busy_timeout = {_config.settings.db_busy_timeout_ms}")
|
||||
conn.execute("PRAGMA journal_mode = WAL")
|
||||
conn.execute("PRAGMA synchronous = NORMAL")
|
||||
try:
|
||||
yield conn
|
||||
conn.commit()
|
||||
except Exception:
|
||||
conn.rollback()
|
||||
raise
|
||||
finally:
|
||||
conn.close()
|
||||
0
src/atocore/observability/__init__.py
Normal file
0
src/atocore/observability/__init__.py
Normal file
43
src/atocore/observability/logger.py
Normal file
43
src/atocore/observability/logger.py
Normal file
@@ -0,0 +1,43 @@
|
||||
"""Structured logging for AtoCore."""
|
||||
|
||||
import logging
|
||||
|
||||
import atocore.config as _config
|
||||
import structlog
|
||||
|
||||
_LOG_LEVELS = {
|
||||
"DEBUG": logging.DEBUG,
|
||||
"INFO": logging.INFO,
|
||||
"WARNING": logging.WARNING,
|
||||
"ERROR": logging.ERROR,
|
||||
}
|
||||
|
||||
|
||||
def setup_logging() -> None:
|
||||
"""Configure structlog with JSON output."""
|
||||
log_level = "DEBUG" if _config.settings.debug else "INFO"
|
||||
renderer = (
|
||||
structlog.dev.ConsoleRenderer()
|
||||
if _config.settings.debug
|
||||
else structlog.processors.JSONRenderer()
|
||||
)
|
||||
|
||||
structlog.configure(
|
||||
processors=[
|
||||
structlog.contextvars.merge_contextvars,
|
||||
structlog.processors.add_log_level,
|
||||
structlog.processors.TimeStamper(fmt="iso"),
|
||||
renderer,
|
||||
],
|
||||
wrapper_class=structlog.make_filtering_bound_logger(
|
||||
_LOG_LEVELS.get(log_level, logging.INFO)
|
||||
),
|
||||
context_class=dict,
|
||||
logger_factory=structlog.PrintLoggerFactory(),
|
||||
cache_logger_on_first_use=True,
|
||||
)
|
||||
|
||||
|
||||
def get_logger(name: str) -> structlog.BoundLogger:
|
||||
"""Get a named logger."""
|
||||
return structlog.get_logger(name)
|
||||
1
src/atocore/ops/__init__.py
Normal file
1
src/atocore/ops/__init__.py
Normal file
@@ -0,0 +1 @@
|
||||
"""Operational utilities for running AtoCore safely."""
|
||||
636
src/atocore/ops/backup.py
Normal file
636
src/atocore/ops/backup.py
Normal file
@@ -0,0 +1,636 @@
|
||||
"""Create safe runtime backups for the AtoCore machine store.
|
||||
|
||||
This module is intentionally conservative:
|
||||
|
||||
- The SQLite snapshot uses the online ``conn.backup()`` API and is safe to
|
||||
call while the database is in use.
|
||||
- The project registry snapshot is a simple file copy of the canonical
|
||||
registry JSON.
|
||||
- The Chroma snapshot is a *cold* directory copy. To stay safe it must be
|
||||
taken while no ingestion is running. The recommended pattern from the API
|
||||
layer is to acquire ``exclusive_ingestion()`` for the duration of the
|
||||
backup so refreshes and ingestions cannot run concurrently with the copy.
|
||||
|
||||
The backup metadata file records what was actually included so restore
|
||||
tooling does not have to guess.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import shutil
|
||||
import sqlite3
|
||||
from datetime import datetime, UTC
|
||||
from pathlib import Path
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.models.database import init_db
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("backup")
|
||||
|
||||
|
||||
def create_runtime_backup(
|
||||
timestamp: datetime | None = None,
|
||||
include_chroma: bool = False,
|
||||
) -> dict:
|
||||
"""Create a hot SQLite backup plus registry/config metadata.
|
||||
|
||||
When ``include_chroma`` is true the Chroma persistence directory is also
|
||||
snapshotted as a cold directory copy. The caller is responsible for
|
||||
ensuring no ingestion is running concurrently. The HTTP layer enforces
|
||||
this by holding ``exclusive_ingestion()`` around the call.
|
||||
"""
|
||||
init_db()
|
||||
now = timestamp or datetime.now(UTC)
|
||||
stamp = now.strftime("%Y%m%dT%H%M%SZ")
|
||||
|
||||
backup_root = _config.settings.resolved_backup_dir / "snapshots" / stamp
|
||||
db_backup_dir = backup_root / "db"
|
||||
config_backup_dir = backup_root / "config"
|
||||
chroma_backup_dir = backup_root / "chroma"
|
||||
metadata_path = backup_root / "backup-metadata.json"
|
||||
|
||||
db_backup_dir.mkdir(parents=True, exist_ok=True)
|
||||
config_backup_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
db_snapshot_path = db_backup_dir / _config.settings.db_path.name
|
||||
_backup_sqlite_db(_config.settings.db_path, db_snapshot_path)
|
||||
|
||||
registry_snapshot = None
|
||||
registry_path = _config.settings.resolved_project_registry_path
|
||||
if registry_path.exists():
|
||||
registry_snapshot = config_backup_dir / registry_path.name
|
||||
registry_snapshot.write_text(
|
||||
registry_path.read_text(encoding="utf-8"), encoding="utf-8"
|
||||
)
|
||||
|
||||
chroma_snapshot_path = ""
|
||||
chroma_files_copied = 0
|
||||
chroma_bytes_copied = 0
|
||||
if include_chroma:
|
||||
source_chroma = _config.settings.chroma_path
|
||||
if source_chroma.exists() and source_chroma.is_dir():
|
||||
chroma_backup_dir.mkdir(parents=True, exist_ok=True)
|
||||
chroma_files_copied, chroma_bytes_copied = _copy_directory_tree(
|
||||
source_chroma, chroma_backup_dir
|
||||
)
|
||||
chroma_snapshot_path = str(chroma_backup_dir)
|
||||
else:
|
||||
log.info(
|
||||
"chroma_snapshot_skipped_missing",
|
||||
path=str(source_chroma),
|
||||
)
|
||||
|
||||
metadata = {
|
||||
"created_at": now.isoformat(),
|
||||
"backup_root": str(backup_root),
|
||||
"db_snapshot_path": str(db_snapshot_path),
|
||||
"db_size_bytes": db_snapshot_path.stat().st_size,
|
||||
"registry_snapshot_path": str(registry_snapshot) if registry_snapshot else "",
|
||||
"chroma_snapshot_path": chroma_snapshot_path,
|
||||
"chroma_snapshot_bytes": chroma_bytes_copied,
|
||||
"chroma_snapshot_files": chroma_files_copied,
|
||||
"chroma_snapshot_included": include_chroma,
|
||||
"vector_store_note": (
|
||||
"Chroma snapshot included as cold directory copy."
|
||||
if include_chroma and chroma_snapshot_path
|
||||
else "Chroma hot backup is not included; rerun with include_chroma=True under exclusive_ingestion()."
|
||||
),
|
||||
}
|
||||
metadata_path.write_text(
|
||||
json.dumps(metadata, indent=2, ensure_ascii=True) + "\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
# Automatic post-backup validation. Failures log a warning but do
|
||||
# not raise — the backup files are still on disk and may be useful.
|
||||
validation = validate_backup(stamp)
|
||||
validated = validation.get("valid", False)
|
||||
validation_errors = validation.get("errors", [])
|
||||
if not validated:
|
||||
log.warning(
|
||||
"post_backup_validation_failed",
|
||||
backup_root=str(backup_root),
|
||||
errors=validation_errors,
|
||||
)
|
||||
metadata["validated"] = validated
|
||||
metadata["validation_errors"] = validation_errors
|
||||
|
||||
log.info(
|
||||
"runtime_backup_created",
|
||||
backup_root=str(backup_root),
|
||||
db_snapshot=str(db_snapshot_path),
|
||||
chroma_included=include_chroma,
|
||||
chroma_bytes=chroma_bytes_copied,
|
||||
validated=validated,
|
||||
)
|
||||
return metadata
|
||||
|
||||
|
||||
def list_runtime_backups() -> list[dict]:
|
||||
"""List all runtime backups under the configured backup directory."""
|
||||
snapshots_root = _config.settings.resolved_backup_dir / "snapshots"
|
||||
if not snapshots_root.exists() or not snapshots_root.is_dir():
|
||||
return []
|
||||
|
||||
entries: list[dict] = []
|
||||
for snapshot_dir in sorted(snapshots_root.iterdir()):
|
||||
if not snapshot_dir.is_dir():
|
||||
continue
|
||||
metadata_path = snapshot_dir / "backup-metadata.json"
|
||||
entry: dict = {
|
||||
"stamp": snapshot_dir.name,
|
||||
"path": str(snapshot_dir),
|
||||
"has_metadata": metadata_path.exists(),
|
||||
}
|
||||
if metadata_path.exists():
|
||||
try:
|
||||
entry["metadata"] = json.loads(metadata_path.read_text(encoding="utf-8"))
|
||||
except json.JSONDecodeError:
|
||||
entry["metadata"] = None
|
||||
entry["metadata_error"] = "invalid_json"
|
||||
entries.append(entry)
|
||||
return entries
|
||||
|
||||
|
||||
def validate_backup(stamp: str) -> dict:
|
||||
"""Validate that a previously created backup is structurally usable.
|
||||
|
||||
Checks:
|
||||
- the snapshot directory exists
|
||||
- the SQLite snapshot is openable and ``PRAGMA integrity_check`` returns ok
|
||||
- the registry snapshot, if recorded, parses as JSON
|
||||
- the chroma snapshot directory, if recorded, exists
|
||||
"""
|
||||
snapshot_dir = _config.settings.resolved_backup_dir / "snapshots" / stamp
|
||||
result: dict = {
|
||||
"stamp": stamp,
|
||||
"path": str(snapshot_dir),
|
||||
"exists": snapshot_dir.exists(),
|
||||
"db_ok": False,
|
||||
"registry_ok": None,
|
||||
"chroma_ok": None,
|
||||
"errors": [],
|
||||
}
|
||||
if not snapshot_dir.exists():
|
||||
result["errors"].append("snapshot_directory_missing")
|
||||
return result
|
||||
|
||||
metadata_path = snapshot_dir / "backup-metadata.json"
|
||||
if not metadata_path.exists():
|
||||
result["errors"].append("metadata_missing")
|
||||
return result
|
||||
|
||||
try:
|
||||
metadata = json.loads(metadata_path.read_text(encoding="utf-8"))
|
||||
except json.JSONDecodeError as exc:
|
||||
result["errors"].append(f"metadata_invalid_json: {exc}")
|
||||
return result
|
||||
result["metadata"] = metadata
|
||||
|
||||
db_path = Path(metadata.get("db_snapshot_path", ""))
|
||||
if not db_path.exists():
|
||||
result["errors"].append("db_snapshot_missing")
|
||||
else:
|
||||
try:
|
||||
with sqlite3.connect(str(db_path)) as conn:
|
||||
row = conn.execute("PRAGMA integrity_check").fetchone()
|
||||
result["db_ok"] = bool(row and row[0] == "ok")
|
||||
if not result["db_ok"]:
|
||||
result["errors"].append(
|
||||
f"db_integrity_check_failed: {row[0] if row else 'no_row'}"
|
||||
)
|
||||
except sqlite3.DatabaseError as exc:
|
||||
result["errors"].append(f"db_open_failed: {exc}")
|
||||
|
||||
registry_snapshot_path = metadata.get("registry_snapshot_path", "")
|
||||
if registry_snapshot_path:
|
||||
registry_path = Path(registry_snapshot_path)
|
||||
if not registry_path.exists():
|
||||
result["registry_ok"] = False
|
||||
result["errors"].append("registry_snapshot_missing")
|
||||
else:
|
||||
try:
|
||||
json.loads(registry_path.read_text(encoding="utf-8"))
|
||||
result["registry_ok"] = True
|
||||
except json.JSONDecodeError as exc:
|
||||
result["registry_ok"] = False
|
||||
result["errors"].append(f"registry_invalid_json: {exc}")
|
||||
|
||||
chroma_snapshot_path = metadata.get("chroma_snapshot_path", "")
|
||||
if chroma_snapshot_path:
|
||||
chroma_dir = Path(chroma_snapshot_path)
|
||||
if chroma_dir.exists() and chroma_dir.is_dir():
|
||||
result["chroma_ok"] = True
|
||||
else:
|
||||
result["chroma_ok"] = False
|
||||
result["errors"].append("chroma_snapshot_missing")
|
||||
|
||||
result["valid"] = not result["errors"]
|
||||
return result
|
||||
|
||||
|
||||
def restore_runtime_backup(
|
||||
stamp: str,
|
||||
*,
|
||||
include_chroma: bool | None = None,
|
||||
pre_restore_snapshot: bool = True,
|
||||
confirm_service_stopped: bool = False,
|
||||
) -> dict:
|
||||
"""Restore a previously captured runtime backup.
|
||||
|
||||
CRITICAL: the AtoCore service MUST be stopped before calling this.
|
||||
Overwriting a live SQLite database corrupts state and can break
|
||||
the running container's open connections. The caller must pass
|
||||
``confirm_service_stopped=True`` as an explicit acknowledgment —
|
||||
otherwise this function refuses to run.
|
||||
|
||||
The restore procedure:
|
||||
|
||||
1. Validate the backup via ``validate_backup``; refuse on any error.
|
||||
2. (default) Create a pre-restore safety snapshot of the CURRENT
|
||||
state so the restore itself is reversible. The snapshot stamp
|
||||
is returned in the result for the operator to record.
|
||||
3. Remove stale SQLite WAL/SHM sidecar files next to the target db
|
||||
before copying — the snapshot is a self-contained main-file
|
||||
image from ``conn.backup()``, and leftover WAL/SHM from the old
|
||||
live db would desync against the restored main file.
|
||||
4. Copy the snapshot db over the target db path.
|
||||
5. Restore the project registry file if the snapshot captured one.
|
||||
6. Restore the Chroma directory if ``include_chroma`` resolves to
|
||||
true. When ``include_chroma is None`` the function defers to
|
||||
whether the snapshot captured Chroma (the common case).
|
||||
7. Run ``PRAGMA integrity_check`` on the restored db and report
|
||||
the result.
|
||||
|
||||
Returns a dict describing what was restored. On refused restore
|
||||
(service still running, validation failed) raises ``RuntimeError``.
|
||||
"""
|
||||
if not confirm_service_stopped:
|
||||
raise RuntimeError(
|
||||
"restore_runtime_backup refuses to run without "
|
||||
"confirm_service_stopped=True — stop the AtoCore container "
|
||||
"first (e.g. `docker compose down` from deploy/dalidou) "
|
||||
"before calling this function"
|
||||
)
|
||||
|
||||
validation = validate_backup(stamp)
|
||||
if not validation.get("valid"):
|
||||
raise RuntimeError(
|
||||
f"backup {stamp} failed validation: {validation.get('errors')}"
|
||||
)
|
||||
metadata = validation.get("metadata") or {}
|
||||
|
||||
pre_snapshot_stamp: str | None = None
|
||||
if pre_restore_snapshot:
|
||||
pre = create_runtime_backup(include_chroma=False)
|
||||
pre_snapshot_stamp = Path(pre["backup_root"]).name
|
||||
|
||||
target_db = _config.settings.db_path
|
||||
source_db = Path(metadata.get("db_snapshot_path", ""))
|
||||
if not source_db.exists():
|
||||
raise RuntimeError(
|
||||
f"db snapshot not found at {source_db} — backup "
|
||||
f"metadata may be stale"
|
||||
)
|
||||
|
||||
# Force sqlite to flush any lingering WAL into the main file and
|
||||
# release OS-level file handles on -wal/-shm before we swap the
|
||||
# main file. Passing through conn.backup() in the pre-restore
|
||||
# snapshot can leave sidecars momentarily locked on Windows;
|
||||
# an explicit checkpoint(TRUNCATE) is the reliable way to flush
|
||||
# and release. Best-effort: if the target db can't be opened
|
||||
# (missing, corrupt), fall through and trust the copy step.
|
||||
if target_db.exists():
|
||||
try:
|
||||
with sqlite3.connect(str(target_db)) as checkpoint_conn:
|
||||
checkpoint_conn.execute("PRAGMA wal_checkpoint(TRUNCATE)")
|
||||
except sqlite3.DatabaseError as exc:
|
||||
log.warning(
|
||||
"restore_pre_checkpoint_failed",
|
||||
target_db=str(target_db),
|
||||
error=str(exc),
|
||||
)
|
||||
|
||||
# Remove stale WAL/SHM sidecars from the old live db so SQLite
|
||||
# can't read inconsistent state on next open. Tolerant to
|
||||
# Windows file-lock races — the subsequent copy replaces the
|
||||
# main file anyway, and the integrity check afterward is the
|
||||
# actual correctness signal.
|
||||
wal_path = target_db.with_name(target_db.name + "-wal")
|
||||
shm_path = target_db.with_name(target_db.name + "-shm")
|
||||
for stale in (wal_path, shm_path):
|
||||
if stale.exists():
|
||||
try:
|
||||
stale.unlink()
|
||||
except OSError as exc:
|
||||
log.warning(
|
||||
"restore_sidecar_unlink_failed",
|
||||
path=str(stale),
|
||||
error=str(exc),
|
||||
)
|
||||
|
||||
target_db.parent.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copy2(source_db, target_db)
|
||||
|
||||
registry_restored = False
|
||||
registry_snapshot_path = metadata.get("registry_snapshot_path", "")
|
||||
if registry_snapshot_path:
|
||||
src_reg = Path(registry_snapshot_path)
|
||||
if src_reg.exists():
|
||||
dst_reg = _config.settings.resolved_project_registry_path
|
||||
dst_reg.parent.mkdir(parents=True, exist_ok=True)
|
||||
shutil.copy2(src_reg, dst_reg)
|
||||
registry_restored = True
|
||||
|
||||
chroma_snapshot_path = metadata.get("chroma_snapshot_path", "")
|
||||
if include_chroma is None:
|
||||
include_chroma = bool(chroma_snapshot_path)
|
||||
chroma_restored = False
|
||||
if include_chroma and chroma_snapshot_path:
|
||||
src_chroma = Path(chroma_snapshot_path)
|
||||
if src_chroma.exists() and src_chroma.is_dir():
|
||||
dst_chroma = _config.settings.chroma_path
|
||||
# Do NOT rmtree the destination itself: in a Dockerized
|
||||
# deployment the chroma dir is a bind-mounted volume, and
|
||||
# unlinking a mount point raises
|
||||
# OSError [Errno 16] Device or resource busy.
|
||||
# Instead, clear the directory's CONTENTS and copytree into
|
||||
# it with dirs_exist_ok=True. This is equivalent to an
|
||||
# rmtree+copytree for restore purposes but stays inside the
|
||||
# mount boundary. Discovered during the first real restore
|
||||
# drill on Dalidou (2026-04-09).
|
||||
dst_chroma.mkdir(parents=True, exist_ok=True)
|
||||
for item in dst_chroma.iterdir():
|
||||
if item.is_dir() and not item.is_symlink():
|
||||
shutil.rmtree(item)
|
||||
else:
|
||||
item.unlink()
|
||||
shutil.copytree(src_chroma, dst_chroma, dirs_exist_ok=True)
|
||||
chroma_restored = True
|
||||
|
||||
restored_integrity_ok = False
|
||||
integrity_error: str | None = None
|
||||
try:
|
||||
with sqlite3.connect(str(target_db)) as conn:
|
||||
row = conn.execute("PRAGMA integrity_check").fetchone()
|
||||
restored_integrity_ok = bool(row and row[0] == "ok")
|
||||
if not restored_integrity_ok:
|
||||
integrity_error = row[0] if row else "no_row"
|
||||
except sqlite3.DatabaseError as exc:
|
||||
integrity_error = f"db_open_failed: {exc}"
|
||||
|
||||
result: dict = {
|
||||
"stamp": stamp,
|
||||
"pre_restore_snapshot": pre_snapshot_stamp,
|
||||
"target_db": str(target_db),
|
||||
"db_restored": True,
|
||||
"registry_restored": registry_restored,
|
||||
"chroma_restored": chroma_restored,
|
||||
"restored_integrity_ok": restored_integrity_ok,
|
||||
}
|
||||
if integrity_error:
|
||||
result["integrity_error"] = integrity_error
|
||||
|
||||
log.info(
|
||||
"runtime_backup_restored",
|
||||
stamp=stamp,
|
||||
pre_restore_snapshot=pre_snapshot_stamp,
|
||||
registry_restored=registry_restored,
|
||||
chroma_restored=chroma_restored,
|
||||
integrity_ok=restored_integrity_ok,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
def cleanup_old_backups(*, confirm: bool = False) -> dict:
|
||||
"""Apply retention policy and remove old snapshots.
|
||||
|
||||
Retention keeps:
|
||||
- Last 7 daily snapshots (most recent per calendar day)
|
||||
- Last 4 weekly snapshots (most recent on each Sunday)
|
||||
- Last 6 monthly snapshots (most recent on the 1st of each month)
|
||||
|
||||
All other snapshots are candidates for deletion. Runs as dry-run by
|
||||
default; pass ``confirm=True`` to actually delete.
|
||||
|
||||
Returns a dict with kept/deleted counts and any errors.
|
||||
"""
|
||||
snapshots_root = _config.settings.resolved_backup_dir / "snapshots"
|
||||
if not snapshots_root.exists() or not snapshots_root.is_dir():
|
||||
return {"kept": 0, "deleted": 0, "would_delete": 0, "dry_run": not confirm, "errors": []}
|
||||
|
||||
# Parse all stamp directories into (datetime, dir_path) pairs.
|
||||
stamps: list[tuple[datetime, Path]] = []
|
||||
unparseable: list[str] = []
|
||||
for entry in sorted(snapshots_root.iterdir()):
|
||||
if not entry.is_dir():
|
||||
continue
|
||||
try:
|
||||
dt = datetime.strptime(entry.name, "%Y%m%dT%H%M%SZ").replace(tzinfo=UTC)
|
||||
stamps.append((dt, entry))
|
||||
except ValueError:
|
||||
unparseable.append(entry.name)
|
||||
|
||||
if not stamps:
|
||||
return {
|
||||
"kept": 0, "deleted": 0, "would_delete": 0,
|
||||
"dry_run": not confirm, "errors": [],
|
||||
"unparseable": unparseable,
|
||||
}
|
||||
|
||||
# Sort newest first so "most recent per bucket" is a simple first-seen.
|
||||
stamps.sort(key=lambda t: t[0], reverse=True)
|
||||
|
||||
keep_set: set[Path] = set()
|
||||
|
||||
# Last 7 daily: most recent snapshot per calendar day.
|
||||
seen_days: set[str] = set()
|
||||
for dt, path in stamps:
|
||||
day_key = dt.strftime("%Y-%m-%d")
|
||||
if day_key not in seen_days:
|
||||
seen_days.add(day_key)
|
||||
keep_set.add(path)
|
||||
if len(seen_days) >= 7:
|
||||
break
|
||||
|
||||
# Last 4 weekly: most recent snapshot that falls on a Sunday.
|
||||
seen_weeks: set[str] = set()
|
||||
for dt, path in stamps:
|
||||
if dt.weekday() == 6: # Sunday
|
||||
week_key = dt.strftime("%Y-W%W")
|
||||
if week_key not in seen_weeks:
|
||||
seen_weeks.add(week_key)
|
||||
keep_set.add(path)
|
||||
if len(seen_weeks) >= 4:
|
||||
break
|
||||
|
||||
# Last 6 monthly: most recent snapshot on the 1st of a month.
|
||||
seen_months: set[str] = set()
|
||||
for dt, path in stamps:
|
||||
if dt.day == 1:
|
||||
month_key = dt.strftime("%Y-%m")
|
||||
if month_key not in seen_months:
|
||||
seen_months.add(month_key)
|
||||
keep_set.add(path)
|
||||
if len(seen_months) >= 6:
|
||||
break
|
||||
|
||||
to_delete = [path for _, path in stamps if path not in keep_set]
|
||||
|
||||
errors: list[str] = []
|
||||
deleted_count = 0
|
||||
if confirm:
|
||||
for path in to_delete:
|
||||
try:
|
||||
shutil.rmtree(path)
|
||||
deleted_count += 1
|
||||
except OSError as exc:
|
||||
errors.append(f"{path.name}: {exc}")
|
||||
|
||||
result: dict = {
|
||||
"kept": len(keep_set),
|
||||
"dry_run": not confirm,
|
||||
"errors": errors,
|
||||
}
|
||||
if confirm:
|
||||
result["deleted"] = deleted_count
|
||||
else:
|
||||
result["would_delete"] = len(to_delete)
|
||||
if unparseable:
|
||||
result["unparseable"] = unparseable
|
||||
|
||||
log.info(
|
||||
"cleanup_old_backups",
|
||||
kept=len(keep_set),
|
||||
deleted=deleted_count if confirm else 0,
|
||||
would_delete=len(to_delete) if not confirm else 0,
|
||||
dry_run=not confirm,
|
||||
)
|
||||
return result
|
||||
|
||||
|
||||
def _backup_sqlite_db(source_path: Path, dest_path: Path) -> None:
|
||||
source_conn = sqlite3.connect(str(source_path))
|
||||
dest_conn = sqlite3.connect(str(dest_path))
|
||||
try:
|
||||
source_conn.backup(dest_conn)
|
||||
finally:
|
||||
dest_conn.close()
|
||||
source_conn.close()
|
||||
|
||||
|
||||
def _copy_directory_tree(source: Path, dest: Path) -> tuple[int, int]:
|
||||
"""Copy a directory tree and return (file_count, total_bytes)."""
|
||||
if dest.exists():
|
||||
shutil.rmtree(dest)
|
||||
shutil.copytree(source, dest)
|
||||
|
||||
file_count = 0
|
||||
total_bytes = 0
|
||||
for path in dest.rglob("*"):
|
||||
if path.is_file():
|
||||
file_count += 1
|
||||
total_bytes += path.stat().st_size
|
||||
return file_count, total_bytes
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""CLI entry point for the backup module.
|
||||
|
||||
Supports four subcommands:
|
||||
|
||||
- ``create`` run ``create_runtime_backup`` (default if none given)
|
||||
- ``list`` list all runtime backup snapshots
|
||||
- ``validate`` validate a specific snapshot by stamp
|
||||
- ``restore`` restore a specific snapshot by stamp
|
||||
|
||||
The restore subcommand is the one used by the backup/restore drill
|
||||
and MUST be run only when the AtoCore service is stopped. It takes
|
||||
``--confirm-service-stopped`` as an explicit acknowledgment.
|
||||
"""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(
|
||||
prog="python -m atocore.ops.backup",
|
||||
description="AtoCore runtime backup create/list/validate/restore",
|
||||
)
|
||||
sub = parser.add_subparsers(dest="command")
|
||||
|
||||
p_create = sub.add_parser("create", help="create a new runtime backup")
|
||||
p_create.add_argument(
|
||||
"--chroma",
|
||||
action="store_true",
|
||||
help="also snapshot the Chroma vector store (cold copy)",
|
||||
)
|
||||
|
||||
sub.add_parser("list", help="list runtime backup snapshots")
|
||||
|
||||
p_validate = sub.add_parser("validate", help="validate a snapshot by stamp")
|
||||
p_validate.add_argument("stamp", help="snapshot stamp (e.g. 20260409T010203Z)")
|
||||
|
||||
p_cleanup = sub.add_parser("cleanup", help="remove old snapshots per retention policy")
|
||||
p_cleanup.add_argument(
|
||||
"--confirm",
|
||||
action="store_true",
|
||||
help="actually delete (default is dry-run)",
|
||||
)
|
||||
|
||||
p_restore = sub.add_parser(
|
||||
"restore",
|
||||
help="restore a snapshot by stamp (service must be stopped)",
|
||||
)
|
||||
p_restore.add_argument("stamp", help="snapshot stamp to restore")
|
||||
p_restore.add_argument(
|
||||
"--confirm-service-stopped",
|
||||
action="store_true",
|
||||
help="explicit acknowledgment that the AtoCore container is stopped",
|
||||
)
|
||||
p_restore.add_argument(
|
||||
"--no-pre-snapshot",
|
||||
action="store_true",
|
||||
help="skip the pre-restore safety snapshot of current state",
|
||||
)
|
||||
chroma_group = p_restore.add_mutually_exclusive_group()
|
||||
chroma_group.add_argument(
|
||||
"--chroma",
|
||||
dest="include_chroma",
|
||||
action="store_true",
|
||||
default=None,
|
||||
help="force-restore the Chroma snapshot",
|
||||
)
|
||||
chroma_group.add_argument(
|
||||
"--no-chroma",
|
||||
dest="include_chroma",
|
||||
action="store_false",
|
||||
help="skip the Chroma snapshot even if it was captured",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
command = args.command or "create"
|
||||
|
||||
if command == "create":
|
||||
include_chroma = getattr(args, "chroma", False)
|
||||
result = create_runtime_backup(include_chroma=include_chroma)
|
||||
elif command == "list":
|
||||
result = {"backups": list_runtime_backups()}
|
||||
elif command == "validate":
|
||||
result = validate_backup(args.stamp)
|
||||
elif command == "cleanup":
|
||||
result = cleanup_old_backups(confirm=getattr(args, "confirm", False))
|
||||
elif command == "restore":
|
||||
result = restore_runtime_backup(
|
||||
args.stamp,
|
||||
include_chroma=args.include_chroma,
|
||||
pre_restore_snapshot=not args.no_pre_snapshot,
|
||||
confirm_service_stopped=args.confirm_service_stopped,
|
||||
)
|
||||
else: # pragma: no cover — argparse guards this
|
||||
parser.error(f"unknown command: {command}")
|
||||
|
||||
print(json.dumps(result, indent=2, ensure_ascii=True))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1
src/atocore/projects/__init__.py
Normal file
1
src/atocore/projects/__init__.py
Normal file
@@ -0,0 +1 @@
|
||||
"""Project registry and source refresh helpers."""
|
||||
461
src/atocore/projects/registry.py
Normal file
461
src/atocore/projects/registry.py
Normal file
@@ -0,0 +1,461 @@
|
||||
"""Registered project source metadata and refresh helpers."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import tempfile
|
||||
from dataclasses import asdict, dataclass
|
||||
from pathlib import Path
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.ingestion.pipeline import ingest_folder
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ProjectSourceRef:
|
||||
source: str
|
||||
subpath: str
|
||||
label: str = ""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class RegisteredProject:
|
||||
project_id: str
|
||||
aliases: tuple[str, ...]
|
||||
description: str
|
||||
ingest_roots: tuple[ProjectSourceRef, ...]
|
||||
|
||||
|
||||
def get_project_registry_template() -> dict:
|
||||
"""Return a minimal template for registering a new project."""
|
||||
return {
|
||||
"projects": [
|
||||
{
|
||||
"id": "p07-example",
|
||||
"aliases": ["p07", "example-project"],
|
||||
"description": "Short description of the project and staged corpus.",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p07-example",
|
||||
"label": "Primary staged project docs",
|
||||
}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
def build_project_registration_proposal(
|
||||
project_id: str,
|
||||
aliases: list[str] | tuple[str, ...] | None = None,
|
||||
description: str = "",
|
||||
ingest_roots: list[dict] | tuple[dict, ...] | None = None,
|
||||
) -> dict:
|
||||
"""Build a normalized project registration proposal without mutating state."""
|
||||
normalized_id = project_id.strip()
|
||||
if not normalized_id:
|
||||
raise ValueError("Project id must be non-empty")
|
||||
|
||||
normalized_aliases = _normalize_aliases(aliases or [])
|
||||
normalized_roots = _normalize_ingest_roots(ingest_roots or [])
|
||||
if not normalized_roots:
|
||||
raise ValueError("At least one ingest root is required")
|
||||
|
||||
collisions = _find_name_collisions(normalized_id, normalized_aliases)
|
||||
resolved_roots = []
|
||||
for root in normalized_roots:
|
||||
source_ref = ProjectSourceRef(
|
||||
source=root["source"],
|
||||
subpath=root["subpath"],
|
||||
label=root.get("label", ""),
|
||||
)
|
||||
resolved_path = _resolve_ingest_root(source_ref)
|
||||
resolved_roots.append(
|
||||
{
|
||||
**root,
|
||||
"path": str(resolved_path),
|
||||
"exists": resolved_path.exists(),
|
||||
"is_dir": resolved_path.is_dir(),
|
||||
}
|
||||
)
|
||||
|
||||
return {
|
||||
"project": {
|
||||
"id": normalized_id,
|
||||
"aliases": normalized_aliases,
|
||||
"description": description.strip(),
|
||||
"ingest_roots": normalized_roots,
|
||||
},
|
||||
"resolved_ingest_roots": resolved_roots,
|
||||
"collisions": collisions,
|
||||
"registry_path": str(_config.settings.resolved_project_registry_path),
|
||||
"valid": not collisions,
|
||||
}
|
||||
|
||||
|
||||
def register_project(
|
||||
project_id: str,
|
||||
aliases: list[str] | tuple[str, ...] | None = None,
|
||||
description: str = "",
|
||||
ingest_roots: list[dict] | tuple[dict, ...] | None = None,
|
||||
) -> dict:
|
||||
"""Persist a validated project registration to the registry file."""
|
||||
proposal = build_project_registration_proposal(
|
||||
project_id=project_id,
|
||||
aliases=aliases,
|
||||
description=description,
|
||||
ingest_roots=ingest_roots,
|
||||
)
|
||||
if not proposal["valid"]:
|
||||
collision_names = ", ".join(collision["name"] for collision in proposal["collisions"])
|
||||
raise ValueError(f"Project registration has collisions: {collision_names}")
|
||||
|
||||
registry_path = _config.settings.resolved_project_registry_path
|
||||
payload = _load_registry_payload(registry_path)
|
||||
payload.setdefault("projects", []).append(proposal["project"])
|
||||
_write_registry_payload(registry_path, payload)
|
||||
|
||||
return {
|
||||
**proposal,
|
||||
"status": "registered",
|
||||
}
|
||||
|
||||
|
||||
def update_project(
|
||||
project_name: str,
|
||||
aliases: list[str] | tuple[str, ...] | None = None,
|
||||
description: str | None = None,
|
||||
ingest_roots: list[dict] | tuple[dict, ...] | None = None,
|
||||
) -> dict:
|
||||
"""Update an existing project registration in the registry file."""
|
||||
existing = get_registered_project(project_name)
|
||||
if existing is None:
|
||||
raise ValueError(f"Unknown project: {project_name}")
|
||||
|
||||
final_aliases = _normalize_aliases(aliases) if aliases is not None else list(existing.aliases)
|
||||
final_description = description.strip() if description is not None else existing.description
|
||||
final_roots = (
|
||||
_normalize_ingest_roots(ingest_roots)
|
||||
if ingest_roots is not None
|
||||
else [asdict(root) for root in existing.ingest_roots]
|
||||
)
|
||||
if not final_roots:
|
||||
raise ValueError("At least one ingest root is required")
|
||||
|
||||
collisions = _find_name_collisions(
|
||||
existing.project_id,
|
||||
final_aliases,
|
||||
exclude_project_id=existing.project_id,
|
||||
)
|
||||
if collisions:
|
||||
collision_names = ", ".join(collision["name"] for collision in collisions)
|
||||
raise ValueError(f"Project update has collisions: {collision_names}")
|
||||
|
||||
updated_entry = {
|
||||
"id": existing.project_id,
|
||||
"aliases": final_aliases,
|
||||
"description": final_description,
|
||||
"ingest_roots": final_roots,
|
||||
}
|
||||
|
||||
resolved_roots = []
|
||||
for root in final_roots:
|
||||
source_ref = ProjectSourceRef(
|
||||
source=root["source"],
|
||||
subpath=root["subpath"],
|
||||
label=root.get("label", ""),
|
||||
)
|
||||
resolved_path = _resolve_ingest_root(source_ref)
|
||||
resolved_roots.append(
|
||||
{
|
||||
**root,
|
||||
"path": str(resolved_path),
|
||||
"exists": resolved_path.exists(),
|
||||
"is_dir": resolved_path.is_dir(),
|
||||
}
|
||||
)
|
||||
|
||||
registry_path = _config.settings.resolved_project_registry_path
|
||||
payload = _load_registry_payload(registry_path)
|
||||
payload["projects"] = [
|
||||
updated_entry if str(entry.get("id", "")).strip() == existing.project_id else entry
|
||||
for entry in payload.get("projects", [])
|
||||
]
|
||||
_write_registry_payload(registry_path, payload)
|
||||
|
||||
return {
|
||||
"project": updated_entry,
|
||||
"resolved_ingest_roots": resolved_roots,
|
||||
"collisions": [],
|
||||
"registry_path": str(registry_path),
|
||||
"valid": True,
|
||||
"status": "updated",
|
||||
}
|
||||
|
||||
|
||||
def load_project_registry() -> list[RegisteredProject]:
|
||||
"""Load project registry entries from JSON config."""
|
||||
registry_path = _config.settings.resolved_project_registry_path
|
||||
payload = _load_registry_payload(registry_path)
|
||||
entries = payload.get("projects", [])
|
||||
projects: list[RegisteredProject] = []
|
||||
|
||||
for entry in entries:
|
||||
project_id = str(entry["id"]).strip()
|
||||
if not project_id:
|
||||
raise ValueError("Project registry entry is missing a non-empty id")
|
||||
aliases = tuple(
|
||||
alias.strip()
|
||||
for alias in entry.get("aliases", [])
|
||||
if isinstance(alias, str) and alias.strip()
|
||||
)
|
||||
description = str(entry.get("description", "")).strip()
|
||||
ingest_roots = tuple(
|
||||
ProjectSourceRef(
|
||||
source=str(root["source"]).strip(),
|
||||
subpath=str(root["subpath"]).strip(),
|
||||
label=str(root.get("label", "")).strip(),
|
||||
)
|
||||
for root in entry.get("ingest_roots", [])
|
||||
if str(root.get("source", "")).strip()
|
||||
and str(root.get("subpath", "")).strip()
|
||||
)
|
||||
if not ingest_roots:
|
||||
raise ValueError(f"Project registry entry '{project_id}' has no ingest_roots")
|
||||
projects.append(
|
||||
RegisteredProject(
|
||||
project_id=project_id,
|
||||
aliases=aliases,
|
||||
description=description,
|
||||
ingest_roots=ingest_roots,
|
||||
)
|
||||
)
|
||||
|
||||
_validate_unique_project_names(projects)
|
||||
return projects
|
||||
|
||||
|
||||
def list_registered_projects() -> list[dict]:
|
||||
"""Return registry entries with resolved source readiness."""
|
||||
return [_project_to_dict(project) for project in load_project_registry()]
|
||||
|
||||
|
||||
def get_registered_project(project_name: str) -> RegisteredProject | None:
|
||||
"""Resolve a registry entry by id or alias."""
|
||||
needle = project_name.strip().lower()
|
||||
if not needle:
|
||||
return None
|
||||
|
||||
for project in load_project_registry():
|
||||
candidates = {project.project_id.lower(), *(alias.lower() for alias in project.aliases)}
|
||||
if needle in candidates:
|
||||
return project
|
||||
return None
|
||||
|
||||
|
||||
def resolve_project_name(name: str | None) -> str:
|
||||
"""Canonicalize a project name through the registry.
|
||||
|
||||
Returns the canonical ``project_id`` if the input matches any
|
||||
registered project's id or alias. Returns the input unchanged
|
||||
when it's empty or not in the registry — the second case keeps
|
||||
backwards compatibility with hand-curated state, memories, and
|
||||
interactions that predate the registry, or for projects that
|
||||
are intentionally not registered.
|
||||
|
||||
This helper is the single canonicalization boundary for project
|
||||
names across the trust hierarchy. Every read/write that takes a
|
||||
project name should pass it through ``resolve_project_name``
|
||||
before storing or querying. The contract is documented in
|
||||
``docs/architecture/representation-authority.md``.
|
||||
"""
|
||||
if not name:
|
||||
return name or ""
|
||||
project = get_registered_project(name)
|
||||
if project is not None:
|
||||
return project.project_id
|
||||
return name
|
||||
|
||||
|
||||
def refresh_registered_project(project_name: str, purge_deleted: bool = False) -> dict:
|
||||
"""Ingest all configured source roots for a registered project.
|
||||
|
||||
The returned dict carries an overall ``status`` so callers can tell at a
|
||||
glance whether the refresh was fully successful, partial, or did nothing
|
||||
at all because every configured root was missing or not a directory:
|
||||
|
||||
- ``ingested``: every root was a real directory and was ingested
|
||||
- ``partial``: at least one root ingested and at least one was unusable
|
||||
- ``nothing_to_ingest``: no roots were usable
|
||||
"""
|
||||
project = get_registered_project(project_name)
|
||||
if project is None:
|
||||
raise ValueError(f"Unknown project: {project_name}")
|
||||
|
||||
roots = []
|
||||
ingested_count = 0
|
||||
skipped_count = 0
|
||||
for source_ref in project.ingest_roots:
|
||||
resolved = _resolve_ingest_root(source_ref)
|
||||
root_result = {
|
||||
"source": source_ref.source,
|
||||
"subpath": source_ref.subpath,
|
||||
"label": source_ref.label,
|
||||
"path": str(resolved),
|
||||
}
|
||||
if not resolved.exists():
|
||||
roots.append({**root_result, "status": "missing"})
|
||||
skipped_count += 1
|
||||
continue
|
||||
if not resolved.is_dir():
|
||||
roots.append({**root_result, "status": "not_directory"})
|
||||
skipped_count += 1
|
||||
continue
|
||||
|
||||
roots.append(
|
||||
{
|
||||
**root_result,
|
||||
"status": "ingested",
|
||||
"results": ingest_folder(resolved, purge_deleted=purge_deleted),
|
||||
}
|
||||
)
|
||||
ingested_count += 1
|
||||
|
||||
if ingested_count == 0:
|
||||
overall_status = "nothing_to_ingest"
|
||||
elif skipped_count == 0:
|
||||
overall_status = "ingested"
|
||||
else:
|
||||
overall_status = "partial"
|
||||
|
||||
return {
|
||||
"project": project.project_id,
|
||||
"aliases": list(project.aliases),
|
||||
"description": project.description,
|
||||
"purge_deleted": purge_deleted,
|
||||
"status": overall_status,
|
||||
"roots_ingested": ingested_count,
|
||||
"roots_skipped": skipped_count,
|
||||
"roots": roots,
|
||||
}
|
||||
|
||||
|
||||
def _normalize_aliases(aliases: list[str] | tuple[str, ...]) -> list[str]:
|
||||
deduped: list[str] = []
|
||||
seen: set[str] = set()
|
||||
for alias in aliases:
|
||||
candidate = alias.strip()
|
||||
if not candidate:
|
||||
continue
|
||||
key = candidate.lower()
|
||||
if key in seen:
|
||||
continue
|
||||
seen.add(key)
|
||||
deduped.append(candidate)
|
||||
return deduped
|
||||
|
||||
|
||||
def _normalize_ingest_roots(ingest_roots: list[dict] | tuple[dict, ...]) -> list[dict]:
|
||||
normalized: list[dict] = []
|
||||
for root in ingest_roots:
|
||||
source = str(root.get("source", "")).strip()
|
||||
subpath = str(root.get("subpath", "")).strip()
|
||||
label = str(root.get("label", "")).strip()
|
||||
if not source or not subpath:
|
||||
continue
|
||||
if source not in {"vault", "drive"}:
|
||||
raise ValueError(f"Unsupported source root: {source}")
|
||||
normalized.append({"source": source, "subpath": subpath, "label": label})
|
||||
return normalized
|
||||
|
||||
|
||||
def _project_to_dict(project: RegisteredProject) -> dict:
|
||||
return {
|
||||
"id": project.project_id,
|
||||
"aliases": list(project.aliases),
|
||||
"description": project.description,
|
||||
"ingest_roots": [
|
||||
{
|
||||
**asdict(source_ref),
|
||||
"path": str(_resolve_ingest_root(source_ref)),
|
||||
"exists": _resolve_ingest_root(source_ref).exists(),
|
||||
"is_dir": _resolve_ingest_root(source_ref).is_dir(),
|
||||
}
|
||||
for source_ref in project.ingest_roots
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def _resolve_ingest_root(source_ref: ProjectSourceRef) -> Path:
|
||||
base_map = {
|
||||
"vault": _config.settings.resolved_vault_source_dir,
|
||||
"drive": _config.settings.resolved_drive_source_dir,
|
||||
}
|
||||
try:
|
||||
base_dir = base_map[source_ref.source]
|
||||
except KeyError as exc:
|
||||
raise ValueError(f"Unsupported source root: {source_ref.source}") from exc
|
||||
|
||||
return (base_dir / source_ref.subpath).resolve(strict=False)
|
||||
|
||||
|
||||
def _validate_unique_project_names(projects: list[RegisteredProject]) -> None:
|
||||
seen: dict[str, str] = {}
|
||||
for project in projects:
|
||||
names = [project.project_id, *project.aliases]
|
||||
for name in names:
|
||||
key = name.lower()
|
||||
if key in seen and seen[key] != project.project_id:
|
||||
raise ValueError(
|
||||
f"Project registry name collision: '{name}' is used by both "
|
||||
f"'{seen[key]}' and '{project.project_id}'"
|
||||
)
|
||||
seen[key] = project.project_id
|
||||
|
||||
|
||||
def _find_name_collisions(
|
||||
project_id: str,
|
||||
aliases: list[str],
|
||||
exclude_project_id: str | None = None,
|
||||
) -> list[dict]:
|
||||
collisions: list[dict] = []
|
||||
existing = load_project_registry()
|
||||
requested_names = [project_id, *aliases]
|
||||
for requested in requested_names:
|
||||
requested_key = requested.lower()
|
||||
for project in existing:
|
||||
if exclude_project_id is not None and project.project_id == exclude_project_id:
|
||||
continue
|
||||
project_names = [project.project_id, *project.aliases]
|
||||
if requested_key in {name.lower() for name in project_names}:
|
||||
collisions.append(
|
||||
{
|
||||
"name": requested,
|
||||
"existing_project": project.project_id,
|
||||
}
|
||||
)
|
||||
break
|
||||
return collisions
|
||||
|
||||
|
||||
def _load_registry_payload(registry_path: Path) -> dict:
|
||||
if not registry_path.exists():
|
||||
return {"projects": []}
|
||||
return json.loads(registry_path.read_text(encoding="utf-8"))
|
||||
|
||||
|
||||
def _write_registry_payload(registry_path: Path, payload: dict) -> None:
|
||||
registry_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
rendered = json.dumps(payload, indent=2, ensure_ascii=True) + "\n"
|
||||
with tempfile.NamedTemporaryFile(
|
||||
mode="w",
|
||||
encoding="utf-8",
|
||||
dir=registry_path.parent,
|
||||
prefix=f"{registry_path.stem}.",
|
||||
suffix=".tmp",
|
||||
delete=False,
|
||||
) as tmp_file:
|
||||
tmp_file.write(rendered)
|
||||
temp_path = Path(tmp_file.name)
|
||||
temp_path.replace(registry_path)
|
||||
0
src/atocore/retrieval/__init__.py
Normal file
0
src/atocore/retrieval/__init__.py
Normal file
32
src/atocore/retrieval/embeddings.py
Normal file
32
src/atocore/retrieval/embeddings.py
Normal file
@@ -0,0 +1,32 @@
|
||||
"""Embedding model management."""
|
||||
|
||||
import atocore.config as _config
|
||||
from sentence_transformers import SentenceTransformer
|
||||
|
||||
from atocore.observability.logger import get_logger
|
||||
|
||||
log = get_logger("embeddings")
|
||||
|
||||
_model: SentenceTransformer | None = None
|
||||
|
||||
|
||||
def get_model() -> SentenceTransformer:
|
||||
"""Load and cache the embedding model."""
|
||||
global _model
|
||||
if _model is None:
|
||||
log.info("loading_embedding_model", model=_config.settings.embedding_model)
|
||||
_model = SentenceTransformer(_config.settings.embedding_model)
|
||||
log.info("embedding_model_loaded", model=_config.settings.embedding_model)
|
||||
return _model
|
||||
|
||||
|
||||
def embed_texts(texts: list[str]) -> list[list[float]]:
|
||||
"""Generate embeddings for a list of texts."""
|
||||
model = get_model()
|
||||
embeddings = model.encode(texts, show_progress_bar=False, normalize_embeddings=True)
|
||||
return embeddings.tolist()
|
||||
|
||||
|
||||
def embed_query(query: str) -> list[float]:
|
||||
"""Generate embedding for a single query."""
|
||||
return embed_texts([query])[0]
|
||||
236
src/atocore/retrieval/retriever.py
Normal file
236
src/atocore/retrieval/retriever.py
Normal file
@@ -0,0 +1,236 @@
|
||||
"""Retrieval: query to ranked chunks."""
|
||||
|
||||
import re
|
||||
import time
|
||||
from dataclasses import dataclass
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.models.database import get_connection
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.projects.registry import get_registered_project
|
||||
from atocore.retrieval.embeddings import embed_query
|
||||
from atocore.retrieval.vector_store import get_vector_store
|
||||
|
||||
log = get_logger("retriever")
|
||||
|
||||
_STOP_TOKENS = {
|
||||
"about",
|
||||
"and",
|
||||
"current",
|
||||
"for",
|
||||
"from",
|
||||
"into",
|
||||
"like",
|
||||
"project",
|
||||
"shared",
|
||||
"system",
|
||||
"that",
|
||||
"the",
|
||||
"this",
|
||||
"what",
|
||||
"with",
|
||||
}
|
||||
|
||||
_HIGH_SIGNAL_HINTS = (
|
||||
"status",
|
||||
"decision",
|
||||
"requirements",
|
||||
"requirement",
|
||||
"roadmap",
|
||||
"charter",
|
||||
"system-map",
|
||||
"system_map",
|
||||
"contracts",
|
||||
"schema",
|
||||
"architecture",
|
||||
"workflow",
|
||||
"error-budget",
|
||||
"comparison-matrix",
|
||||
"selection-decision",
|
||||
)
|
||||
|
||||
_LOW_SIGNAL_HINTS = (
|
||||
"/_archive/",
|
||||
"\\_archive\\",
|
||||
"/archive/",
|
||||
"\\archive\\",
|
||||
"_history",
|
||||
"history",
|
||||
"pre-cleanup",
|
||||
"pre-migration",
|
||||
"reviews/",
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
class ChunkResult:
|
||||
chunk_id: str
|
||||
content: str
|
||||
score: float
|
||||
heading_path: str
|
||||
source_file: str
|
||||
tags: str
|
||||
title: str
|
||||
document_id: str
|
||||
|
||||
|
||||
def retrieve(
|
||||
query: str,
|
||||
top_k: int | None = None,
|
||||
filter_tags: list[str] | None = None,
|
||||
project_hint: str | None = None,
|
||||
) -> list[ChunkResult]:
|
||||
"""Retrieve the most relevant chunks for a query."""
|
||||
top_k = top_k or _config.settings.context_top_k
|
||||
start = time.time()
|
||||
|
||||
query_embedding = embed_query(query)
|
||||
store = get_vector_store()
|
||||
|
||||
where = None
|
||||
if filter_tags:
|
||||
if len(filter_tags) == 1:
|
||||
where = {"tags": {"$contains": f'"{filter_tags[0]}"'}}
|
||||
else:
|
||||
where = {
|
||||
"$and": [
|
||||
{"tags": {"$contains": f'"{tag}"'}}
|
||||
for tag in filter_tags
|
||||
]
|
||||
}
|
||||
|
||||
results = store.query(
|
||||
query_embedding=query_embedding,
|
||||
top_k=top_k,
|
||||
where=where,
|
||||
)
|
||||
|
||||
chunks = []
|
||||
if results and results["ids"] and results["ids"][0]:
|
||||
existing_ids = _existing_chunk_ids(results["ids"][0])
|
||||
for i, chunk_id in enumerate(results["ids"][0]):
|
||||
if chunk_id not in existing_ids:
|
||||
continue
|
||||
|
||||
distance = results["distances"][0][i] if results["distances"] else 0
|
||||
score = 1.0 - distance
|
||||
meta = results["metadatas"][0][i] if results["metadatas"] else {}
|
||||
content = results["documents"][0][i] if results["documents"] else ""
|
||||
|
||||
score *= _query_match_boost(query, meta)
|
||||
score *= _path_signal_boost(meta)
|
||||
if project_hint:
|
||||
score *= _project_match_boost(project_hint, meta)
|
||||
|
||||
chunks.append(
|
||||
ChunkResult(
|
||||
chunk_id=chunk_id,
|
||||
content=content,
|
||||
score=round(score, 4),
|
||||
heading_path=meta.get("heading_path", ""),
|
||||
source_file=meta.get("source_file", ""),
|
||||
tags=meta.get("tags", "[]"),
|
||||
title=meta.get("title", ""),
|
||||
document_id=meta.get("document_id", ""),
|
||||
)
|
||||
)
|
||||
|
||||
duration_ms = int((time.time() - start) * 1000)
|
||||
chunks.sort(key=lambda chunk: chunk.score, reverse=True)
|
||||
|
||||
log.info(
|
||||
"retrieval_done",
|
||||
query=query[:100],
|
||||
top_k=top_k,
|
||||
results_count=len(chunks),
|
||||
duration_ms=duration_ms,
|
||||
)
|
||||
|
||||
return chunks
|
||||
|
||||
|
||||
def _project_match_boost(project_hint: str, metadata: dict) -> float:
|
||||
"""Return a project-aware relevance multiplier for raw retrieval."""
|
||||
hint_lower = project_hint.strip().lower()
|
||||
if not hint_lower:
|
||||
return 1.0
|
||||
|
||||
source_file = str(metadata.get("source_file", "")).lower()
|
||||
title = str(metadata.get("title", "")).lower()
|
||||
tags = str(metadata.get("tags", "")).lower()
|
||||
searchable = " ".join([source_file, title, tags])
|
||||
|
||||
project = get_registered_project(project_hint)
|
||||
candidate_names = {hint_lower}
|
||||
if project is not None:
|
||||
candidate_names.add(project.project_id.lower())
|
||||
candidate_names.update(alias.lower() for alias in project.aliases)
|
||||
candidate_names.update(
|
||||
source_ref.subpath.replace("\\", "/").strip("/").split("/")[-1].lower()
|
||||
for source_ref in project.ingest_roots
|
||||
if source_ref.subpath.strip("/\\")
|
||||
)
|
||||
|
||||
for candidate in candidate_names:
|
||||
if candidate and candidate in searchable:
|
||||
return _config.settings.rank_project_match_boost
|
||||
|
||||
return 1.0
|
||||
|
||||
|
||||
def _query_match_boost(query: str, metadata: dict) -> float:
|
||||
"""Boost chunks whose path/title/headings echo the query's high-signal terms."""
|
||||
tokens = [
|
||||
token
|
||||
for token in re.findall(r"[a-z0-9][a-z0-9_-]{2,}", query.lower())
|
||||
if token not in _STOP_TOKENS
|
||||
]
|
||||
if not tokens:
|
||||
return 1.0
|
||||
|
||||
searchable = " ".join(
|
||||
[
|
||||
str(metadata.get("source_file", "")).lower(),
|
||||
str(metadata.get("title", "")).lower(),
|
||||
str(metadata.get("heading_path", "")).lower(),
|
||||
]
|
||||
)
|
||||
matches = sum(1 for token in set(tokens) if token in searchable)
|
||||
if matches <= 0:
|
||||
return 1.0
|
||||
return min(
|
||||
1.0 + matches * _config.settings.rank_query_token_step,
|
||||
_config.settings.rank_query_token_cap,
|
||||
)
|
||||
|
||||
|
||||
def _path_signal_boost(metadata: dict) -> float:
|
||||
"""Prefer current high-signal docs and gently down-rank archival noise."""
|
||||
searchable = " ".join(
|
||||
[
|
||||
str(metadata.get("source_file", "")).lower(),
|
||||
str(metadata.get("title", "")).lower(),
|
||||
str(metadata.get("heading_path", "")).lower(),
|
||||
]
|
||||
)
|
||||
|
||||
multiplier = 1.0
|
||||
if any(hint in searchable for hint in _LOW_SIGNAL_HINTS):
|
||||
multiplier *= _config.settings.rank_path_low_signal_penalty
|
||||
if any(hint in searchable for hint in _HIGH_SIGNAL_HINTS):
|
||||
multiplier *= _config.settings.rank_path_high_signal_boost
|
||||
return multiplier
|
||||
|
||||
|
||||
def _existing_chunk_ids(chunk_ids: list[str]) -> set[str]:
|
||||
"""Filter out stale vector entries whose chunk rows no longer exist."""
|
||||
if not chunk_ids:
|
||||
return set()
|
||||
|
||||
placeholders = ", ".join("?" for _ in chunk_ids)
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute(
|
||||
f"SELECT id FROM source_chunks WHERE id IN ({placeholders})",
|
||||
chunk_ids,
|
||||
).fetchall()
|
||||
return {row["id"] for row in rows}
|
||||
77
src/atocore/retrieval/vector_store.py
Normal file
77
src/atocore/retrieval/vector_store.py
Normal file
@@ -0,0 +1,77 @@
|
||||
"""ChromaDB vector store wrapper."""
|
||||
|
||||
import chromadb
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.observability.logger import get_logger
|
||||
from atocore.retrieval.embeddings import embed_texts
|
||||
|
||||
log = get_logger("vector_store")
|
||||
|
||||
COLLECTION_NAME = "atocore_chunks"
|
||||
|
||||
_store: "VectorStore | None" = None
|
||||
|
||||
|
||||
class VectorStore:
|
||||
"""Wrapper around ChromaDB for chunk storage and retrieval."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
_config.settings.chroma_path.mkdir(parents=True, exist_ok=True)
|
||||
self._client = chromadb.PersistentClient(path=str(_config.settings.chroma_path))
|
||||
self._collection = self._client.get_or_create_collection(
|
||||
name=COLLECTION_NAME,
|
||||
metadata={"hnsw:space": "cosine"},
|
||||
)
|
||||
log.info("vector_store_initialized", path=str(_config.settings.chroma_path))
|
||||
|
||||
def add(
|
||||
self,
|
||||
ids: list[str],
|
||||
documents: list[str],
|
||||
metadatas: list[dict],
|
||||
) -> None:
|
||||
"""Add chunks with embeddings to the store."""
|
||||
embeddings = embed_texts(documents)
|
||||
self._collection.add(
|
||||
ids=ids,
|
||||
embeddings=embeddings,
|
||||
documents=documents,
|
||||
metadatas=metadatas,
|
||||
)
|
||||
log.debug("vectors_added", count=len(ids))
|
||||
|
||||
def query(
|
||||
self,
|
||||
query_embedding: list[float],
|
||||
top_k: int = 10,
|
||||
where: dict | None = None,
|
||||
) -> dict:
|
||||
"""Query the store for similar chunks."""
|
||||
kwargs: dict = {
|
||||
"query_embeddings": [query_embedding],
|
||||
"n_results": top_k,
|
||||
"include": ["documents", "metadatas", "distances"],
|
||||
}
|
||||
if where:
|
||||
kwargs["where"] = where
|
||||
|
||||
return self._collection.query(**kwargs)
|
||||
|
||||
def delete(self, ids: list[str]) -> None:
|
||||
"""Delete chunks by IDs."""
|
||||
if ids:
|
||||
self._collection.delete(ids=ids)
|
||||
log.debug("vectors_deleted", count=len(ids))
|
||||
|
||||
@property
|
||||
def count(self) -> int:
|
||||
return self._collection.count()
|
||||
|
||||
|
||||
def get_vector_store() -> VectorStore:
|
||||
"""Get or create the singleton vector store."""
|
||||
global _store
|
||||
if _store is None:
|
||||
_store = VectorStore()
|
||||
return _store
|
||||
0
tests/__init__.py
Normal file
0
tests/__init__.py
Normal file
158
tests/conftest.py
Normal file
158
tests/conftest.py
Normal file
@@ -0,0 +1,158 @@
|
||||
"""pytest configuration and shared fixtures."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "src"))
|
||||
|
||||
# Default test data directory — overridden per-test by fixtures
|
||||
_default_test_dir = tempfile.mkdtemp(prefix="atocore_test_")
|
||||
os.environ["ATOCORE_DATA_DIR"] = _default_test_dir
|
||||
os.environ["ATOCORE_DEBUG"] = "true"
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def tmp_data_dir(tmp_path):
|
||||
"""Provide a temporary data directory for tests."""
|
||||
os.environ["ATOCORE_DATA_DIR"] = str(tmp_path)
|
||||
# Reset singletons
|
||||
from atocore import config
|
||||
config.settings = config.Settings()
|
||||
|
||||
import atocore.retrieval.vector_store as vs
|
||||
vs._store = None
|
||||
|
||||
return tmp_path
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def project_registry(tmp_path, monkeypatch):
|
||||
"""Stand up an isolated project registry pointing at a temp file.
|
||||
|
||||
Returns a callable that takes one or more (project_id, [aliases])
|
||||
tuples and writes them into the registry, then forces the in-process
|
||||
settings singleton to re-resolve. Use this when a test needs the
|
||||
canonicalization helpers (resolve_project_name, get_registered_project)
|
||||
to recognize aliases.
|
||||
"""
|
||||
registry_path = tmp_path / "test-project-registry.json"
|
||||
|
||||
def _set(*projects):
|
||||
payload = {"projects": []}
|
||||
for entry in projects:
|
||||
if isinstance(entry, str):
|
||||
project_id, aliases = entry, []
|
||||
else:
|
||||
project_id, aliases = entry
|
||||
payload["projects"].append(
|
||||
{
|
||||
"id": project_id,
|
||||
"aliases": list(aliases),
|
||||
"description": f"test project {project_id}",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": f"incoming/projects/{project_id}"}
|
||||
],
|
||||
}
|
||||
)
|
||||
registry_path.write_text(json.dumps(payload), encoding="utf-8")
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
from atocore import config
|
||||
|
||||
config.settings = config.Settings()
|
||||
return registry_path
|
||||
|
||||
return _set
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def sample_markdown(tmp_path) -> Path:
|
||||
"""Create a sample markdown file for testing."""
|
||||
md_file = tmp_path / "test_note.md"
|
||||
md_file.write_text(
|
||||
"""---
|
||||
tags:
|
||||
- atocore
|
||||
- architecture
|
||||
date: 2026-04-05
|
||||
---
|
||||
# AtoCore Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
AtoCore is a personal context engine that enriches LLM interactions
|
||||
with durable memory, structured context, and project knowledge.
|
||||
|
||||
## Layers
|
||||
|
||||
The system has these layers:
|
||||
|
||||
1. Main PKM (human, messy, exploratory)
|
||||
2. AtoVault (system mirror)
|
||||
3. AtoDrive (trusted project truth)
|
||||
4. Structured Memory (DB)
|
||||
5. Semantic Retrieval (vector DB)
|
||||
|
||||
## Memory Types
|
||||
|
||||
AtoCore supports these memory types:
|
||||
|
||||
- Identity
|
||||
- Preferences
|
||||
- Project Memory
|
||||
- Episodic Memory
|
||||
- Knowledge Objects
|
||||
- Adaptation Memory
|
||||
- Trusted Project State
|
||||
|
||||
## Trust Precedence
|
||||
|
||||
When sources conflict:
|
||||
|
||||
1. Trusted Project State wins
|
||||
2. AtoDrive overrides PKM
|
||||
3. Most recent confirmed wins
|
||||
4. Higher confidence wins
|
||||
5. Equal → flag conflict
|
||||
|
||||
No silent merging.
|
||||
""",
|
||||
encoding="utf-8",
|
||||
)
|
||||
return md_file
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def sample_folder(tmp_path, sample_markdown) -> Path:
|
||||
"""Create a folder with multiple markdown files."""
|
||||
# Already has test_note.md from sample_markdown
|
||||
second = tmp_path / "second_note.md"
|
||||
second.write_text(
|
||||
"""---
|
||||
tags:
|
||||
- chunking
|
||||
---
|
||||
# Chunking Strategy
|
||||
|
||||
## Approach
|
||||
|
||||
Heading-aware recursive splitting:
|
||||
|
||||
1. Split on H2 boundaries first
|
||||
2. If section > 800 chars, split on H3
|
||||
3. If still > 800 chars, split on paragraphs
|
||||
4. Hard split at 800 chars with 100 char overlap
|
||||
|
||||
## Parameters
|
||||
|
||||
- max_chunk_size: 800 characters
|
||||
- overlap: 100 characters
|
||||
- min_chunk_size: 50 characters
|
||||
""",
|
||||
encoding="utf-8",
|
||||
)
|
||||
return tmp_path
|
||||
636
tests/test_api_storage.py
Normal file
636
tests/test_api_storage.py
Normal file
@@ -0,0 +1,636 @@
|
||||
"""Tests for storage-related API readiness endpoints."""
|
||||
|
||||
from contextlib import contextmanager
|
||||
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.main import app
|
||||
|
||||
|
||||
def test_sources_endpoint_reports_configured_sources(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/sources")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["vault_enabled"] is True
|
||||
assert body["drive_enabled"] is True
|
||||
assert len(body["sources"]) == 2
|
||||
assert all(source["read_only"] for source in body["sources"])
|
||||
|
||||
|
||||
def test_health_endpoint_exposes_machine_paths_and_source_readiness(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/health")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["status"] == "ok"
|
||||
assert body["sources_ready"] is True
|
||||
assert "db_path" in body["machine_paths"]
|
||||
assert "run_dir" in body["machine_paths"]
|
||||
|
||||
|
||||
def test_health_endpoint_reports_code_version_from_module(tmp_data_dir):
|
||||
"""The /health response must include code_version reflecting
|
||||
atocore.__version__, so deployment drift detection works."""
|
||||
from atocore import __version__
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/health")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["version"] == __version__
|
||||
assert body["code_version"] == __version__
|
||||
|
||||
|
||||
def test_health_endpoint_reports_build_metadata_from_env(tmp_data_dir, monkeypatch):
|
||||
"""The /health response must include build_sha, build_time, and
|
||||
build_branch from the ATOCORE_BUILD_* env vars, so deploy.sh can
|
||||
detect precise drift via SHA comparison instead of relying on
|
||||
the coarse code_version field.
|
||||
|
||||
Regression test for the codex finding from 2026-04-08:
|
||||
code_version 0.2.0 is too coarse to trust as a 'live is current'
|
||||
signal because it only changes on manual bumps. The build_sha
|
||||
field changes per commit and is set by deploy.sh.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_BUILD_SHA", "abc1234567890fedcba0987654321")
|
||||
monkeypatch.setenv("ATOCORE_BUILD_TIME", "2026-04-09T01:23:45Z")
|
||||
monkeypatch.setenv("ATOCORE_BUILD_BRANCH", "main")
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/health")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["build_sha"] == "abc1234567890fedcba0987654321"
|
||||
assert body["build_time"] == "2026-04-09T01:23:45Z"
|
||||
assert body["build_branch"] == "main"
|
||||
|
||||
|
||||
def test_health_endpoint_reports_unknown_when_build_env_unset(tmp_data_dir, monkeypatch):
|
||||
"""When deploy.sh hasn't set the build env vars (e.g. someone
|
||||
ran `docker compose up` directly), /health reports 'unknown'
|
||||
for all three build fields. This is a clear signal to the
|
||||
operator that the deploy provenance is missing and they should
|
||||
re-run via deploy.sh."""
|
||||
monkeypatch.delenv("ATOCORE_BUILD_SHA", raising=False)
|
||||
monkeypatch.delenv("ATOCORE_BUILD_TIME", raising=False)
|
||||
monkeypatch.delenv("ATOCORE_BUILD_BRANCH", raising=False)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/health")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["build_sha"] == "unknown"
|
||||
assert body["build_time"] == "unknown"
|
||||
assert body["build_branch"] == "unknown"
|
||||
|
||||
|
||||
def test_projects_endpoint_reports_registered_projects(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
config_dir = tmp_data_dir / "config"
|
||||
project_dir = vault_dir / "incoming" / "projects" / "p04-gigabit"
|
||||
project_dir.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
"""
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["p04"],
|
||||
"description": "P04 docs",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p04-gigabit"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
""".strip(),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/projects")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["projects"][0]["id"] == "p04-gigabit"
|
||||
assert body["projects"][0]["ingest_roots"][0]["exists"] is True
|
||||
|
||||
|
||||
def test_project_refresh_endpoint_uses_registered_roots(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
config_dir = tmp_data_dir / "config"
|
||||
project_dir = vault_dir / "incoming" / "projects" / "p05-interferometer"
|
||||
project_dir.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
"""
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05"],
|
||||
"description": "P05 docs",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
""".strip(),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
calls = []
|
||||
|
||||
def fake_refresh_registered_project(project_name, purge_deleted=False):
|
||||
calls.append((project_name, purge_deleted))
|
||||
return {
|
||||
"project": "p05-interferometer",
|
||||
"aliases": ["p05"],
|
||||
"description": "P05 docs",
|
||||
"purge_deleted": purge_deleted,
|
||||
"status": "ingested",
|
||||
"roots_ingested": 1,
|
||||
"roots_skipped": 0,
|
||||
"roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p05-interferometer",
|
||||
"path": str(project_dir),
|
||||
"status": "ingested",
|
||||
"results": [],
|
||||
}
|
||||
],
|
||||
}
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
monkeypatch.setattr("atocore.api.routes.refresh_registered_project", fake_refresh_registered_project)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post("/projects/p05/refresh")
|
||||
|
||||
assert response.status_code == 200
|
||||
assert calls == [("p05", False)]
|
||||
assert response.json()["project"] == "p05-interferometer"
|
||||
|
||||
|
||||
def test_project_refresh_endpoint_serializes_ingestion(tmp_data_dir, monkeypatch):
|
||||
config.settings = config.Settings()
|
||||
events = []
|
||||
|
||||
@contextmanager
|
||||
def fake_lock():
|
||||
events.append("enter")
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
events.append("exit")
|
||||
|
||||
def fake_refresh_registered_project(project_name, purge_deleted=False):
|
||||
events.append(("refresh", project_name, purge_deleted))
|
||||
return {
|
||||
"project": "p05-interferometer",
|
||||
"aliases": ["p05"],
|
||||
"description": "P05 docs",
|
||||
"purge_deleted": purge_deleted,
|
||||
"status": "nothing_to_ingest",
|
||||
"roots_ingested": 0,
|
||||
"roots_skipped": 0,
|
||||
"roots": [],
|
||||
}
|
||||
|
||||
monkeypatch.setattr("atocore.api.routes.exclusive_ingestion", fake_lock)
|
||||
monkeypatch.setattr("atocore.api.routes.refresh_registered_project", fake_refresh_registered_project)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post("/projects/p05/refresh")
|
||||
|
||||
assert response.status_code == 200
|
||||
assert events == ["enter", ("refresh", "p05", False), "exit"]
|
||||
|
||||
|
||||
def test_projects_template_endpoint_returns_template(tmp_data_dir, monkeypatch):
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.get("/projects/template")
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["allowed_sources"] == ["vault", "drive"]
|
||||
assert body["template"]["projects"][0]["id"] == "p07-example"
|
||||
|
||||
|
||||
def test_project_proposal_endpoint_returns_normalized_preview(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
config_dir = tmp_data_dir / "config"
|
||||
staged = vault_dir / "incoming" / "projects" / "p07-example"
|
||||
staged.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text('{"projects": []}', encoding="utf-8")
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post(
|
||||
"/projects/proposal",
|
||||
json={
|
||||
"project_id": "p07-example",
|
||||
"aliases": ["p07", "example-project", "p07"],
|
||||
"description": "Example project",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p07-example",
|
||||
"label": "Primary docs",
|
||||
}
|
||||
],
|
||||
},
|
||||
)
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["project"]["aliases"] == ["p07", "example-project"]
|
||||
assert body["resolved_ingest_roots"][0]["exists"] is True
|
||||
assert body["valid"] is True
|
||||
|
||||
|
||||
def test_project_register_endpoint_persists_entry(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
config_dir = tmp_data_dir / "config"
|
||||
staged = vault_dir / "incoming" / "projects" / "p07-example"
|
||||
staged.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text('{"projects": []}', encoding="utf-8")
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post(
|
||||
"/projects/register",
|
||||
json={
|
||||
"project_id": "p07-example",
|
||||
"aliases": ["p07", "example-project"],
|
||||
"description": "Example project",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p07-example",
|
||||
"label": "Primary docs",
|
||||
}
|
||||
],
|
||||
},
|
||||
)
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["status"] == "registered"
|
||||
assert body["project"]["id"] == "p07-example"
|
||||
assert '"p07-example"' in registry_path.read_text(encoding="utf-8")
|
||||
|
||||
|
||||
def test_project_register_endpoint_rejects_collisions(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
config_dir = tmp_data_dir / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
"""
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
""".strip(),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post(
|
||||
"/projects/register",
|
||||
json={
|
||||
"project_id": "p07-example",
|
||||
"aliases": ["interferometer"],
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p07-example",
|
||||
}
|
||||
],
|
||||
},
|
||||
)
|
||||
|
||||
assert response.status_code == 400
|
||||
assert "collisions" in response.json()["detail"]
|
||||
|
||||
|
||||
def test_project_update_endpoint_persists_changes(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
config_dir = tmp_data_dir / "config"
|
||||
project_dir = vault_dir / "incoming" / "projects" / "p04-gigabit"
|
||||
project_dir.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
"""
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["p04", "gigabit"],
|
||||
"description": "Old description",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p04-gigabit"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
""".strip(),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.put(
|
||||
"/projects/p04",
|
||||
json={
|
||||
"aliases": ["p04", "gigabit", "gigabit-project"],
|
||||
"description": "Updated P04 docs",
|
||||
},
|
||||
)
|
||||
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["status"] == "updated"
|
||||
assert body["project"]["aliases"] == ["p04", "gigabit", "gigabit-project"]
|
||||
assert body["project"]["description"] == "Updated P04 docs"
|
||||
|
||||
|
||||
def test_project_update_endpoint_rejects_collisions(tmp_data_dir, monkeypatch):
|
||||
vault_dir = tmp_data_dir / "vault-source"
|
||||
drive_dir = tmp_data_dir / "drive-source"
|
||||
config_dir = tmp_data_dir / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
"""
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["p04", "gigabit"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p04-gigabit"}
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
""".strip(),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.put(
|
||||
"/projects/p04",
|
||||
json={
|
||||
"aliases": ["p04", "interferometer"],
|
||||
},
|
||||
)
|
||||
|
||||
assert response.status_code == 400
|
||||
assert "collisions" in response.json()["detail"]
|
||||
|
||||
|
||||
def test_admin_backup_create_without_chroma(tmp_data_dir, monkeypatch):
|
||||
config.settings = config.Settings()
|
||||
captured = {}
|
||||
|
||||
def fake_create_runtime_backup(timestamp=None, include_chroma=False):
|
||||
captured["include_chroma"] = include_chroma
|
||||
return {
|
||||
"created_at": "2026-04-06T23:00:00+00:00",
|
||||
"backup_root": "/tmp/fake",
|
||||
"db_snapshot_path": "/tmp/fake/db/atocore.db",
|
||||
"db_size_bytes": 0,
|
||||
"registry_snapshot_path": "",
|
||||
"chroma_snapshot_path": "",
|
||||
"chroma_snapshot_bytes": 0,
|
||||
"chroma_snapshot_files": 0,
|
||||
"chroma_snapshot_included": False,
|
||||
"vector_store_note": "skipped",
|
||||
}
|
||||
|
||||
monkeypatch.setattr("atocore.api.routes.create_runtime_backup", fake_create_runtime_backup)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post("/admin/backup", json={})
|
||||
|
||||
assert response.status_code == 200
|
||||
assert captured == {"include_chroma": False}
|
||||
body = response.json()
|
||||
assert body["chroma_snapshot_included"] is False
|
||||
|
||||
|
||||
def test_admin_backup_create_with_chroma_holds_lock(tmp_data_dir, monkeypatch):
|
||||
config.settings = config.Settings()
|
||||
events = []
|
||||
|
||||
@contextmanager
|
||||
def fake_lock():
|
||||
events.append("enter")
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
events.append("exit")
|
||||
|
||||
def fake_create_runtime_backup(timestamp=None, include_chroma=False):
|
||||
events.append(("backup", include_chroma))
|
||||
return {
|
||||
"created_at": "2026-04-06T23:30:00+00:00",
|
||||
"backup_root": "/tmp/fake",
|
||||
"db_snapshot_path": "/tmp/fake/db/atocore.db",
|
||||
"db_size_bytes": 0,
|
||||
"registry_snapshot_path": "",
|
||||
"chroma_snapshot_path": "/tmp/fake/chroma",
|
||||
"chroma_snapshot_bytes": 4,
|
||||
"chroma_snapshot_files": 1,
|
||||
"chroma_snapshot_included": True,
|
||||
"vector_store_note": "included",
|
||||
}
|
||||
|
||||
monkeypatch.setattr("atocore.api.routes.exclusive_ingestion", fake_lock)
|
||||
monkeypatch.setattr("atocore.api.routes.create_runtime_backup", fake_create_runtime_backup)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post("/admin/backup", json={"include_chroma": True})
|
||||
|
||||
assert response.status_code == 200
|
||||
assert events == ["enter", ("backup", True), "exit"]
|
||||
assert response.json()["chroma_snapshot_included"] is True
|
||||
|
||||
|
||||
def test_admin_backup_list_and_validate_endpoints(tmp_data_dir, monkeypatch):
|
||||
config.settings = config.Settings()
|
||||
|
||||
def fake_list_runtime_backups():
|
||||
return [
|
||||
{
|
||||
"stamp": "20260406T220000Z",
|
||||
"path": "/tmp/fake/snapshots/20260406T220000Z",
|
||||
"has_metadata": True,
|
||||
"metadata": {"db_snapshot_path": "/tmp/fake/snapshots/20260406T220000Z/db/atocore.db"},
|
||||
}
|
||||
]
|
||||
|
||||
def fake_validate_backup(stamp):
|
||||
if stamp == "missing":
|
||||
return {
|
||||
"stamp": stamp,
|
||||
"path": f"/tmp/fake/snapshots/{stamp}",
|
||||
"exists": False,
|
||||
"errors": ["snapshot_directory_missing"],
|
||||
}
|
||||
return {
|
||||
"stamp": stamp,
|
||||
"path": f"/tmp/fake/snapshots/{stamp}",
|
||||
"exists": True,
|
||||
"db_ok": True,
|
||||
"registry_ok": True,
|
||||
"chroma_ok": None,
|
||||
"valid": True,
|
||||
"errors": [],
|
||||
}
|
||||
|
||||
monkeypatch.setattr("atocore.api.routes.list_runtime_backups", fake_list_runtime_backups)
|
||||
monkeypatch.setattr("atocore.api.routes.validate_backup", fake_validate_backup)
|
||||
|
||||
client = TestClient(app)
|
||||
|
||||
listing = client.get("/admin/backup")
|
||||
assert listing.status_code == 200
|
||||
listing_body = listing.json()
|
||||
assert "backup_dir" in listing_body
|
||||
assert listing_body["backups"][0]["stamp"] == "20260406T220000Z"
|
||||
|
||||
valid = client.get("/admin/backup/20260406T220000Z/validate")
|
||||
assert valid.status_code == 200
|
||||
assert valid.json()["valid"] is True
|
||||
|
||||
missing = client.get("/admin/backup/missing/validate")
|
||||
assert missing.status_code == 404
|
||||
|
||||
|
||||
def test_query_endpoint_accepts_project_hint(monkeypatch):
|
||||
def fake_retrieve(prompt, top_k=10, filter_tags=None, project_hint=None):
|
||||
assert prompt == "architecture"
|
||||
assert top_k == 3
|
||||
assert project_hint == "p04-gigabit"
|
||||
return []
|
||||
|
||||
monkeypatch.setattr("atocore.api.routes.retrieve", fake_retrieve)
|
||||
|
||||
client = TestClient(app)
|
||||
response = client.post(
|
||||
"/query",
|
||||
json={
|
||||
"prompt": "architecture",
|
||||
"top_k": 3,
|
||||
"project": "p04-gigabit",
|
||||
},
|
||||
)
|
||||
|
||||
assert response.status_code == 200
|
||||
assert response.json()["results"] == []
|
||||
313
tests/test_atocore_client.py
Normal file
313
tests/test_atocore_client.py
Normal file
@@ -0,0 +1,313 @@
|
||||
"""Tests for scripts/atocore_client.py — the shared operator CLI.
|
||||
|
||||
Specifically covers the Phase 9 reflection-loop subcommands added
|
||||
after codex's sequence-step-3 review: ``capture``, ``extract``,
|
||||
``reinforce-interaction``, ``list-interactions``, ``get-interaction``,
|
||||
``queue``, ``promote``, ``reject``.
|
||||
|
||||
The tests mock the client's ``request()`` helper and verify each
|
||||
subcommand:
|
||||
|
||||
- calls the correct HTTP method and path
|
||||
- builds the correct JSON body (or the correct query string)
|
||||
- passes the right subset of CLI arguments through
|
||||
|
||||
This is the same "wiring test" shape used by tests/test_api_storage.py:
|
||||
we don't exercise the live HTTP stack; we verify the client builds
|
||||
the request correctly. The server side is already covered by its
|
||||
own route tests.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
# Make scripts/ importable
|
||||
_REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
sys.path.insert(0, str(_REPO_ROOT / "scripts"))
|
||||
|
||||
import atocore_client as client # noqa: E402
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Request capture helper
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class _RequestCapture:
|
||||
"""Drop-in replacement for client.request() that records calls."""
|
||||
|
||||
def __init__(self, response: dict | None = None):
|
||||
self.calls: list[dict] = []
|
||||
self._response = response if response is not None else {"ok": True}
|
||||
|
||||
def __call__(self, method, path, data=None, timeout=None):
|
||||
self.calls.append(
|
||||
{"method": method, "path": path, "data": data, "timeout": timeout}
|
||||
)
|
||||
return self._response
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def capture_requests(monkeypatch):
|
||||
"""Replace client.request with a recording stub and return it."""
|
||||
stub = _RequestCapture()
|
||||
monkeypatch.setattr(client, "request", stub)
|
||||
return stub
|
||||
|
||||
|
||||
def _run_client(monkeypatch, argv: list[str]) -> int:
|
||||
"""Simulate a CLI invocation with the given argv."""
|
||||
monkeypatch.setattr(sys, "argv", ["atocore_client.py", *argv])
|
||||
return client.main()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# capture
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_capture_posts_to_interactions_endpoint(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
[
|
||||
"capture",
|
||||
"what is p05's current focus",
|
||||
"The current focus is wave 2 operational ingestion.",
|
||||
"p05-interferometer",
|
||||
"claude-code-test",
|
||||
"session-abc",
|
||||
],
|
||||
)
|
||||
assert len(capture_requests.calls) == 1
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/interactions"
|
||||
body = call["data"]
|
||||
assert body["prompt"] == "what is p05's current focus"
|
||||
assert body["response"].startswith("The current focus")
|
||||
assert body["project"] == "p05-interferometer"
|
||||
assert body["client"] == "claude-code-test"
|
||||
assert body["session_id"] == "session-abc"
|
||||
assert body["reinforce"] is True # default
|
||||
|
||||
|
||||
def test_capture_sets_default_client_when_omitted(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
["capture", "hi", "hello"],
|
||||
)
|
||||
call = capture_requests.calls[0]
|
||||
assert call["data"]["client"] == "atocore-client"
|
||||
assert call["data"]["project"] == ""
|
||||
assert call["data"]["reinforce"] is True
|
||||
|
||||
|
||||
def test_capture_accepts_reinforce_false(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
["capture", "prompt", "response", "p05", "claude", "sess", "false"],
|
||||
)
|
||||
call = capture_requests.calls[0]
|
||||
assert call["data"]["reinforce"] is False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# extract
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_extract_default_is_preview(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["extract", "abc-123"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/interactions/abc-123/extract"
|
||||
assert call["data"] == {"persist": False}
|
||||
|
||||
|
||||
def test_extract_persist_true(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["extract", "abc-123", "true"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["data"] == {"persist": True}
|
||||
|
||||
|
||||
def test_extract_url_encodes_interaction_id(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["extract", "abc/def"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["path"] == "/interactions/abc%2Fdef/extract"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# reinforce-interaction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_reinforce_interaction_posts_to_correct_path(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["reinforce-interaction", "int-xyz"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/interactions/int-xyz/reinforce"
|
||||
assert call["data"] == {}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# list-interactions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_list_interactions_no_filters(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["list-interactions"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "GET"
|
||||
assert call["path"] == "/interactions?limit=50"
|
||||
|
||||
|
||||
def test_list_interactions_with_project_filter(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["list-interactions", "p05-interferometer"])
|
||||
call = capture_requests.calls[0]
|
||||
assert "project=p05-interferometer" in call["path"]
|
||||
assert "limit=50" in call["path"]
|
||||
|
||||
|
||||
def test_list_interactions_full_filter_set(capture_requests, monkeypatch):
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
[
|
||||
"list-interactions",
|
||||
"p05",
|
||||
"sess-1",
|
||||
"claude-code",
|
||||
"2026-04-07T00:00:00Z",
|
||||
"20",
|
||||
],
|
||||
)
|
||||
call = capture_requests.calls[0]
|
||||
path = call["path"]
|
||||
assert "project=p05" in path
|
||||
assert "session_id=sess-1" in path
|
||||
assert "client=claude-code" in path
|
||||
# Since is URL-encoded — the : and + chars get escaped
|
||||
assert "since=2026-04-07" in path
|
||||
assert "limit=20" in path
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# get-interaction
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_get_interaction_fetches_by_id(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["get-interaction", "int-42"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "GET"
|
||||
assert call["path"] == "/interactions/int-42"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# queue
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_queue_always_filters_by_candidate_status(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["queue"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "GET"
|
||||
assert call["path"].startswith("/memory?")
|
||||
assert "status=candidate" in call["path"]
|
||||
assert "limit=50" in call["path"]
|
||||
|
||||
|
||||
def test_queue_with_memory_type_and_project(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["queue", "adaptation", "p05-interferometer", "10"])
|
||||
call = capture_requests.calls[0]
|
||||
path = call["path"]
|
||||
assert "status=candidate" in path
|
||||
assert "memory_type=adaptation" in path
|
||||
assert "project=p05-interferometer" in path
|
||||
assert "limit=10" in path
|
||||
|
||||
|
||||
def test_queue_limit_coercion(capture_requests, monkeypatch):
|
||||
"""limit is typed as int by argparse so string '25' becomes 25."""
|
||||
_run_client(monkeypatch, ["queue", "", "", "25"])
|
||||
call = capture_requests.calls[0]
|
||||
assert "limit=25" in call["path"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# promote / reject
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_promote_posts_to_memory_promote_path(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["promote", "mem-abc"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/memory/mem-abc/promote"
|
||||
assert call["data"] == {}
|
||||
|
||||
|
||||
def test_reject_posts_to_memory_reject_path(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["reject", "mem-xyz"])
|
||||
call = capture_requests.calls[0]
|
||||
assert call["method"] == "POST"
|
||||
assert call["path"] == "/memory/mem-xyz/reject"
|
||||
assert call["data"] == {}
|
||||
|
||||
|
||||
def test_promote_url_encodes_memory_id(capture_requests, monkeypatch):
|
||||
_run_client(monkeypatch, ["promote", "mem/with/slashes"])
|
||||
call = capture_requests.calls[0]
|
||||
assert "mem%2Fwith%2Fslashes" in call["path"]
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# end-to-end: ensure the Phase 9 loop can be driven entirely through
|
||||
# the client
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_phase9_full_loop_via_client_shape(capture_requests, monkeypatch):
|
||||
"""Simulate the full capture -> extract -> queue -> promote cycle.
|
||||
|
||||
This doesn't exercise real HTTP — each call is intercepted by
|
||||
the mock request. But it proves every step of the Phase 9 loop
|
||||
is reachable through the shared client, which is the whole point
|
||||
of the codex-step-3 work.
|
||||
"""
|
||||
# Step 1: capture
|
||||
_run_client(
|
||||
monkeypatch,
|
||||
[
|
||||
"capture",
|
||||
"what about GF-PTFE for lateral support",
|
||||
"## Decision: use GF-PTFE pads for thermal stability",
|
||||
"p05-interferometer",
|
||||
],
|
||||
)
|
||||
# Step 2: extract candidates (preview)
|
||||
_run_client(monkeypatch, ["extract", "fake-interaction-id"])
|
||||
# Step 3: extract and persist
|
||||
_run_client(monkeypatch, ["extract", "fake-interaction-id", "true"])
|
||||
# Step 4: list the review queue
|
||||
_run_client(monkeypatch, ["queue"])
|
||||
# Step 5: promote a candidate
|
||||
_run_client(monkeypatch, ["promote", "fake-memory-id"])
|
||||
# Step 6: reject another
|
||||
_run_client(monkeypatch, ["reject", "fake-memory-id-2"])
|
||||
|
||||
methods_and_paths = [
|
||||
(c["method"], c["path"]) for c in capture_requests.calls
|
||||
]
|
||||
assert methods_and_paths == [
|
||||
("POST", "/interactions"),
|
||||
("POST", "/interactions/fake-interaction-id/extract"),
|
||||
("POST", "/interactions/fake-interaction-id/extract"),
|
||||
("GET", "/memory?status=candidate&limit=50"),
|
||||
("POST", "/memory/fake-memory-id/promote"),
|
||||
("POST", "/memory/fake-memory-id-2/reject"),
|
||||
]
|
||||
690
tests/test_backup.py
Normal file
690
tests/test_backup.py
Normal file
@@ -0,0 +1,690 @@
|
||||
"""Tests for runtime backup creation, restore, and retention cleanup."""
|
||||
|
||||
import json
|
||||
import sqlite3
|
||||
from datetime import UTC, datetime, timedelta
|
||||
|
||||
import pytest
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.models.database import init_db
|
||||
from atocore.ops.backup import (
|
||||
cleanup_old_backups,
|
||||
create_runtime_backup,
|
||||
list_runtime_backups,
|
||||
restore_runtime_backup,
|
||||
validate_backup,
|
||||
)
|
||||
|
||||
|
||||
def test_create_runtime_backup_copies_db_and_registry(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
registry_path = tmp_path / "config" / "project-registry.json"
|
||||
registry_path.parent.mkdir(parents=True)
|
||||
registry_path.write_text('{"projects":[{"id":"p01-example","aliases":[],"ingest_roots":[{"source":"vault","subpath":"incoming/projects/p01-example"}]}]}\n', encoding="utf-8")
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
conn.execute("INSERT INTO projects (id, name) VALUES (?, ?)", ("p01", "P01 Example"))
|
||||
conn.commit()
|
||||
|
||||
result = create_runtime_backup(datetime(2026, 4, 6, 18, 0, 0, tzinfo=UTC))
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
db_snapshot = tmp_path / "backups" / "snapshots" / "20260406T180000Z" / "db" / "atocore.db"
|
||||
registry_snapshot = (
|
||||
tmp_path / "backups" / "snapshots" / "20260406T180000Z" / "config" / "project-registry.json"
|
||||
)
|
||||
metadata_path = (
|
||||
tmp_path / "backups" / "snapshots" / "20260406T180000Z" / "backup-metadata.json"
|
||||
)
|
||||
|
||||
assert result["db_snapshot_path"] == str(db_snapshot)
|
||||
assert db_snapshot.exists()
|
||||
assert registry_snapshot.exists()
|
||||
assert metadata_path.exists()
|
||||
|
||||
with sqlite3.connect(str(db_snapshot)) as conn:
|
||||
row = conn.execute("SELECT name FROM projects WHERE id = ?", ("p01",)).fetchone()
|
||||
assert row[0] == "P01 Example"
|
||||
|
||||
metadata = json.loads(metadata_path.read_text(encoding="utf-8"))
|
||||
assert metadata["registry_snapshot_path"] == str(registry_snapshot)
|
||||
|
||||
|
||||
def test_create_runtime_backup_includes_chroma_when_requested(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
|
||||
# Create a fake chroma directory tree with a couple of files.
|
||||
chroma_dir = config.settings.chroma_path
|
||||
(chroma_dir / "collection-a").mkdir(parents=True, exist_ok=True)
|
||||
(chroma_dir / "collection-a" / "data.bin").write_bytes(b"\x00\x01\x02\x03")
|
||||
(chroma_dir / "metadata.json").write_text('{"ok":true}', encoding="utf-8")
|
||||
|
||||
result = create_runtime_backup(
|
||||
datetime(2026, 4, 6, 20, 0, 0, tzinfo=UTC),
|
||||
include_chroma=True,
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
chroma_snapshot_root = (
|
||||
tmp_path / "backups" / "snapshots" / "20260406T200000Z" / "chroma"
|
||||
)
|
||||
assert result["chroma_snapshot_included"] is True
|
||||
assert result["chroma_snapshot_path"] == str(chroma_snapshot_root)
|
||||
assert result["chroma_snapshot_files"] >= 2
|
||||
assert result["chroma_snapshot_bytes"] > 0
|
||||
assert (chroma_snapshot_root / "collection-a" / "data.bin").exists()
|
||||
assert (chroma_snapshot_root / "metadata.json").exists()
|
||||
|
||||
|
||||
def test_list_and_validate_runtime_backups(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
first = create_runtime_backup(datetime(2026, 4, 6, 21, 0, 0, tzinfo=UTC))
|
||||
second = create_runtime_backup(datetime(2026, 4, 6, 22, 0, 0, tzinfo=UTC))
|
||||
|
||||
listing = list_runtime_backups()
|
||||
first_validation = validate_backup("20260406T210000Z")
|
||||
second_validation = validate_backup("20260406T220000Z")
|
||||
missing_validation = validate_backup("20260101T000000Z")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert len(listing) == 2
|
||||
assert {entry["stamp"] for entry in listing} == {
|
||||
"20260406T210000Z",
|
||||
"20260406T220000Z",
|
||||
}
|
||||
for entry in listing:
|
||||
assert entry["has_metadata"] is True
|
||||
assert entry["metadata"]["db_snapshot_path"]
|
||||
|
||||
assert first_validation["valid"] is True
|
||||
assert first_validation["db_ok"] is True
|
||||
assert first_validation["errors"] == []
|
||||
|
||||
assert second_validation["valid"] is True
|
||||
|
||||
assert missing_validation["exists"] is False
|
||||
assert "snapshot_directory_missing" in missing_validation["errors"]
|
||||
|
||||
# both metadata paths are reachable on disk
|
||||
assert json.loads(
|
||||
(tmp_path / "backups" / "snapshots" / "20260406T210000Z" / "backup-metadata.json")
|
||||
.read_text(encoding="utf-8")
|
||||
)["db_snapshot_path"] == first["db_snapshot_path"]
|
||||
assert second["db_snapshot_path"].endswith("atocore.db")
|
||||
|
||||
|
||||
def test_create_runtime_backup_handles_missing_registry(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
result = create_runtime_backup(datetime(2026, 4, 6, 19, 0, 0, tzinfo=UTC))
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert result["registry_snapshot_path"] == ""
|
||||
|
||||
|
||||
def test_restore_refuses_without_confirm_service_stopped(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
create_runtime_backup(datetime(2026, 4, 9, 10, 0, 0, tzinfo=UTC))
|
||||
|
||||
with pytest.raises(RuntimeError, match="confirm_service_stopped"):
|
||||
restore_runtime_backup("20260409T100000Z")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_raises_on_invalid_backup(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
with pytest.raises(RuntimeError, match="failed validation"):
|
||||
restore_runtime_backup(
|
||||
"20250101T000000Z", confirm_service_stopped=True
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_round_trip_reverses_post_backup_mutations(tmp_path, monkeypatch):
|
||||
"""Canonical drill: snapshot -> mutate -> restore -> mutation gone."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
registry_path = tmp_path / "config" / "project-registry.json"
|
||||
registry_path.parent.mkdir(parents=True)
|
||||
registry_path.write_text(
|
||||
'{"projects":[{"id":"p01-example","aliases":[],'
|
||||
'"ingest_roots":[{"source":"vault","subpath":"incoming/projects/p01-example"}]}]}\n',
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
|
||||
# 1. Seed baseline state that should SURVIVE the restore.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name) VALUES (?, ?)",
|
||||
("p01", "Baseline Project"),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
# 2. Create the backup we're going to restore to.
|
||||
create_runtime_backup(datetime(2026, 4, 9, 11, 0, 0, tzinfo=UTC))
|
||||
stamp = "20260409T110000Z"
|
||||
|
||||
# 3. Mutate live state AFTER the backup — this is what the
|
||||
# restore should reverse.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name) VALUES (?, ?)",
|
||||
("p99", "Post Backup Mutation"),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
# Confirm the mutation is present before restore.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
row = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p99",)
|
||||
).fetchone()
|
||||
assert row is not None and row[0] == "Post Backup Mutation"
|
||||
|
||||
# 4. Restore — the drill procedure. Explicit confirm_service_stopped.
|
||||
result = restore_runtime_backup(
|
||||
stamp, confirm_service_stopped=True
|
||||
)
|
||||
|
||||
# 5. Verify restore report
|
||||
assert result["stamp"] == stamp
|
||||
assert result["db_restored"] is True
|
||||
assert result["registry_restored"] is True
|
||||
assert result["restored_integrity_ok"] is True
|
||||
assert result["pre_restore_snapshot"] is not None
|
||||
|
||||
# 6. Verify live state reflects the restore: baseline survived,
|
||||
# post-backup mutation is gone.
|
||||
with sqlite3.connect(str(config.settings.db_path)) as conn:
|
||||
baseline = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p01",)
|
||||
).fetchone()
|
||||
mutation = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p99",)
|
||||
).fetchone()
|
||||
assert baseline is not None and baseline[0] == "Baseline Project"
|
||||
assert mutation is None
|
||||
|
||||
# 7. Pre-restore safety snapshot DOES contain the mutation —
|
||||
# it captured current state before overwriting. This is the
|
||||
# reversibility guarantee: the operator can restore back to
|
||||
# it if the restore itself was a mistake.
|
||||
pre_stamp = result["pre_restore_snapshot"]
|
||||
pre_validation = validate_backup(pre_stamp)
|
||||
assert pre_validation["valid"] is True
|
||||
pre_db_path = pre_validation["metadata"]["db_snapshot_path"]
|
||||
with sqlite3.connect(pre_db_path) as conn:
|
||||
pre_mutation = conn.execute(
|
||||
"SELECT name FROM projects WHERE id = ?", ("p99",)
|
||||
).fetchone()
|
||||
assert pre_mutation is not None and pre_mutation[0] == "Post Backup Mutation"
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_round_trip_with_chroma(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
|
||||
# Seed baseline chroma state that should survive restore.
|
||||
chroma_dir = config.settings.chroma_path
|
||||
(chroma_dir / "coll-a").mkdir(parents=True, exist_ok=True)
|
||||
(chroma_dir / "coll-a" / "baseline.bin").write_bytes(b"baseline")
|
||||
|
||||
create_runtime_backup(
|
||||
datetime(2026, 4, 9, 12, 0, 0, tzinfo=UTC), include_chroma=True
|
||||
)
|
||||
stamp = "20260409T120000Z"
|
||||
|
||||
# Mutate chroma after backup: add a file + remove baseline.
|
||||
(chroma_dir / "coll-a" / "post_backup.bin").write_bytes(b"post")
|
||||
(chroma_dir / "coll-a" / "baseline.bin").unlink()
|
||||
|
||||
result = restore_runtime_backup(
|
||||
stamp, confirm_service_stopped=True
|
||||
)
|
||||
|
||||
assert result["chroma_restored"] is True
|
||||
assert (chroma_dir / "coll-a" / "baseline.bin").exists()
|
||||
assert not (chroma_dir / "coll-a" / "post_backup.bin").exists()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_chroma_does_not_unlink_destination_directory(tmp_path, monkeypatch):
|
||||
"""Regression: restore must not rmtree the chroma dir itself.
|
||||
|
||||
In a Dockerized deployment the chroma dir is a bind-mounted
|
||||
volume. Calling shutil.rmtree on a mount point raises
|
||||
``OSError [Errno 16] Device or resource busy``, which broke the
|
||||
first real Dalidou drill on 2026-04-09. The fix clears the
|
||||
directory's CONTENTS and copytree(dirs_exist_ok=True) into it,
|
||||
keeping the directory inode (and any bind mount) intact.
|
||||
|
||||
This test captures the inode of the destination directory before
|
||||
and after restore and asserts they match — that's what a
|
||||
bind-mounted chroma dir would also see.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
|
||||
chroma_dir = config.settings.chroma_path
|
||||
(chroma_dir / "coll-a").mkdir(parents=True, exist_ok=True)
|
||||
(chroma_dir / "coll-a" / "baseline.bin").write_bytes(b"baseline")
|
||||
|
||||
create_runtime_backup(
|
||||
datetime(2026, 4, 9, 15, 0, 0, tzinfo=UTC), include_chroma=True
|
||||
)
|
||||
|
||||
# Capture the destination directory's stat signature before restore.
|
||||
chroma_stat_before = chroma_dir.stat()
|
||||
|
||||
# Add a file post-backup so restore has work to do.
|
||||
(chroma_dir / "coll-a" / "post_backup.bin").write_bytes(b"post")
|
||||
|
||||
restore_runtime_backup(
|
||||
"20260409T150000Z", confirm_service_stopped=True
|
||||
)
|
||||
|
||||
# Directory still exists (would have failed on mount point) and
|
||||
# its st_ino matches — the mount itself wasn't unlinked.
|
||||
assert chroma_dir.exists()
|
||||
chroma_stat_after = chroma_dir.stat()
|
||||
assert chroma_stat_before.st_ino == chroma_stat_after.st_ino, (
|
||||
"chroma directory inode changed — restore recreated the "
|
||||
"directory instead of clearing its contents; this would "
|
||||
"fail on a Docker bind-mounted volume"
|
||||
)
|
||||
# And the contents did actually get restored.
|
||||
assert (chroma_dir / "coll-a" / "baseline.bin").exists()
|
||||
assert not (chroma_dir / "coll-a" / "post_backup.bin").exists()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_restore_skips_pre_snapshot_when_requested(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
create_runtime_backup(datetime(2026, 4, 9, 13, 0, 0, tzinfo=UTC))
|
||||
|
||||
before_count = len(list_runtime_backups())
|
||||
|
||||
result = restore_runtime_backup(
|
||||
"20260409T130000Z",
|
||||
confirm_service_stopped=True,
|
||||
pre_restore_snapshot=False,
|
||||
)
|
||||
|
||||
after_count = len(list_runtime_backups())
|
||||
assert result["pre_restore_snapshot"] is None
|
||||
assert after_count == before_count
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_create_backup_includes_validation_fields(tmp_path, monkeypatch):
|
||||
"""Task B: create_runtime_backup auto-validates and reports result."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
result = create_runtime_backup(datetime(2026, 4, 11, 10, 0, 0, tzinfo=UTC))
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert "validated" in result
|
||||
assert "validation_errors" in result
|
||||
assert result["validated"] is True
|
||||
assert result["validation_errors"] == []
|
||||
|
||||
|
||||
def test_create_backup_validation_failure_does_not_raise(tmp_path, monkeypatch):
|
||||
"""Task B: if post-backup validation fails, backup still returns metadata."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
def _broken_validate(stamp):
|
||||
return {"valid": False, "errors": ["db_missing", "metadata_missing"]}
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
monkeypatch.setattr("atocore.ops.backup.validate_backup", _broken_validate)
|
||||
result = create_runtime_backup(datetime(2026, 4, 11, 11, 0, 0, tzinfo=UTC))
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
# Should NOT have raised — backup still returned metadata
|
||||
assert result["validated"] is False
|
||||
assert result["validation_errors"] == ["db_missing", "metadata_missing"]
|
||||
# Core backup fields still present
|
||||
assert "db_snapshot_path" in result
|
||||
assert "created_at" in result
|
||||
|
||||
|
||||
def test_restore_cleans_stale_wal_sidecars(tmp_path, monkeypatch):
|
||||
"""Stale WAL/SHM sidecars must not carry bytes past the restore.
|
||||
|
||||
Note: after restore runs, PRAGMA integrity_check reopens the
|
||||
restored db which may legitimately recreate a fresh -wal. So we
|
||||
assert that the STALE byte marker no longer appears in either
|
||||
sidecar, not that the files are absent.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
create_runtime_backup(datetime(2026, 4, 9, 14, 0, 0, tzinfo=UTC))
|
||||
|
||||
# Write fake stale WAL/SHM next to the live db with an
|
||||
# unmistakable marker.
|
||||
target_db = config.settings.db_path
|
||||
wal = target_db.with_name(target_db.name + "-wal")
|
||||
shm = target_db.with_name(target_db.name + "-shm")
|
||||
stale_marker = b"STALE-SIDECAR-MARKER-DO-NOT-SURVIVE"
|
||||
wal.write_bytes(stale_marker)
|
||||
shm.write_bytes(stale_marker)
|
||||
assert wal.exists() and shm.exists()
|
||||
|
||||
restore_runtime_backup(
|
||||
"20260409T140000Z", confirm_service_stopped=True
|
||||
)
|
||||
|
||||
# The restored db must pass integrity check (tested elsewhere);
|
||||
# here we just confirm that no file next to it still contains
|
||||
# the stale marker from the old live process.
|
||||
for sidecar in (wal, shm):
|
||||
if sidecar.exists():
|
||||
assert stale_marker not in sidecar.read_bytes(), (
|
||||
f"{sidecar.name} still carries stale marker"
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Task C: Backup retention cleanup
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _setup_cleanup_env(tmp_path, monkeypatch):
|
||||
"""Helper: configure env, init db, return snapshots_root."""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
original = config.settings
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
snapshots_root = config.settings.resolved_backup_dir / "snapshots"
|
||||
snapshots_root.mkdir(parents=True, exist_ok=True)
|
||||
return original, snapshots_root
|
||||
|
||||
|
||||
def _seed_snapshots(snapshots_root, dates):
|
||||
"""Create minimal valid snapshot dirs for the given datetimes."""
|
||||
for dt in dates:
|
||||
stamp = dt.strftime("%Y%m%dT%H%M%SZ")
|
||||
snap_dir = snapshots_root / stamp
|
||||
db_dir = snap_dir / "db"
|
||||
db_dir.mkdir(parents=True, exist_ok=True)
|
||||
db_path = db_dir / "atocore.db"
|
||||
conn = sqlite3.connect(str(db_path))
|
||||
conn.execute("CREATE TABLE IF NOT EXISTS _marker (id INTEGER)")
|
||||
conn.close()
|
||||
metadata = {
|
||||
"created_at": dt.isoformat(),
|
||||
"backup_root": str(snap_dir),
|
||||
"db_snapshot_path": str(db_path),
|
||||
"db_size_bytes": db_path.stat().st_size,
|
||||
"registry_snapshot_path": "",
|
||||
"chroma_snapshot_path": "",
|
||||
"chroma_snapshot_bytes": 0,
|
||||
"chroma_snapshot_files": 0,
|
||||
"chroma_snapshot_included": False,
|
||||
"vector_store_note": "",
|
||||
}
|
||||
(snap_dir / "backup-metadata.json").write_text(
|
||||
json.dumps(metadata, indent=2) + "\n", encoding="utf-8"
|
||||
)
|
||||
|
||||
|
||||
def test_cleanup_empty_dir(tmp_path, monkeypatch):
|
||||
original, _ = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
result = cleanup_old_backups()
|
||||
assert result["kept"] == 0
|
||||
assert result["would_delete"] == 0
|
||||
assert result["dry_run"] is True
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_dry_run_identifies_old_snapshots(tmp_path, monkeypatch):
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
# 10 daily snapshots Apr 2-11 (avoiding Apr 1 which is monthly).
|
||||
base = datetime(2026, 4, 2, 12, 0, 0, tzinfo=UTC)
|
||||
dates = [base + timedelta(days=i) for i in range(10)]
|
||||
_seed_snapshots(snapshots_root, dates)
|
||||
|
||||
result = cleanup_old_backups()
|
||||
assert result["dry_run"] is True
|
||||
# 7 daily kept + Apr 5 is a Sunday (weekly) but already in daily.
|
||||
# Apr 2, 3, 4 are oldest. Apr 5 is Sunday → kept as weekly.
|
||||
# So: 7 daily (Apr 5-11) + 1 weekly (Apr 5 already counted) = 7 daily.
|
||||
# But Apr 5 is the 8th newest day from Apr 11... wait.
|
||||
# Newest 7 days: Apr 11,10,9,8,7,6,5 → all kept as daily.
|
||||
# Remaining: Apr 4,3,2. Apr 5 is already in daily.
|
||||
# None of Apr 4,3,2 are Sunday or 1st → all 3 deleted.
|
||||
assert result["kept"] == 7
|
||||
assert result["would_delete"] == 3
|
||||
assert len(list(snapshots_root.iterdir())) == 10
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_confirm_deletes(tmp_path, monkeypatch):
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
base = datetime(2026, 4, 2, 12, 0, 0, tzinfo=UTC)
|
||||
dates = [base + timedelta(days=i) for i in range(10)]
|
||||
_seed_snapshots(snapshots_root, dates)
|
||||
|
||||
result = cleanup_old_backups(confirm=True)
|
||||
assert result["dry_run"] is False
|
||||
assert result["deleted"] == 3
|
||||
assert result["kept"] == 7
|
||||
assert len(list(snapshots_root.iterdir())) == 7
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_keeps_last_7_daily(tmp_path, monkeypatch):
|
||||
"""Exactly 7 snapshots on different days → all kept."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
dates = [base + timedelta(days=i) for i in range(7)]
|
||||
_seed_snapshots(snapshots_root, dates)
|
||||
|
||||
result = cleanup_old_backups()
|
||||
assert result["kept"] == 7
|
||||
assert result["would_delete"] == 0
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_keeps_sunday_weekly(tmp_path, monkeypatch):
|
||||
"""Snapshots on Sundays outside the 7-day window are kept as weekly."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
# 7 daily snapshots covering Apr 5-11
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
daily = [base + timedelta(days=i) for i in range(7)]
|
||||
|
||||
# 2 older Sunday snapshots
|
||||
sun1 = datetime(2026, 3, 29, 12, 0, 0, tzinfo=UTC) # Sunday
|
||||
sun2 = datetime(2026, 3, 22, 12, 0, 0, tzinfo=UTC) # Sunday
|
||||
# A non-Sunday old snapshot that should be deleted
|
||||
wed = datetime(2026, 3, 25, 12, 0, 0, tzinfo=UTC) # Wednesday
|
||||
|
||||
_seed_snapshots(snapshots_root, daily + [sun1, sun2, wed])
|
||||
|
||||
result = cleanup_old_backups()
|
||||
# 7 daily + 2 Sunday weekly = 9 kept, 1 Wednesday deleted
|
||||
assert result["kept"] == 9
|
||||
assert result["would_delete"] == 1
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_keeps_monthly_first(tmp_path, monkeypatch):
|
||||
"""Snapshots on the 1st of a month outside daily+weekly are kept as monthly."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
# 7 daily in April 2026
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
daily = [base + timedelta(days=i) for i in range(7)]
|
||||
|
||||
# Old monthly 1st snapshots
|
||||
m1 = datetime(2026, 1, 1, 12, 0, 0, tzinfo=UTC)
|
||||
m2 = datetime(2025, 12, 1, 12, 0, 0, tzinfo=UTC)
|
||||
# Old non-1st, non-Sunday snapshot — should be deleted
|
||||
old = datetime(2026, 1, 15, 12, 0, 0, tzinfo=UTC)
|
||||
|
||||
_seed_snapshots(snapshots_root, daily + [m1, m2, old])
|
||||
|
||||
result = cleanup_old_backups()
|
||||
# 7 daily + 2 monthly = 9 kept, 1 deleted
|
||||
assert result["kept"] == 9
|
||||
assert result["would_delete"] == 1
|
||||
finally:
|
||||
config.settings = original
|
||||
|
||||
|
||||
def test_cleanup_unparseable_stamp_skipped(tmp_path, monkeypatch):
|
||||
"""Directories with unparseable names are ignored, not deleted."""
|
||||
original, snapshots_root = _setup_cleanup_env(tmp_path, monkeypatch)
|
||||
try:
|
||||
base = datetime(2026, 4, 5, 12, 0, 0, tzinfo=UTC)
|
||||
_seed_snapshots(snapshots_root, [base])
|
||||
|
||||
bad_dir = snapshots_root / "not-a-timestamp"
|
||||
bad_dir.mkdir()
|
||||
|
||||
result = cleanup_old_backups(confirm=True)
|
||||
assert result.get("unparseable") == ["not-a-timestamp"]
|
||||
assert bad_dir.exists()
|
||||
assert result["kept"] == 1
|
||||
finally:
|
||||
config.settings = original
|
||||
249
tests/test_capture_stop.py
Normal file
249
tests/test_capture_stop.py
Normal file
@@ -0,0 +1,249 @@
|
||||
"""Tests for deploy/hooks/capture_stop.py — Claude Code Stop hook."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
import textwrap
|
||||
from io import StringIO
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
|
||||
# The hook script lives outside of the normal package tree, so import
|
||||
# it by manipulating sys.path.
|
||||
_HOOK_DIR = str(Path(__file__).resolve().parent.parent / "deploy" / "hooks")
|
||||
if _HOOK_DIR not in sys.path:
|
||||
sys.path.insert(0, _HOOK_DIR)
|
||||
|
||||
import capture_stop # noqa: E402
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _write_transcript(tmp: Path, entries: list[dict]) -> str:
|
||||
"""Write a JSONL transcript and return the path."""
|
||||
path = tmp / "transcript.jsonl"
|
||||
with open(path, "w", encoding="utf-8") as f:
|
||||
for entry in entries:
|
||||
f.write(json.dumps(entry, ensure_ascii=False) + "\n")
|
||||
return str(path)
|
||||
|
||||
|
||||
def _user_entry(content: str, *, is_meta: bool = False) -> dict:
|
||||
return {
|
||||
"type": "user",
|
||||
"isMeta": is_meta,
|
||||
"message": {"role": "user", "content": content},
|
||||
}
|
||||
|
||||
|
||||
def _assistant_entry() -> dict:
|
||||
return {
|
||||
"type": "assistant",
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": [{"type": "text", "text": "Sure, here's the answer."}],
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def _system_entry() -> dict:
|
||||
return {"type": "system", "message": {"role": "system", "content": "system init"}}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _extract_last_user_prompt
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestExtractLastUserPrompt:
|
||||
def test_returns_last_real_prompt(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("First prompt that is long enough to capture"),
|
||||
_assistant_entry(),
|
||||
_user_entry("Second prompt that should be the one we capture"),
|
||||
_assistant_entry(),
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "Second prompt that should be the one we capture"
|
||||
|
||||
def test_skips_meta_messages(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("Real prompt that is definitely long enough"),
|
||||
_user_entry("<local-command>some system stuff</local-command>"),
|
||||
_user_entry("Meta message that looks real enough", is_meta=True),
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "Real prompt that is definitely long enough"
|
||||
|
||||
def test_skips_xml_content(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("Actual prompt from a real human user"),
|
||||
_user_entry("<command-name>/help</command-name>"),
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "Actual prompt from a real human user"
|
||||
|
||||
def test_skips_short_messages(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [
|
||||
_user_entry("This prompt is long enough to be captured"),
|
||||
_user_entry("yes"), # too short
|
||||
])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == "This prompt is long enough to be captured"
|
||||
|
||||
def test_handles_content_blocks(self, tmp_path):
|
||||
entry = {
|
||||
"type": "user",
|
||||
"message": {
|
||||
"role": "user",
|
||||
"content": [
|
||||
{"type": "text", "text": "First paragraph of the prompt."},
|
||||
{"type": "text", "text": "Second paragraph continues here."},
|
||||
],
|
||||
},
|
||||
}
|
||||
path = _write_transcript(tmp_path, [entry])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert "First paragraph" in result
|
||||
assert "Second paragraph" in result
|
||||
|
||||
def test_empty_transcript(self, tmp_path):
|
||||
path = _write_transcript(tmp_path, [])
|
||||
result = capture_stop._extract_last_user_prompt(path)
|
||||
assert result == ""
|
||||
|
||||
def test_missing_file(self):
|
||||
result = capture_stop._extract_last_user_prompt("/nonexistent/path.jsonl")
|
||||
assert result == ""
|
||||
|
||||
def test_empty_path(self):
|
||||
result = capture_stop._extract_last_user_prompt("")
|
||||
assert result == ""
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _infer_project
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestInferProject:
|
||||
def test_empty_cwd(self):
|
||||
assert capture_stop._infer_project("") == ""
|
||||
|
||||
def test_unknown_path(self):
|
||||
assert capture_stop._infer_project("C:\\Users\\antoi\\random") == ""
|
||||
|
||||
def test_mapped_path(self):
|
||||
with mock.patch.dict(capture_stop._PROJECT_PATH_MAP, {
|
||||
"C:\\Users\\antoi\\gigabit": "p04-gigabit",
|
||||
}):
|
||||
result = capture_stop._infer_project("C:\\Users\\antoi\\gigabit\\src")
|
||||
assert result == "p04-gigabit"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _capture (integration-style, mocking HTTP)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class TestCapture:
|
||||
def _hook_input(self, *, transcript_path: str = "", **overrides) -> str:
|
||||
data = {
|
||||
"session_id": "test-session-123",
|
||||
"transcript_path": transcript_path,
|
||||
"cwd": "C:\\Users\\antoi\\ATOCore",
|
||||
"permission_mode": "default",
|
||||
"hook_event_name": "Stop",
|
||||
"last_assistant_message": "Here is the answer to your question about the code.",
|
||||
"turn_number": 3,
|
||||
}
|
||||
data.update(overrides)
|
||||
return json.dumps(data)
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_posts_to_atocore(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("Please explain how the backup system works in detail"),
|
||||
_assistant_entry(),
|
||||
])
|
||||
mock_resp = mock.MagicMock()
|
||||
mock_resp.read.return_value = json.dumps({"id": "int-001", "status": "recorded"}).encode()
|
||||
mock_urlopen.return_value = mock_resp
|
||||
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
|
||||
mock_urlopen.assert_called_once()
|
||||
req = mock_urlopen.call_args[0][0]
|
||||
body = json.loads(req.data.decode())
|
||||
assert body["prompt"] == "Please explain how the backup system works in detail"
|
||||
assert body["client"] == "claude-code"
|
||||
assert body["session_id"] == "test-session-123"
|
||||
assert body["reinforce"] is True
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_skips_when_disabled(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("A prompt that would normally be captured"),
|
||||
])
|
||||
with mock.patch.dict(os.environ, {"ATOCORE_CAPTURE_DISABLED": "1"}):
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
mock_urlopen.assert_not_called()
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_skips_short_prompt(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("yes"),
|
||||
])
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
mock_urlopen.assert_not_called()
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_truncates_long_response(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("Tell me everything about the entire codebase architecture"),
|
||||
])
|
||||
long_response = "x" * 60_000
|
||||
mock_resp = mock.MagicMock()
|
||||
mock_resp.read.return_value = json.dumps({"id": "int-002"}).encode()
|
||||
mock_urlopen.return_value = mock_resp
|
||||
|
||||
with mock.patch("sys.stdin", StringIO(
|
||||
self._hook_input(transcript_path=transcript, last_assistant_message=long_response)
|
||||
)):
|
||||
capture_stop._capture()
|
||||
|
||||
req = mock_urlopen.call_args[0][0]
|
||||
body = json.loads(req.data.decode())
|
||||
assert len(body["response"]) <= capture_stop.MAX_RESPONSE_LENGTH + 20
|
||||
assert body["response"].endswith("[truncated]")
|
||||
|
||||
def test_main_never_raises(self):
|
||||
"""main() must always exit 0, even on garbage input."""
|
||||
with mock.patch("sys.stdin", StringIO("not json at all")):
|
||||
# Should not raise
|
||||
capture_stop.main()
|
||||
|
||||
@mock.patch("capture_stop.urllib.request.urlopen")
|
||||
def test_uses_atocore_url_env(self, mock_urlopen, tmp_path):
|
||||
transcript = _write_transcript(tmp_path, [
|
||||
_user_entry("Please help me with this particular problem in the code"),
|
||||
])
|
||||
mock_resp = mock.MagicMock()
|
||||
mock_resp.read.return_value = json.dumps({"id": "int-003"}).encode()
|
||||
mock_urlopen.return_value = mock_resp
|
||||
|
||||
with mock.patch.dict(os.environ, {"ATOCORE_URL": "http://localhost:9999"}):
|
||||
# Re-read the env var
|
||||
with mock.patch.object(capture_stop, "ATOCORE_URL", "http://localhost:9999"):
|
||||
with mock.patch("sys.stdin", StringIO(self._hook_input(transcript_path=transcript))):
|
||||
capture_stop._capture()
|
||||
|
||||
req = mock_urlopen.call_args[0][0]
|
||||
assert req.full_url == "http://localhost:9999/interactions"
|
||||
73
tests/test_chunker.py
Normal file
73
tests/test_chunker.py
Normal file
@@ -0,0 +1,73 @@
|
||||
"""Tests for the markdown chunker."""
|
||||
|
||||
from atocore.ingestion.chunker import chunk_markdown
|
||||
|
||||
|
||||
def test_basic_chunking():
|
||||
"""Test that markdown is split into chunks."""
|
||||
body = """## Section One
|
||||
|
||||
This is the first section with some content that is long enough to pass the minimum chunk size filter applied by the chunker.
|
||||
|
||||
## Section Two
|
||||
|
||||
This is the second section with different content that is also long enough to pass the minimum chunk size threshold.
|
||||
"""
|
||||
chunks = chunk_markdown(body)
|
||||
assert len(chunks) >= 2
|
||||
assert all(c.char_count > 0 for c in chunks)
|
||||
assert all(c.chunk_index >= 0 for c in chunks)
|
||||
|
||||
|
||||
def test_heading_path_preserved():
|
||||
"""Test that heading paths are captured."""
|
||||
body = """## Architecture
|
||||
|
||||
### Layers
|
||||
|
||||
The system has multiple layers organized in a clear hierarchy for separation of concerns and maintainability.
|
||||
"""
|
||||
chunks = chunk_markdown(body)
|
||||
assert len(chunks) >= 1
|
||||
# At least one chunk should have heading info
|
||||
has_heading = any(c.heading_path for c in chunks)
|
||||
assert has_heading
|
||||
|
||||
|
||||
def test_small_chunks_filtered():
|
||||
"""Test that very small chunks are discarded."""
|
||||
body = """## A
|
||||
|
||||
Hi
|
||||
|
||||
## B
|
||||
|
||||
This is a real section with enough content to pass the minimum size threshold.
|
||||
"""
|
||||
chunks = chunk_markdown(body, min_size=50)
|
||||
# "Hi" should be filtered out
|
||||
for c in chunks:
|
||||
assert c.char_count >= 50
|
||||
|
||||
|
||||
def test_large_section_split():
|
||||
"""Test that large sections are split further."""
|
||||
large_content = "Word " * 200 # ~1000 chars
|
||||
body = f"## Big Section\n\n{large_content}"
|
||||
chunks = chunk_markdown(body, max_size=400)
|
||||
assert len(chunks) >= 2
|
||||
|
||||
|
||||
def test_metadata_passed_through():
|
||||
"""Test that base metadata is included in chunks."""
|
||||
body = "## Test\n\nSome content here that is long enough."
|
||||
meta = {"source_file": "/test/file.md", "tags": ["test"]}
|
||||
chunks = chunk_markdown(body, base_metadata=meta)
|
||||
if chunks:
|
||||
assert chunks[0].metadata.get("source_file") == "/test/file.md"
|
||||
|
||||
|
||||
def test_empty_body():
|
||||
"""Test chunking an empty body."""
|
||||
chunks = chunk_markdown("")
|
||||
assert chunks == []
|
||||
89
tests/test_config.py
Normal file
89
tests/test_config.py
Normal file
@@ -0,0 +1,89 @@
|
||||
"""Tests for configuration and canonical path boundaries."""
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import atocore.config as config
|
||||
|
||||
|
||||
def test_settings_resolve_canonical_directories(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(tmp_path / "vault-source"))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(tmp_path / "drive-source"))
|
||||
monkeypatch.setenv("ATOCORE_LOG_DIR", str(tmp_path / "logs"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
settings = config.Settings()
|
||||
|
||||
assert settings.db_path == (tmp_path / "data" / "db" / "atocore.db").resolve()
|
||||
assert settings.chroma_path == (tmp_path / "data" / "chroma").resolve()
|
||||
assert settings.cache_path == (tmp_path / "data" / "cache").resolve()
|
||||
assert settings.tmp_path == (tmp_path / "data" / "tmp").resolve()
|
||||
assert settings.resolved_vault_source_dir == (tmp_path / "vault-source").resolve()
|
||||
assert settings.resolved_drive_source_dir == (tmp_path / "drive-source").resolve()
|
||||
assert settings.resolved_log_dir == (tmp_path / "logs").resolve()
|
||||
assert settings.resolved_backup_dir == (tmp_path / "backups").resolve()
|
||||
assert settings.resolved_run_dir == (tmp_path / "run").resolve()
|
||||
assert settings.resolved_project_registry_path == (
|
||||
tmp_path / "config" / "project-registry.json"
|
||||
).resolve()
|
||||
|
||||
|
||||
def test_settings_keep_legacy_db_path_when_present(tmp_path, monkeypatch):
|
||||
data_dir = tmp_path / "data"
|
||||
data_dir.mkdir()
|
||||
legacy_db = data_dir / "atocore.db"
|
||||
legacy_db.write_text("", encoding="utf-8")
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(data_dir))
|
||||
|
||||
settings = config.Settings()
|
||||
|
||||
assert settings.db_path == legacy_db.resolve()
|
||||
|
||||
|
||||
def test_ranking_weights_are_tunable_via_env(monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_RANK_PROJECT_MATCH_BOOST", "3.5")
|
||||
monkeypatch.setenv("ATOCORE_RANK_QUERY_TOKEN_STEP", "0.12")
|
||||
monkeypatch.setenv("ATOCORE_RANK_QUERY_TOKEN_CAP", "1.5")
|
||||
monkeypatch.setenv("ATOCORE_RANK_PATH_HIGH_SIGNAL_BOOST", "1.25")
|
||||
monkeypatch.setenv("ATOCORE_RANK_PATH_LOW_SIGNAL_PENALTY", "0.5")
|
||||
|
||||
settings = config.Settings()
|
||||
|
||||
assert settings.rank_project_match_boost == 3.5
|
||||
assert settings.rank_query_token_step == 0.12
|
||||
assert settings.rank_query_token_cap == 1.5
|
||||
assert settings.rank_path_high_signal_boost == 1.25
|
||||
assert settings.rank_path_low_signal_penalty == 0.5
|
||||
|
||||
|
||||
def test_ensure_runtime_dirs_creates_machine_dirs_only(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(tmp_path / "vault-source"))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(tmp_path / "drive-source"))
|
||||
monkeypatch.setenv("ATOCORE_LOG_DIR", str(tmp_path / "logs"))
|
||||
monkeypatch.setenv("ATOCORE_BACKUP_DIR", str(tmp_path / "backups"))
|
||||
monkeypatch.setenv(
|
||||
"ATOCORE_PROJECT_REGISTRY_PATH", str(tmp_path / "config" / "project-registry.json")
|
||||
)
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
config.ensure_runtime_dirs()
|
||||
|
||||
assert config.settings.db_path.parent.exists()
|
||||
assert config.settings.chroma_path.exists()
|
||||
assert config.settings.cache_path.exists()
|
||||
assert config.settings.tmp_path.exists()
|
||||
assert config.settings.resolved_log_dir.exists()
|
||||
assert config.settings.resolved_backup_dir.exists()
|
||||
assert config.settings.resolved_run_dir.exists()
|
||||
assert config.settings.resolved_project_registry_path.parent.exists()
|
||||
assert not config.settings.resolved_vault_source_dir.exists()
|
||||
assert not config.settings.resolved_drive_source_dir.exists()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
348
tests/test_context_builder.py
Normal file
348
tests/test_context_builder.py
Normal file
@@ -0,0 +1,348 @@
|
||||
"""Tests for the context builder."""
|
||||
|
||||
import json
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.context.builder import build_context, get_last_context_pack
|
||||
from atocore.context.project_state import init_project_state_schema, set_state
|
||||
from atocore.ingestion.pipeline import ingest_file
|
||||
from atocore.models.database import init_db
|
||||
|
||||
|
||||
def test_build_context_returns_pack(tmp_data_dir, sample_markdown):
|
||||
"""Test that context builder returns a valid pack."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
pack = build_context("What is AtoCore?")
|
||||
assert pack.total_chars > 0
|
||||
assert len(pack.chunks_used) > 0
|
||||
assert pack.budget_remaining >= 0
|
||||
assert "--- End Context ---" in pack.formatted_context
|
||||
|
||||
|
||||
def test_context_respects_budget(tmp_data_dir, sample_markdown):
|
||||
"""Test that context builder respects character budget."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
pack = build_context("What is AtoCore?", budget=500)
|
||||
assert pack.total_chars <= 500
|
||||
assert len(pack.formatted_context) <= 500
|
||||
|
||||
|
||||
def test_context_with_project_hint(tmp_data_dir, sample_markdown):
|
||||
"""Test that project hint boosts relevant chunks."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
pack = build_context("What is the architecture?", project_hint="atocore")
|
||||
assert len(pack.chunks_used) > 0
|
||||
assert pack.total_chars > 0
|
||||
|
||||
|
||||
def test_context_builder_passes_project_hint_to_retrieval(monkeypatch):
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
|
||||
calls = []
|
||||
|
||||
def fake_retrieve(query, top_k=None, filter_tags=None, project_hint=None):
|
||||
calls.append((query, project_hint))
|
||||
return []
|
||||
|
||||
monkeypatch.setattr("atocore.context.builder.retrieve", fake_retrieve)
|
||||
|
||||
build_context("architecture", project_hint="p05-interferometer", budget=300)
|
||||
|
||||
assert calls == [("architecture", "p05-interferometer")]
|
||||
|
||||
|
||||
def test_last_context_pack_stored(tmp_data_dir, sample_markdown):
|
||||
"""Test that last context pack is stored for debug."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
build_context("test prompt")
|
||||
last = get_last_context_pack()
|
||||
assert last is not None
|
||||
assert last.query == "test prompt"
|
||||
|
||||
|
||||
def test_full_prompt_structure(tmp_data_dir, sample_markdown):
|
||||
"""Test that the full prompt has correct structure."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
pack = build_context("What are memory types?")
|
||||
assert "knowledge base" in pack.full_prompt.lower()
|
||||
assert "What are memory types?" in pack.full_prompt
|
||||
|
||||
|
||||
def test_project_state_included_in_context(tmp_data_dir, sample_markdown):
|
||||
"""Test that trusted project state is injected into context."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
# Set some project state
|
||||
set_state("atocore", "status", "phase", "Phase 0.5 complete")
|
||||
set_state("atocore", "decision", "database", "SQLite for structured data")
|
||||
|
||||
pack = build_context("What is AtoCore?", project_hint="atocore")
|
||||
|
||||
# Project state should appear in context
|
||||
assert "--- Trusted Project State ---" in pack.formatted_context
|
||||
assert "Phase 0.5 complete" in pack.formatted_context
|
||||
assert "SQLite for structured data" in pack.formatted_context
|
||||
assert pack.project_state_chars > 0
|
||||
|
||||
|
||||
def test_trusted_state_precedence_is_restated_in_retrieved_context(tmp_data_dir, sample_markdown):
|
||||
"""When trusted state and retrieval coexist, the context should restate precedence explicitly."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
set_state("atocore", "status", "phase", "Phase 2")
|
||||
pack = build_context("What is AtoCore?", project_hint="atocore")
|
||||
|
||||
assert "If retrieved context conflicts with Trusted Project State above" in pack.formatted_context
|
||||
|
||||
|
||||
def test_project_state_takes_priority_budget(tmp_data_dir, sample_markdown):
|
||||
"""Test that project state is included even with tight budget."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
set_state("atocore", "status", "phase", "Phase 1 in progress")
|
||||
|
||||
# Small budget — project state should still be included
|
||||
pack = build_context("status?", project_hint="atocore", budget=500)
|
||||
assert "Phase 1 in progress" in pack.formatted_context
|
||||
|
||||
|
||||
def test_project_state_respects_total_budget(tmp_data_dir, sample_markdown):
|
||||
"""Trusted state should still fit within the total context budget."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
set_state("atocore", "status", "notes", "x" * 400)
|
||||
set_state("atocore", "decision", "details", "y" * 400)
|
||||
|
||||
pack = build_context("status?", project_hint="atocore", budget=120)
|
||||
assert pack.total_chars <= 120
|
||||
assert pack.budget_remaining >= 0
|
||||
assert len(pack.formatted_context) <= 120
|
||||
|
||||
|
||||
def test_project_hint_matches_state_case_insensitively(tmp_data_dir, sample_markdown):
|
||||
"""Project state lookup should not depend on exact casing."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
set_state("AtoCore", "status", "phase", "Phase 2")
|
||||
pack = build_context("status?", project_hint="atocore")
|
||||
assert "Phase 2" in pack.formatted_context
|
||||
|
||||
|
||||
def test_no_project_state_without_hint(tmp_data_dir, sample_markdown):
|
||||
"""Test that project state is not included without project hint."""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
set_state("atocore", "status", "phase", "Phase 1")
|
||||
|
||||
pack = build_context("What is AtoCore?")
|
||||
assert pack.project_state_chars == 0
|
||||
assert "--- Trusted Project State ---" not in pack.formatted_context
|
||||
|
||||
|
||||
def test_alias_hint_resolves_through_registry(tmp_data_dir, sample_markdown, monkeypatch):
|
||||
"""An alias hint like 'p05' should find project state stored under 'p05-interferometer'.
|
||||
|
||||
This is the regression test for the P1 finding from codex's review:
|
||||
/context/build was previously doing an exact-name lookup that
|
||||
silently dropped trusted project state when the caller passed an
|
||||
alias instead of the canonical project id.
|
||||
"""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
# Stand up a minimal project registry that knows the aliases.
|
||||
# The registry lives in a JSON file pointed to by
|
||||
# ATOCORE_PROJECT_REGISTRY_PATH; the dataclass-driven loader picks
|
||||
# it up on every call (no in-process cache to invalidate).
|
||||
registry_path = tmp_data_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"description": "P05 alias-resolution regression test",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05"}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
# Trusted state is stored under the canonical id (the way the
|
||||
# /project/state endpoint always writes it).
|
||||
set_state(
|
||||
"p05-interferometer",
|
||||
"status",
|
||||
"next_focus",
|
||||
"Wave 2 trusted-operational ingestion",
|
||||
)
|
||||
|
||||
# The bug: pack with alias hint used to silently miss the state.
|
||||
pack_with_alias = build_context("status?", project_hint="p05", budget=2000)
|
||||
assert "Wave 2 trusted-operational ingestion" in pack_with_alias.formatted_context
|
||||
assert pack_with_alias.project_state_chars > 0
|
||||
|
||||
# The canonical id should still work the same way.
|
||||
pack_with_canonical = build_context(
|
||||
"status?", project_hint="p05-interferometer", budget=2000
|
||||
)
|
||||
assert "Wave 2 trusted-operational ingestion" in pack_with_canonical.formatted_context
|
||||
|
||||
# A second alias should also resolve.
|
||||
pack_with_other_alias = build_context(
|
||||
"status?", project_hint="interferometer", budget=2000
|
||||
)
|
||||
assert "Wave 2 trusted-operational ingestion" in pack_with_other_alias.formatted_context
|
||||
|
||||
|
||||
def test_unknown_hint_falls_back_to_raw_lookup(tmp_data_dir, sample_markdown, monkeypatch):
|
||||
"""A hint that isn't in the registry should still try the raw name.
|
||||
|
||||
This preserves backwards compatibility with hand-curated
|
||||
project_state entries that predate the project registry.
|
||||
"""
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
# Empty registry — the hint won't resolve through it.
|
||||
registry_path = tmp_data_dir / "project-registry.json"
|
||||
registry_path.write_text('{"projects": []}', encoding="utf-8")
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
config.settings = config.Settings()
|
||||
|
||||
set_state("orphan-project", "status", "phase", "Solo run")
|
||||
|
||||
pack = build_context("status?", project_hint="orphan-project", budget=2000)
|
||||
assert "Solo run" in pack.formatted_context
|
||||
|
||||
|
||||
def test_project_memories_included_in_pack(tmp_data_dir, sample_markdown):
|
||||
"""Active project-scoped memories for the target project should
|
||||
land in a dedicated '--- Project Memories ---' band so the
|
||||
Phase 9 reflection loop has a retrieval outlet."""
|
||||
from atocore.memory.service import create_memory
|
||||
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
mem = create_memory(
|
||||
memory_type="project",
|
||||
content="the mirror architecture is Option B conical back for p04-gigabit",
|
||||
project="p04-gigabit",
|
||||
confidence=0.9,
|
||||
)
|
||||
# A sibling memory for a different project must NOT leak into the pack.
|
||||
create_memory(
|
||||
memory_type="project",
|
||||
content="polisher suite splits into sim, post, control, contracts",
|
||||
project="p06-polisher",
|
||||
confidence=0.9,
|
||||
)
|
||||
|
||||
pack = build_context(
|
||||
"remind me about the mirror architecture",
|
||||
project_hint="p04-gigabit",
|
||||
budget=3000,
|
||||
)
|
||||
assert "--- Project Memories ---" in pack.formatted_context
|
||||
assert "Option B conical back" in pack.formatted_context
|
||||
assert "polisher suite splits" not in pack.formatted_context
|
||||
assert pack.project_memory_chars > 0
|
||||
assert mem.project == "p04-gigabit"
|
||||
|
||||
|
||||
def test_project_memories_absent_without_project_hint(tmp_data_dir, sample_markdown):
|
||||
"""Without a project hint, project memories stay out of the pack —
|
||||
cross-project bleed would rot the signal."""
|
||||
from atocore.memory.service import create_memory
|
||||
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
create_memory(
|
||||
memory_type="project",
|
||||
content="scoped project knowledge that should not leak globally",
|
||||
project="p04-gigabit",
|
||||
confidence=0.9,
|
||||
)
|
||||
|
||||
pack = build_context("tell me something", budget=3000)
|
||||
assert "--- Project Memories ---" not in pack.formatted_context
|
||||
assert pack.project_memory_chars == 0
|
||||
|
||||
|
||||
def test_project_memories_query_relevance_ordering(tmp_data_dir, sample_markdown):
|
||||
"""When the budget only fits one memory, query-relevance ordering
|
||||
should pick the one the query is actually about — even if another
|
||||
memory has higher confidence.
|
||||
|
||||
Regression for the 2026-04-11 p05-vendor-signal harness failure:
|
||||
memory selection was fixed-order by confidence, so a lower-ranked
|
||||
vendor memory got starved out of the budget when a query was
|
||||
specifically about vendors.
|
||||
"""
|
||||
from atocore.memory.service import create_memory
|
||||
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
create_memory(
|
||||
memory_type="project",
|
||||
content="the folded-beam interferometer uses a CGH stage and fold mirror",
|
||||
project="p05-interferometer",
|
||||
confidence=0.97,
|
||||
)
|
||||
create_memory(
|
||||
memory_type="knowledge",
|
||||
content="vendor signal: Zygo Verifire SV is the strongest value path for the interferometer",
|
||||
project="p05-interferometer",
|
||||
confidence=0.85,
|
||||
)
|
||||
|
||||
pack = build_context(
|
||||
"what is the current vendor signal for the interferometer",
|
||||
project_hint="p05-interferometer",
|
||||
budget=1200, # tight enough that only one project memory fits
|
||||
)
|
||||
assert "Zygo Verifire SV" in pack.formatted_context
|
||||
assert pack.project_memory_chars > 0
|
||||
184
tests/test_database.py
Normal file
184
tests/test_database.py
Normal file
@@ -0,0 +1,184 @@
|
||||
"""Tests for SQLite connection pragmas and runtime behavior."""
|
||||
|
||||
import sqlite3
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.models.database import get_connection, init_db
|
||||
|
||||
|
||||
def test_get_connection_applies_busy_timeout_and_wal(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_DB_BUSY_TIMEOUT_MS", "7000")
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
init_db()
|
||||
with get_connection() as conn:
|
||||
busy_timeout = conn.execute("PRAGMA busy_timeout").fetchone()[0]
|
||||
journal_mode = conn.execute("PRAGMA journal_mode").fetchone()[0]
|
||||
foreign_keys = conn.execute("PRAGMA foreign_keys").fetchone()[0]
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert busy_timeout == 7000
|
||||
assert str(journal_mode).lower() == "wal"
|
||||
assert foreign_keys == 1
|
||||
|
||||
|
||||
def test_get_connection_uses_configured_timeout_value(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
monkeypatch.setenv("ATOCORE_DB_BUSY_TIMEOUT_MS", "2500")
|
||||
|
||||
original_settings = config.settings
|
||||
original_connect = sqlite3.connect
|
||||
calls = []
|
||||
|
||||
def fake_connect(*args, **kwargs):
|
||||
calls.append(kwargs.get("timeout"))
|
||||
return original_connect(*args, **kwargs)
|
||||
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
monkeypatch.setattr("atocore.models.database.sqlite3.connect", fake_connect)
|
||||
init_db()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert calls
|
||||
assert calls[0] == 2.5
|
||||
|
||||
|
||||
def test_init_db_upgrades_pre_phase9_schema_without_failing(tmp_path, monkeypatch):
|
||||
"""Regression test for the schema init ordering bug caught during
|
||||
the first real Dalidou deploy (report from 2026-04-08).
|
||||
|
||||
Before the fix, SCHEMA_SQL contained CREATE INDEX statements that
|
||||
referenced columns (memories.project, interactions.project,
|
||||
interactions.session_id) added by _apply_migrations later in
|
||||
init_db. On a fresh install this worked because CREATE TABLE
|
||||
created the tables with the new columns before the CREATE INDEX
|
||||
ran, but on UPGRADE from a pre-Phase-9 schema the CREATE TABLE
|
||||
IF NOT EXISTS was a no-op and the CREATE INDEX hit
|
||||
OperationalError: no such column.
|
||||
|
||||
This test seeds the tables with the OLD pre-Phase-9 shape then
|
||||
calls init_db() and verifies that:
|
||||
|
||||
- init_db does not raise
|
||||
- The new columns were added via _apply_migrations
|
||||
- The new indexes exist
|
||||
|
||||
If the bug is reintroduced by moving a CREATE INDEX for a
|
||||
migration column back into SCHEMA_SQL, this test will fail
|
||||
with OperationalError before reaching the assertions.
|
||||
"""
|
||||
monkeypatch.setenv("ATOCORE_DATA_DIR", str(tmp_path / "data"))
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
|
||||
# Step 1: create the data dir and open a direct connection
|
||||
config.ensure_runtime_dirs()
|
||||
db_path = config.settings.db_path
|
||||
|
||||
# Step 2: seed the DB with the old pre-Phase-9 shape. No
|
||||
# project/last_referenced_at/reference_count on memories; no
|
||||
# project/client/session_id/response/memories_used/chunks_used
|
||||
# on interactions. We also need the prerequisite tables
|
||||
# (projects, source_documents, source_chunks) because the
|
||||
# memories table has an FK to source_chunks.
|
||||
with sqlite3.connect(str(db_path)) as conn:
|
||||
conn.executescript(
|
||||
"""
|
||||
CREATE TABLE source_documents (
|
||||
id TEXT PRIMARY KEY,
|
||||
file_path TEXT UNIQUE NOT NULL,
|
||||
file_hash TEXT NOT NULL,
|
||||
title TEXT,
|
||||
doc_type TEXT DEFAULT 'markdown',
|
||||
tags TEXT DEFAULT '[]',
|
||||
ingested_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE source_chunks (
|
||||
id TEXT PRIMARY KEY,
|
||||
document_id TEXT NOT NULL REFERENCES source_documents(id) ON DELETE CASCADE,
|
||||
chunk_index INTEGER NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
heading_path TEXT DEFAULT '',
|
||||
char_count INTEGER NOT NULL,
|
||||
metadata TEXT DEFAULT '{}',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE memories (
|
||||
id TEXT PRIMARY KEY,
|
||||
memory_type TEXT NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
source_chunk_id TEXT REFERENCES source_chunks(id),
|
||||
confidence REAL DEFAULT 1.0,
|
||||
status TEXT DEFAULT 'active',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE projects (
|
||||
id TEXT PRIMARY KEY,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
description TEXT DEFAULT '',
|
||||
status TEXT DEFAULT 'active',
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE interactions (
|
||||
id TEXT PRIMARY KEY,
|
||||
prompt TEXT NOT NULL,
|
||||
context_pack TEXT DEFAULT '{}',
|
||||
response_summary TEXT DEFAULT '',
|
||||
project_id TEXT REFERENCES projects(id),
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
"""
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
# Step 3: call init_db — this used to raise on the upgrade
|
||||
# path. After the fix it should succeed.
|
||||
init_db()
|
||||
|
||||
# Step 4: verify the migrations ran — Phase 9 columns present
|
||||
with sqlite3.connect(str(db_path)) as conn:
|
||||
conn.row_factory = sqlite3.Row
|
||||
memories_cols = {
|
||||
row["name"] for row in conn.execute("PRAGMA table_info(memories)")
|
||||
}
|
||||
interactions_cols = {
|
||||
row["name"]
|
||||
for row in conn.execute("PRAGMA table_info(interactions)")
|
||||
}
|
||||
|
||||
assert "project" in memories_cols
|
||||
assert "last_referenced_at" in memories_cols
|
||||
assert "reference_count" in memories_cols
|
||||
|
||||
assert "project" in interactions_cols
|
||||
assert "client" in interactions_cols
|
||||
assert "session_id" in interactions_cols
|
||||
assert "response" in interactions_cols
|
||||
assert "memories_used" in interactions_cols
|
||||
assert "chunks_used" in interactions_cols
|
||||
|
||||
# Step 5: verify the indexes on migration columns exist
|
||||
index_rows = conn.execute(
|
||||
"SELECT name FROM sqlite_master WHERE type='index' AND tbl_name IN ('memories','interactions')"
|
||||
).fetchall()
|
||||
index_names = {row["name"] for row in index_rows}
|
||||
|
||||
assert "idx_memories_project" in index_names
|
||||
assert "idx_interactions_project_name" in index_names
|
||||
assert "idx_interactions_session" in index_names
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
374
tests/test_extractor.py
Normal file
374
tests/test_extractor.py
Normal file
@@ -0,0 +1,374 @@
|
||||
"""Tests for Phase 9 Commit C rule-based candidate extractor."""
|
||||
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
from atocore.interactions.service import record_interaction
|
||||
from atocore.main import app
|
||||
from atocore.memory.extractor import (
|
||||
MemoryCandidate,
|
||||
extract_candidates_from_interaction,
|
||||
)
|
||||
from atocore.memory.service import (
|
||||
create_memory,
|
||||
get_memories,
|
||||
promote_memory,
|
||||
reject_candidate_memory,
|
||||
)
|
||||
from atocore.models.database import init_db
|
||||
|
||||
|
||||
def _capture(**fields):
|
||||
return record_interaction(
|
||||
prompt=fields.get("prompt", "unused"),
|
||||
response=fields.get("response", ""),
|
||||
response_summary=fields.get("response_summary", ""),
|
||||
project=fields.get("project", ""),
|
||||
reinforce=False,
|
||||
)
|
||||
|
||||
|
||||
# --- extractor: heading patterns ------------------------------------------
|
||||
|
||||
|
||||
def test_extractor_finds_decision_heading(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response=(
|
||||
"We talked about the frame.\n\n"
|
||||
"## Decision: switch the lateral supports to GF-PTFE pads\n\n"
|
||||
"Rationale: thermal stability."
|
||||
),
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert len(results) == 1
|
||||
assert results[0].memory_type == "adaptation"
|
||||
assert "GF-PTFE" in results[0].content
|
||||
assert results[0].rule == "decision_heading"
|
||||
|
||||
|
||||
def test_extractor_finds_constraint_and_requirement_headings(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response=(
|
||||
"### Constraint: total mass must stay under 4.8 kg\n"
|
||||
"## Requirement: survives 12g shock in any axis\n"
|
||||
),
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
rules = {r.rule for r in results}
|
||||
assert "constraint_heading" in rules
|
||||
assert "requirement_heading" in rules
|
||||
constraint = next(r for r in results if r.rule == "constraint_heading")
|
||||
requirement = next(r for r in results if r.rule == "requirement_heading")
|
||||
assert constraint.memory_type == "project"
|
||||
assert requirement.memory_type == "project"
|
||||
assert "4.8 kg" in constraint.content
|
||||
assert "12g" in requirement.content
|
||||
|
||||
|
||||
def test_extractor_finds_fact_heading(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response="## Fact: the polisher sim uses floating-point deltas in microns\n",
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert len(results) == 1
|
||||
assert results[0].memory_type == "knowledge"
|
||||
assert results[0].rule == "fact_heading"
|
||||
|
||||
|
||||
def test_extractor_heading_separator_variants(tmp_data_dir):
|
||||
"""Decision headings should match with `:`, `-`, or em-dash."""
|
||||
init_db()
|
||||
for sep in (":", "-", "\u2014"):
|
||||
interaction = _capture(
|
||||
response=f"## Decision {sep} adopt option B for the mount interface\n",
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert len(results) == 1, f"sep={sep!r}"
|
||||
assert "option B" in results[0].content
|
||||
|
||||
|
||||
# --- extractor: sentence patterns -----------------------------------------
|
||||
|
||||
|
||||
def test_extractor_finds_preference_sentence(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response=(
|
||||
"I prefer rebase-based workflows because history stays linear "
|
||||
"and reviewers have an easier time."
|
||||
),
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
pref_matches = [r for r in results if r.rule == "preference_sentence"]
|
||||
assert len(pref_matches) == 1
|
||||
assert pref_matches[0].memory_type == "preference"
|
||||
assert "rebase" in pref_matches[0].content.lower()
|
||||
|
||||
|
||||
def test_extractor_finds_decided_to_sentence(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response=(
|
||||
"After going through the options we decided to keep the legacy "
|
||||
"calibration routine for the July milestone."
|
||||
),
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
decision_matches = [r for r in results if r.rule == "decided_to_sentence"]
|
||||
assert len(decision_matches) == 1
|
||||
assert decision_matches[0].memory_type == "adaptation"
|
||||
assert "legacy calibration" in decision_matches[0].content.lower()
|
||||
|
||||
|
||||
def test_extractor_finds_requirement_sentence(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response=(
|
||||
"One of the findings: the requirement is that the interferometer "
|
||||
"must resolve 50 picometer displacements at 1 kHz bandwidth."
|
||||
),
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
req_matches = [r for r in results if r.rule == "requirement_sentence"]
|
||||
assert len(req_matches) == 1
|
||||
assert req_matches[0].memory_type == "project"
|
||||
assert "picometer" in req_matches[0].content.lower()
|
||||
|
||||
|
||||
# --- extractor: content rules ---------------------------------------------
|
||||
|
||||
|
||||
def test_extractor_rejects_too_short_matches(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(response="## Decision: yes\n") # too short after clean
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert results == []
|
||||
|
||||
|
||||
def test_extractor_deduplicates_identical_matches(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response=(
|
||||
"## Decision: use the modular frame variant for prototyping\n"
|
||||
"## Decision: use the modular frame variant for prototyping\n"
|
||||
),
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert len(results) == 1
|
||||
|
||||
|
||||
def test_extractor_strips_trailing_punctuation(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
response="## Decision: defer the laser redesign to Q3.\n",
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert len(results) == 1
|
||||
assert results[0].content.endswith("Q3")
|
||||
|
||||
|
||||
def test_extractor_includes_project_and_source_interaction_id(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(
|
||||
project="p05-interferometer",
|
||||
response="## Decision: freeze the optical path for the prototype run\n",
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert len(results) == 1
|
||||
assert results[0].project == "p05-interferometer"
|
||||
assert results[0].source_interaction_id == interaction.id
|
||||
|
||||
|
||||
def test_extractor_drops_candidates_matching_existing_active(tmp_data_dir):
|
||||
init_db()
|
||||
# Seed an active memory that the extractor would otherwise re-propose
|
||||
create_memory(
|
||||
memory_type="preference",
|
||||
content="prefers small reviewable diffs",
|
||||
)
|
||||
interaction = _capture(
|
||||
response="Remember that I prefer small reviewable diffs because they merge faster.",
|
||||
)
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
# The only candidate would have been the preference, now dropped
|
||||
assert not any(r.content.lower() == "small reviewable diffs" for r in results)
|
||||
|
||||
|
||||
def test_extractor_returns_empty_for_no_patterns(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = _capture(response="Nothing structural here, just prose.")
|
||||
results = extract_candidates_from_interaction(interaction)
|
||||
assert results == []
|
||||
|
||||
|
||||
# --- service: candidate lifecycle -----------------------------------------
|
||||
|
||||
|
||||
def test_candidate_and_active_can_coexist(tmp_data_dir):
|
||||
init_db()
|
||||
active = create_memory(
|
||||
memory_type="preference",
|
||||
content="logs every config change to the change log",
|
||||
status="active",
|
||||
)
|
||||
candidate = create_memory(
|
||||
memory_type="preference",
|
||||
content="logs every config change to the change log",
|
||||
status="candidate",
|
||||
)
|
||||
# The two are distinct rows because status is part of the dedup key
|
||||
assert active.id != candidate.id
|
||||
|
||||
|
||||
def test_promote_memory_moves_candidate_to_active(tmp_data_dir):
|
||||
init_db()
|
||||
candidate = create_memory(
|
||||
memory_type="adaptation",
|
||||
content="moved the staging scripts into deploy/staging",
|
||||
status="candidate",
|
||||
)
|
||||
ok = promote_memory(candidate.id)
|
||||
assert ok is True
|
||||
|
||||
active_list = get_memories(memory_type="adaptation", status="active")
|
||||
assert any(m.id == candidate.id for m in active_list)
|
||||
|
||||
|
||||
def test_promote_memory_on_non_candidate_returns_false(tmp_data_dir):
|
||||
init_db()
|
||||
active = create_memory(
|
||||
memory_type="adaptation",
|
||||
content="already active adaptation entry",
|
||||
status="active",
|
||||
)
|
||||
assert promote_memory(active.id) is False
|
||||
|
||||
|
||||
def test_reject_candidate_moves_it_to_invalid(tmp_data_dir):
|
||||
init_db()
|
||||
candidate = create_memory(
|
||||
memory_type="knowledge",
|
||||
content="the calibration uses barometric pressure compensation",
|
||||
status="candidate",
|
||||
)
|
||||
ok = reject_candidate_memory(candidate.id)
|
||||
assert ok is True
|
||||
|
||||
invalid_list = get_memories(memory_type="knowledge", status="invalid")
|
||||
assert any(m.id == candidate.id for m in invalid_list)
|
||||
|
||||
|
||||
def test_reject_on_non_candidate_returns_false(tmp_data_dir):
|
||||
init_db()
|
||||
active = create_memory(memory_type="preference", content="always uses structured logging")
|
||||
assert reject_candidate_memory(active.id) is False
|
||||
|
||||
|
||||
def test_get_memories_filters_by_candidate_status(tmp_data_dir):
|
||||
init_db()
|
||||
create_memory(memory_type="preference", content="active one", status="active")
|
||||
create_memory(memory_type="preference", content="candidate one", status="candidate")
|
||||
create_memory(memory_type="preference", content="another candidate", status="candidate")
|
||||
candidates = get_memories(status="candidate", memory_type="preference")
|
||||
assert len(candidates) == 2
|
||||
assert all(c.status == "candidate" for c in candidates)
|
||||
|
||||
|
||||
# --- API: extract / promote / reject / list -------------------------------
|
||||
|
||||
|
||||
def test_api_extract_interaction_without_persist(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = record_interaction(
|
||||
prompt="review",
|
||||
response="## Decision: flip the default budget to 4000 for p05\n",
|
||||
reinforce=False,
|
||||
)
|
||||
client = TestClient(app)
|
||||
response = client.post(f"/interactions/{interaction.id}/extract", json={})
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["candidate_count"] == 1
|
||||
assert body["persisted"] is False
|
||||
assert body["persisted_ids"] == []
|
||||
# The candidate should NOT have been written to the memory table
|
||||
queue = get_memories(status="candidate")
|
||||
assert queue == []
|
||||
|
||||
|
||||
def test_api_extract_interaction_with_persist(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = record_interaction(
|
||||
prompt="review",
|
||||
response=(
|
||||
"## Decision: pin the embedding model to v2.3 for Wave 2\n"
|
||||
"## Constraint: context budget must stay under 4000 chars\n"
|
||||
),
|
||||
reinforce=False,
|
||||
)
|
||||
client = TestClient(app)
|
||||
response = client.post(
|
||||
f"/interactions/{interaction.id}/extract", json={"persist": True}
|
||||
)
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["candidate_count"] == 2
|
||||
assert body["persisted"] is True
|
||||
assert len(body["persisted_ids"]) == 2
|
||||
|
||||
queue = get_memories(status="candidate", limit=50)
|
||||
assert len(queue) == 2
|
||||
|
||||
|
||||
def test_api_extract_returns_404_for_missing_interaction(tmp_data_dir):
|
||||
init_db()
|
||||
client = TestClient(app)
|
||||
response = client.post("/interactions/nope/extract", json={})
|
||||
assert response.status_code == 404
|
||||
|
||||
|
||||
def test_api_promote_and_reject_endpoints(tmp_data_dir):
|
||||
init_db()
|
||||
candidate = create_memory(
|
||||
memory_type="adaptation",
|
||||
content="restructured the ingestion pipeline into layered stages",
|
||||
status="candidate",
|
||||
)
|
||||
client = TestClient(app)
|
||||
|
||||
promote_response = client.post(f"/memory/{candidate.id}/promote")
|
||||
assert promote_response.status_code == 200
|
||||
assert promote_response.json()["status"] == "promoted"
|
||||
|
||||
# Promoting it again should 404 because it's no longer a candidate
|
||||
second_promote = client.post(f"/memory/{candidate.id}/promote")
|
||||
assert second_promote.status_code == 404
|
||||
|
||||
reject_response = client.post("/memory/does-not-exist/reject")
|
||||
assert reject_response.status_code == 404
|
||||
|
||||
|
||||
def test_api_get_memory_candidate_status_filter(tmp_data_dir):
|
||||
init_db()
|
||||
create_memory(memory_type="preference", content="prefers explicit types", status="active")
|
||||
create_memory(
|
||||
memory_type="preference",
|
||||
content="prefers pull requests sized by diff lines not files",
|
||||
status="candidate",
|
||||
)
|
||||
client = TestClient(app)
|
||||
response = client.get("/memory", params={"status": "candidate"})
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert "candidate" in body["statuses"]
|
||||
assert len(body["memories"]) == 1
|
||||
assert body["memories"][0]["status"] == "candidate"
|
||||
|
||||
|
||||
def test_api_get_memory_invalid_status_returns_400(tmp_data_dir):
|
||||
init_db()
|
||||
client = TestClient(app)
|
||||
response = client.get("/memory", params={"status": "not-a-status"})
|
||||
assert response.status_code == 400
|
||||
170
tests/test_ingestion.py
Normal file
170
tests/test_ingestion.py
Normal file
@@ -0,0 +1,170 @@
|
||||
"""Tests for the ingestion pipeline."""
|
||||
|
||||
from atocore.ingestion.parser import parse_markdown
|
||||
from atocore.models.database import get_connection, init_db
|
||||
from atocore.ingestion.pipeline import ingest_file, ingest_folder
|
||||
|
||||
|
||||
def test_parse_markdown(sample_markdown):
|
||||
"""Test markdown parsing with frontmatter."""
|
||||
parsed = parse_markdown(sample_markdown)
|
||||
assert parsed.title == "AtoCore Architecture"
|
||||
assert "atocore" in parsed.tags
|
||||
assert "architecture" in parsed.tags
|
||||
assert len(parsed.body) > 0
|
||||
assert len(parsed.headings) > 0
|
||||
|
||||
|
||||
def test_parse_extracts_headings(sample_markdown):
|
||||
"""Test that headings are extracted correctly."""
|
||||
parsed = parse_markdown(sample_markdown)
|
||||
heading_texts = [h[1] for h in parsed.headings]
|
||||
assert "AtoCore Architecture" in heading_texts
|
||||
assert "Overview" in heading_texts
|
||||
|
||||
|
||||
def test_ingest_file(tmp_data_dir, sample_markdown):
|
||||
"""Test ingesting a single file."""
|
||||
init_db()
|
||||
result = ingest_file(sample_markdown)
|
||||
assert result["status"] == "ingested"
|
||||
assert result["chunks"] > 0
|
||||
|
||||
# Verify the file was stored in DB
|
||||
with get_connection() as conn:
|
||||
doc = conn.execute(
|
||||
"SELECT COUNT(*) as c FROM source_documents WHERE file_path = ?",
|
||||
(str(sample_markdown.resolve()),),
|
||||
).fetchone()
|
||||
assert doc["c"] == 1
|
||||
|
||||
chunks = conn.execute(
|
||||
"SELECT COUNT(*) as c FROM source_chunks sc "
|
||||
"JOIN source_documents sd ON sc.document_id = sd.id "
|
||||
"WHERE sd.file_path = ?",
|
||||
(str(sample_markdown.resolve()),),
|
||||
).fetchone()
|
||||
assert chunks["c"] > 0
|
||||
|
||||
|
||||
def test_ingest_skips_unchanged(tmp_data_dir, sample_markdown):
|
||||
"""Test that re-ingesting unchanged file is skipped."""
|
||||
init_db()
|
||||
ingest_file(sample_markdown)
|
||||
result = ingest_file(sample_markdown)
|
||||
assert result["status"] == "skipped"
|
||||
|
||||
|
||||
def test_ingest_updates_changed(tmp_data_dir, sample_markdown):
|
||||
"""Test that changed files are re-ingested."""
|
||||
init_db()
|
||||
ingest_file(sample_markdown)
|
||||
|
||||
# Modify the file
|
||||
sample_markdown.write_text(
|
||||
sample_markdown.read_text(encoding="utf-8") + "\n\n## New Section\n\nNew content added.",
|
||||
encoding="utf-8",
|
||||
)
|
||||
result = ingest_file(sample_markdown)
|
||||
assert result["status"] == "ingested"
|
||||
|
||||
|
||||
def test_parse_markdown_uses_supplied_text(sample_markdown):
|
||||
"""Parsing should be able to reuse pre-read content from ingestion."""
|
||||
latin_text = """---\ntags: parser\n---\n# Parser Title\n\nBody text."""
|
||||
parsed = parse_markdown(sample_markdown, text=latin_text)
|
||||
assert parsed.title == "Parser Title"
|
||||
assert "parser" in parsed.tags
|
||||
|
||||
|
||||
def test_reingest_empty_replaces_stale_chunks(tmp_data_dir, sample_markdown, monkeypatch):
|
||||
"""Re-ingesting a file with no chunks should clear stale DB/vector state."""
|
||||
init_db()
|
||||
|
||||
class FakeVectorStore:
|
||||
def __init__(self):
|
||||
self.deleted_ids = []
|
||||
|
||||
def add(self, ids, documents, metadatas):
|
||||
return None
|
||||
|
||||
def delete(self, ids):
|
||||
self.deleted_ids.extend(ids)
|
||||
|
||||
fake_store = FakeVectorStore()
|
||||
monkeypatch.setattr("atocore.ingestion.pipeline.get_vector_store", lambda: fake_store)
|
||||
|
||||
first = ingest_file(sample_markdown)
|
||||
assert first["status"] == "ingested"
|
||||
|
||||
sample_markdown.write_text("# Changed\n\nThis update should now produce no chunks after monkeypatching.", encoding="utf-8")
|
||||
monkeypatch.setattr("atocore.ingestion.pipeline.chunk_markdown", lambda *args, **kwargs: [])
|
||||
second = ingest_file(sample_markdown)
|
||||
assert second["status"] == "empty"
|
||||
|
||||
with get_connection() as conn:
|
||||
chunk_count = conn.execute("SELECT COUNT(*) AS c FROM source_chunks").fetchone()
|
||||
assert chunk_count["c"] == 0
|
||||
|
||||
assert fake_store.deleted_ids
|
||||
|
||||
|
||||
def test_ingest_folder_includes_markdown_extension(tmp_data_dir, sample_folder, monkeypatch):
|
||||
"""Folder ingestion should include both .md and .markdown files."""
|
||||
init_db()
|
||||
markdown_file = sample_folder / "third_note.markdown"
|
||||
markdown_file.write_text("# Third Note\n\nThis file should be discovered during folder ingestion.", encoding="utf-8")
|
||||
|
||||
class FakeVectorStore:
|
||||
def add(self, ids, documents, metadatas):
|
||||
return None
|
||||
|
||||
def delete(self, ids):
|
||||
return None
|
||||
|
||||
@property
|
||||
def count(self):
|
||||
return 0
|
||||
|
||||
monkeypatch.setattr("atocore.ingestion.pipeline.get_vector_store", lambda: FakeVectorStore())
|
||||
results = ingest_folder(sample_folder)
|
||||
files = {result["file"] for result in results if "file" in result}
|
||||
assert str(markdown_file.resolve()) in files
|
||||
|
||||
|
||||
def test_purge_deleted_files_does_not_match_sibling_prefix(tmp_data_dir, sample_folder, monkeypatch):
|
||||
"""Purging one folder should not delete entries from a sibling folder with the same prefix."""
|
||||
init_db()
|
||||
|
||||
class FakeVectorStore:
|
||||
def add(self, ids, documents, metadatas):
|
||||
return None
|
||||
|
||||
def delete(self, ids):
|
||||
return None
|
||||
|
||||
@property
|
||||
def count(self):
|
||||
return 0
|
||||
|
||||
monkeypatch.setattr("atocore.ingestion.pipeline.get_vector_store", lambda: FakeVectorStore())
|
||||
|
||||
kept_folder = tmp_data_dir / "notes"
|
||||
kept_folder.mkdir()
|
||||
kept_file = kept_folder / "keep.md"
|
||||
kept_file.write_text("# Keep\n\nThis document should survive purge.", encoding="utf-8")
|
||||
ingest_file(kept_file)
|
||||
|
||||
purge_folder = tmp_data_dir / "notes-project"
|
||||
purge_folder.mkdir()
|
||||
purge_file = purge_folder / "gone.md"
|
||||
purge_file.write_text("# Gone\n\nThis document will be purged.", encoding="utf-8")
|
||||
ingest_file(purge_file)
|
||||
purge_file.unlink()
|
||||
|
||||
ingest_folder(purge_folder, purge_deleted=True)
|
||||
|
||||
with get_connection() as conn:
|
||||
rows = conn.execute("SELECT file_path FROM source_documents").fetchall()
|
||||
remaining_paths = {row["file_path"] for row in rows}
|
||||
assert str(kept_file.resolve()) in remaining_paths
|
||||
304
tests/test_interactions.py
Normal file
304
tests/test_interactions.py
Normal file
@@ -0,0 +1,304 @@
|
||||
"""Tests for the Phase 9 Commit A interaction capture loop."""
|
||||
|
||||
import time
|
||||
|
||||
import pytest
|
||||
from fastapi.testclient import TestClient
|
||||
|
||||
from atocore.interactions.service import (
|
||||
get_interaction,
|
||||
list_interactions,
|
||||
record_interaction,
|
||||
)
|
||||
from atocore.main import app
|
||||
from atocore.models.database import init_db
|
||||
|
||||
|
||||
# --- Service-level tests --------------------------------------------------
|
||||
|
||||
|
||||
def test_record_interaction_persists_all_fields(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = record_interaction(
|
||||
prompt="What is the lateral support material for p05?",
|
||||
response="The current lateral support uses GF-PTFE pads per Decision D-024.",
|
||||
response_summary="lateral support: GF-PTFE per D-024",
|
||||
project="p05-interferometer",
|
||||
client="claude-code",
|
||||
session_id="sess-001",
|
||||
memories_used=["mem-aaa", "mem-bbb"],
|
||||
chunks_used=["chunk-111", "chunk-222", "chunk-333"],
|
||||
context_pack={"budget": 3000, "chunks": 3},
|
||||
)
|
||||
|
||||
assert interaction.id
|
||||
assert interaction.created_at
|
||||
|
||||
fetched = get_interaction(interaction.id)
|
||||
assert fetched is not None
|
||||
assert fetched.prompt.startswith("What is the lateral support")
|
||||
assert fetched.response.startswith("The current lateral support")
|
||||
assert fetched.response_summary == "lateral support: GF-PTFE per D-024"
|
||||
assert fetched.project == "p05-interferometer"
|
||||
assert fetched.client == "claude-code"
|
||||
assert fetched.session_id == "sess-001"
|
||||
assert fetched.memories_used == ["mem-aaa", "mem-bbb"]
|
||||
assert fetched.chunks_used == ["chunk-111", "chunk-222", "chunk-333"]
|
||||
assert fetched.context_pack == {"budget": 3000, "chunks": 3}
|
||||
|
||||
|
||||
def test_record_interaction_minimum_fields(tmp_data_dir):
|
||||
init_db()
|
||||
interaction = record_interaction(prompt="ping")
|
||||
assert interaction.id
|
||||
assert interaction.prompt == "ping"
|
||||
assert interaction.response == ""
|
||||
assert interaction.memories_used == []
|
||||
assert interaction.chunks_used == []
|
||||
|
||||
|
||||
def test_record_interaction_rejects_empty_prompt(tmp_data_dir):
|
||||
init_db()
|
||||
with pytest.raises(ValueError):
|
||||
record_interaction(prompt="")
|
||||
with pytest.raises(ValueError):
|
||||
record_interaction(prompt=" ")
|
||||
|
||||
|
||||
def test_get_interaction_returns_none_for_unknown_id(tmp_data_dir):
|
||||
init_db()
|
||||
assert get_interaction("does-not-exist") is None
|
||||
assert get_interaction("") is None
|
||||
|
||||
|
||||
def test_list_interactions_filters_by_project(tmp_data_dir):
|
||||
init_db()
|
||||
record_interaction(prompt="p04 question", project="p04-gigabit")
|
||||
record_interaction(prompt="p05 question", project="p05-interferometer")
|
||||
record_interaction(prompt="another p05", project="p05-interferometer")
|
||||
|
||||
p05 = list_interactions(project="p05-interferometer")
|
||||
p04 = list_interactions(project="p04-gigabit")
|
||||
|
||||
assert len(p05) == 2
|
||||
assert len(p04) == 1
|
||||
assert all(i.project == "p05-interferometer" for i in p05)
|
||||
assert p04[0].prompt == "p04 question"
|
||||
|
||||
|
||||
def test_list_interactions_filters_by_session_and_client(tmp_data_dir):
|
||||
init_db()
|
||||
record_interaction(prompt="a", session_id="sess-A", client="openclaw")
|
||||
record_interaction(prompt="b", session_id="sess-A", client="claude-code")
|
||||
record_interaction(prompt="c", session_id="sess-B", client="openclaw")
|
||||
|
||||
sess_a = list_interactions(session_id="sess-A")
|
||||
openclaw = list_interactions(client="openclaw")
|
||||
|
||||
assert len(sess_a) == 2
|
||||
assert len(openclaw) == 2
|
||||
assert {i.client for i in sess_a} == {"openclaw", "claude-code"}
|
||||
|
||||
|
||||
def test_list_interactions_orders_newest_first_and_respects_limit(tmp_data_dir):
|
||||
init_db()
|
||||
# created_at has 1-second resolution; sleep enough to keep ordering
|
||||
# deterministic regardless of insert speed.
|
||||
for index in range(5):
|
||||
record_interaction(prompt=f"prompt-{index}")
|
||||
time.sleep(1.05)
|
||||
|
||||
items = list_interactions(limit=3)
|
||||
assert len(items) == 3
|
||||
# Newest first: prompt-4, prompt-3, prompt-2
|
||||
assert items[0].prompt == "prompt-4"
|
||||
assert items[1].prompt == "prompt-3"
|
||||
assert items[2].prompt == "prompt-2"
|
||||
|
||||
|
||||
def test_list_interactions_respects_since_filter(tmp_data_dir):
|
||||
init_db()
|
||||
first = record_interaction(prompt="early")
|
||||
time.sleep(1.05)
|
||||
second = record_interaction(prompt="late")
|
||||
|
||||
after_first = list_interactions(since=first.created_at)
|
||||
ids_after_first = {item.id for item in after_first}
|
||||
assert second.id in ids_after_first
|
||||
assert first.id in ids_after_first # cutoff is inclusive
|
||||
|
||||
after_second = list_interactions(since=second.created_at)
|
||||
ids_after_second = {item.id for item in after_second}
|
||||
assert second.id in ids_after_second
|
||||
assert first.id not in ids_after_second
|
||||
|
||||
|
||||
def test_list_interactions_zero_limit_returns_empty(tmp_data_dir):
|
||||
init_db()
|
||||
record_interaction(prompt="ping")
|
||||
assert list_interactions(limit=0) == []
|
||||
|
||||
|
||||
# --- API-level tests ------------------------------------------------------
|
||||
|
||||
|
||||
def test_post_interactions_endpoint_records_interaction(tmp_data_dir):
|
||||
init_db()
|
||||
client = TestClient(app)
|
||||
response = client.post(
|
||||
"/interactions",
|
||||
json={
|
||||
"prompt": "What changed in p06 this week?",
|
||||
"response": "Polisher kinematic frame parameters updated to v0.3.",
|
||||
"response_summary": "p06 frame parameters bumped to v0.3",
|
||||
"project": "p06-polisher",
|
||||
"client": "claude-code",
|
||||
"session_id": "sess-xyz",
|
||||
"memories_used": ["mem-1"],
|
||||
"chunks_used": ["chunk-a", "chunk-b"],
|
||||
"context_pack": {"chunks": 2},
|
||||
},
|
||||
)
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["status"] == "recorded"
|
||||
interaction_id = body["id"]
|
||||
|
||||
# Round-trip via the GET endpoint
|
||||
fetched = client.get(f"/interactions/{interaction_id}")
|
||||
assert fetched.status_code == 200
|
||||
fetched_body = fetched.json()
|
||||
assert fetched_body["prompt"].startswith("What changed in p06")
|
||||
assert fetched_body["response"].startswith("Polisher kinematic frame")
|
||||
assert fetched_body["project"] == "p06-polisher"
|
||||
assert fetched_body["chunks_used"] == ["chunk-a", "chunk-b"]
|
||||
assert fetched_body["context_pack"] == {"chunks": 2}
|
||||
|
||||
|
||||
def test_post_interactions_rejects_empty_prompt(tmp_data_dir):
|
||||
init_db()
|
||||
client = TestClient(app)
|
||||
response = client.post("/interactions", json={"prompt": ""})
|
||||
assert response.status_code == 400
|
||||
|
||||
|
||||
def test_get_unknown_interaction_returns_404(tmp_data_dir):
|
||||
init_db()
|
||||
client = TestClient(app)
|
||||
response = client.get("/interactions/does-not-exist")
|
||||
assert response.status_code == 404
|
||||
|
||||
|
||||
def test_list_interactions_endpoint_returns_summaries(tmp_data_dir):
|
||||
init_db()
|
||||
client = TestClient(app)
|
||||
client.post(
|
||||
"/interactions",
|
||||
json={"prompt": "alpha", "project": "p04-gigabit", "response": "x" * 10},
|
||||
)
|
||||
client.post(
|
||||
"/interactions",
|
||||
json={"prompt": "beta", "project": "p05-interferometer", "response": "y" * 50},
|
||||
)
|
||||
|
||||
response = client.get("/interactions", params={"project": "p05-interferometer"})
|
||||
assert response.status_code == 200
|
||||
body = response.json()
|
||||
assert body["count"] == 1
|
||||
assert body["interactions"][0]["prompt"] == "beta"
|
||||
assert body["interactions"][0]["response_chars"] == 50
|
||||
# The list endpoint never includes the full response body
|
||||
assert "response" not in body["interactions"][0]
|
||||
|
||||
|
||||
# --- alias canonicalization on interaction capture/list -------------------
|
||||
|
||||
|
||||
def test_record_interaction_canonicalizes_project(project_registry):
|
||||
"""Capturing under an alias should store the canonical project id.
|
||||
|
||||
Regression for codex's P2 finding: reinforcement and extraction
|
||||
query memories by interaction.project; if the captured project is
|
||||
a raw alias they would silently miss memories stored under the
|
||||
canonical id.
|
||||
"""
|
||||
init_db()
|
||||
project_registry(("p05-interferometer", ["p05", "interferometer"]))
|
||||
|
||||
interaction = record_interaction(
|
||||
prompt="quick capture", response="response body", project="p05", reinforce=False
|
||||
)
|
||||
assert interaction.project == "p05-interferometer"
|
||||
|
||||
fetched = get_interaction(interaction.id)
|
||||
assert fetched.project == "p05-interferometer"
|
||||
|
||||
|
||||
def test_list_interactions_canonicalizes_project_filter(project_registry):
|
||||
init_db()
|
||||
project_registry(("p06-polisher", ["p06", "polisher"]))
|
||||
|
||||
record_interaction(prompt="a", response="ra", project="p06-polisher", reinforce=False)
|
||||
record_interaction(prompt="b", response="rb", project="polisher", reinforce=False)
|
||||
record_interaction(prompt="c", response="rc", project="atocore", reinforce=False)
|
||||
|
||||
# Query by an alias should still find both p06 captures
|
||||
via_alias = list_interactions(project="p06")
|
||||
via_canonical = list_interactions(project="p06-polisher")
|
||||
assert len(via_alias) == 2
|
||||
assert len(via_canonical) == 2
|
||||
assert {i.prompt for i in via_alias} == {"a", "b"}
|
||||
|
||||
|
||||
# --- since filter format normalization ------------------------------------
|
||||
|
||||
|
||||
def test_list_interactions_since_accepts_iso_with_t_separator(tmp_data_dir):
|
||||
init_db()
|
||||
record_interaction(prompt="early", response="r", reinforce=False)
|
||||
time.sleep(1.05)
|
||||
pivot = record_interaction(prompt="late", response="r", reinforce=False)
|
||||
|
||||
# pivot.created_at is in storage format 'YYYY-MM-DD HH:MM:SS'.
|
||||
# Build the equivalent ISO 8601 with 'T' that an external client
|
||||
# would naturally send.
|
||||
iso_with_t = pivot.created_at.replace(" ", "T")
|
||||
items = list_interactions(since=iso_with_t)
|
||||
assert any(i.id == pivot.id for i in items)
|
||||
# The early row must also be excluded if its timestamp is strictly
|
||||
# before the pivot — since is inclusive on the cutoff
|
||||
early_ids = {i.id for i in items if i.prompt == "early"}
|
||||
assert early_ids == set() or len(items) >= 1
|
||||
|
||||
|
||||
def test_list_interactions_since_accepts_z_suffix(tmp_data_dir):
|
||||
init_db()
|
||||
pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
|
||||
time.sleep(1.05)
|
||||
after = record_interaction(prompt="after", response="r", reinforce=False)
|
||||
|
||||
iso_with_z = pivot.created_at.replace(" ", "T") + "Z"
|
||||
items = list_interactions(since=iso_with_z)
|
||||
ids = {i.id for i in items}
|
||||
assert pivot.id in ids
|
||||
assert after.id in ids
|
||||
|
||||
|
||||
def test_list_interactions_since_accepts_offset(tmp_data_dir):
|
||||
init_db()
|
||||
pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
|
||||
time.sleep(1.05)
|
||||
after = record_interaction(prompt="after", response="r", reinforce=False)
|
||||
|
||||
iso_with_offset = pivot.created_at.replace(" ", "T") + "+00:00"
|
||||
items = list_interactions(since=iso_with_offset)
|
||||
assert any(i.id == after.id for i in items)
|
||||
|
||||
|
||||
def test_list_interactions_since_storage_format_still_works(tmp_data_dir):
|
||||
"""The bare storage format must still work for backwards compatibility."""
|
||||
init_db()
|
||||
pivot = record_interaction(prompt="pivot", response="r", reinforce=False)
|
||||
|
||||
items = list_interactions(since=pivot.created_at)
|
||||
assert any(i.id == pivot.id for i in items)
|
||||
18
tests/test_logging.py
Normal file
18
tests/test_logging.py
Normal file
@@ -0,0 +1,18 @@
|
||||
"""Tests for logging configuration."""
|
||||
|
||||
from types import SimpleNamespace
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.observability.logger import setup_logging
|
||||
|
||||
|
||||
def test_setup_logging_uses_dynamic_settings_without_name_error():
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = SimpleNamespace(debug=False)
|
||||
setup_logging()
|
||||
|
||||
config.settings = SimpleNamespace(debug=True)
|
||||
setup_logging()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
188
tests/test_memory.py
Normal file
188
tests/test_memory.py
Normal file
@@ -0,0 +1,188 @@
|
||||
"""Tests for Memory Core."""
|
||||
|
||||
import os
|
||||
import tempfile
|
||||
|
||||
import pytest
|
||||
|
||||
import atocore.config as _config
|
||||
from atocore.models.database import init_db
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def isolated_db():
|
||||
"""Give each test a completely isolated database."""
|
||||
tmpdir = tempfile.mkdtemp()
|
||||
os.environ["ATOCORE_DATA_DIR"] = tmpdir
|
||||
|
||||
# Replace the global settings so all modules see the new data_dir
|
||||
_config.settings = _config.Settings()
|
||||
|
||||
# Also reset any module-level references to the old settings
|
||||
import atocore.models.database
|
||||
# database.py now uses _config.settings dynamically, so no patch needed
|
||||
|
||||
init_db()
|
||||
yield tmpdir
|
||||
|
||||
|
||||
def test_create_memory(isolated_db):
|
||||
from atocore.memory.service import create_memory
|
||||
mem = create_memory("identity", "User is a mechanical engineer specializing in optics")
|
||||
assert mem.memory_type == "identity"
|
||||
assert mem.status == "active"
|
||||
assert mem.confidence == 1.0
|
||||
|
||||
|
||||
def test_create_memory_invalid_type(isolated_db):
|
||||
from atocore.memory.service import create_memory
|
||||
with pytest.raises(ValueError, match="Invalid memory type"):
|
||||
create_memory("invalid_type", "some content")
|
||||
|
||||
|
||||
def test_create_memory_dedup(isolated_db):
|
||||
from atocore.memory.service import create_memory
|
||||
m1 = create_memory("identity", "User is an engineer")
|
||||
m2 = create_memory("identity", "User is an engineer")
|
||||
assert m1.id == m2.id
|
||||
|
||||
|
||||
def test_create_memory_dedup_is_project_scoped(isolated_db):
|
||||
from atocore.memory.service import create_memory
|
||||
m1 = create_memory("project", "Uses SQLite for local state", project="atocore")
|
||||
m2 = create_memory("project", "Uses SQLite for local state", project="openclaw")
|
||||
assert m1.id != m2.id
|
||||
|
||||
|
||||
def test_project_is_persisted_and_filterable(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories
|
||||
create_memory("project", "Uses SQLite for local state", project="atocore")
|
||||
create_memory("project", "Uses Postgres in production", project="openclaw")
|
||||
|
||||
atocore_memories = get_memories(memory_type="project", project="atocore")
|
||||
assert len(atocore_memories) == 1
|
||||
assert atocore_memories[0].project == "atocore"
|
||||
|
||||
|
||||
def test_get_memories_all(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories
|
||||
create_memory("identity", "User is an engineer")
|
||||
create_memory("preference", "Prefers Python with type hints")
|
||||
create_memory("knowledge", "Zerodur has near-zero thermal expansion")
|
||||
|
||||
mems = get_memories()
|
||||
assert len(mems) == 3
|
||||
|
||||
|
||||
def test_get_memories_by_type(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories
|
||||
create_memory("identity", "User is an engineer")
|
||||
create_memory("preference", "Prefers concise code")
|
||||
create_memory("preference", "Uses FastAPI for APIs")
|
||||
|
||||
mems = get_memories(memory_type="preference")
|
||||
assert len(mems) == 2
|
||||
|
||||
|
||||
def test_get_memories_active_only(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories, invalidate_memory
|
||||
m = create_memory("knowledge", "Fact about optics")
|
||||
invalidate_memory(m.id)
|
||||
|
||||
assert len(get_memories(active_only=True)) == 0
|
||||
assert len(get_memories(active_only=False)) == 1
|
||||
|
||||
|
||||
def test_get_memories_min_confidence(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories
|
||||
create_memory("knowledge", "High confidence fact", confidence=0.9)
|
||||
create_memory("knowledge", "Low confidence fact", confidence=0.3)
|
||||
|
||||
high = get_memories(min_confidence=0.5)
|
||||
assert len(high) == 1
|
||||
assert high[0].confidence == 0.9
|
||||
|
||||
|
||||
def test_update_memory(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories, update_memory
|
||||
mem = create_memory("knowledge", "Initial fact")
|
||||
update_memory(mem.id, content="Updated fact", confidence=0.8)
|
||||
|
||||
mems = get_memories()
|
||||
assert len(mems) == 1
|
||||
assert mems[0].content == "Updated fact"
|
||||
assert mems[0].confidence == 0.8
|
||||
|
||||
|
||||
def test_update_memory_rejects_duplicate_active_memory(isolated_db):
|
||||
from atocore.memory.service import create_memory, update_memory
|
||||
import pytest
|
||||
|
||||
first = create_memory("knowledge", "Canonical fact", project="atocore")
|
||||
second = create_memory("knowledge", "Different fact", project="atocore")
|
||||
|
||||
with pytest.raises(ValueError, match="duplicate active memory"):
|
||||
update_memory(second.id, content="Canonical fact")
|
||||
|
||||
|
||||
def test_create_memory_validates_confidence(isolated_db):
|
||||
from atocore.memory.service import create_memory
|
||||
import pytest
|
||||
|
||||
with pytest.raises(ValueError, match="Confidence must be between 0.0 and 1.0"):
|
||||
create_memory("knowledge", "Out of range", confidence=1.5)
|
||||
|
||||
|
||||
def test_invalidate_memory(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories, invalidate_memory
|
||||
mem = create_memory("knowledge", "Wrong fact")
|
||||
invalidate_memory(mem.id)
|
||||
assert len(get_memories(active_only=True)) == 0
|
||||
|
||||
|
||||
def test_supersede_memory(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories, supersede_memory
|
||||
mem = create_memory("knowledge", "Old fact")
|
||||
supersede_memory(mem.id)
|
||||
|
||||
mems = get_memories(active_only=False)
|
||||
assert len(mems) == 1
|
||||
assert mems[0].status == "superseded"
|
||||
|
||||
|
||||
def test_memories_for_context(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
create_memory("identity", "User is a senior mechanical engineer")
|
||||
create_memory("preference", "Prefers Python with type hints")
|
||||
|
||||
text, chars = get_memories_for_context(memory_types=["identity", "preference"], budget=500)
|
||||
assert "--- AtoCore Memory ---" in text
|
||||
assert "[identity]" in text
|
||||
assert "[preference]" in text
|
||||
assert chars > 0
|
||||
|
||||
|
||||
def test_memories_for_context_reserves_room_for_each_type(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
create_memory("identity", "Identity entry that is intentionally long so it could consume the whole budget on its own")
|
||||
create_memory("preference", "Preference entry that should still appear")
|
||||
|
||||
text, _ = get_memories_for_context(memory_types=["identity", "preference"], budget=120)
|
||||
assert "[preference]" in text
|
||||
|
||||
|
||||
def test_memories_for_context_respects_actual_serialized_budget(isolated_db):
|
||||
from atocore.memory.service import create_memory, get_memories_for_context
|
||||
create_memory("identity", "Identity text that should fit the wrapper-aware memory budget calculation")
|
||||
create_memory("preference", "Preference text that should also fit")
|
||||
|
||||
text, chars = get_memories_for_context(memory_types=["identity", "preference"], budget=140)
|
||||
assert chars == len(text)
|
||||
assert chars <= 140
|
||||
|
||||
|
||||
def test_memories_for_context_empty(isolated_db):
|
||||
from atocore.memory.service import get_memories_for_context
|
||||
text, chars = get_memories_for_context()
|
||||
assert text == ""
|
||||
assert chars == 0
|
||||
802
tests/test_migrate_legacy_aliases.py
Normal file
802
tests/test_migrate_legacy_aliases.py
Normal file
@@ -0,0 +1,802 @@
|
||||
"""Tests for scripts/migrate_legacy_aliases.py.
|
||||
|
||||
The migration script closes the compatibility gap documented in
|
||||
docs/architecture/project-identity-canonicalization.md. These tests
|
||||
cover:
|
||||
|
||||
- empty/clean database behavior
|
||||
- shadow projects detection
|
||||
- state rekey without collisions
|
||||
- state collision detection + apply refusal
|
||||
- memory rekey + supersession of duplicates
|
||||
- interaction rekey
|
||||
- end-to-end apply on a realistic shadow
|
||||
- idempotency (running twice produces the same final state)
|
||||
- report artifact is written
|
||||
- the pre-fix regression gap is actually closed after migration
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import sqlite3
|
||||
import sys
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from atocore.context.project_state import (
|
||||
get_state,
|
||||
init_project_state_schema,
|
||||
)
|
||||
from atocore.models.database import init_db
|
||||
|
||||
# Make scripts/ importable
|
||||
_REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
sys.path.insert(0, str(_REPO_ROOT / "scripts"))
|
||||
|
||||
import migrate_legacy_aliases as mig # noqa: E402
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Helpers that seed "legacy" rows the way they would have looked before fb6298a
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _open_db_connection():
|
||||
"""Open a direct SQLite connection to the test data dir's DB."""
|
||||
import atocore.config as config
|
||||
|
||||
conn = sqlite3.connect(str(config.settings.db_path))
|
||||
conn.row_factory = sqlite3.Row
|
||||
conn.execute("PRAGMA foreign_keys = ON")
|
||||
return conn
|
||||
|
||||
|
||||
def _seed_shadow_project(
|
||||
conn: sqlite3.Connection, shadow_name: str
|
||||
) -> str:
|
||||
"""Insert a projects row keyed under an alias, like the old set_state would have."""
|
||||
project_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO projects (id, name, description) VALUES (?, ?, ?)",
|
||||
(project_id, shadow_name, f"shadow row for {shadow_name}"),
|
||||
)
|
||||
conn.commit()
|
||||
return project_id
|
||||
|
||||
|
||||
def _seed_state_row(
|
||||
conn: sqlite3.Connection,
|
||||
project_id: str,
|
||||
category: str,
|
||||
key: str,
|
||||
value: str,
|
||||
status: str = "active",
|
||||
) -> str:
|
||||
row_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO project_state "
|
||||
"(id, project_id, category, key, value, source, confidence, status) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
|
||||
(row_id, project_id, category, key, value, "legacy-test", 1.0, status),
|
||||
)
|
||||
conn.commit()
|
||||
return row_id
|
||||
|
||||
|
||||
def _seed_memory_row(
|
||||
conn: sqlite3.Connection,
|
||||
memory_type: str,
|
||||
content: str,
|
||||
project: str,
|
||||
status: str = "active",
|
||||
) -> str:
|
||||
row_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO memories "
|
||||
"(id, memory_type, content, project, source_chunk_id, confidence, status) "
|
||||
"VALUES (?, ?, ?, ?, ?, ?, ?)",
|
||||
(row_id, memory_type, content, project, None, 1.0, status),
|
||||
)
|
||||
conn.commit()
|
||||
return row_id
|
||||
|
||||
|
||||
def _seed_interaction_row(
|
||||
conn: sqlite3.Connection, prompt: str, project: str
|
||||
) -> str:
|
||||
row_id = str(uuid.uuid4())
|
||||
conn.execute(
|
||||
"INSERT INTO interactions "
|
||||
"(id, prompt, context_pack, response_summary, response, "
|
||||
" memories_used, chunks_used, client, session_id, project, created_at) "
|
||||
"VALUES (?, ?, '{}', '', '', '[]', '[]', 'legacy-test', '', ?, '2026-04-01 12:00:00')",
|
||||
(row_id, prompt, project),
|
||||
)
|
||||
conn.commit()
|
||||
return row_id
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# plan-building tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def _setup(tmp_data_dir):
|
||||
init_db()
|
||||
init_project_state_schema()
|
||||
|
||||
|
||||
def test_dry_run_on_empty_registry_reports_empty_plan(tmp_data_dir):
|
||||
"""Empty registry -> empty alias map -> empty plan."""
|
||||
registry_path = tmp_data_dir / "empty-registry.json"
|
||||
registry_path.write_text('{"projects": []}', encoding="utf-8")
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert plan.alias_map == {}
|
||||
assert plan.is_empty
|
||||
assert not plan.has_collisions
|
||||
assert plan.counts() == {
|
||||
"shadow_projects": 0,
|
||||
"state_rekey_rows": 0,
|
||||
"state_collisions": 0,
|
||||
"state_historical_drops": 0,
|
||||
"memory_rekey_rows": 0,
|
||||
"memory_supersede_rows": 0,
|
||||
"interaction_rekey_rows": 0,
|
||||
}
|
||||
|
||||
|
||||
def test_dry_run_on_clean_registered_db_reports_empty_plan(project_registry):
|
||||
"""A registry with projects but no legacy rows -> empty plan."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert plan.alias_map != {}
|
||||
assert plan.is_empty
|
||||
|
||||
|
||||
def test_dry_run_finds_shadow_project(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_shadow_project(conn, "p05")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert len(plan.shadow_projects) == 1
|
||||
assert plan.shadow_projects[0].shadow_name == "p05"
|
||||
assert plan.shadow_projects[0].canonical_project_id == "p05-interferometer"
|
||||
|
||||
|
||||
def test_dry_run_plans_state_rekey_without_collisions(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1 ingestion")
|
||||
_seed_state_row(conn, shadow_id, "decision", "lateral_support", "GF-PTFE")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert len(plan.state_plans) == 1
|
||||
sp = plan.state_plans[0]
|
||||
assert len(sp.rows_to_rekey) == 2
|
||||
assert sp.collisions == []
|
||||
assert not plan.has_collisions
|
||||
|
||||
|
||||
def test_dry_run_detects_state_collision(project_registry):
|
||||
"""Shadow and canonical both have state under the same (category, key) with different values."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
_seed_state_row(
|
||||
conn, canonical_id, "status", "next_focus", "Wave 2"
|
||||
)
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert plan.has_collisions
|
||||
collision = plan.state_plans[0].collisions[0]
|
||||
assert collision["shadow"]["value"] == "Wave 1"
|
||||
assert collision["canonical"]["value"] == "Wave 2"
|
||||
|
||||
|
||||
def test_dry_run_plans_memory_rekey_and_supersession(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p04-gigabit", ["p04", "gigabit"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
# A clean memory under the alias that will just be rekeyed
|
||||
_seed_memory_row(conn, "project", "clean rekey memory", "p04")
|
||||
# A memory that collides with an existing canonical memory
|
||||
_seed_memory_row(conn, "project", "duplicate content", "p04")
|
||||
_seed_memory_row(conn, "project", "duplicate content", "p04-gigabit")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# There's exactly one memory plan (one alias matched)
|
||||
assert len(plan.memory_plans) == 1
|
||||
mp = plan.memory_plans[0]
|
||||
# Two rows are candidates for rekey or supersession — one clean,
|
||||
# one duplicate. The duplicate is handled via to_supersede; the
|
||||
# other via rows_to_rekey.
|
||||
total_affected = len(mp.rows_to_rekey) + len(mp.to_supersede)
|
||||
assert total_affected == 2
|
||||
|
||||
|
||||
def test_dry_run_plans_interaction_rekey(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p06-polisher", ["p06", "polisher"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_interaction_row(conn, "quick capture under alias", "polisher")
|
||||
_seed_interaction_row(conn, "another alias-keyed row", "p06")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
total = sum(len(p.rows_to_rekey) for p in plan.interaction_plans)
|
||||
assert total == 2
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# apply tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_apply_refuses_on_state_collision(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
_seed_state_row(conn, canonical_id, "status", "next_focus", "Wave 2")
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert plan.has_collisions
|
||||
|
||||
with pytest.raises(mig.MigrationRefused):
|
||||
mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
def test_apply_migrates_clean_shadow_end_to_end(project_registry):
|
||||
"""The happy path: one shadow project with clean state rows, rekey into a freshly-created canonical row, verify reachability via get_state."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "status", "next_focus", "Wave 1 ingestion"
|
||||
)
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "decision", "lateral_support", "GF-PTFE"
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_rekeyed"] == 2
|
||||
assert summary["shadow_projects_deleted"] == 1
|
||||
assert summary["canonical_rows_created"] == 1
|
||||
|
||||
# The regression gap is now closed: the service layer can see
|
||||
# the state under the canonical id via either the alias OR the
|
||||
# canonical.
|
||||
via_alias = get_state("p05")
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
assert len(via_alias) == 2
|
||||
assert len(via_canonical) == 2
|
||||
values = {entry.value for entry in via_canonical}
|
||||
assert values == {"Wave 1 ingestion", "GF-PTFE"}
|
||||
|
||||
|
||||
def test_apply_drops_shadow_state_duplicate_without_collision(project_registry):
|
||||
"""Shadow and canonical both have the same (category, key, value) — shadow gets marked superseded rather than hitting the UNIQUE constraint."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "status", "next_focus", "Wave 1 ingestion"
|
||||
)
|
||||
_seed_state_row(
|
||||
conn, canonical_id, "status", "next_focus", "Wave 1 ingestion"
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_merged_as_duplicate"] == 1
|
||||
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
# Exactly one active row survives
|
||||
assert len(via_canonical) == 1
|
||||
assert via_canonical[0].value == "Wave 1 ingestion"
|
||||
|
||||
|
||||
def test_apply_preserves_superseded_shadow_state_when_no_collision(project_registry):
|
||||
"""Regression test for the codex-flagged data-loss bug.
|
||||
|
||||
Before the fix, plan_state_migration only selected status='active'
|
||||
rows. Any superseded or invalid row on the shadow project was
|
||||
invisible to the plan and got silently cascade-deleted when the
|
||||
shadow projects row was dropped at the end of apply. That's
|
||||
exactly the kind of audit loss a cleanup migration must not cause.
|
||||
|
||||
This test seeds a shadow project with a superseded state row on
|
||||
a triple the canonical project doesn't have, runs the migration,
|
||||
and verifies the row survived and is now attached to the
|
||||
canonical project (still with status='superseded').
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
# Superseded row on a triple the canonical won't have
|
||||
_seed_state_row(
|
||||
conn,
|
||||
shadow_id,
|
||||
"status",
|
||||
"historical_phase",
|
||||
"Phase 0 legacy",
|
||||
status="superseded",
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# The superseded row should have been rekeyed, not dropped
|
||||
assert summary["state_rows_rekeyed"] == 1
|
||||
assert summary["state_rows_historical_dropped"] == 0
|
||||
|
||||
# Verify via raw SQL that the row is now attached to the canonical
|
||||
# projects row and still has status='superseded'
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
row = conn.execute(
|
||||
"SELECT ps.status, ps.value, p.name "
|
||||
"FROM project_state ps JOIN projects p ON ps.project_id = p.id "
|
||||
"WHERE ps.category = ? AND ps.key = ?",
|
||||
("status", "historical_phase"),
|
||||
).fetchone()
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert row is not None, "superseded shadow row was lost during migration"
|
||||
assert row["status"] == "superseded"
|
||||
assert row["value"] == "Phase 0 legacy"
|
||||
assert row["name"] == "p05-interferometer"
|
||||
|
||||
|
||||
def test_apply_drops_shadow_inactive_row_when_canonical_holds_same_triple(project_registry):
|
||||
"""Shadow is inactive (superseded) and collides with an active canonical row.
|
||||
|
||||
The canonical wins by definition of the UPSERT schema. The shadow
|
||||
row is recorded as a historical_drop in the plan so the operator
|
||||
sees the audit loss, and the apply cascade-deletes it via the
|
||||
shadow projects row. This is the unavoidable data-loss case
|
||||
documented in the migration module docstring.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
|
||||
# Shadow has a superseded value on a triple where the canonical
|
||||
# has a different active value. Can't preserve both: UNIQUE
|
||||
# allows only one row per triple.
|
||||
_seed_state_row(
|
||||
conn,
|
||||
shadow_id,
|
||||
"status",
|
||||
"next_focus",
|
||||
"Old wave 1",
|
||||
status="superseded",
|
||||
)
|
||||
_seed_state_row(
|
||||
conn,
|
||||
canonical_id,
|
||||
"status",
|
||||
"next_focus",
|
||||
"Wave 2 trusted-operational",
|
||||
status="active",
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions # not an active-vs-active collision
|
||||
assert plan.counts()["state_historical_drops"] == 1
|
||||
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_historical_dropped"] == 1
|
||||
|
||||
# The canonical's active row survives unchanged
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
active_next_focus = [
|
||||
e
|
||||
for e in via_canonical
|
||||
if e.category == "status" and e.key == "next_focus"
|
||||
]
|
||||
assert len(active_next_focus) == 1
|
||||
assert active_next_focus[0].value == "Wave 2 trusted-operational"
|
||||
|
||||
|
||||
def test_apply_replaces_inactive_canonical_with_active_shadow(project_registry):
|
||||
"""Shadow is active, canonical has an inactive row at the same triple.
|
||||
|
||||
The shadow wins: canonical inactive row is deleted, shadow is
|
||||
rekeyed into canonical's project_id. This covers the
|
||||
cross-contamination case where the old alias path was used for
|
||||
the live value while the canonical path had a stale row.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p06-polisher", ["p06", "polisher"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p06")
|
||||
canonical_id = _seed_shadow_project(conn, "p06-polisher")
|
||||
|
||||
# Canonical has a stale invalid row; shadow has the live value.
|
||||
_seed_state_row(
|
||||
conn,
|
||||
canonical_id,
|
||||
"decision",
|
||||
"frame",
|
||||
"Old frame (no longer current)",
|
||||
status="invalid",
|
||||
)
|
||||
_seed_state_row(
|
||||
conn,
|
||||
shadow_id,
|
||||
"decision",
|
||||
"frame",
|
||||
"kinematic mount frame",
|
||||
status="active",
|
||||
)
|
||||
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert not plan.has_collisions
|
||||
assert plan.counts()["state_historical_drops"] == 0
|
||||
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["state_rows_replaced_inactive_canonical"] == 1
|
||||
|
||||
# The active shadow value now lives on the canonical row
|
||||
via_canonical = get_state("p06-polisher")
|
||||
frame_entries = [
|
||||
e for e in via_canonical if e.category == "decision" and e.key == "frame"
|
||||
]
|
||||
assert len(frame_entries) == 1
|
||||
assert frame_entries[0].value == "kinematic mount frame"
|
||||
|
||||
# Confirm via raw SQL that the previously-inactive canonical row
|
||||
# no longer exists
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
stale = conn.execute(
|
||||
"SELECT COUNT(*) AS c FROM project_state WHERE value = ?",
|
||||
("Old frame (no longer current)",),
|
||||
).fetchone()
|
||||
finally:
|
||||
conn.close()
|
||||
assert stale["c"] == 0
|
||||
|
||||
|
||||
def test_apply_migrates_memories(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p04-gigabit", ["p04", "gigabit"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_memory_row(conn, "project", "lateral support uses GF-PTFE", "p04")
|
||||
_seed_memory_row(conn, "preference", "I prefer descriptive commits", "gigabit")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["memory_rows_rekeyed"] == 2
|
||||
|
||||
# Both memories should now read as living under the canonical id
|
||||
from atocore.memory.service import get_memories
|
||||
|
||||
rows = get_memories(project="p04-gigabit", limit=50)
|
||||
contents = {m.content for m in rows}
|
||||
assert "lateral support uses GF-PTFE" in contents
|
||||
assert "I prefer descriptive commits" in contents
|
||||
|
||||
|
||||
def test_apply_migrates_interactions(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p06-polisher", ["p06", "polisher"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_interaction_row(conn, "alias-keyed 1", "polisher")
|
||||
_seed_interaction_row(conn, "alias-keyed 2", "p06")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
summary = mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary["interaction_rows_rekeyed"] == 2
|
||||
|
||||
from atocore.interactions.service import list_interactions
|
||||
|
||||
rows = list_interactions(project="p06-polisher", limit=50)
|
||||
prompts = {i.prompt for i in rows}
|
||||
assert prompts == {"alias-keyed 1", "alias-keyed 2"}
|
||||
|
||||
|
||||
def test_apply_is_idempotent(project_registry):
|
||||
"""Running apply twice produces the same final state as running it once."""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
_seed_memory_row(conn, "project", "m1", "p05")
|
||||
_seed_interaction_row(conn, "i1", "p05")
|
||||
|
||||
# first apply
|
||||
plan_a = mig.build_plan(conn, registry_path)
|
||||
summary_a = mig.apply_plan(conn, plan_a)
|
||||
|
||||
# second apply: plan should be empty
|
||||
plan_b = mig.build_plan(conn, registry_path)
|
||||
assert plan_b.is_empty
|
||||
|
||||
# forcing a second apply on the empty plan via the function
|
||||
# directly should also succeed as a no-op (caller normally
|
||||
# has to pass --allow-empty through the CLI, but apply_plan
|
||||
# itself doesn't enforce that — the refusal is in run())
|
||||
summary_b = mig.apply_plan(conn, plan_b)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
assert summary_a["state_rows_rekeyed"] == 1
|
||||
assert summary_a["memory_rows_rekeyed"] == 1
|
||||
assert summary_a["interaction_rows_rekeyed"] == 1
|
||||
assert summary_b["state_rows_rekeyed"] == 0
|
||||
assert summary_b["memory_rows_rekeyed"] == 0
|
||||
assert summary_b["interaction_rows_rekeyed"] == 0
|
||||
|
||||
|
||||
def test_apply_refuses_with_integrity_errors(project_registry):
|
||||
"""If the projects table has two case-variant rows for the canonical id, refuse.
|
||||
|
||||
The projects.name column has a case-sensitive UNIQUE constraint,
|
||||
so exact duplicates can't exist. But case-variant rows
|
||||
``p05-interferometer`` and ``P05-Interferometer`` can both
|
||||
survive the UNIQUE constraint while both matching the
|
||||
case-insensitive ``lower(name) = lower(?)`` lookup that the
|
||||
migration uses to find the canonical row. That ambiguity
|
||||
(which canonical row should dependents rekey into?) is exactly
|
||||
the integrity failure the migration is guarding against.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
_seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_shadow_project(conn, "P05-Interferometer")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
assert plan.integrity_errors
|
||||
with pytest.raises(mig.MigrationRefused):
|
||||
mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# reporting tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_plan_to_json_dict_is_serializable(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(conn, shadow_id, "status", "next_focus", "Wave 1")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
payload = mig.plan_to_json_dict(plan)
|
||||
# Must be JSON-serializable
|
||||
json_str = json.dumps(payload, default=str)
|
||||
assert "p05-interferometer" in json_str
|
||||
assert payload["counts"]["state_rekey_rows"] == 1
|
||||
|
||||
|
||||
def test_write_report_creates_file(tmp_path, project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
report_dir = tmp_path / "reports"
|
||||
report_path = mig.write_report(
|
||||
plan,
|
||||
summary=None,
|
||||
db_path=Path("/tmp/fake.db"),
|
||||
registry_path=registry_path,
|
||||
mode="dry-run",
|
||||
report_dir=report_dir,
|
||||
)
|
||||
assert report_path.exists()
|
||||
payload = json.loads(report_path.read_text(encoding="utf-8"))
|
||||
assert payload["mode"] == "dry-run"
|
||||
assert "plan" in payload
|
||||
|
||||
|
||||
def test_render_plan_text_on_empty_plan(project_registry):
|
||||
registry_path = project_registry() # empty
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
text = mig.render_plan_text(plan)
|
||||
assert "nothing to plan" in text.lower()
|
||||
|
||||
|
||||
def test_render_plan_text_on_collision(project_registry):
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
canonical_id = _seed_shadow_project(conn, "p05-interferometer")
|
||||
_seed_state_row(conn, shadow_id, "status", "phase", "A")
|
||||
_seed_state_row(conn, canonical_id, "status", "phase", "B")
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
text = mig.render_plan_text(plan)
|
||||
assert "COLLISION" in text.upper()
|
||||
assert "REFUSE" in text.upper() or "refuse" in text.lower()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# gap-closed companion test — the flip side of
|
||||
# test_legacy_alias_keyed_state_is_invisible_until_migrated in
|
||||
# test_project_state.py. After running this migration, the legacy row
|
||||
# IS reachable via the canonical id.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_legacy_alias_gap_is_closed_after_migration(project_registry):
|
||||
"""End-to-end regression test for the canonicalization gap.
|
||||
|
||||
Simulates the exact scenario from
|
||||
test_legacy_alias_keyed_state_is_invisible_until_migrated in
|
||||
test_project_state.py — a shadow projects row with a state row
|
||||
pointing at it. Runs the migration. Verifies the state is now
|
||||
reachable via the canonical id.
|
||||
"""
|
||||
registry_path = project_registry(
|
||||
("p05-interferometer", ["p05", "interferometer"])
|
||||
)
|
||||
|
||||
conn = _open_db_connection()
|
||||
try:
|
||||
shadow_id = _seed_shadow_project(conn, "p05")
|
||||
_seed_state_row(
|
||||
conn, shadow_id, "status", "legacy_focus", "Wave 1 ingestion"
|
||||
)
|
||||
|
||||
# Before migration: the legacy row is invisible to get_state
|
||||
# (this is the documented gap, covered in test_project_state.py)
|
||||
assert all(
|
||||
entry.value != "Wave 1 ingestion" for entry in get_state("p05")
|
||||
)
|
||||
assert all(
|
||||
entry.value != "Wave 1 ingestion"
|
||||
for entry in get_state("p05-interferometer")
|
||||
)
|
||||
|
||||
# Run the migration
|
||||
plan = mig.build_plan(conn, registry_path)
|
||||
mig.apply_plan(conn, plan)
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
# After migration: the row is reachable via canonical AND alias
|
||||
via_canonical = get_state("p05-interferometer")
|
||||
via_alias = get_state("p05")
|
||||
assert any(e.value == "Wave 1 ingestion" for e in via_canonical)
|
||||
assert any(e.value == "Wave 1 ingestion" for e in via_alias)
|
||||
596
tests/test_project_registry.py
Normal file
596
tests/test_project_registry.py
Normal file
@@ -0,0 +1,596 @@
|
||||
"""Tests for project registry resolution and refresh behavior."""
|
||||
|
||||
import json
|
||||
|
||||
import atocore.config as config
|
||||
from atocore.projects.registry import (
|
||||
build_project_registration_proposal,
|
||||
get_registered_project,
|
||||
get_project_registry_template,
|
||||
list_registered_projects,
|
||||
register_project,
|
||||
refresh_registered_project,
|
||||
update_project,
|
||||
)
|
||||
|
||||
|
||||
def test_project_registry_lists_projects_with_resolved_roots(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
(vault_dir / "incoming" / "projects" / "p04-gigabit").mkdir(parents=True)
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["p04", "gigabit"],
|
||||
"description": "P04 docs",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p04-gigabit",
|
||||
"label": "P04 staged docs",
|
||||
}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
projects = list_registered_projects()
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert len(projects) == 1
|
||||
assert projects[0]["id"] == "p04-gigabit"
|
||||
assert projects[0]["ingest_roots"][0]["exists"] is True
|
||||
|
||||
|
||||
def test_project_registry_resolves_alias(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
project = get_registered_project("p05")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert project is not None
|
||||
assert project.project_id == "p05-interferometer"
|
||||
|
||||
|
||||
def test_refresh_registered_project_ingests_registered_roots(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
project_dir = vault_dir / "incoming" / "projects" / "p06-polisher"
|
||||
project_dir.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p06-polisher",
|
||||
"aliases": ["p06", "polisher"],
|
||||
"description": "P06 docs",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p06-polisher"}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
calls = []
|
||||
|
||||
def fake_ingest_folder(path, purge_deleted=True):
|
||||
calls.append((str(path), purge_deleted))
|
||||
return [{"file": str(path / "README.md"), "status": "ingested"}]
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
monkeypatch.setattr("atocore.projects.registry.ingest_folder", fake_ingest_folder)
|
||||
result = refresh_registered_project("polisher")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert result["project"] == "p06-polisher"
|
||||
assert len(calls) == 1
|
||||
assert calls[0][0].endswith("p06-polisher")
|
||||
assert calls[0][1] is False
|
||||
assert result["roots"][0]["status"] == "ingested"
|
||||
assert result["status"] == "ingested"
|
||||
assert result["roots_ingested"] == 1
|
||||
assert result["roots_skipped"] == 0
|
||||
|
||||
|
||||
def test_refresh_registered_project_reports_nothing_to_ingest_when_all_missing(
|
||||
tmp_path, monkeypatch
|
||||
):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p07-ghost",
|
||||
"aliases": ["ghost"],
|
||||
"description": "Project whose roots do not exist on disk",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p07-ghost"}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
def fail_ingest_folder(path, purge_deleted=True):
|
||||
raise AssertionError(f"ingest_folder should not be called for missing root: {path}")
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
monkeypatch.setattr("atocore.projects.registry.ingest_folder", fail_ingest_folder)
|
||||
result = refresh_registered_project("ghost")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert result["status"] == "nothing_to_ingest"
|
||||
assert result["roots_ingested"] == 0
|
||||
assert result["roots_skipped"] == 1
|
||||
assert result["roots"][0]["status"] == "missing"
|
||||
|
||||
|
||||
def test_refresh_registered_project_reports_partial_status(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
real_root = vault_dir / "incoming" / "projects" / "p08-mixed"
|
||||
real_root.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p08-mixed",
|
||||
"aliases": ["mixed"],
|
||||
"description": "One root present, one missing",
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p08-mixed"},
|
||||
{"source": "vault", "subpath": "incoming/projects/p08-mixed-missing"},
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
def fake_ingest_folder(path, purge_deleted=True):
|
||||
return [{"file": str(path / "README.md"), "status": "ingested"}]
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
monkeypatch.setattr("atocore.projects.registry.ingest_folder", fake_ingest_folder)
|
||||
result = refresh_registered_project("mixed")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert result["status"] == "partial"
|
||||
assert result["roots_ingested"] == 1
|
||||
assert result["roots_skipped"] == 1
|
||||
statuses = sorted(root["status"] for root in result["roots"])
|
||||
assert statuses == ["ingested", "missing"]
|
||||
|
||||
|
||||
def test_project_registry_template_has_expected_shape():
|
||||
template = get_project_registry_template()
|
||||
assert "projects" in template
|
||||
assert template["projects"][0]["id"] == "p07-example"
|
||||
assert template["projects"][0]["ingest_roots"][0]["source"] == "vault"
|
||||
|
||||
|
||||
def test_project_registry_rejects_alias_collision(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["shared"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p04-gigabit"}
|
||||
],
|
||||
},
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["shared"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
],
|
||||
},
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
try:
|
||||
list_registered_projects()
|
||||
except ValueError as exc:
|
||||
assert "collision" in str(exc)
|
||||
else:
|
||||
raise AssertionError("Expected project registry collision to raise")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_project_registration_proposal_normalizes_and_resolves_paths(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
staged = vault_dir / "incoming" / "projects" / "p07-example"
|
||||
staged.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(json.dumps({"projects": []}), encoding="utf-8")
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
proposal = build_project_registration_proposal(
|
||||
project_id="p07-example",
|
||||
aliases=["p07", "example-project", "p07"],
|
||||
description="Example project",
|
||||
ingest_roots=[
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p07-example",
|
||||
"label": "Primary docs",
|
||||
}
|
||||
],
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert proposal["project"]["aliases"] == ["p07", "example-project"]
|
||||
assert proposal["resolved_ingest_roots"][0]["exists"] is True
|
||||
assert proposal["valid"] is True
|
||||
|
||||
|
||||
def test_project_registration_proposal_reports_collisions(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
proposal = build_project_registration_proposal(
|
||||
project_id="p08-example",
|
||||
aliases=["interferometer"],
|
||||
ingest_roots=[
|
||||
{"source": "vault", "subpath": "incoming/projects/p08-example"}
|
||||
],
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert proposal["valid"] is False
|
||||
assert proposal["collisions"][0]["existing_project"] == "p05-interferometer"
|
||||
|
||||
|
||||
def test_register_project_persists_new_entry(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
staged = vault_dir / "incoming" / "projects" / "p07-example"
|
||||
staged.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(json.dumps({"projects": []}), encoding="utf-8")
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
result = register_project(
|
||||
project_id="p07-example",
|
||||
aliases=["p07", "example-project"],
|
||||
description="Example project",
|
||||
ingest_roots=[
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p07-example",
|
||||
"label": "Primary docs",
|
||||
}
|
||||
],
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert result["status"] == "registered"
|
||||
payload = json.loads(registry_path.read_text(encoding="utf-8"))
|
||||
assert payload["projects"][0]["id"] == "p07-example"
|
||||
assert payload["projects"][0]["aliases"] == ["p07", "example-project"]
|
||||
|
||||
|
||||
def test_register_project_rejects_collisions(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
try:
|
||||
register_project(
|
||||
project_id="p07-example",
|
||||
aliases=["interferometer"],
|
||||
ingest_roots=[
|
||||
{"source": "vault", "subpath": "incoming/projects/p07-example"}
|
||||
],
|
||||
)
|
||||
except ValueError as exc:
|
||||
assert "collisions" in str(exc)
|
||||
else:
|
||||
raise AssertionError("Expected collision to prevent project registration")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
|
||||
def test_update_project_persists_description_and_aliases(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
staged = vault_dir / "incoming" / "projects" / "p04-gigabit"
|
||||
staged.mkdir(parents=True)
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["p04", "gigabit"],
|
||||
"description": "Old description",
|
||||
"ingest_roots": [
|
||||
{
|
||||
"source": "vault",
|
||||
"subpath": "incoming/projects/p04-gigabit",
|
||||
"label": "Primary docs",
|
||||
}
|
||||
],
|
||||
}
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
result = update_project(
|
||||
"p04",
|
||||
aliases=["p04", "gigabit", "gigabit-project"],
|
||||
description="Updated P04 project docs",
|
||||
)
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
|
||||
assert result["status"] == "updated"
|
||||
assert result["project"]["id"] == "p04-gigabit"
|
||||
assert result["project"]["aliases"] == ["p04", "gigabit", "gigabit-project"]
|
||||
assert result["project"]["description"] == "Updated P04 project docs"
|
||||
|
||||
payload = json.loads(registry_path.read_text(encoding="utf-8"))
|
||||
assert payload["projects"][0]["aliases"] == ["p04", "gigabit", "gigabit-project"]
|
||||
assert payload["projects"][0]["description"] == "Updated P04 project docs"
|
||||
|
||||
|
||||
def test_update_project_rejects_colliding_aliases(tmp_path, monkeypatch):
|
||||
vault_dir = tmp_path / "vault"
|
||||
drive_dir = tmp_path / "drive"
|
||||
config_dir = tmp_path / "config"
|
||||
vault_dir.mkdir()
|
||||
drive_dir.mkdir()
|
||||
config_dir.mkdir()
|
||||
registry_path = config_dir / "project-registry.json"
|
||||
registry_path.write_text(
|
||||
json.dumps(
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"id": "p04-gigabit",
|
||||
"aliases": ["p04", "gigabit"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p04-gigabit"}
|
||||
],
|
||||
},
|
||||
{
|
||||
"id": "p05-interferometer",
|
||||
"aliases": ["p05", "interferometer"],
|
||||
"ingest_roots": [
|
||||
{"source": "vault", "subpath": "incoming/projects/p05-interferometer"}
|
||||
],
|
||||
},
|
||||
]
|
||||
}
|
||||
),
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
monkeypatch.setenv("ATOCORE_VAULT_SOURCE_DIR", str(vault_dir))
|
||||
monkeypatch.setenv("ATOCORE_DRIVE_SOURCE_DIR", str(drive_dir))
|
||||
monkeypatch.setenv("ATOCORE_PROJECT_REGISTRY_PATH", str(registry_path))
|
||||
|
||||
original_settings = config.settings
|
||||
try:
|
||||
config.settings = config.Settings()
|
||||
try:
|
||||
update_project(
|
||||
"p04-gigabit",
|
||||
aliases=["p04", "interferometer"],
|
||||
)
|
||||
except ValueError as exc:
|
||||
assert "collisions" in str(exc)
|
||||
else:
|
||||
raise AssertionError("Expected collision to prevent project update")
|
||||
finally:
|
||||
config.settings = original_settings
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user