bfa7dba4de4fb4a650f5e53e258e14c161e2edae
Adds structural metadata that the LLM triage was already implicitly
reasoning about ("stale snapshot" → reject). Phase 3 captures that
reasoning as fields so it can DRIVE retrieval, not just rejection.
Schema (src/atocore/models/database.py):
- domain_tags TEXT DEFAULT '[]' JSON array of lowercase topic keywords
- valid_until DATETIME ISO date; null = permanent
- idx_memories_valid_until index for efficient expiry queries
Memory service (src/atocore/memory/service.py):
- Memory dataclass gains domain_tags + valid_until
- create_memory, update_memory accept/persist both
- _row_to_memory safely reads both (JSON-decode + null handling)
- _normalize_tags helper: lowercase, dedup, strip, cap at 10
- get_memories_for_context filters expired (valid_until < today UTC)
- _rank_memories_for_query adds tag-boost: memories whose domain_tags
appear as substrings in query text rank higher (tertiary key after
content-overlap density + absolute overlap, before confidence)
LLM extractor (_llm_prompt.py → llm-0.5.0):
- SYSTEM_PROMPT documents domain_tags (2-5 keywords) + valid_until
(time-bounded facts get expiry dates; durable facts stay null)
- normalize_candidate_item parses both fields from model output with
graceful fallback for string/null/missing
LLM triage (scripts/auto_triage.py):
- TRIAGE_SYSTEM_PROMPT documents same two fields
- parse_verdict extracts them from verdict JSON
- On promote: PUT /memory/{id} with tags + valid_until BEFORE
POST /memory/{id}/promote, so active memories carry them
API (src/atocore/api/routes.py):
- MemoryCreateRequest: adds domain_tags, valid_until
- MemoryUpdateRequest: adds domain_tags, valid_until, memory_type
- GET /memory response exposes domain_tags + valid_until + created_at
Triage UI (src/atocore/engineering/triage_ui.py):
- Renders existing tags as colored badges
- Adds inline text field for tags (comma-separated) + date picker for
valid_until on every candidate card
- Save&Promote button persists edits via PUT then promotes
- Plain Promote (and Y shortcut) also saves tags/expiry if edited
Wiki (src/atocore/engineering/wiki.py):
- Search now matches memory content OR domain_tags
- Search results render tags as clickable badges linking to
/wiki/search?q=<tag> for cross-project navigation
- valid_until shown as amber "valid until YYYY-MM-DD" hint
Tests: 303 → 308 (5 new for Phase 3 behavior):
- test_create_memory_with_tags_and_valid_until
- test_create_memory_normalizes_tags
- test_update_memory_sets_tags_and_valid_until
- test_get_memories_for_context_excludes_expired
- test_context_builder_tag_boost_orders_results
Deferred (explicitly): temporal_scope enum, source_refs memory graph,
HDBSCAN clustering, memory detail wiki page, backfill of existing
actives. See docs/MASTER-BRAIN-PLAN.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AtoCore
Personal context engine that enriches LLM interactions with durable memory, structured context, and project knowledge.
Quick Start
pip install -e .
uvicorn src.atocore.main:app --port 8100
Usage
# Ingest markdown files
curl -X POST http://localhost:8100/ingest \
-H "Content-Type: application/json" \
-d '{"path": "/path/to/notes"}'
# Build enriched context for a prompt
curl -X POST http://localhost:8100/context/build \
-H "Content-Type: application/json" \
-d '{"prompt": "What is the project status?", "project": "myproject"}'
# CLI ingestion
python scripts/ingest_folder.py --path /path/to/notes
# Live operator client
python scripts/atocore_client.py health
python scripts/atocore_client.py audit-query "gigabit" 5
API Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /ingest | Ingest markdown file or folder |
| POST | /query | Retrieve relevant chunks |
| POST | /context/build | Build full context pack |
| GET | /health | Health check |
| GET | /debug/context | Inspect last context pack |
Architecture
FastAPI (port 8100)
|- Ingestion: markdown -> parse -> chunk -> embed -> store
|- Retrieval: query -> embed -> vector search -> rank
|- Context Builder: retrieve -> boost -> budget -> format
|- SQLite (documents, chunks, memories, projects, interactions)
'- ChromaDB (vector embeddings)
Configuration
Set via environment variables (prefix ATOCORE_):
| Variable | Default | Description |
|---|---|---|
| ATOCORE_DEBUG | false | Enable debug logging |
| ATOCORE_PORT | 8100 | Server port |
| ATOCORE_CHUNK_MAX_SIZE | 800 | Max chunk size (chars) |
| ATOCORE_CONTEXT_BUDGET | 3000 | Context pack budget (chars) |
| ATOCORE_EMBEDDING_MODEL | paraphrase-multilingual-MiniLM-L12-v2 | Embedding model |
Testing
pip install -e ".[dev]"
pytest
Operations
scripts/atocore_client.pyprovides a live API client for project refresh, project-state inspection, and retrieval-quality audits.docs/operations.mdcaptures the current operational priority order: retrieval quality, Wave 2 trusted-operational ingestion, AtoDrive scoping, and restore validation.
Architecture Notes
Implementation-facing architecture notes live under docs/architecture/.
Current additions:
docs/architecture/engineering-knowledge-hybrid-architecture.md— 5-layer hybrid modeldocs/architecture/engineering-ontology-v1.md— V1 object and relationship inventorydocs/architecture/engineering-query-catalog.md— 20 v1-required queriesdocs/architecture/memory-vs-entities.md— canonical home splitdocs/architecture/promotion-rules.md— Layer 0 to Layer 2 pipelinedocs/architecture/conflict-model.md— contradictory facts detection and resolution
Description
Languages
Python
94.8%
Shell
3.9%
JavaScript
0.9%
PowerShell
0.3%