1a2ee5e07f1103041e6ed293aedc17a842ce8a8d
scripts/auto_triage.py: fetches candidate memories, asks a triage model (claude -p, default sonnet) to classify each as promote / reject / needs_human, and executes the verdict via the API. Trust model: - Auto-promote: model says promote AND confidence >= 0.8 AND dedup-checked against existing active memories for the project - Auto-reject: model says reject - needs_human: everything else stays in queue for manual review The triage model receives both the candidate content AND a summary of existing active memories for the same project, so it can detect duplicates and near-duplicates. The system prompt explicitly lists the rejection categories learned from the first two manual triage passes (stale snapshots, impl details, planned-not-implemented, process rules that belong in ledger not memory). deploy/dalidou/batch-extract.sh now runs extraction (Step A) then auto-triage (Step B) in sequence. The nightly cron at 03:00 UTC will run the full pipeline: backup → cleanup → rsync → extract → triage. Only needs_human candidates reach the human. Supports --dry-run for preview without executing. Supports --model override for multi-model triage (e.g. opus for higher-quality review, or a future Gemini/Ollama backend). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AtoCore
Personal context engine that enriches LLM interactions with durable memory, structured context, and project knowledge.
Quick Start
pip install -e .
uvicorn src.atocore.main:app --port 8100
Usage
# Ingest markdown files
curl -X POST http://localhost:8100/ingest \
-H "Content-Type: application/json" \
-d '{"path": "/path/to/notes"}'
# Build enriched context for a prompt
curl -X POST http://localhost:8100/context/build \
-H "Content-Type: application/json" \
-d '{"prompt": "What is the project status?", "project": "myproject"}'
# CLI ingestion
python scripts/ingest_folder.py --path /path/to/notes
# Live operator client
python scripts/atocore_client.py health
python scripts/atocore_client.py audit-query "gigabit" 5
API Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /ingest | Ingest markdown file or folder |
| POST | /query | Retrieve relevant chunks |
| POST | /context/build | Build full context pack |
| GET | /health | Health check |
| GET | /debug/context | Inspect last context pack |
Architecture
FastAPI (port 8100)
|- Ingestion: markdown -> parse -> chunk -> embed -> store
|- Retrieval: query -> embed -> vector search -> rank
|- Context Builder: retrieve -> boost -> budget -> format
|- SQLite (documents, chunks, memories, projects, interactions)
'- ChromaDB (vector embeddings)
Configuration
Set via environment variables (prefix ATOCORE_):
| Variable | Default | Description |
|---|---|---|
| ATOCORE_DEBUG | false | Enable debug logging |
| ATOCORE_PORT | 8100 | Server port |
| ATOCORE_CHUNK_MAX_SIZE | 800 | Max chunk size (chars) |
| ATOCORE_CONTEXT_BUDGET | 3000 | Context pack budget (chars) |
| ATOCORE_EMBEDDING_MODEL | paraphrase-multilingual-MiniLM-L12-v2 | Embedding model |
Testing
pip install -e ".[dev]"
pytest
Operations
scripts/atocore_client.pyprovides a live API client for project refresh, project-state inspection, and retrieval-quality audits.docs/operations.mdcaptures the current operational priority order: retrieval quality, Wave 2 trusted-operational ingestion, AtoDrive scoping, and restore validation.
Architecture Notes
Implementation-facing architecture notes live under docs/architecture/.
Current additions:
docs/architecture/engineering-knowledge-hybrid-architecture.md— 5-layer hybrid modeldocs/architecture/engineering-ontology-v1.md— V1 object and relationship inventorydocs/architecture/engineering-query-catalog.md— 20 v1-required queriesdocs/architecture/memory-vs-entities.md— canonical home splitdocs/architecture/promotion-rules.md— Layer 0 to Layer 2 pipelinedocs/architecture/conflict-model.md— contradictory facts detection and resolution
Description
Languages
Python
96.2%
Shell
3.3%
JavaScript
0.4%