Anto01 531c560db7 feat: Phase 1 ingestion hardening + Phase 5 Trusted Project State
Phase 1 - Ingestion hardening:
- Encoding fallback (UTF-8/UTF-8-sig/Latin-1/CP1252)
- Delete detection: purge DB/vector entries for removed files
- Ingestion stats endpoint (GET /stats)

Phase 5 - Trusted Project State:
- project_state table with categories (status, decision, requirement, contact, milestone, fact, config)
- CRUD API: POST/GET/DELETE /project/state
- Upsert semantics, invalidation (supersede) support
- Context builder integrates project state at highest trust precedence
- Project state gets 20% budget allocation, appears first in context
- Trust precedence: Project State > Retrieved Chunks (per Master Plan)

33/33 tests passing. Validated end-to-end with GigaBIT M1 project data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:41:59 -04:00

AtoCore

Personal context engine that enriches LLM interactions with durable memory, structured context, and project knowledge.

Quick Start

pip install -e .
uvicorn src.atocore.main:app --port 8100

Usage

# Ingest markdown files
curl -X POST http://localhost:8100/ingest \
  -H "Content-Type: application/json" \
  -d '{"path": "/path/to/notes"}'

# Build enriched context for a prompt
curl -X POST http://localhost:8100/context/build \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is the project status?", "project": "myproject"}'

# CLI ingestion
python scripts/ingest_folder.py --path /path/to/notes

API Endpoints

Method Path Description
POST /ingest Ingest markdown file or folder
POST /query Retrieve relevant chunks
POST /context/build Build full context pack
GET /health Health check
GET /debug/context Inspect last context pack

Architecture

FastAPI (port 8100)
  ├── Ingestion: markdown → parse → chunk → embed → store
  ├── Retrieval: query → embed → vector search → rank
  ├── Context Builder: retrieve → boost → budget → format
  ├── SQLite (documents, chunks, memories, projects, interactions)
  └── ChromaDB (vector embeddings)

Configuration

Set via environment variables (prefix ATOCORE_):

Variable Default Description
ATOCORE_DEBUG false Enable debug logging
ATOCORE_PORT 8100 Server port
ATOCORE_CHUNK_MAX_SIZE 800 Max chunk size (chars)
ATOCORE_CONTEXT_BUDGET 3000 Context pack budget (chars)
ATOCORE_EMBEDDING_MODEL paraphrase-multilingual-MiniLM-L12-v2 Embedding model

Testing

pip install -e ".[dev]"
pytest
Description
ATODrive project repository
Readme 1.7 MiB
Languages
Python 96.2%
Shell 3.3%
JavaScript 0.4%