128 lines
5.4 KiB
Markdown
128 lines
5.4 KiB
Markdown
# AtoCore
|
|
|
|
Personal context engine that enriches LLM interactions with durable memory, structured context, and project knowledge.
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
pip install -e .
|
|
uvicorn atocore.main:app --port 8100
|
|
```
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
# Ingest markdown files
|
|
curl -X POST http://localhost:8100/ingest \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"path": "/path/to/notes"}'
|
|
|
|
# Build enriched context for a prompt
|
|
curl -X POST http://localhost:8100/context/build \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"prompt": "What is the project status?", "project": "myproject"}'
|
|
|
|
# CLI ingestion
|
|
python scripts/ingest_folder.py --path /path/to/notes
|
|
|
|
# Live operator client
|
|
python scripts/atocore_client.py health
|
|
python scripts/atocore_client.py audit-query "gigabit" 5
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| POST | /ingest | Ingest markdown file or folder |
|
|
| POST | /query | Retrieve relevant chunks |
|
|
| POST | /context/build | Build full context pack |
|
|
| POST | /interactions | Capture prompt/response interactions |
|
|
| GET/POST | /memory | List/create durable memories |
|
|
| GET/POST | /entities | Engineering entity graph surface |
|
|
| GET | /admin/dashboard | Operator dashboard |
|
|
| GET | /health | Health check |
|
|
| GET | /debug/context | Inspect last context pack |
|
|
|
|
## API versioning
|
|
|
|
The public contract for external clients (AKC, OpenClaw, future tools) is
|
|
served under a `/v1` prefix. Every public endpoint is available at both its
|
|
unversioned path and under `/v1` — e.g. `POST /entities` and `POST /v1/entities`
|
|
route to the same handler.
|
|
|
|
Rules:
|
|
|
|
- New public endpoints land at the latest version prefix.
|
|
- Backwards-compatible additions stay on the current version.
|
|
- Breaking schema changes to an existing endpoint bump the prefix (`/v2/...`)
|
|
and leave the prior version in place until clients migrate.
|
|
- Unversioned paths are retained for internal callers (hooks, scripts, the
|
|
wiki/admin UI). Do not rely on them from external clients — use `/v1`.
|
|
|
|
The authoritative list of versioned paths is in `src/atocore/main.py`
|
|
(`_V1_PUBLIC_PATHS`). `GET /openapi.json` reflects both the versioned and
|
|
unversioned forms.
|
|
|
|
## Architecture
|
|
|
|
```text
|
|
FastAPI (port 8100)
|
|
|- Ingestion: markdown -> parse -> chunk -> embed -> store
|
|
|- Retrieval: query -> embed -> vector search -> rank
|
|
|- Context Builder: project state -> memories -> entities -> retrieval -> budget
|
|
|- Reflection: capture -> reinforce -> extract -> triage -> promote/expire
|
|
|- Engineering: typed entities, relationships, conflicts, wiki/mirror
|
|
|- SQLite (documents, chunks, memories, projects, interactions, entities)
|
|
'- ChromaDB (vector embeddings)
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Set via environment variables (prefix `ATOCORE_`):
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| ATOCORE_DEBUG | false | Enable debug logging |
|
|
| ATOCORE_PORT | 8100 | Server port |
|
|
| ATOCORE_CHUNK_MAX_SIZE | 800 | Max chunk size (chars) |
|
|
| ATOCORE_CONTEXT_BUDGET | 3000 | Context pack budget (chars) |
|
|
| ATOCORE_EMBEDDING_MODEL | paraphrase-multilingual-MiniLM-L12-v2 | Embedding model |
|
|
| ATOCORE_RANK_PROJECT_MATCH_BOOST | 2.0 | Soft boost for chunks whose metadata matches the project hint |
|
|
| ATOCORE_RANK_PROJECT_SCOPE_FILTER | true | Filter project-hinted retrieval away from other registered project corpora |
|
|
| ATOCORE_RANK_PROJECT_SCOPE_CANDIDATE_MULTIPLIER | 4 | Widen candidate pull before project-scope filtering |
|
|
| ATOCORE_RANK_QUERY_TOKEN_STEP | 0.08 | Per-token boost when query terms appear in high-signal metadata |
|
|
| ATOCORE_RANK_QUERY_TOKEN_CAP | 1.32 | Maximum query-token boost multiplier |
|
|
| ATOCORE_RANK_PATH_HIGH_SIGNAL_BOOST | 1.18 | Boost current decision/status/requirements-like paths |
|
|
| ATOCORE_RANK_PATH_LOW_SIGNAL_PENALTY | 0.72 | Down-rank archive/history-like paths |
|
|
|
|
`ATOCORE_RANK_PROJECT_SCOPE_FILTER` gates the hard cross-project filter only.
|
|
`ATOCORE_RANK_PROJECT_MATCH_BOOST` remains the separate soft-ranking knob.
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
pip install -e ".[dev]"
|
|
pytest
|
|
```
|
|
|
|
## Operations
|
|
|
|
- `scripts/atocore_client.py` provides a live API client for project refresh, project-state inspection, and retrieval-quality audits.
|
|
- `scripts/retrieval_eval.py` runs the live retrieval/context harness, separates blocking failures from known content gaps, and stamps JSON output with target/build metadata.
|
|
- `scripts/live_status.py` renders a compact read-only status report from `/health`, `/stats`, `/projects`, and `/admin/dashboard`; set `ATOCORE_AUTH_TOKEN` or `--auth-token` when those endpoints are gated.
|
|
- `docs/operations.md` captures the current operational priority order: retrieval quality, Wave 2 trusted-operational ingestion, AtoDrive scoping, and restore validation.
|
|
- `DEV-LEDGER.md` is the fast-moving source of operational truth during active development; copy claims into docs only after checking the live service.
|
|
|
|
## Architecture Notes
|
|
|
|
Implementation-facing architecture notes live under `docs/architecture/`.
|
|
|
|
Current additions:
|
|
- `docs/architecture/engineering-knowledge-hybrid-architecture.md` — 5-layer hybrid model
|
|
- `docs/architecture/engineering-ontology-v1.md` — V1 object and relationship inventory
|
|
- `docs/architecture/engineering-query-catalog.md` — 20 v1-required queries
|
|
- `docs/architecture/memory-vs-entities.md` — canonical home split
|
|
- `docs/architecture/promotion-rules.md` — Layer 0 to Layer 2 pipeline
|
|
- `docs/architecture/conflict-model.md` — contradictory facts detection and resolution
|