Anto01 cbf9e03ab9 feat(engineering): V1-0 write-time invariants (F-1 + F-5 hook + F-8)
Phase V1-0 of the Engineering V1 Completion Plan. Establishes the
write-time invariants every later phase depends on so no later phase
can leak invalid state into the entity store.

F-1 shared-header fields per engineering-v1-acceptance.md:45:
  - entities.extractor_version (default "", EXTRACTOR_VERSION="v1.0.0"
    written by service.create_entity)
  - entities.canonical_home (default "entity")
  - entities.hand_authored (default 0, INTEGER boolean)
  Idempotent ALTERs in both _apply_migrations (database.py) and
  init_engineering_schema (service.py). CREATE TABLE also carries the
  columns for fresh DBs. _row_to_entity tolerates old rows without
  them so tests that predate V1-0 keep passing.

F-8 provenance enforcement per promotion-rules.md:243:
  create_entity raises ValueError when source_refs is empty and
  hand_authored is False. New kwargs hand_authored and
  extractor_version threaded through the API (EntityCreateRequest)
  and the /wiki/new form body (human wiki writes set hand_authored
  true by definition). The non-negotiable invariant: every row either
  carries provenance or is explicitly flagged as hand-authored.

F-5 synchronous conflict-detection hook on active create per
engineering-v1-acceptance.md:99:
  create_entity(status="active") now runs detect_conflicts_for_entity
  with fail-open per conflict-model.md:256. Detector errors log a
  warning but never 4xx-block the write (Q-3 "flag, never block").

Doc note added to engineering-ontology-v1.md recording that `project`
IS the `project_id` per "fields equivalent to" wording. No storage
rename.

Backfill script scripts/v1_0_backfill_provenance.py reports and
optionally flags existing active entities that lack provenance.
Idempotent. Supports --dry-run and --invalidate-instead.

Tests: 10 new in test_v1_0_write_invariants.py covering F-1 fields,
F-8 raise + bypass, F-5 hook on active + no-hook on candidate, Q-3
fail-open, Q-4 partial scope_only=active excludes candidates.

Three pre-existing conflict tests adapted to read list_open_conflicts
rather than re-run the detector (which now dedups because the hook
already fired at create-time). One API test adds hand_authored=true
since its fixture has no source_refs.

conftest.py wraps create_entity so tests that don't pass source_refs
or hand_authored default to hand_authored=True (tests author their
own fixture data — reasonable default). Production paths (API route,
wiki form, graduation scripts) all pass explicit values and are
unaffected.

Test count: 533 -> 543 (+10). Full suite green in 77.86s.

Pending: Codex review on the branch before squash-merge to main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:39:30 -04:00

AtoCore

Personal context engine that enriches LLM interactions with durable memory, structured context, and project knowledge.

Quick Start

pip install -e .
uvicorn src.atocore.main:app --port 8100

Usage

# Ingest markdown files
curl -X POST http://localhost:8100/ingest \
  -H "Content-Type: application/json" \
  -d '{"path": "/path/to/notes"}'

# Build enriched context for a prompt
curl -X POST http://localhost:8100/context/build \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is the project status?", "project": "myproject"}'

# CLI ingestion
python scripts/ingest_folder.py --path /path/to/notes

# Live operator client
python scripts/atocore_client.py health
python scripts/atocore_client.py audit-query "gigabit" 5

API Endpoints

Method Path Description
POST /ingest Ingest markdown file or folder
POST /query Retrieve relevant chunks
POST /context/build Build full context pack
GET /health Health check
GET /debug/context Inspect last context pack

API versioning

The public contract for external clients (AKC, OpenClaw, future tools) is served under a /v1 prefix. Every public endpoint is available at both its unversioned path and under /v1 — e.g. POST /entities and POST /v1/entities route to the same handler.

Rules:

  • New public endpoints land at the latest version prefix.
  • Backwards-compatible additions stay on the current version.
  • Breaking schema changes to an existing endpoint bump the prefix (/v2/...) and leave the prior version in place until clients migrate.
  • Unversioned paths are retained for internal callers (hooks, scripts, the wiki/admin UI). Do not rely on them from external clients — use /v1.

The authoritative list of versioned paths is in src/atocore/main.py (_V1_PUBLIC_PATHS). GET /openapi.json reflects both the versioned and unversioned forms.

Architecture

FastAPI (port 8100)
  |- Ingestion: markdown -> parse -> chunk -> embed -> store
  |- Retrieval: query -> embed -> vector search -> rank
  |- Context Builder: retrieve -> boost -> budget -> format
  |- SQLite (documents, chunks, memories, projects, interactions)
  '- ChromaDB (vector embeddings)

Configuration

Set via environment variables (prefix ATOCORE_):

Variable Default Description
ATOCORE_DEBUG false Enable debug logging
ATOCORE_PORT 8100 Server port
ATOCORE_CHUNK_MAX_SIZE 800 Max chunk size (chars)
ATOCORE_CONTEXT_BUDGET 3000 Context pack budget (chars)
ATOCORE_EMBEDDING_MODEL paraphrase-multilingual-MiniLM-L12-v2 Embedding model

Testing

pip install -e ".[dev]"
pytest

Operations

  • scripts/atocore_client.py provides a live API client for project refresh, project-state inspection, and retrieval-quality audits.
  • docs/operations.md captures the current operational priority order: retrieval quality, Wave 2 trusted-operational ingestion, AtoDrive scoping, and restore validation.

Architecture Notes

Implementation-facing architecture notes live under docs/architecture/.

Current additions:

  • docs/architecture/engineering-knowledge-hybrid-architecture.md — 5-layer hybrid model
  • docs/architecture/engineering-ontology-v1.md — V1 object and relationship inventory
  • docs/architecture/engineering-query-catalog.md — 20 v1-required queries
  • docs/architecture/memory-vs-entities.md — canonical home split
  • docs/architecture/promotion-rules.md — Layer 0 to Layer 2 pipeline
  • docs/architecture/conflict-model.md — contradictory facts detection and resolution
Description
ATODrive project repository
Readme 3.4 MiB
Languages
Python 95.2%
Shell 3.6%
JavaScript 0.9%
PowerShell 0.3%