feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
"""AtoCore configuration via environment variables."""
|
|
|
|
|
|
|
|
|
|
from pathlib import Path
|
|
|
|
|
|
|
|
|
|
from pydantic_settings import BaseSettings
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class Settings(BaseSettings):
|
2026-04-05 18:33:52 -04:00
|
|
|
env: str = "development"
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
debug: bool = False
|
2026-04-05 18:33:52 -04:00
|
|
|
log_level: str = "INFO"
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
data_dir: Path = Path("./data")
|
2026-04-05 18:33:52 -04:00
|
|
|
db_dir: Path | None = None
|
|
|
|
|
chroma_dir: Path | None = None
|
|
|
|
|
cache_dir: Path | None = None
|
|
|
|
|
tmp_dir: Path | None = None
|
|
|
|
|
vault_source_dir: Path = Path("./sources/vault")
|
|
|
|
|
drive_source_dir: Path = Path("./sources/drive")
|
|
|
|
|
source_vault_enabled: bool = True
|
|
|
|
|
source_drive_enabled: bool = True
|
|
|
|
|
log_dir: Path = Path("./logs")
|
|
|
|
|
backup_dir: Path = Path("./backups")
|
|
|
|
|
run_dir: Path = Path("./run")
|
2026-04-06 08:02:13 -04:00
|
|
|
project_registry_path: Path = Path("./config/project-registry.json")
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
host: str = "127.0.0.1"
|
|
|
|
|
port: int = 8100
|
2026-04-06 10:15:00 -04:00
|
|
|
db_busy_timeout_ms: int = 5000
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
# Embedding
|
|
|
|
|
embedding_model: str = (
|
|
|
|
|
"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# Chunking
|
|
|
|
|
chunk_max_size: int = 800
|
|
|
|
|
chunk_overlap: int = 100
|
|
|
|
|
chunk_min_size: int = 50
|
|
|
|
|
|
|
|
|
|
# Context
|
|
|
|
|
context_budget: int = 3000
|
|
|
|
|
context_top_k: int = 15
|
|
|
|
|
|
|
|
|
|
model_config = {"env_prefix": "ATOCORE_"}
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def db_path(self) -> Path:
|
2026-04-05 18:33:52 -04:00
|
|
|
legacy_path = self.resolved_data_dir / "atocore.db"
|
|
|
|
|
if self.db_dir is not None:
|
|
|
|
|
return self.resolved_db_dir / "atocore.db"
|
|
|
|
|
if legacy_path.exists():
|
|
|
|
|
return legacy_path
|
|
|
|
|
return self.resolved_db_dir / "atocore.db"
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def chroma_path(self) -> Path:
|
2026-04-05 18:33:52 -04:00
|
|
|
return self._resolve_path(self.chroma_dir or (self.resolved_data_dir / "chroma"))
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def cache_path(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.cache_dir or (self.resolved_data_dir / "cache"))
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def tmp_path(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.tmp_dir or (self.resolved_data_dir / "tmp"))
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def resolved_data_dir(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.data_dir)
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def resolved_db_dir(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.db_dir or (self.resolved_data_dir / "db"))
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def resolved_vault_source_dir(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.vault_source_dir)
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def resolved_drive_source_dir(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.drive_source_dir)
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def resolved_log_dir(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.log_dir)
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def resolved_backup_dir(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.backup_dir)
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def resolved_run_dir(self) -> Path:
|
|
|
|
|
if self.run_dir == Path("./run"):
|
|
|
|
|
return self._resolve_path(self.resolved_data_dir.parent / "run")
|
|
|
|
|
return self._resolve_path(self.run_dir)
|
|
|
|
|
|
2026-04-06 08:02:13 -04:00
|
|
|
@property
|
|
|
|
|
def resolved_project_registry_path(self) -> Path:
|
|
|
|
|
return self._resolve_path(self.project_registry_path)
|
|
|
|
|
|
2026-04-05 18:33:52 -04:00
|
|
|
@property
|
|
|
|
|
def machine_dirs(self) -> list[Path]:
|
|
|
|
|
return [
|
|
|
|
|
self.db_path.parent,
|
|
|
|
|
self.chroma_path,
|
|
|
|
|
self.cache_path,
|
|
|
|
|
self.tmp_path,
|
|
|
|
|
self.resolved_log_dir,
|
|
|
|
|
self.resolved_backup_dir,
|
|
|
|
|
self.resolved_run_dir,
|
2026-04-06 09:52:19 -04:00
|
|
|
self.resolved_project_registry_path.parent,
|
2026-04-05 18:33:52 -04:00
|
|
|
]
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def source_specs(self) -> list[dict[str, object]]:
|
|
|
|
|
return [
|
|
|
|
|
{
|
|
|
|
|
"name": "vault",
|
|
|
|
|
"enabled": self.source_vault_enabled,
|
|
|
|
|
"path": self.resolved_vault_source_dir,
|
|
|
|
|
"read_only": True,
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"name": "drive",
|
|
|
|
|
"enabled": self.source_drive_enabled,
|
|
|
|
|
"path": self.resolved_drive_source_dir,
|
|
|
|
|
"read_only": True,
|
|
|
|
|
},
|
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
def source_dirs(self) -> list[Path]:
|
|
|
|
|
return [spec["path"] for spec in self.source_specs if spec["enabled"]]
|
|
|
|
|
|
|
|
|
|
def _resolve_path(self, path: Path) -> Path:
|
|
|
|
|
return path.expanduser().resolve(strict=False)
|
feat: implement AtoCore Phase 0 + Phase 0.5 (foundation + PoC)
Complete implementation of the personal context engine foundation:
- FastAPI server with 5 endpoints (ingest, query, context/build, health, debug)
- SQLite database with 5 tables (documents, chunks, memories, projects, interactions)
- Heading-aware markdown chunker (800 char max, recursive splitting)
- Multilingual embeddings via sentence-transformers (EN/FR)
- ChromaDB vector store with cosine similarity retrieval
- Context builder with project boosting, dedup, and budget enforcement
- CLI scripts for batch ingestion and test prompt evaluation
- 19 unit tests passing, 79% coverage
- Validated on 482 real project files (8383 chunks, 0 errors)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:21:27 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
settings = Settings()
|
2026-04-05 18:33:52 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def ensure_runtime_dirs() -> None:
|
|
|
|
|
"""Create writable runtime directories for machine state and logs.
|
|
|
|
|
|
|
|
|
|
Source directories are intentionally excluded because they are treated as
|
|
|
|
|
read-only ingestion inputs by convention.
|
|
|
|
|
"""
|
|
|
|
|
for directory in settings.machine_dirs:
|
|
|
|
|
directory.mkdir(parents=True, exist_ok=True)
|