feat: tunable ranking, refresh status, chroma backup + admin endpoints
Three small improvements that move the operational baseline forward
without changing the existing trust model.
1. Tunable retrieval ranking weights
- rank_project_match_boost, rank_query_token_step,
rank_query_token_cap, rank_path_high_signal_boost,
rank_path_low_signal_penalty are now Settings fields
- all overridable via ATOCORE_* env vars
- retriever no longer hard-codes 2.0 / 1.18 / 0.72 / 0.08 / 1.32
- lets ranking be tuned per environment as Wave 1 is exercised
without code changes
2. /projects/{name}/refresh status
- refresh_registered_project now returns an overall status field
("ingested", "partial", "nothing_to_ingest") plus roots_ingested
and roots_skipped counters
- ProjectRefreshResponse advertises the new fields so callers can
rely on them
- covers the case where every configured root is missing on disk
3. Chroma cold snapshot + admin backup endpoints
- create_runtime_backup now accepts include_chroma and writes a
cold directory copy of the chroma persistence path
- new list_runtime_backups() and validate_backup() helpers
- new endpoints:
- POST /admin/backup create snapshot (optional chroma)
- GET /admin/backup list snapshots
- GET /admin/backup/{stamp}/validate structural validation
- chroma snapshots are taken under exclusive_ingestion() so a refresh
or ingest cannot race with the cold copy
- backup metadata records what was actually included and how big
Tests:
- 8 new tests covering tunable weights, refresh status branches
(ingested / partial / nothing_to_ingest), chroma snapshot, list,
validate, and the API endpoints (including the lock-acquisition path)
- existing fake refresh stubs in test_api_storage.py updated for the
expanded ProjectRefreshResponse model
- full suite: 105 passing (was 97)
next-steps doc updated to reflect that the chroma snapshot + restore
validation gap from current-state.md is now closed in code; only the
operational retention policy remains.
This commit is contained in:
@@ -255,12 +255,23 @@ def get_registered_project(project_name: str) -> RegisteredProject | None:
|
||||
|
||||
|
||||
def refresh_registered_project(project_name: str, purge_deleted: bool = False) -> dict:
|
||||
"""Ingest all configured source roots for a registered project."""
|
||||
"""Ingest all configured source roots for a registered project.
|
||||
|
||||
The returned dict carries an overall ``status`` so callers can tell at a
|
||||
glance whether the refresh was fully successful, partial, or did nothing
|
||||
at all because every configured root was missing or not a directory:
|
||||
|
||||
- ``ingested``: every root was a real directory and was ingested
|
||||
- ``partial``: at least one root ingested and at least one was unusable
|
||||
- ``nothing_to_ingest``: no roots were usable
|
||||
"""
|
||||
project = get_registered_project(project_name)
|
||||
if project is None:
|
||||
raise ValueError(f"Unknown project: {project_name}")
|
||||
|
||||
roots = []
|
||||
ingested_count = 0
|
||||
skipped_count = 0
|
||||
for source_ref in project.ingest_roots:
|
||||
resolved = _resolve_ingest_root(source_ref)
|
||||
root_result = {
|
||||
@@ -271,9 +282,11 @@ def refresh_registered_project(project_name: str, purge_deleted: bool = False) -
|
||||
}
|
||||
if not resolved.exists():
|
||||
roots.append({**root_result, "status": "missing"})
|
||||
skipped_count += 1
|
||||
continue
|
||||
if not resolved.is_dir():
|
||||
roots.append({**root_result, "status": "not_directory"})
|
||||
skipped_count += 1
|
||||
continue
|
||||
|
||||
roots.append(
|
||||
@@ -283,12 +296,23 @@ def refresh_registered_project(project_name: str, purge_deleted: bool = False) -
|
||||
"results": ingest_folder(resolved, purge_deleted=purge_deleted),
|
||||
}
|
||||
)
|
||||
ingested_count += 1
|
||||
|
||||
if ingested_count == 0:
|
||||
overall_status = "nothing_to_ingest"
|
||||
elif skipped_count == 0:
|
||||
overall_status = "ingested"
|
||||
else:
|
||||
overall_status = "partial"
|
||||
|
||||
return {
|
||||
"project": project.project_id,
|
||||
"aliases": list(project.aliases),
|
||||
"description": project.description,
|
||||
"purge_deleted": purge_deleted,
|
||||
"status": overall_status,
|
||||
"roots_ingested": ingested_count,
|
||||
"roots_skipped": skipped_count,
|
||||
"roots": roots,
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user