fix(retrieval): fail open on registry resolution errors
This commit is contained in:
@@ -9,7 +9,7 @@
|
|||||||
- **live_sha** (Dalidou `/health` build_sha): `f44a211` (verified 2026-04-24T14:48:44Z post audit-improvements deploy; status=ok)
|
- **live_sha** (Dalidou `/health` build_sha): `f44a211` (verified 2026-04-24T14:48:44Z post audit-improvements deploy; status=ok)
|
||||||
- **last_updated**: 2026-04-24 by Codex (retrieval boundary deployed; project_id metadata branch started)
|
- **last_updated**: 2026-04-24 by Codex (retrieval boundary deployed; project_id metadata branch started)
|
||||||
- **main_tip**: `f44a211`
|
- **main_tip**: `f44a211`
|
||||||
- **test_count**: 565 on `codex/project-id-metadata-retrieval` (deployed main baseline: 553)
|
- **test_count**: 567 on `codex/project-id-metadata-retrieval` (deployed main baseline: 553)
|
||||||
- **harness**: `19/20 PASS` on live Dalidou, 0 blocking failures, 1 known content gap (`p04-constraints`)
|
- **harness**: `19/20 PASS` on live Dalidou, 0 blocking failures, 1 known content gap (`p04-constraints`)
|
||||||
- **vectors**: 33,253
|
- **vectors**: 33,253
|
||||||
- **active_memories**: 290 (`/admin/dashboard` 2026-04-24; note integrity panel reports a separate active_memory_count=951 and needs reconciliation)
|
- **active_memories**: 290 (`/admin/dashboard` 2026-04-24; note integrity panel reports a separate active_memory_count=951 and needs reconciliation)
|
||||||
@@ -174,6 +174,8 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
|||||||
|
|
||||||
- **2026-04-24 Codex (project_id audit response)** Applied independent-audit fixes on `codex/project-id-metadata-retrieval`. Closed the nightly `/ingest/sources` clobber risk by adding registry-level `derive_project_id_for_path()` and making unscoped `ingest_file()` derive ownership from registered ingest roots when possible; `refresh_registered_project()` still passes the canonical project id directly. Changed retrieval so empty `project_id` falls through to legacy path/tag ownership instead of short-circuiting as unowned. Hardened `scripts/backfill_chunk_project_ids.py`: `--apply` now requires `--chroma-snapshot-confirmed`, runs Chroma metadata updates before SQLite writes, batches updates, skips/report missing vectors, skips/report malformed metadata, reports already-tagged rows, and turns missing ingestion tables into a JSON `db_warning` instead of a traceback. Added tests for auto-derive ingestion, empty-project fallback, ingest-root overlap rejection, and backfill dry-run/apply/snapshot/missing-vector/malformed cases. Verified targeted suite (`test_backfill_chunk_project_ids.py`, `test_ingestion.py`, `test_project_registry.py`, `test_retrieval.py`): 45 passed. Verified full suite: 565 passed in 73.16s. Local dry-run on empty/default data returns 0 updates with `db_warning` rather than crashing. Branch still not merged/deployed.
|
- **2026-04-24 Codex (project_id audit response)** Applied independent-audit fixes on `codex/project-id-metadata-retrieval`. Closed the nightly `/ingest/sources` clobber risk by adding registry-level `derive_project_id_for_path()` and making unscoped `ingest_file()` derive ownership from registered ingest roots when possible; `refresh_registered_project()` still passes the canonical project id directly. Changed retrieval so empty `project_id` falls through to legacy path/tag ownership instead of short-circuiting as unowned. Hardened `scripts/backfill_chunk_project_ids.py`: `--apply` now requires `--chroma-snapshot-confirmed`, runs Chroma metadata updates before SQLite writes, batches updates, skips/report missing vectors, skips/report malformed metadata, reports already-tagged rows, and turns missing ingestion tables into a JSON `db_warning` instead of a traceback. Added tests for auto-derive ingestion, empty-project fallback, ingest-root overlap rejection, and backfill dry-run/apply/snapshot/missing-vector/malformed cases. Verified targeted suite (`test_backfill_chunk_project_ids.py`, `test_ingestion.py`, `test_project_registry.py`, `test_retrieval.py`): 45 passed. Verified full suite: 565 passed in 73.16s. Local dry-run on empty/default data returns 0 updates with `db_warning` rather than crashing. Branch still not merged/deployed.
|
||||||
|
|
||||||
|
- **2026-04-24 Codex (project_id final hardening before merge)** Applied the final independent-review P2s on `codex/project-id-metadata-retrieval`: `ingest_file()` still fails open when project-id derivation fails, but now emits `project_id_derivation_failed` with file path and error; retrieval now catches registry failures both at project-scope resolution and the soft project-match boost path, logs warnings, and serves unscoped rather than raising. Added regression tests for both fail-open paths. Verified targeted suite (`test_ingestion.py`, `test_retrieval.py`, `test_backfill_chunk_project_ids.py`, `test_project_registry.py`): 47 passed. Verified full suite: 567 passed in 79.66s. Branch still not merged/deployed.
|
||||||
|
|
||||||
- **2026-04-24 Codex (audit improvements foundation)** Started implementation of the audit recommendations on branch `codex/audit-improvements-foundation` from `origin/main@c53e61e`. First tranche: registry-aware project-scoped retrieval filtering (`ATOCORE_RANK_PROJECT_SCOPE_FILTER`, widened candidate pull before filtering), eval harness known-issue lane, two p05 project-bleed fixtures, `scripts/live_status.py`, README/current-state/master-plan status refresh. Verified `pytest -q`: 550 passed in 67.11s. Live retrieval harness against undeployed production: 20 fixtures, 18 pass, 1 known issue (`p04-constraints` Zerodur/1.2 content gap), 1 blocking guard (`p05-broad-status-no-atomizer`) still failing because production has not yet deployed the retrieval filter and currently pulls `P04-GigaBIT-M1-KB-design` into broad p05 status context. Live dashboard refresh: health ok, build `2b86543`, docs 1748, chunks/vectors 33253, interactions 948, active memories 289, candidates 0, project_state total 128. Noted count discrepancy: dashboard memories.active=289 while integrity active_memory_count=951; schedule reconciliation in a follow-up.
|
- **2026-04-24 Codex (audit improvements foundation)** Started implementation of the audit recommendations on branch `codex/audit-improvements-foundation` from `origin/main@c53e61e`. First tranche: registry-aware project-scoped retrieval filtering (`ATOCORE_RANK_PROJECT_SCOPE_FILTER`, widened candidate pull before filtering), eval harness known-issue lane, two p05 project-bleed fixtures, `scripts/live_status.py`, README/current-state/master-plan status refresh. Verified `pytest -q`: 550 passed in 67.11s. Live retrieval harness against undeployed production: 20 fixtures, 18 pass, 1 known issue (`p04-constraints` Zerodur/1.2 content gap), 1 blocking guard (`p05-broad-status-no-atomizer`) still failing because production has not yet deployed the retrieval filter and currently pulls `P04-GigaBIT-M1-KB-design` into broad p05 status context. Live dashboard refresh: health ok, build `2b86543`, docs 1748, chunks/vectors 33253, interactions 948, active memories 289, candidates 0, project_state total 128. Noted count discrepancy: dashboard memories.active=289 while integrity active_memory_count=951; schedule reconciliation in a follow-up.
|
||||||
|
|
||||||
- **2026-04-24 Codex (independent-audit hardening)** Applied the Opus independent audit's fast follow-ups before merge/deploy. Closed the two P1s by making project-scope ownership path/tag-based only, adding path-segment/tag-exact matching to avoid short-alias substring collisions, and keeping title/heading text out of provenance decisions. Added regression tests for title poisoning, substring collision, and unknown-project fallback. Added retrieval log fields `raw_results_count`, `post_filter_count`, `post_filter_dropped`, and `underfilled`. Added retrieval-eval run metadata (`generated_at`, `base_url`, `/health`) and `live_status.py` auth-token/status support. README now documents the ranking knobs and clarifies that the hard scope filter and soft project match boost are separate controls. Verified `pytest -q`: 553 passed in 66.07s. Live production remains expected-predeploy: 20 fixtures, 18 pass, 1 known content gap, 1 blocking p05 bleed guard. Latest live dashboard: build `2b86543`, docs 1748, chunks/vectors 33253, interactions 950, active memories 290, candidates 0, project_state total 128.
|
- **2026-04-24 Codex (independent-audit hardening)** Applied the Opus independent audit's fast follow-ups before merge/deploy. Closed the two P1s by making project-scope ownership path/tag-based only, adding path-segment/tag-exact matching to avoid short-alias substring collisions, and keeping title/heading text out of provenance decisions. Added regression tests for title poisoning, substring collision, and unknown-project fallback. Added retrieval log fields `raw_results_count`, `post_filter_count`, `post_filter_dropped`, and `underfilled`. Added retrieval-eval run metadata (`generated_at`, `base_url`, `/health`) and `live_status.py` auth-token/status support. README now documents the ranking knobs and clarifies that the hard scope filter and soft project match boost are separate controls. Verified `pytest -q`: 553 passed in 66.07s. Live production remains expected-predeploy: 20 fixtures, 18 pass, 1 known content gap, 1 blocking p05 bleed guard. Latest live dashboard: build `2b86543`, docs 1748, chunks/vectors 33253, interactions 950, active memories 290, candidates 0, project_state total 128.
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
Update 2026-04-24: audit-improvements deployed as `f44a211`; live harness is
|
Update 2026-04-24: audit-improvements deployed as `f44a211`; live harness is
|
||||||
19/20 with 0 blocking failures and 1 known content gap. Active follow-up branch
|
19/20 with 0 blocking failures and 1 known content gap. Active follow-up branch
|
||||||
`codex/project-id-metadata-retrieval` is at 565 passing tests.
|
`codex/project-id-metadata-retrieval` is at 567 passing tests.
|
||||||
|
|
||||||
Live deploy: `2b86543` · Dalidou health: ok · Harness: 18/20 with 1 known
|
Live deploy: `2b86543` · Dalidou health: ok · Harness: 18/20 with 1 known
|
||||||
content gap and 1 current blocking project-bleed guard · Tests: 553 passing.
|
content gap and 1 current blocking project-bleed guard · Tests: 553 passing.
|
||||||
|
|||||||
@@ -152,7 +152,7 @@ deferred from the shared client until their workflows are exercised.
|
|||||||
- query-relevance memory ranking with overlap-density scoring
|
- query-relevance memory ranking with overlap-density scoring
|
||||||
- retrieval eval harness: 20 fixtures; current live has 19 pass, 1 known
|
- retrieval eval harness: 20 fixtures; current live has 19 pass, 1 known
|
||||||
content gap, and 0 blocking failures after the audit-improvements deploy
|
content gap, and 0 blocking failures after the audit-improvements deploy
|
||||||
- 565 tests passing on the active `codex/project-id-metadata-retrieval` branch
|
- 567 tests passing on the active `codex/project-id-metadata-retrieval` branch
|
||||||
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
|
- nightly pipeline: backup → cleanup → rsync → OpenClaw import → vault refresh → extract → triage → **auto-promote/expire** → weekly synth/lint → **retrieval harness** → **pipeline summary to project state**
|
||||||
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
|
- Phase 10 operational: reinforcement-based auto-promotion (ref_count ≥ 3, confidence ≥ 0.7) + stale candidate expiry (14 days unreinforced)
|
||||||
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
|
- pipeline health visible in dashboard: interaction totals by client, pipeline last_run, harness results, triage stats
|
||||||
|
|||||||
@@ -42,7 +42,12 @@ def ingest_file(file_path: Path, project_id: str = "") -> dict:
|
|||||||
from atocore.projects.registry import derive_project_id_for_path
|
from atocore.projects.registry import derive_project_id_for_path
|
||||||
|
|
||||||
project_id = derive_project_id_for_path(file_path)
|
project_id = derive_project_id_for_path(file_path)
|
||||||
except Exception:
|
except Exception as exc:
|
||||||
|
log.warning(
|
||||||
|
"project_id_derivation_failed",
|
||||||
|
file_path=str(file_path),
|
||||||
|
error=str(exc),
|
||||||
|
)
|
||||||
project_id = ""
|
project_id = ""
|
||||||
|
|
||||||
if not file_path.exists():
|
if not file_path.exists():
|
||||||
|
|||||||
@@ -84,7 +84,15 @@ def retrieve(
|
|||||||
"""Retrieve the most relevant chunks for a query."""
|
"""Retrieve the most relevant chunks for a query."""
|
||||||
top_k = top_k or _config.settings.context_top_k
|
top_k = top_k or _config.settings.context_top_k
|
||||||
start = time.time()
|
start = time.time()
|
||||||
|
try:
|
||||||
scoped_project = get_registered_project(project_hint) if project_hint else None
|
scoped_project = get_registered_project(project_hint) if project_hint else None
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning(
|
||||||
|
"project_scope_resolution_failed",
|
||||||
|
project_hint=project_hint,
|
||||||
|
error=str(exc),
|
||||||
|
)
|
||||||
|
scoped_project = None
|
||||||
scope_filter_enabled = bool(scoped_project and _config.settings.rank_project_scope_filter)
|
scope_filter_enabled = bool(scoped_project and _config.settings.rank_project_scope_filter)
|
||||||
registered_projects = None
|
registered_projects = None
|
||||||
query_top_k = top_k
|
query_top_k = top_k
|
||||||
@@ -292,7 +300,15 @@ def _project_match_boost(project_hint: str, metadata: dict) -> float:
|
|||||||
if not hint_lower:
|
if not hint_lower:
|
||||||
return 1.0
|
return 1.0
|
||||||
|
|
||||||
|
try:
|
||||||
project = get_registered_project(project_hint)
|
project = get_registered_project(project_hint)
|
||||||
|
except Exception as exc:
|
||||||
|
log.warning(
|
||||||
|
"project_match_boost_resolution_failed",
|
||||||
|
project_hint=project_hint,
|
||||||
|
error=str(exc),
|
||||||
|
)
|
||||||
|
project = None
|
||||||
candidate_names = _project_scope_terms(project) if project is not None else {hint_lower}
|
candidate_names = _project_scope_terms(project) if project is not None else {hint_lower}
|
||||||
for candidate in candidate_names:
|
for candidate in candidate_names:
|
||||||
if _metadata_has_term(metadata, candidate):
|
if _metadata_has_term(metadata, candidate):
|
||||||
|
|||||||
@@ -163,6 +163,45 @@ def test_ingest_file_derives_project_id_from_registry_root(tmp_data_dir, tmp_pat
|
|||||||
assert all(meta["project_id"] == "p04-gigabit" for meta in fake_store.metadatas)
|
assert all(meta["project_id"] == "p04-gigabit" for meta in fake_store.metadatas)
|
||||||
|
|
||||||
|
|
||||||
|
def test_ingest_file_logs_and_fails_open_when_project_derivation_fails(
|
||||||
|
tmp_data_dir,
|
||||||
|
sample_markdown,
|
||||||
|
monkeypatch,
|
||||||
|
):
|
||||||
|
"""A broken registry should be visible but should not block ingestion."""
|
||||||
|
init_db()
|
||||||
|
warnings = []
|
||||||
|
|
||||||
|
class FakeVectorStore:
|
||||||
|
def __init__(self):
|
||||||
|
self.metadatas = []
|
||||||
|
|
||||||
|
def add(self, ids, documents, metadatas):
|
||||||
|
self.metadatas.extend(metadatas)
|
||||||
|
|
||||||
|
def delete(self, ids):
|
||||||
|
return None
|
||||||
|
|
||||||
|
fake_store = FakeVectorStore()
|
||||||
|
monkeypatch.setattr("atocore.ingestion.pipeline.get_vector_store", lambda: fake_store)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"atocore.projects.registry.derive_project_id_for_path",
|
||||||
|
lambda path: (_ for _ in ()).throw(ValueError("registry broken")),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"atocore.ingestion.pipeline.log.warning",
|
||||||
|
lambda event, **kwargs: warnings.append((event, kwargs)),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = ingest_file(sample_markdown)
|
||||||
|
|
||||||
|
assert result["status"] == "ingested"
|
||||||
|
assert fake_store.metadatas
|
||||||
|
assert all(meta["project_id"] == "" for meta in fake_store.metadatas)
|
||||||
|
assert warnings[0][0] == "project_id_derivation_failed"
|
||||||
|
assert "registry broken" in warnings[0][1]["error"]
|
||||||
|
|
||||||
|
|
||||||
def test_ingest_project_folder_passes_project_id_to_files(tmp_data_dir, sample_folder, monkeypatch):
|
def test_ingest_project_folder_passes_project_id_to_files(tmp_data_dir, sample_folder, monkeypatch):
|
||||||
seen = []
|
seen = []
|
||||||
|
|
||||||
|
|||||||
@@ -566,6 +566,59 @@ def test_retrieve_unknown_project_hint_does_not_widen_or_filter(monkeypatch):
|
|||||||
assert [r.chunk_id for r in results] == ["chunk-a", "chunk-b"]
|
assert [r.chunk_id for r in results] == ["chunk-a", "chunk-b"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_retrieve_fails_open_when_project_scope_resolution_fails(monkeypatch):
|
||||||
|
warnings = []
|
||||||
|
|
||||||
|
class FakeStore:
|
||||||
|
def query(self, query_embedding, top_k=10, where=None):
|
||||||
|
assert top_k == 2
|
||||||
|
return {
|
||||||
|
"ids": [["chunk-a", "chunk-b"]],
|
||||||
|
"documents": [["doc a", "doc b"]],
|
||||||
|
"metadatas": [[
|
||||||
|
{
|
||||||
|
"heading_path": "Overview",
|
||||||
|
"source_file": "p04-gigabit/file.md",
|
||||||
|
"tags": "[]",
|
||||||
|
"title": "A",
|
||||||
|
"document_id": "doc-a",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"heading_path": "Overview",
|
||||||
|
"source_file": "p05-interferometer/file.md",
|
||||||
|
"tags": "[]",
|
||||||
|
"title": "B",
|
||||||
|
"document_id": "doc-b",
|
||||||
|
},
|
||||||
|
]],
|
||||||
|
"distances": [[0.2, 0.21]],
|
||||||
|
}
|
||||||
|
|
||||||
|
monkeypatch.setattr("atocore.retrieval.retriever.get_vector_store", lambda: FakeStore())
|
||||||
|
monkeypatch.setattr("atocore.retrieval.retriever.embed_query", lambda query: [0.0, 0.1])
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"atocore.retrieval.retriever._existing_chunk_ids",
|
||||||
|
lambda chunk_ids: set(chunk_ids),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"atocore.retrieval.retriever.get_registered_project",
|
||||||
|
lambda project_name: (_ for _ in ()).throw(ValueError("registry overlap")),
|
||||||
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"atocore.retrieval.retriever.log.warning",
|
||||||
|
lambda event, **kwargs: warnings.append((event, kwargs)),
|
||||||
|
)
|
||||||
|
|
||||||
|
results = retrieve("overview", top_k=2, project_hint="p04")
|
||||||
|
|
||||||
|
assert [r.chunk_id for r in results] == ["chunk-a", "chunk-b"]
|
||||||
|
assert {warning[0] for warning in warnings} == {
|
||||||
|
"project_scope_resolution_failed",
|
||||||
|
"project_match_boost_resolution_failed",
|
||||||
|
}
|
||||||
|
assert all("registry overlap" in warning[1]["error"] for warning in warnings)
|
||||||
|
|
||||||
|
|
||||||
def test_retrieve_downranks_archive_noise_and_prefers_high_signal_paths(monkeypatch):
|
def test_retrieve_downranks_archive_noise_and_prefers_high_signal_paths(monkeypatch):
|
||||||
class FakeStore:
|
class FakeStore:
|
||||||
def query(self, query_embedding, top_k=10, where=None):
|
def query(self, query_embedding, top_k=10, where=None):
|
||||||
|
|||||||
Reference in New Issue
Block a user