audit: record extraction pipeline findings
This commit is contained in:
@@ -7,8 +7,8 @@
|
|||||||
## Orientation
|
## Orientation
|
||||||
|
|
||||||
- **live_sha** (Dalidou `/health` build_sha): `39d73e9`
|
- **live_sha** (Dalidou `/health` build_sha): `39d73e9`
|
||||||
- **last_updated**: 2026-04-12 by Codex (audit branch `codex/audit-2026-04-12-final`)
|
- **last_updated**: 2026-04-12 by Codex (audit branch `codex/audit-2026-04-12-extraction`)
|
||||||
- **main_tip**: `e2895b5`
|
- **main_tip**: `ac7f77d`
|
||||||
- **test_count**: 280 passing
|
- **test_count**: 280 passing
|
||||||
- **harness**: `16/18 PASS` (p06-firmware-interface = R7 ranking tie; p06-tailscale = chunk bleed)
|
- **harness**: `16/18 PASS` (p06-firmware-interface = R7 ranking tie; p06-tailscale = chunk bleed)
|
||||||
- **active_memories**: 36 (p06-polisher 16, p05-interferometer 6, p04-gigabit 5, atocore 5, other 4)
|
- **active_memories**: 36 (p06-polisher 16, p05-interferometer 6, p04-gigabit 5, atocore 5, other 4)
|
||||||
@@ -121,16 +121,18 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
|||||||
|
|
||||||
| id | finder | severity | file:line | summary | status | owner | opened_at | resolved_by |
|
| id | finder | severity | file:line | summary | status | owner | opened_at | resolved_by |
|
||||||
|-----|--------|----------|------------------------------------|-------------------------------------------------------------------------|--------------|--------|------------|-------------|
|
|-----|--------|----------|------------------------------------|-------------------------------------------------------------------------|--------------|--------|------------|-------------|
|
||||||
| R1 | Codex | P1 | deploy/hooks/capture_stop.py:76-85 | Live Claude capture still omits `extract`, so "loop closed both sides" remains overstated in practice even though the API supports it | acknowledged | Claude | 2026-04-11 | |
|
| R1 | Codex | P1 | deploy/hooks/capture_stop.py:76-85 | Live Claude capture still omits `extract`, so "loop closed both sides" remains overstated in practice even though the API supports it | fixed | Claude | 2026-04-11 | c67bec0 |
|
||||||
| R2 | Codex | P1 | src/atocore/context/builder.py | Project memories excluded from pack | fixed | Claude | 2026-04-11 | 8ea53f4 |
|
| R2 | Codex | P1 | src/atocore/context/builder.py | Project memories excluded from pack | fixed | Claude | 2026-04-11 | 8ea53f4 |
|
||||||
| R3 | Claude | P2 | src/atocore/memory/extractor.py | Rule cues (`## Decision:`) never fire on conversational LLM text | open | Claude | 2026-04-11 | |
|
| R3 | Claude | P2 | src/atocore/memory/extractor.py | Rule cues (`## Decision:`) never fire on conversational LLM text | open | Claude | 2026-04-11 | |
|
||||||
| R4 | Codex | P2 | DEV-LEDGER.md:11 | Orientation `main_tip` was stale versus `HEAD` / `origin/main` | fixed | Codex | 2026-04-11 | 81307ce |
|
| R4 | Codex | P2 | DEV-LEDGER.md:11 | Orientation `main_tip` was stale versus `HEAD` / `origin/main` | fixed | Codex | 2026-04-11 | 81307ce |
|
||||||
| R5 | Codex | P1 | src/atocore/interactions/service.py:157-174 | The deployed extraction path still calls only the rule extractor; the new LLM extractor is eval/script-only, so Day 4 "gate cleared" is true as a benchmark result but not as an operational extraction path | acknowledged | Claude | 2026-04-12 | |
|
| R5 | Codex | P1 | src/atocore/interactions/service.py:157-174 | The deployed extraction path still calls only the rule extractor; the new LLM extractor is eval/script-only, so Day 4 "gate cleared" is true as a benchmark result but not as an operational extraction path | fixed | Claude | 2026-04-12 | c67bec0 |
|
||||||
| R6 | Codex | P1 | src/atocore/memory/extractor_llm.py:258-276 | LLM extraction accepts model-supplied `project` verbatim with no fallback to `interaction.project`; live triage promoted a clearly p06 memory (offline/network rule) as project=`""`, which explains the p06-offline-design harness miss and falsifies the current "all 3 failures are budget-contention" claim | fixed | Claude | 2026-04-12 | 39d73e9 |
|
| R6 | Codex | P1 | src/atocore/memory/extractor_llm.py:258-276 | LLM extraction accepts model-supplied `project` verbatim with no fallback to `interaction.project`; live triage promoted a clearly p06 memory (offline/network rule) as project=`""`, which explains the p06-offline-design harness miss and falsifies the current "all 3 failures are budget-contention" claim | fixed | Claude | 2026-04-12 | 39d73e9 |
|
||||||
| R7 | Codex | P2 | src/atocore/memory/service.py:448-459 | Query ranking is overlap-count only, so broad overview memories can tie exact low-confidence memories and win on confidence; p06-firmware-interface is not just budget pressure, it also exposes a weak lexical scorer | open | Claude | 2026-04-12 | |
|
| R7 | Codex | P2 | src/atocore/memory/service.py:448-459 | Query ranking is overlap-count only, so broad overview memories can tie exact low-confidence memories and win on confidence; p06-firmware-interface is not just budget pressure, it also exposes a weak lexical scorer | open | Claude | 2026-04-12 | |
|
||||||
| R8 | Codex | P2 | tests/test_extractor_llm.py:1-7 | LLM extractor tests stop at parser/failure contracts; there is no automated coverage for the script-only persistence/review path that produced the 16 promoted memories, including project-scope preservation | open | Claude | 2026-04-12 | |
|
| R8 | Codex | P2 | tests/test_extractor_llm.py:1-7 | LLM extractor tests stop at parser/failure contracts; there is no automated coverage for the script-only persistence/review path that produced the 16 promoted memories, including project-scope preservation | open | Claude | 2026-04-12 | |
|
||||||
| R9 | Codex | P2 | src/atocore/memory/extractor_llm.py:258-259 | The R6 fallback only repairs empty project output. A wrong non-empty model project still overrides the interaction's known scope, so project attribution is improved but not yet trust-preserving. | open | Claude | 2026-04-12 | |
|
| R9 | Codex | P2 | src/atocore/memory/extractor_llm.py:258-259 | The R6 fallback only repairs empty project output. A wrong non-empty model project still overrides the interaction's known scope, so project attribution is improved but not yet trust-preserving. | open | Claude | 2026-04-12 | |
|
||||||
| R10 | Codex | P2 | docs/master-plan-status.md:31-33 | "Phase 8 - OpenClaw Integration" is fair as a baseline milestone, but not as a "primary" integration claim. `t420-openclaw/atocore.py` currently covers a narrow read-oriented subset (13 request shapes vs 32 API routes) plus fail-open health, while memory/interactions/admin write paths remain out of surface. | open | Claude | 2026-04-12 | |
|
| R10 | Codex | P2 | docs/master-plan-status.md:31-33 | "Phase 8 - OpenClaw Integration" is fair as a baseline milestone, but not as a "primary" integration claim. `t420-openclaw/atocore.py` currently covers a narrow read-oriented subset (13 request shapes vs 32 API routes) plus fail-open health, while memory/interactions/admin write paths remain out of surface. | open | Claude | 2026-04-12 | |
|
||||||
|
| R11 | Codex | P2 | src/atocore/api/routes.py:773-845 | `POST /admin/extract-batch` still accepts `mode="llm"` inside the container and returns a successful 0-candidate result instead of surfacing that host-only LLM extraction is unavailable from this runtime. That is a misleading API contract for operators. | open | Claude | 2026-04-12 | |
|
||||||
|
| R12 | Codex | P2 | scripts/batch_llm_extract_live.py:39-190 | The host-side extractor duplicates the LLM system prompt and JSON parsing logic from `src/atocore/memory/extractor_llm.py`. It works today, but this is now a prompt/parser drift risk across the container and host implementations. | open | Claude | 2026-04-12 | |
|
||||||
|
|
||||||
## Recent Decisions
|
## Recent Decisions
|
||||||
|
|
||||||
@@ -148,6 +150,7 @@ One branch `codex/extractor-eval-loop` for Day 1-5, a second `codex/retrieval-ha
|
|||||||
|
|
||||||
## Session Log
|
## Session Log
|
||||||
|
|
||||||
|
- **2026-04-12 Codex (audit branch `codex/audit-2026-04-12-extraction`)** audited `54d84b5..ac7f77d` with live Dalidou verification. Confirmed the host-side LLM extraction pipeline is operational: nightly cron points at `deploy/dalidou/cron-backup.sh`, Step 4 calls `deploy/dalidou/batch-extract.sh`, the batch script exists/executable on Dalidou, and a manual host-side run produced candidates successfully. Updated R1 and R5 to **fixed** (`c67bec0`) because extraction now runs unattended off-container. Live state during audit: build `39d73e9`, active memories **36**, candidate queue **29** (16 existing + 13 added by manual verification run), and `last_extract_batch_run` populated in AtoCore project state. Added R11-R12 for the misleading container `mode=llm` no-op and host/container prompt-parser duplication. Security note: CLI positional prompt/response text is visible in process args while `claude -p` runs; acceptable on a single-user home host, but worth remembering if Dalidou's trust boundary changes.
|
||||||
- **2026-04-12 Codex (audit branch `codex/audit-2026-04-12-final`)** audited `c5bad99..e2895b5` against origin/main, live Dalidou, and the OpenClaw client script. Live state checked: build `39d73e9`, harness reproducible at **16/18 PASS**, active memories **36**, and `t420-openclaw/atocore.py health` fails open correctly with `fail_open=true`. Spot-checks of Wave 2 project-state entries matched their cited vault docs. Updated R5-R8 status reality (R6 fixed by `39d73e9`), added R9-R10, and corrected Orientation `main_tip` to `e2895b5` because the ledger had drifted behind origin/main. Note: live Dalidou is still on `39d73e9`, so branch-truth and deploy-truth are not the same yet.
|
- **2026-04-12 Codex (audit branch `codex/audit-2026-04-12-final`)** audited `c5bad99..e2895b5` against origin/main, live Dalidou, and the OpenClaw client script. Live state checked: build `39d73e9`, harness reproducible at **16/18 PASS**, active memories **36**, and `t420-openclaw/atocore.py health` fails open correctly with `fail_open=true`. Spot-checks of Wave 2 project-state entries matched their cited vault docs. Updated R5-R8 status reality (R6 fixed by `39d73e9`), added R9-R10, and corrected Orientation `main_tip` to `e2895b5` because the ledger had drifted behind origin/main. Note: live Dalidou is still on `39d73e9`, so branch-truth and deploy-truth are not the same yet.
|
||||||
- **2026-04-12 Claude** Wave 2 trusted operational ingestion + codex audit response. Read 6 vault docs, created 8 new Trusted Project State entries (p04 +2, p05 +3, p06 +3). Fixed R6 (project fallback in LLM extractor) per codex audit. Fixed misscoped p06 offline memory on live Dalidou. Merged codex/audit-2026-04-12. Switched default LLM model from haiku to sonnet. Harness 15/18 -> 16/18. Tests 278 -> 280. main_tip 146f2e4 -> 39d73e9.
|
- **2026-04-12 Claude** Wave 2 trusted operational ingestion + codex audit response. Read 6 vault docs, created 8 new Trusted Project State entries (p04 +2, p05 +3, p06 +3). Fixed R6 (project fallback in LLM extractor) per codex audit. Fixed misscoped p06 offline memory on live Dalidou. Merged codex/audit-2026-04-12. Switched default LLM model from haiku to sonnet. Harness 15/18 -> 16/18. Tests 278 -> 280. main_tip 146f2e4 -> 39d73e9.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user