Files
ATOCore/docs/openclaw-atocore-nightly-screener-runbook.md

355 lines
8.9 KiB
Markdown
Raw Normal View History

# OpenClaw x AtoCore Nightly Screener Runbook
## Purpose
The nightly screener is the V1 bridge between broad evidence capture and narrow trusted state.
Its job is to:
- gather raw evidence from approved V1 sources
- reduce noise
- produce reviewable candidate material
- prepare operator review work
- never silently create trusted truth
## Scope
The nightly screener is a screening and preparation job.
It is not a trusted-state writer.
It is not a registry operator.
It is not a hidden reviewer.
V1 active inputs are:
- Discord and Discrawl evidence
- OpenClaw interaction evidence
- PKM, repos, and KB references
- read-only AtoCore context for comparison and deduplication
## Explicit approval rule
If the screener output points at a mutating operator action, that action still requires:
- direct human instruction
- in the current thread or current session
- for that specific action
- with no inference from evidence or screener output alone
The screener may recommend review. It may not manufacture approval.
## Inputs
The screener may consume the following inputs when available.
### 1. Discord and Discrawl evidence
Examples:
- recent archived Discord messages
- thread excerpts relevant to known projects
- conversation clusters around decisions, requirements, constraints, or repeated questions
### 2. OpenClaw interaction evidence
Examples:
- captured interactions
- recent operator conversations relevant to projects
- already-logged evidence bundles
### 3. Read-only AtoCore context inputs
Examples:
- project registry lookup for project matching
- project_state read for comparison only
- memory or entity lookups for deduplication only
These reads may help the screener rank or classify candidates, but they must not be used as a write side effect.
### 4. Optional canonical-source references
Examples:
- PKM notes
- repo docs
- KB-export summaries
These may be consulted to decide whether a signal appears to duplicate or contradict already-canonical truth.
## Outputs
The screener should produce output in four buckets.
### 1. Nightly screener report
A compact report describing:
- inputs seen
- items skipped
- candidate counts
- project match confidence distribution
- failures or unavailable sources
- items requiring human review
### 2. Evidence bundle or manifest
A structured bundle of the source snippets that justified each candidate or unresolved item.
This is the reviewer's provenance package.
### 3. Candidate manifests
Separate candidate manifests for:
- memory candidates
- entity candidates later
- unresolved "needs canonical-source update first" items
### 4. Operator action queue
A short list of items needing explicit human action, such as:
- review these candidates
- decide whether to refresh project X
- decide whether to curate project_state
- decide whether a Discord-originated claim should first be reflected in PKM, repo, or KB
## Required non-output
The screener must not directly produce any of the following:
- active memories without review
- active entities without review
- project_state writes
- registry mutation
- refresh operations
- ingestion operations
- promote or reject decisions
## Nightly procedure
### Step 1 - load last-run checkpoint
Read the last successful screener checkpoint so the run knows:
- what time range to inspect
- what evidence was already processed
- which items were already dropped or bundled
If no checkpoint exists, use a conservative bounded time window and mark the run as bootstrap mode.
### Step 2 - gather evidence
Collect available evidence from each configured source.
Per-source rule:
- source unavailable -> note it, continue
- source empty -> note it, continue
- source noisy -> keep raw capture bounded and deduplicated
### Step 3 - normalize and deduplicate
For each collected item:
- normalize timestamps, source ids, and project hints
- remove exact duplicates
- group repeated or near-identical evidence when practical
- keep provenance pointers intact
The goal is to avoid flooding review with repeated copies of the same conversation.
### Step 4 - attempt project association
For each evidence item, try to associate it with:
- a registered project id, or
- `unassigned` if confidence is low
Rules:
- high confidence match -> attach project id
- low confidence match -> mark as uncertain
- no good match -> leave unassigned
Do not force a project assignment just to make the output tidier.
### Step 5 - classify signal type
Classify each normalized item into one of these buckets:
- noise / ignore
- evidence only
- memory candidate
- entity candidate
- needs canonical-source update first
- needs explicit operator decision
If the classification is uncertain, choose the lower-trust bucket.
### Step 6 - compare against higher-trust layers
For non-noise items, compare against the current higher-trust landscape.
Check for:
- already-active equivalent memory
- already-active equivalent entity later
- existing project_state answer
- obvious duplication of canonical source truth
- obvious contradiction with canonical source truth
This comparison is read-only.
It is used only to rank and annotate output.
### Step 7 - build candidate bundles
For each candidate:
- include the candidate text or shape
- include provenance snippets
- include source type
- include project association confidence
- include reason for candidate classification
- include conflict or duplicate notes if found
### Step 8 - build unresolved operator queue
Some items should not become candidates yet.
Examples:
- "This looks like current truth but should first be updated in PKM, repo, or KB."
- "This Discord-originated request asks for refresh or ingest."
- "This might be a decision, but confidence is too low."
These belong in a small operator queue, not in trusted state.
### Step 9 - persist report artifacts only
Persist only:
- screener report
- evidence manifests
- candidate manifests
- checkpoint metadata
If candidate persistence into AtoCore is enabled later, it still remains a candidate-only path and must not skip review.
### Step 10 - exit fail-open
If the screener could not reach AtoCore or some source system:
- write the failure or skip into the report
- keep the checkpoint conservative
- do not fake success
- do not silently mutate anything elsewhere
## Failure modes
### Failure mode 1 - AtoCore unavailable
Behavior:
- continue in fail-open mode if possible
- write a report that the run was evidence-only or degraded
- do not attempt write-side recovery actions
### Failure mode 2 - Discrawl unavailable or stale
Behavior:
- note Discord archive input unavailable or stale
- continue with other sources
- do not invent Discord evidence summaries
### Failure mode 3 - candidate explosion
Behavior:
- rank candidates
- keep only a bounded top set for review
- put the remainder into a dropped or deferred manifest
- do not overwhelm the reviewer queue
### Failure mode 4 - low-confidence project mapping
Behavior:
- leave items unassigned or uncertain
- do not force them into a project-specific truth lane
### Failure mode 5 - contradiction with trusted truth
Behavior:
- flag the contradiction in the report
- keep the evidence or candidate for review if useful
- do not overwrite project_state
### Failure mode 6 - direct operator-action request found in evidence
Examples:
- "register this project"
- "refresh this source"
- "promote this memory"
Behavior:
- place the item into the operator action queue
- require explicit human approval
- do not perform the mutation as part of the screener
## Review handoff format
Each screener run should hand off a compact review package containing:
1. a run summary
2. candidate counts by type and project
3. top candidates with provenance
4. unresolved items needing explicit operator choice
5. unavailable-source notes
6. checkpoint status
The handoff should be short enough for a human to review without reading the entire raw archive.
## Safety rules
The screener must obey these rules every night.
1. No direct project_state writes.
2. No direct registry mutation.
3. No direct refresh or ingest.
4. No direct promote or reject.
5. No treating Discord or Discrawl as trusted truth.
6. No hiding source uncertainty.
7. No inventing missing integrations.
8. No bringing deferred sources into V1 through policy drift or hidden dependency.
## Minimum useful run
A useful screener run can still succeed even if it only does this:
- gathers available Discord and OpenClaw evidence
- filters obvious noise
- produces a small candidate manifest
- notes unavailable archive inputs if any
- leaves trusted state untouched
That is still a correct V1 run.
## Deferred from V1
Screenpipe is deferred from V1. It is not an active input, not a required dependency, and not part of the runtime behavior of this V1 screener.
## Bottom line
The nightly screener is not the brain of the system.
It is the filter.
Its purpose is to make human review easier while preserving the trust hierarchy:
- broad capture in
- narrow reviewed truth out
- no hidden mutations in the middle