diff --git a/README.md b/README.md index 48b5148..ca60f24 100644 --- a/README.md +++ b/README.md @@ -37,11 +37,13 @@ The controller should be boring, deterministic, auditable, and conservative. ## Start here -1. Read [`docs/00-start-here.md`](docs/00-start-here.md). -2. Read [`docs/02-v1-scope.md`](docs/02-v1-scope.md). -3. Review the state machine and safety rules in [`docs/03-architecture.md`](docs/03-architecture.md). -4. Use [`docs/08-commissioning-checklist.md`](docs/08-commissioning-checklist.md) during hardware bring-up. -5. Track implementation using [`ROADMAP.md`](ROADMAP.md). +1. Read [`docs/LLM_CONTEXT.md`](docs/LLM_CONTEXT.md) if you are Cédric's LLM/coding assistant or need a compact project brief. +2. Read [`docs/00-start-here.md`](docs/00-start-here.md). +3. Read [`docs/02-v1-scope.md`](docs/02-v1-scope.md). +4. Review the state machine and safety rules in [`docs/03-architecture.md`](docs/03-architecture.md). +5. Use [`docs/11-feature-request-intake.md`](docs/11-feature-request-intake.md) before adding new features. +6. Use [`docs/08-commissioning-checklist.md`](docs/08-commissioning-checklist.md) during hardware bring-up. +7. Track implementation using [`ROADMAP.md`](ROADMAP.md). ## Repository map diff --git a/docs/00-start-here.md b/docs/00-start-here.md index e9a1331..71f1d3a 100644 --- a/docs/00-start-here.md +++ b/docs/00-start-here.md @@ -30,4 +30,9 @@ Neither side invents polishing strategy. - test checklists; - telemetry schema clarifications; - state-machine/safety edge cases; -- concise implementation notes in `docs/nick-generated/`. +- concise implementation notes in `docs/nick-generated/`; +- feature-request classification using `docs/11-feature-request-intake.md`. + +## If you are using an LLM/coding assistant + +Start with [`LLM_CONTEXT.md`](LLM_CONTEXT.md), then load the narrower docs relevant to the change you are making. Do not let the LLM infer optical strategy or change safety/protocol/telemetry contracts without Antoine approval. diff --git a/docs/11-feature-request-intake.md b/docs/11-feature-request-intake.md new file mode 100644 index 0000000..29ac3cf --- /dev/null +++ b/docs/11-feature-request-intake.md @@ -0,0 +1,160 @@ +--- +title: Polisher-Control Feature Request Intake +status: draft +requested_by: Antoine Letarte +generated_by: Nick / Hermes +project: P11-Polisher-Fullum +repo: polisher-control +source_truth: false +created: 2026-06-02 +privacy: technical-only +--- + +# Feature Request Intake — Polisher-Control + +## Purpose + +Use this note when Antoine or Cédric proposes a new feature for `polisher-control`. + +The goal is to avoid coding a feature into the wrong layer. Some requests belong in the machine controller, some belong in `polisher-post`, and some belong in `polisher-sim`. + +## Fast classification + +Before implementation, classify the request. + +| Class | Belongs where | Examples | +|---|---|---| +| Execution feature | `polisher-control` | manual UI behavior, host state machine, Teensy loop, drive/sensor integration, telemetry logging, alarms, `/data/` artifacts | +| Contract feature | `polisher-control` + `shared/schemas` + usually `polisher-post` | new telemetry channel, new protocol field, new log field, new machine-capability property | +| Planning/intelligence feature | `polisher-sim` / `polisher-post` | choosing pass strategy, metrology interpretation, calibration updates, optimization, dwell planning | +| Safety/scope feature | approval-gated | interlock behavior, E-stop response, force limits, fault reset policy, remote operation, powered manipulation modes | + +Rule: if the feature decides **what polishing should be done**, it is probably not a `polisher-control` feature. If it executes/logs/enforces what was already approved, it probably is. + +## Intake template + +Copy this block into an issue, PR, or `docs/nick-generated/YYYY-MM-DD-feature-name.md` when scoping a feature. + +```markdown +--- +title: Feature Intake — +status: draft +requested_by: Antoine / Cédric +generated_by: Nick / Hermes +project: P11-Polisher-Fullum +repo: polisher-control +source_truth: false +created: YYYY-MM-DD +privacy: technical-only +--- + +# Feature Intake — + +## 1. Requested behavior + +- What should the operator/system be able to do? +- What problem does this solve? +- Who requested it? + +## 2. Classification + +- [ ] Execution feature in `polisher-control` +- [ ] Contract feature touching schemas/protocol/telemetry/logs +- [ ] Planning/intelligence feature that belongs upstream +- [ ] Safety/scope feature requiring Antoine approval + +Rationale: + +## 3. Operator-visible behavior + +- UI controls / screens / prompts: +- Normal workflow: +- Fault/warning workflow: +- What is logged as operator action: + +## 4. Affected layers + +- Host controller: +- Teensy firmware: +- Host ↔ Teensy protocol: +- Telemetry channels: +- Run/manual-session logs: +- Machine capability profile: +- Shared schemas: +- Tests: +- Docs/checklists: + +## 5. Safety implications + +- Does it affect force, motion, brakes, interlocks, E-stop, watchdog, or reset behavior? +- Does it allow motion in a new condition? +- Does it change any limit or threshold? +- Does it require a new refusal/NACK reason? +- Does it need a hardware interlock or physical confirmation? + +Safety decision: + +- [ ] No safety behavior change +- [ ] Safety behavior change; Antoine approval required before implementation + +## 6. Contract implications + +- New/changed protocol messages: +- New/changed telemetry channels: +- New/changed log fields: +- New/changed machine capability fields: +- Backward compatibility / schema version impact: + +Contract decision: + +- [ ] No schema/protocol contract change +- [ ] Contract change; update schemas/examples/docs/tests together + +## 7. Acceptance checks + +- [ ] Host-side unit test: +- [ ] Firmware compile/build check: +- [ ] Protocol encode/decode test: +- [ ] Schema/example validation: +- [ ] State-machine transition test: +- [ ] Telemetry channel consistency check: +- [ ] Safety refusal/fault test: +- [ ] Commissioning checklist update: +- [ ] Manual/operator workflow check: + +## 8. Open questions + +- For Antoine: +- For Cédric: +- For source-spec update: + +## 9. Implementation notes + +- Suggested branch: +- Suggested first PR chunk: +- Suggested follow-up chunk: +``` + +## Approval gates + +Ask Antoine before implementing when a request changes any of these: + +- v1 scope boundary; +- safety/interlock behavior; +- force limits, braking, fault/reset policy; +- host ↔ Teensy protocol semantics; +- telemetry channel names, units, rates, or meanings; +- schema versions or required fields; +- boundary between `polisher-control`, `polisher-post`, and `polisher-sim`; +- any powered manipulation / Zero-G / remote-control behavior. + +## Implementation chunking recommendation + +For most features, keep PRs small: + +1. **Spec/doc PR:** feature intake, acceptance checks, schema/protocol sketch. +2. **Host PR:** state machine / UI workflow / logs / tests. +3. **Firmware PR:** message handling / loop behavior / telemetry / compile check. +4. **Integration PR:** end-to-end dry-run, examples, commissioning checklist. + +Do not mix a safety contract change with broad unrelated refactors. diff --git a/docs/LLM_CONTEXT.md b/docs/LLM_CONTEXT.md new file mode 100644 index 0000000..9bb846d --- /dev/null +++ b/docs/LLM_CONTEXT.md @@ -0,0 +1,440 @@ +--- +title: Polisher-Control LLM Context Pack +status: draft +requested_by: Antoine Letarte +generated_by: Nick / Hermes +project: P11-Polisher-Fullum +repo: polisher-control +source_truth: false +created: 2026-06-02 +privacy: technical-only +--- + +# Polisher-Control LLM Context Pack + +## Purpose + +This file is the **first prompt/context file** to give to Cédric's LLM or coding assistant when working in this repository. + +It condenses the approved technical direction for `polisher-control` into one implementation-oriented context pack. It is an aid for coding and design discussions, not the authority itself. + +If this file conflicts with the source specs or with Antoine's explicit direction, stop and ask Antoine before changing architecture, telemetry contracts, safety behavior, or v1 scope. + +## Scope / non-scope + +### In scope for `polisher-control` + +`polisher-control` is the machine-side execution layer. It owns: + +- Teensy firmware and hard-real-time loop. +- Raspberry Pi / host controller and touchscreen/manual workflow. +- Host ↔ Teensy protocol. +- Force setpoint execution and closed-loop Fz control. +- Table/spindle/arm-drive command interfaces. +- KWR75B-CAN force/torque acquisition. +- ODrive S1 + M8325s spindle integration. +- State machine, pause/resume/abort, faults, alarms, interlocks. +- Telemetry, event logs, manual-session logs, run artifacts, and `/data/` layout. +- Conservative machine capability descriptor. + +### Explicitly out of scope + +Do **not** implement or infer: + +- Optical figuring strategy. +- Interferometer/metrology interpretation. +- Dwell-map optimization or pass planning. +- Preston coefficient calibration or learning logic. +- Autonomous correction decisions. +- Controller-side replanning. +- Silent mutation of upstream setpoints. +- Remote control of the machine without a physically present operator. + +Core rule: **the controller executes approved setpoints safely; it does not decide polishing strategy.** + +## Source references + +Use repo docs first for implementation, then the source specs if there is ambiguity. + +### Repo implementation docs + +- `README.md` — repository overview and v1 finish line. +- `AGENTS.md` — repo rules for LLM/coding agents. +- `ROADMAP.md` — implementation roadmap. +- `docs/00-start-here.md` — Cédric build brief. +- `docs/01-ecosystem-boundaries.md` — `polisher-sim` / `polisher-post` / `polisher-control` split. +- `docs/02-v1-scope.md` — v1 delivery boundary. +- `docs/03-architecture.md` — host/firmware architecture, state machine, safety posture. +- `docs/04-host-teensy-protocol-v1.md` — protocol requirements and message set. +- `docs/05-telemetry-channel-spec-v1.md` — telemetry channels and rates. +- `docs/06-event-alarm-codes-v1.md` — event/alarm codes. +- `docs/07-manual-mode-workflow.md` — touchscreen manual workflow. +- `docs/08-commissioning-checklist.md` — bring-up checklist. +- `docs/09-acceptance-checklist.md` — pass/fail criteria. +- `docs/10-open-questions-for-cedric.md` — implementation questions to close. +- `shared/schemas/*.schema.json` — contract schemas mirrored into this repo. +- `shared/machine/fullum-alpha.capabilities.v1.json` — draft capability profile. + +### Upstream source-truth docs + +These live in Antoine's P11 project vault and outrank this generated context pack: + +- `/home/papa/obsidian-vault/2-Projects/P11-Polisher-Fullum/_curation/CONTEXT.md` +- `/home/papa/obsidian-vault/2-Projects/P11-Polisher-Fullum/README.md` +- `/home/papa/obsidian-vault/2-Projects/P11-Polisher-Fullum/software-suite/control/firmware/Fullum-Polisher-Machine-Control-Firmware-Spec-v1.md` +- `/home/papa/obsidian-vault/2-Projects/P11-Polisher-Fullum/05-Implementation/Controller-Bridge-Digital-Twin-Architecture-Plan.md` +- `/home/papa/obsidian-vault/2-Projects/P11-Polisher-Fullum/05-Implementation/Polisher-Control/00-Polisher-Control-System.md` +- `/home/papa/obsidian-vault/2-Projects/P11-Polisher-Fullum/05-Implementation/Polisher-Control/01-Polisher-Control-Roadmap.md` + +## Architecture summary + +```text +polisher-sim + planning / digital twin / metrology / calibration / process intelligence + ↓ +polisher-post + validation / machine-capability checks / controller-job packaging / log import + ↓ +polisher-control + safe deterministic machine execution / telemetry / operator workflow + ↓ +run logs + manual telemetry return upstream for analysis and calibration +``` + +### `polisher-sim` + +Owns the question: **what should be done next?** + +It may ingest interferometer maps, model removal, plan passes, estimate uncertainty, calibrate machine-specific behavior, and produce frozen job packages. It must not drive hardware directly. + +### `polisher-post` + +Owns the question: **how do we express the approved plan safely for this machine?** + +It validates schemas, checks machine capabilities, normalizes units, records translation losses, and emits controller-safe `controller-job.v1` packages. + +### `polisher-control` + +Owns the question: **how do we execute these approved setpoints safely now?** + +It runs the machine, enforces safety, records exact reality, and exports artifacts. It must be boring, predictable, auditable, and conservative. + +## v1 finish line + +The v1 finish line is: + +> Normand can operate the polisher manually from the touchscreen, producing clean synchronized telemetry, with safety enforced. + +Program execution exists as a contract scaffold and future path, but the v1 production surface is manual operation. + +### v1 priority order + +1. Safety/interlocks and deterministic state machine. +2. Manual mode from touchscreen / host UI. +3. Stable host ↔ Teensy setpoint and telemetry protocol. +4. KWR75B-CAN force/torque acquisition and stale-frame detection. +5. Table/arm encoder acquisition and synchronized timestamps. +6. ODrive S1 + M8325s spindle command/telemetry path. +7. Run/manual logs and `/data/` file layout. +8. Controller-job intake dry-run for future program execution. + +### v1 must include + +- `MANUAL` state reachable from `IDLE` only. +- Operator live controls for force, table RPM, spindle RPM, and optional force modulation. +- Mandatory geometric gate before `MANUAL` or `RUNNING`. +- Full telemetry at **≥100 Hz** with a single Teensy monotonic timestamp source. +- Manual-session log emitted on exit. +- Same safety/interlocks in manual mode as job mode. +- KWR75B-CAN sensor status and stale-frame watchdog. +- Tool-weight compensation using a configured `fz_tool_weight_offset_n`. +- Mechanical safe-removal workflow for the tool/head. + +### v1 must not include unless Antoine explicitly reopens it + +- Powered Zero-G / admittance manipulation mode. +- Controller-side optical strategy. +- Controller-side calibration/learning. +- Remote control of the machine. +- Autonomous multi-pass orchestration. +- Metrology import/interpretation in the controller. + +## Hardware basis to preserve + +These are current project facts unless Antoine/source specs supersede them: + +- Real-time controller: **Teensy 4.1**. +- Host: **Raspberry Pi 4 + touchscreen**. +- Primary process force sensor: **Kunwei KWR75B-CAN**, dedicated CAN link to Teensy through isolated transceiver. +- Table encoder: **RLS Artos DHR 162 mm**, 20-bit absolute, SSI/RS422 basis. +- Arm encoder: **RLS Orbis BR10**, 14-bit absolute, SSI/RS422 basis. +- Spindle: **ODrive S1 + M8325s 100KV**, v1 velocity control, CAN 2.0B at 1 Mbps preferred. +- Z/force actuator: counterweight-biased **AutomationDirect SV2L-210B** servo with **SV2A-2150** drive, torque/current mode to confirm. +- Z brake: NC 24 VDC electromagnetic brake; engages on E-stop or power loss. +- Safety relay: Pilz PNOZ X1 or Banner XS26-2 still needs final selection. +- Powered Zero-G is **not** part of v1. + +## State machine context + +Required states: + +- `IDLE` +- `JOB_LOADED` +- `READY` +- `RUNNING` +- `PAUSED` +- `ABORTING` +- `COMPLETED` +- `ABORTED` +- `FAULTED` +- `MANUAL` + +Rules: + +- `FAULTED` exits only through explicit operator reset. +- Illegal transitions are rejected and logged. +- Every transition emits an event. +- `JOB_LOADED → READY` requires explicit operator acknowledge. +- `IDLE → MANUAL` requires the geometric gate. +- Cannot enter `MANUAL` with a job loaded. +- Cannot load a job while in `MANUAL`. +- A fault in `MANUAL` transitions to `FAULTED`. + +## Manual mode workflow + +Shop-floor sequence: + +1. Hardware HOA selector in **Auto** so Teensy/RPi are active. +2. Operator mechanically sets arm amplitude and center on the machine. +3. Operator opens **Manual Mode** on touchscreen. +4. UI presents mandatory geometric gate for: + - `r_menante` + - `L_menee` + - `R_tool` + - `configured_arm_amplitude_deg` + - `configured_arm_center_deg` +5. Operator confirms each value; no skip path. +6. Operator enters initial force, table RPM, spindle RPM, optional modulation. +7. Host sends `MANUAL_START`; Teensy ACK/NACKs. +8. Telemetry begins at ≥100 Hz. +9. Operator adjusts live setpoints; each change is logged as an event. +10. Operator presses Stop; host sends `MANUAL_STOP`. +11. Machine ramps down, returns to `IDLE`, and writes `manual-session-log.v1` + telemetry CSV. + +## Telemetry contract + +### Principles + +- One monotonic Teensy clock anchors all telemetry. +- Raw values are logged; filtering/analysis happens upstream. +- Commanded/setpoint and actual/measured values stay distinct. +- Sensor faults are explicit: validity flags, NaN, sentinel, or event/alarm. +- Missing samples must be detectable from timestamps. +- Header/channel names must remain stable; future tools will key off them. + +### Core required channels at ≥100 Hz + +- `timestamp_us` +- `table_angle_deg` +- `arm_angle_deg` +- `fz_n` +- `mx` +- `my` +- `mz` +- `spindle_rpm_actual` +- `table_rpm_actual` +- `arm_amplitude_deg_derived` +- `arm_center_deg_derived` +- `machine_state` + +### Strongly recommended channels + +- `fx_n` +- `fy_n` +- `ft_status` +- `z_servo_iq_v` +- `z_brake_engaged` +- `spindle_drive_state` +- `spindle_drive_error` +- `spindle_bus_voltage_v` +- `spindle_iq_a` +- `spindle_motor_temp_c` +- `arm_angle_linearized_deg` +- `force_setpoint_n` +- `table_rpm_setpoint` +- `spindle_rpm_setpoint` +- `force_actuator_cmd` +- `estop_active` +- `interlock_state` +- `mode` + +## Host ↔ Teensy protocol context + +The protocol is a setpoint/telemetry protocol, not G-code. + +### Host → Teensy messages + +- `HEARTBEAT` +- `SEGMENT_START` +- `SETPOINT` +- `PAUSE` +- `RESUME` +- `ABORT` +- `MANUAL_START` +- `MANUAL_STOP` +- `ESTOP` + +### Teensy → Host messages + +- `ACK` +- `NACK` +- `TELEMETRY` +- `EVENT` +- `SEGMENT_DONE` +- `ABORT_COMPLETE` + +### Protocol requirements + +- Version on every frame. +- Robust framing; do not rely on timing gaps. +- CRC-16 or CRC-32 on every frame. +- Every host command receives ACK or NACK. +- NACK reason is machine-readable, not free text only. +- Deterministic parsing; no UI text parsing in the firmware protocol. +- Safety messages bounded-latency and robust to single-frame loss. + +## Safety/interlock context + +Safety is layered: + +- Hardware safety: E-stop circuit, safety relay, brake, drive enables. Independent of software. +- Teensy fast safety: force limits, encoder loss, F/T sensor stale/invalid, drive faults, watchdog response. +- Host slow safety: state integrity, RPM deviations, segment/manual session orchestration, logging, UI gates. + +Hard-stop faults include: + +- `ESTOP_ACTIVATED` +- `FORCE_OVER_LIMIT` +- `ENCODER_LOST` +- `DRIVE_FAULT` +- `FT_SENSOR_INVALID` + +Warnings/recoverable pauses include: + +- `FORCE_UNDER_LIMIT` +- `SPINDLE_RPM_DEVIATION` +- `TABLE_RPM_DEVIATION` +- `HOST_COMMS_TIMEOUT` + +Refused transition / gate reasons include: + +- `GEOMETRY_NOT_VALIDATED` +- `ARM_HANDLING_INTERLOCK` if final lock/cam-nut feedback exists. + +## Force control and tool handling + +- Force PID closes on the compensated contact force path. +- Store `fz_tool_weight_offset_n` for the installed tool/head configuration. +- Keep raw force available for diagnostics as `fz_raw_n` where practical. +- Compute compensated contact force as: + +```text +fz_contact_n = fz_raw_n - fz_tool_weight_offset_n +``` + +- If only one canonical force channel is emitted, document that `fz_n` means compensated contact force and include the offset in session/run metadata. +- Do not implement a powered Zero-G state in v1. + +Mechanical safe tool-removal workflow: + +1. Stop active manual/job operation. +2. Command force to zero and stop table, spindle, and arm drive. +3. Confirm compensated force is near zero. +4. Confirm machine is in `IDLE` or stopped manual condition. +5. Engage/verify swing arm or arc mechanical lock. +6. Remove cam arm nut to free arm rotation. +7. Move arm aside by hand. +8. Remove tool from blank by hand. +9. Reinstall/seat tool, restore mechanical configuration, and rerun the geometric gate before software-controlled motion. + +## Run artifacts and data layout + +Write run data to USB SSD, not the RPi SD card. + +Baseline layout: + +```text +/data/ +├── runs/ +│ └── run-*/ +│ ├── run-log.v1.json +│ ├── telemetry.csv +│ ├── telemetry-derived.parquet +│ ├── segment-stats.json +│ ├── dwell-map.npz +│ ├── work-map.npz +│ ├── anomaly-windows/ +│ └── manifest.json +├── manual/ +│ └── manual-*/ +│ ├── manual-session-log.v1.json +│ ├── telemetry.csv +│ ├── telemetry-derived.parquet +│ ├── segment-stats.json +│ ├── dwell-map.npz +│ ├── work-map.npz +│ └── manifest.json +├── capabilities/ +│ └── machine-capabilities.v1.json +└── status.json +``` + +For v1 hardware proving, raw full-rate telemetry, run/manual logs, stable filenames, and hashes are mandatory. Derived artifacts may begin as a simple post-run routine but should follow the contract from the start. + +## Feature request workflow for Antoine/Cédric + +When Antoine says he wants to add a feature to `polisher-control`, classify it before coding: + +1. **Execution feature** — belongs here if it changes host state machine, protocol, firmware, manual workflow, telemetry, safety, or run artifacts. +2. **Contract feature** — may require parallel changes in `shared/schemas`, `polisher-post`, and possibly `polisher-sim`. +3. **Planning/intelligence feature** — probably belongs upstream in `polisher-sim` or `polisher-post`, not here. +4. **Safety/scope feature** — requires Antoine approval before implementation if it changes interlocks, limits, fault behavior, telemetry schema, protocol semantics, or v1 scope. + +For each proposed feature, capture: + +- requested behavior; +- operator-visible behavior; +- affected layers: host / Teensy / protocol / telemetry / schemas / docs / tests; +- safety implications; +- acceptance checks; +- open questions for Antoine or Cédric. + +Use `docs/11-feature-request-intake.md` as the template. + +## Open implementation questions to keep visible + +Current control/firmware-facing questions include: + +- Confirm KBSI-240D / table drive command path or replacement. +- Confirm table encoder transport and transceiver choice. +- Define ODrive runtime command/telemetry path, scaling, fault reset, safe stop, and config export. +- Confirm SV2A-2150 torque/current command mapping, enable/fault wiring, current/Iq monitor path. +- Confirm Z limit-switch / hard-stop wiring into safety chain. +- Confirm NC brake coil driver and brake-engaged diagnostic feedback. +- Select safety relay model. +- Confirm KWR75B-CAN frame map, byte order, scaling, status bits, and update rate. + +## Acceptance mindset + +A change is not done because code exists. It is done when the relevant acceptance checks pass: + +- state transition tests; +- protocol encode/decode/ACK/NACK tests; +- schema/example validation; +- telemetry channel consistency checks; +- safety gate refusal tests; +- host-side log artifact tests; +- firmware compile/build checks where hardware-specific code is touched; +- commissioning checklist item if hardware integration is involved. + +Keep code boring. Prefer explicit rejection over silent assumptions.