d5371cfe7585ac591268420685883f6a754719ab
New features: - Clip-based workflow: record short clips, keep or delete - Toggle recording with Ctrl+Shift+R - Session management (start, clips, end) - Modern CustomTkinter GUI with dark theme - Global hotkeys for hands-free control - Whisper transcription (local, no API) - FFmpeg screen + audio capture - Export to clawdbot_export/ for Mario processing Files added: - recorder.py: FFmpeg screen recording - session.py: Session/clip management - hotkeys.py: Global hotkey registration - kb_capture.py: Main application logic - gui_capture.py: Modern GUI - export.py: Merge clips, transcribe, export Docs: - docs/KB-CAPTURE.md: Full documentation Entry point: uv run kb-capture
CAD-Documenter
One video → Complete engineering documentation.
Transform video walkthroughs of CAD models into comprehensive, structured documentation — ready for CDRs, FEA setups, and client deliverables.
Features
- Smart frame extraction — Scene detection captures key moments, not every second
- Whisper transcription — Local GPU transcription, no cloud dependency
- Hybrid workflow — Export for Clawdbot processing (no API costs!)
- Windows GUI — Easy project management with CustomTkinter
- Atomaste PDF — Professional reports with engineering branding
Quick Start (GUI)
# Clone and install
git clone http://192.168.86.50:3000/Antoine/CAD-Documenter.git
cd CAD-Documenter
uv sync
uv pip install customtkinter
# Launch GUI
uv run cad-doc-gui
Workflow Options
Option A: Hybrid with Clawdbot (Recommended - No API Costs)
Windows (GUI) Clawdbot
───────────── ────────
1. Create project
2. Add videos
3. Process --export-only
→ FFmpeg frames
→ Whisper transcription
→ clawdbot_export/
─────────►
4. "Process CAD report for X"
→ Vision analysis (free)
→ Vault markdown
→ Atomaste PDF
Export for Clawdbot:
uv run cad-doc project init ./my-project -n "My Project"
uv run cad-doc project add ./my-project recording.mp4
uv run cad-doc project process ./my-project --export-only
Option B: Standalone with API
export OPENAI_API_KEY="sk-your-key" # or ANTHROPIC_API_KEY
uv run cad-doc video.mp4 --bom --atomizer-hints --pdf
Installation
Requirements
- Python 3.12+
- FFmpeg
- CUDA GPU (recommended for Whisper)
# Windows (with chocolatey)
choco install ffmpeg
# Or download from https://ffmpeg.org/download.html
Install
git clone http://192.168.86.50:3000/Antoine/CAD-Documenter.git
cd CAD-Documenter
uv sync
# For GUI support
uv pip install customtkinter
CLI Reference
GUI
uv run cad-doc-gui
Project Management
# Create project
uv run cad-doc project init ./my-project -n "Project Name"
# Add videos
uv run cad-doc project add ./my-project video.mp4
# Process (export for Clawdbot)
uv run cad-doc project process ./my-project --export-only
# Process (with API)
uv run cad-doc project process ./my-project
# Check status
uv run cad-doc project status ./my-project
# Generate unified docs
uv run cad-doc project generate ./my-project
Single Video (API mode)
uv run cad-doc video.mp4 [options]
Options:
-o, --output PATH Output directory
--frames-only Only extract frames
--skip-transcription Skip audio transcription
--atomizer-hints Generate FEA optimization hints
--bom Generate Bill of Materials
--pdf Generate PDF output
--api-provider TEXT openai or anthropic
--whisper-model TEXT tiny/base/small/medium/large
Output
Clawdbot Export (clawdbot_export/)
<session>/
├── frames/ # Extracted keyframes
│ ├── 00-01-30.png
│ └── ...
├── transcript.json # Whisper output with timestamps
└── metadata.json # Session info
Full Processing
- 📄 Markdown — Structured documentation
- 📊 BOM — Components, materials, functions
- 🎯 Atomizer hints — FEA objectives & constraints
- 📑 PDF — Professional Atomaste-branded report
Tips
- Narrate your recording — Audio narration = rich documentation
- Collapse UI panels — In NX: Ctrl+Shift+N to hide Assembly Navigator
- Use scene detection — Enabled by default, captures meaningful frames
Architecture
CAD-Documenter/
├── src/cad_documenter/
│ ├── cli.py # Main CLI
│ ├── cli_project.py # Project commands
│ ├── gui.py # Windows GUI
│ ├── pipeline.py # Processing orchestrator
│ ├── video_processor.py # Frame extraction
│ ├── audio_analyzer.py # Whisper transcription
│ ├── vision_analyzer.py # AI vision (API mode)
│ ├── incremental.py # Project processing
│ └── config.py # Configuration
├── prompts/ # AI prompts
├── templates/ # Output templates
└── tests/
License
MIT
Credits
Built by Atomaste for the engineering community.
Description
Languages
Python
100%