Update README with hybrid workflow and GUI docs

This commit is contained in:
Mario Lavoie
2026-01-28 11:52:16 +00:00
parent 5fbd744cca
commit e8cca0b9c5

282
README.md
View File

@@ -7,213 +7,171 @@ Transform video walkthroughs of CAD models into comprehensive, structured docume
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
## The Problem
## Features
- **Documentation is tedious** — Engineers spend hours documenting CAD models manually
- **Knowledge lives in heads** — Verbal explanations during reviews aren't captured
- **CDR prep is painful** — Gathering images, writing descriptions, creating BOMs
- **FEA setup requires context** — Atomizer needs model understanding that's often verbal
- **Smart frame extraction** — Scene detection captures key moments, not every second
- **Whisper transcription** — Local GPU transcription, no cloud dependency
- **Hybrid workflow** — Export for Clawdbot processing (no API costs!)
- **Windows GUI** — Easy project management with CustomTkinter
- **Atomaste PDF** — Professional reports with engineering branding
## The Solution
## Quick Start (GUI)
Record yourself explaining your CAD model. CAD-Documenter:
```bash
# Clone and install
git clone http://192.168.86.50:3000/Antoine/CAD-Documenter.git
cd CAD-Documenter
uv sync
uv pip install customtkinter
1. **Extracts key frames** at scene changes (smart, not fixed intervals)
2. **Transcribes your explanation** via Whisper
3. **Analyzes visually** using GPT-4o or Claude vision
4. **Correlates visual + verbal** to identify components
5. **Generates documentation** with images, BOM, and Atomizer hints
# Launch GUI
uv run cad-doc-gui
```
### Output
## Workflow Options
- 📄 **Markdown documentation** — Structured, version-controlled
- 📊 **Bill of Materials** — CSV with components, materials, functions
- 🔧 **Component registry** — Detailed specs per component
- 🎯 **Atomizer hints** — FEA objectives, constraints, parameters
- 📑 **PDF** — Professional output via Atomaste Report Standard
### Option A: Hybrid with Clawdbot (Recommended - No API Costs)
```
Windows (GUI) Clawdbot
───────────── ────────
1. Create project
2. Add videos
3. Process --export-only
→ FFmpeg frames
→ Whisper transcription
→ clawdbot_export/
─────────►
4. "Process CAD report for X"
→ Vision analysis (free)
→ Vault markdown
→ Atomaste PDF
```
**Export for Clawdbot:**
```bash
uv run cad-doc project init ./my-project -n "My Project"
uv run cad-doc project add ./my-project recording.mp4
uv run cad-doc project process ./my-project --export-only
```
### Option B: Standalone with API
```bash
export OPENAI_API_KEY="sk-your-key" # or ANTHROPIC_API_KEY
uv run cad-doc video.mp4 --bom --atomizer-hints --pdf
```
## Installation
```bash
# Clone
git clone http://100.80.199.40:3000/Antoine/CAD-Documenter.git
cd CAD-Documenter
# Install with uv
uv sync
# Or with pip
pip install -e .
```
### Requirements
- Python 3.12+
- ffmpeg (for video/audio processing)
- OpenAI or Anthropic API key (for vision analysis)
- FFmpeg
- CUDA GPU (recommended for Whisper)
```bash
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
# Windows (with chocolatey)
choco install ffmpeg
# Or download from https://ffmpeg.org/download.html
```
## Quick Start
### Install
```bash
# Set API key
export OPENAI_API_KEY="sk-your-key"
git clone http://192.168.86.50:3000/Antoine/CAD-Documenter.git
cd CAD-Documenter
uv sync
# Run
uv run cad-doc walkthrough.mp4
# With all features
uv run cad-doc walkthrough.mp4 --bom --atomizer-hints --pdf
# For GUI support
uv pip install customtkinter
```
## Usage
## CLI Reference
### GUI
```bash
# Basic
cad-doc video.mp4
# Custom output directory
cad-doc video.mp4 --output ./my_docs/
# Full pipeline
cad-doc video.mp4 --bom --atomizer-hints --pdf
# Just extract frames
cad-doc video.mp4 --frames-only
# Use Anthropic instead of OpenAI
cad-doc video.mp4 --api-provider anthropic
# Better transcription (slower)
cad-doc video.mp4 --whisper-model medium
uv run cad-doc-gui
```
## Configuration
Create a config file:
### Project Management
```bash
cad-doc --init-config
# Creates ~/.cad-documenter.toml
# Create project
uv run cad-doc project init ./my-project -n "Project Name"
# Add videos
uv run cad-doc project add ./my-project video.mp4
# Process (export for Clawdbot)
uv run cad-doc project process ./my-project --export-only
# Process (with API)
uv run cad-doc project process ./my-project
# Check status
uv run cad-doc project status ./my-project
# Generate unified docs
uv run cad-doc project generate ./my-project
```
Or set environment variables:
### Single Video (API mode)
```bash
export OPENAI_API_KEY="sk-..." # Required for OpenAI
export ANTHROPIC_API_KEY="sk-..." # Required for Anthropic
export CAD_DOC_PROVIDER="anthropic" # Override default provider
uv run cad-doc video.mp4 [options]
Options:
-o, --output PATH Output directory
--frames-only Only extract frames
--skip-transcription Skip audio transcription
--atomizer-hints Generate FEA optimization hints
--bom Generate Bill of Materials
--pdf Generate PDF output
--api-provider TEXT openai or anthropic
--whisper-model TEXT tiny/base/small/medium/large
```
See [docs/USAGE.md](docs/USAGE.md) for full configuration options.
## Output
## Recording Tips
### Clawdbot Export (`clawdbot_export/`)
```
<session>/
├── frames/ # Extracted keyframes
│ ├── 00-01-30.png
│ └── ...
├── transcript.json # Whisper output with timestamps
└── metadata.json # Session info
```
For best results when recording:
### Full Processing
- 📄 **Markdown** — Structured documentation
- 📊 **BOM** — Components, materials, functions
- 🎯 **Atomizer hints** — FEA objectives & constraints
- 📑 **PDF** — Professional Atomaste-branded report
1. **Spin slowly** — Give the AI time to see each angle
2. **Name components** — "This is the main bracket..."
3. **Mention materials** — "Made of 6061 aluminum"
4. **Describe functions** — "This holds the motor"
5. **Note constraints** — "Must fit within 200mm"
6. **Point out features** — "These fillets reduce stress"
## Tips
1. **Narrate your recording** — Audio narration = rich documentation
2. **Collapse UI panels** — In NX: Ctrl+Shift+N to hide Assembly Navigator
3. **Use scene detection** — Enabled by default, captures meaningful frames
## Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ CAD-DOCUMENTER │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Video │───►│ Frame │───►│ Vision │───►│ Struct │ │
│ │ Input │ │ Extract │ │ Analysis │ │ Output │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Audio │───►│ Whisper │───►│ Correlate│───►│ Generate │ │
│ │ Track │ │Transcribe│ │ Timeline │ │ Docs │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────────────┘
```
## Atomizer Integration
CAD-Documenter generates FEA optimization hints for Atomizer:
```bash
cad-doc walkthrough.mp4 --atomizer-hints
```
Output `atomizer_hints.json`:
```json
{
"objectives": [{"name": "mass", "direction": "minimize"}],
"constraints": [{"type": "frequency", "value": ">100 Hz"}],
"parameters": ["thickness", "fillet_radius"],
"critical_regions": [{"feature": "fillet", "concern": "stress_concentration"}]
}
```
See [docs/ATOMIZER_INTEGRATION.md](docs/ATOMIZER_INTEGRATION.md) for details.
## Python API
```python
from cad_documenter.pipeline import DocumentationPipeline
pipeline = DocumentationPipeline(
video_path="walkthrough.mp4",
output_dir="./docs"
)
results = pipeline.run_full_pipeline(
atomizer_hints=True,
bom=True,
pdf=True
)
```
## Project Structure
```
CAD-Documenter/
├── src/cad_documenter/
│ ├── cli.py # Command-line interface
│ ├── pipeline.py # Main orchestrator
│ ├── config.py # Configuration management
│ ├── cli.py # Main CLI
│ ├── cli_project.py # Project commands
│ ├── gui.py # Windows GUI
│ ├── pipeline.py # Processing orchestrator
│ ├── video_processor.py # Frame extraction
│ ├── audio_analyzer.py # Whisper transcription
│ ├── vision_analyzer.py # AI vision analysis
── doc_generator.py # Output generation
│ ├── vision_analyzer.py # AI vision (API mode)
── incremental.py # Project processing
│ └── config.py # Configuration
├── prompts/ # AI prompts
├── templates/ # Jinja2 templates
── tests/ # Test suite
├── docs/ # Documentation
└── examples/ # Example configs
├── templates/ # Output templates
── tests/
```
## Roadmap
- [x] Core pipeline (frames, transcription, vision)
- [x] Configuration system
- [x] Atomizer hints extraction
- [x] BOM generation
- [ ] Part Manager integration (P/N lookup)
- [ ] Interactive review mode
- [ ] Gitea auto-publish
- [ ] SolidWorks add-in
## License
MIT