diff --git a/README.md b/README.md index d9ca864..ab6f241 100644 --- a/README.md +++ b/README.md @@ -7,213 +7,171 @@ Transform video walkthroughs of CAD models into comprehensive, structured docume [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) -## The Problem +## Features -- **Documentation is tedious** — Engineers spend hours documenting CAD models manually -- **Knowledge lives in heads** — Verbal explanations during reviews aren't captured -- **CDR prep is painful** — Gathering images, writing descriptions, creating BOMs -- **FEA setup requires context** — Atomizer needs model understanding that's often verbal +- **Smart frame extraction** — Scene detection captures key moments, not every second +- **Whisper transcription** — Local GPU transcription, no cloud dependency +- **Hybrid workflow** — Export for Clawdbot processing (no API costs!) +- **Windows GUI** — Easy project management with CustomTkinter +- **Atomaste PDF** — Professional reports with engineering branding -## The Solution +## Quick Start (GUI) -Record yourself explaining your CAD model. CAD-Documenter: +```bash +# Clone and install +git clone http://192.168.86.50:3000/Antoine/CAD-Documenter.git +cd CAD-Documenter +uv sync +uv pip install customtkinter -1. **Extracts key frames** at scene changes (smart, not fixed intervals) -2. **Transcribes your explanation** via Whisper -3. **Analyzes visually** using GPT-4o or Claude vision -4. **Correlates visual + verbal** to identify components -5. **Generates documentation** with images, BOM, and Atomizer hints +# Launch GUI +uv run cad-doc-gui +``` -### Output +## Workflow Options -- 📄 **Markdown documentation** — Structured, version-controlled -- 📊 **Bill of Materials** — CSV with components, materials, functions -- 🔧 **Component registry** — Detailed specs per component -- 🎯 **Atomizer hints** — FEA objectives, constraints, parameters -- 📑 **PDF** — Professional output via Atomaste Report Standard +### Option A: Hybrid with Clawdbot (Recommended - No API Costs) + +``` +Windows (GUI) Clawdbot +───────────── ──────── +1. Create project +2. Add videos +3. Process --export-only + → FFmpeg frames + → Whisper transcription + → clawdbot_export/ + ─────────► + 4. "Process CAD report for X" + → Vision analysis (free) + → Vault markdown + → Atomaste PDF +``` + +**Export for Clawdbot:** +```bash +uv run cad-doc project init ./my-project -n "My Project" +uv run cad-doc project add ./my-project recording.mp4 +uv run cad-doc project process ./my-project --export-only +``` + +### Option B: Standalone with API + +```bash +export OPENAI_API_KEY="sk-your-key" # or ANTHROPIC_API_KEY +uv run cad-doc video.mp4 --bom --atomizer-hints --pdf +``` ## Installation -```bash -# Clone -git clone http://100.80.199.40:3000/Antoine/CAD-Documenter.git -cd CAD-Documenter - -# Install with uv -uv sync - -# Or with pip -pip install -e . -``` - ### Requirements - - Python 3.12+ -- ffmpeg (for video/audio processing) -- OpenAI or Anthropic API key (for vision analysis) +- FFmpeg +- CUDA GPU (recommended for Whisper) ```bash -# macOS -brew install ffmpeg - -# Ubuntu/Debian -sudo apt install ffmpeg - # Windows (with chocolatey) choco install ffmpeg + +# Or download from https://ffmpeg.org/download.html ``` -## Quick Start - +### Install ```bash -# Set API key -export OPENAI_API_KEY="sk-your-key" +git clone http://192.168.86.50:3000/Antoine/CAD-Documenter.git +cd CAD-Documenter +uv sync -# Run -uv run cad-doc walkthrough.mp4 - -# With all features -uv run cad-doc walkthrough.mp4 --bom --atomizer-hints --pdf +# For GUI support +uv pip install customtkinter ``` -## Usage +## CLI Reference +### GUI ```bash -# Basic -cad-doc video.mp4 - -# Custom output directory -cad-doc video.mp4 --output ./my_docs/ - -# Full pipeline -cad-doc video.mp4 --bom --atomizer-hints --pdf - -# Just extract frames -cad-doc video.mp4 --frames-only - -# Use Anthropic instead of OpenAI -cad-doc video.mp4 --api-provider anthropic - -# Better transcription (slower) -cad-doc video.mp4 --whisper-model medium +uv run cad-doc-gui ``` -## Configuration - -Create a config file: - +### Project Management ```bash -cad-doc --init-config -# Creates ~/.cad-documenter.toml +# Create project +uv run cad-doc project init ./my-project -n "Project Name" + +# Add videos +uv run cad-doc project add ./my-project video.mp4 + +# Process (export for Clawdbot) +uv run cad-doc project process ./my-project --export-only + +# Process (with API) +uv run cad-doc project process ./my-project + +# Check status +uv run cad-doc project status ./my-project + +# Generate unified docs +uv run cad-doc project generate ./my-project ``` -Or set environment variables: - +### Single Video (API mode) ```bash -export OPENAI_API_KEY="sk-..." # Required for OpenAI -export ANTHROPIC_API_KEY="sk-..." # Required for Anthropic -export CAD_DOC_PROVIDER="anthropic" # Override default provider +uv run cad-doc video.mp4 [options] + +Options: + -o, --output PATH Output directory + --frames-only Only extract frames + --skip-transcription Skip audio transcription + --atomizer-hints Generate FEA optimization hints + --bom Generate Bill of Materials + --pdf Generate PDF output + --api-provider TEXT openai or anthropic + --whisper-model TEXT tiny/base/small/medium/large ``` -See [docs/USAGE.md](docs/USAGE.md) for full configuration options. +## Output -## Recording Tips +### Clawdbot Export (`clawdbot_export/`) +``` +/ +├── frames/ # Extracted keyframes +│ ├── 00-01-30.png +│ └── ... +├── transcript.json # Whisper output with timestamps +└── metadata.json # Session info +``` -For best results when recording: +### Full Processing +- 📄 **Markdown** — Structured documentation +- 📊 **BOM** — Components, materials, functions +- 🎯 **Atomizer hints** — FEA objectives & constraints +- 📑 **PDF** — Professional Atomaste-branded report -1. **Spin slowly** — Give the AI time to see each angle -2. **Name components** — "This is the main bracket..." -3. **Mention materials** — "Made of 6061 aluminum" -4. **Describe functions** — "This holds the motor" -5. **Note constraints** — "Must fit within 200mm" -6. **Point out features** — "These fillets reduce stress" +## Tips + +1. **Narrate your recording** — Audio narration = rich documentation +2. **Collapse UI panels** — In NX: Ctrl+Shift+N to hide Assembly Navigator +3. **Use scene detection** — Enabled by default, captures meaningful frames ## Architecture -``` -┌─────────────────────────────────────────────────────────────────────┐ -│ CAD-DOCUMENTER │ -│ │ -│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ -│ │ Video │───►│ Frame │───►│ Vision │───►│ Struct │ │ -│ │ Input │ │ Extract │ │ Analysis │ │ Output │ │ -│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ -│ │ │ │ │ │ -│ ▼ ▼ ▼ ▼ │ -│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ -│ │ Audio │───►│ Whisper │───►│ Correlate│───►│ Generate │ │ -│ │ Track │ │Transcribe│ │ Timeline │ │ Docs │ │ -│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ -└─────────────────────────────────────────────────────────────────────┘ -``` - -## Atomizer Integration - -CAD-Documenter generates FEA optimization hints for Atomizer: - -```bash -cad-doc walkthrough.mp4 --atomizer-hints -``` - -Output `atomizer_hints.json`: -```json -{ - "objectives": [{"name": "mass", "direction": "minimize"}], - "constraints": [{"type": "frequency", "value": ">100 Hz"}], - "parameters": ["thickness", "fillet_radius"], - "critical_regions": [{"feature": "fillet", "concern": "stress_concentration"}] -} -``` - -See [docs/ATOMIZER_INTEGRATION.md](docs/ATOMIZER_INTEGRATION.md) for details. - -## Python API - -```python -from cad_documenter.pipeline import DocumentationPipeline - -pipeline = DocumentationPipeline( - video_path="walkthrough.mp4", - output_dir="./docs" -) - -results = pipeline.run_full_pipeline( - atomizer_hints=True, - bom=True, - pdf=True -) -``` - -## Project Structure - ``` CAD-Documenter/ ├── src/cad_documenter/ -│ ├── cli.py # Command-line interface -│ ├── pipeline.py # Main orchestrator -│ ├── config.py # Configuration management +│ ├── cli.py # Main CLI +│ ├── cli_project.py # Project commands +│ ├── gui.py # Windows GUI +│ ├── pipeline.py # Processing orchestrator │ ├── video_processor.py # Frame extraction │ ├── audio_analyzer.py # Whisper transcription -│ ├── vision_analyzer.py # AI vision analysis -│ └── doc_generator.py # Output generation +│ ├── vision_analyzer.py # AI vision (API mode) +│ ├── incremental.py # Project processing +│ └── config.py # Configuration ├── prompts/ # AI prompts -├── templates/ # Jinja2 templates -├── tests/ # Test suite -├── docs/ # Documentation -└── examples/ # Example configs +├── templates/ # Output templates +└── tests/ ``` -## Roadmap - -- [x] Core pipeline (frames, transcription, vision) -- [x] Configuration system -- [x] Atomizer hints extraction -- [x] BOM generation -- [ ] Part Manager integration (P/N lookup) -- [ ] Interactive review mode -- [ ] Gitea auto-publish -- [ ] SolidWorks add-in - ## License MIT