Files
CODEtomaste/Tool_Scripts/Whisper_Transcript/Whisper-Obsidian-Transcription-Setup.md
Anto01 659bc7fb2e Add Voice Recorder - Whisper transcription tool for Obsidian
Features:
- Audio recording with pause/resume and visual feedback
- Local Whisper transcription (tiny/base/small models)
- 7 note types: instructions, capture, meeting, idea, daily, review, journal
- Claude CLI integration for intelligent note processing
- PKM context integration (reads vault files for better processing)
- Auto-organization into type-specific folders
- Daily notes with yesterday's task carryover
- Language-adaptive responses (matches transcript language)
- Custom icon and Windows desktop shortcut helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 19:51:53 -05:00

370 lines
8.8 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Whisper Voice Memo Transcription for Obsidian
## Overview
A simple, free, local transcription setup that:
- Uses OpenAI Whisper (large-v3 model) for high-quality transcription
- Handles French Canadian accent and English seamlessly
- Auto-detects language switches mid-sentence
- Outputs formatted markdown notes to Obsidian
- Runs via conda environment `test_env`
---
## Configuration
| Setting | Value |
|---------|-------|
| Conda Environment | `test_env` |
| Output Directory | `C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts` |
| Model | `openai/whisper-large-v3` |
| Supported Formats | mp3, m4a, wav, ogg, flac, webm |
---
## Installation
### Step 1: Activate conda environment
```bash
conda activate test_env
```
### Step 2: Install insanely-fast-whisper
```bash
pip install insanely-fast-whisper
```
### Step 3: Verify installation
```bash
insanely-fast-whisper --help
```
---
## Batch Script: Transcribe.bat
Save this file to your Desktop or a convenient location.
**File:** `Transcribe.bat`
```batch
@echo off
setlocal enabledelayedexpansion
:: ============================================
:: CONFIGURATION - Edit these paths as needed
:: ============================================
set "OUTPUT_DIR=C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts"
set "CONDA_ENV=test_env"
set "CONDA_PATH=C:\Users\antoi\anaconda3\Scripts\activate.bat"
:: ============================================
:: MAIN SCRIPT - No edits needed below
:: ============================================
:: Check if file was dragged onto script
if "%~1"=="" (
echo.
echo ========================================
echo Voice Memo Transcriber
echo ========================================
echo.
echo Drag an audio file onto this script!
echo Or paste the full path below:
echo.
set /p "AUDIO_FILE=File path: "
) else (
set "AUDIO_FILE=%~1"
)
:: Generate timestamp for filename
for /f "tokens=1-5 delims=/:.- " %%a in ("%date% %time%") do (
set "TIMESTAMP=%%c-%%a-%%b %%d-%%e"
)
set "NOTE_NAME=Voice Note %TIMESTAMP%.md"
set "TEMP_FILE=%TEMP%\whisper_output.txt"
echo.
echo ========================================
echo Transcribing: %AUDIO_FILE%
echo Output: %NOTE_NAME%
echo ========================================
echo.
echo This may take a few minutes for long recordings...
echo.
:: Activate conda environment and run whisper
call %CONDA_PATH% %CONDA_ENV%
insanely-fast-whisper --file-name "%AUDIO_FILE%" --transcript-path "%TEMP_FILE%" --model-name openai/whisper-large-v3
:: Check if transcription succeeded
if not exist "%TEMP_FILE%" (
echo.
echo ERROR: Transcription failed!
echo Check that the audio file exists and is valid.
echo.
pause
exit /b 1
)
:: Create markdown note with YAML frontmatter
echo --- > "%OUTPUT_DIR%\%NOTE_NAME%"
echo created: %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo type: voice-note >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo status: raw >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo tags: >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - transcript >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - voice-memo >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo # Voice Note - %date% at %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Metadata >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - **Source file:** `%~nx1` >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - **Transcribed:** %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Raw Transcript >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
type "%TEMP_FILE%" >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Notes distillees >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ^<!-- Coller le transcript dans Claude pour organiser et distiller --^> >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
:: Cleanup temp file
del "%TEMP_FILE%" 2>nul
echo.
echo ========================================
echo DONE!
echo Created: %NOTE_NAME%
echo Location: %OUTPUT_DIR%
echo ========================================
echo.
pause
```
---
## Usage
### Method 1: Drag and Drop
1. Record your voice memo (any app)
2. Drag the audio file onto `Transcribe.bat`
3. Wait for transcription (few minutes for 30min audio)
4. Find your note in Obsidian
### Method 2: Double-click and Paste Path
1. Double-click `Transcribe.bat`
2. Paste the full path to your audio file
3. Press Enter
4. Wait for transcription
---
## Output Format
Each transcription creates a markdown file like this:
```markdown
---
created: 2026-01-15 14:30
type: voice-note
status: raw
tags:
- transcript
- voice-memo
---
# Voice Note - 2026-01-15 at 14:30
## Metadata
- **Source file:** `recording.m4a`
- **Transcribed:** 2026-01-15 14:30
---
## Raw Transcript
[Your transcribed text appears here...]
---
## Notes distillees
<!-- Coller le transcript dans Claude pour organiser et distiller -->
```
---
## Processing with Claude
After transcription, use this prompt template to organize your notes:
```
Voici un transcript de notes vocales en français/anglais.
Peux-tu:
1. Corriger les erreurs de transcription évidentes
2. Organiser par thèmes/sujets
3. Extraire les points clés et action items
4. Reformatter en notes structurées
Garde le contenu original mais rends-le plus lisible.
---
[COLLER LE TRANSCRIPT ICI]
```
---
## Troubleshooting
### "conda is not recognized"
- Verify conda path: `where conda`
- Update `CONDA_PATH` in the script to match your installation
### Transcription takes too long
- The `large-v3` model is accurate but slow on CPU
- For faster (less accurate) results, change model to:
```
--model-name openai/whisper-medium
```
or
```
--model-name openai/whisper-small
```
### GPU acceleration
If you have an NVIDIA GPU, install CUDA support:
```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
```
### Wrong language detected
Add language hint to the whisper command:
```bash
insanely-fast-whisper --file-name "audio.mp3" --transcript-path "output.txt" --model-name openai/whisper-large-v3 --language fr
```
---
## Alternative: Python Script Version
For more control or integration with other tools:
**File:** `transcribe.py`
```python
import subprocess
import sys
from datetime import datetime
from pathlib import Path
# Configuration
OUTPUT_DIR = Path(r"C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts")
MODEL = "openai/whisper-large-v3"
def transcribe(audio_path: str):
audio_file = Path(audio_path)
timestamp = datetime.now().strftime("%Y-%m-%d %H-%M")
note_name = f"Voice Note {timestamp}.md"
temp_file = Path.home() / "AppData/Local/Temp/whisper_output.txt"
print(f"\n🎙 Transcribing: {audio_file.name}")
print(f"📝 Output: {note_name}\n")
# Run whisper
subprocess.run([
"insanely-fast-whisper",
"--file-name", str(audio_file),
"--transcript-path", str(temp_file),
"--model-name", MODEL
])
# Read transcript
transcript = temp_file.read_text(encoding="utf-8")
# Create markdown note
note_content = f"""---
created: {datetime.now().strftime("%Y-%m-%d %H:%M")}
type: voice-note
status: raw
tags:
- transcript
- voice-memo
---
# Voice Note - {datetime.now().strftime("%Y-%m-%d")} at {datetime.now().strftime("%H:%M")}
## Metadata
- **Source file:** `{audio_file.name}`
- **Transcribed:** {datetime.now().strftime("%Y-%m-%d %H:%M")}
---
## Raw Transcript
{transcript}
---
## Notes distillees
<!-- Coller le transcript dans Claude pour organiser et distiller -->
"""
output_path = OUTPUT_DIR / note_name
output_path.write_text(note_content, encoding="utf-8")
print(f"\n✅ Done! Created: {note_name}")
print(f"📁 Location: {OUTPUT_DIR}")
if __name__ == "__main__":
if len(sys.argv) > 1:
transcribe(sys.argv[1])
else:
audio = input("Enter audio file path: ").strip('"')
transcribe(audio)
```
Run with:
```bash
conda activate test_env
python transcribe.py "path/to/audio.mp3"
```
---
## Next Steps
- [ ] Install `insanely-fast-whisper` in `test_env`
- [ ] Save `Transcribe.bat` to Desktop
- [ ] Test with a short audio clip
- [ ] Pin to taskbar for quick access
- [ ] Set up Claude prompt template for processing
---
## Resources
- [insanely-fast-whisper GitHub](https://github.com/Vaibhavs10/insanely-fast-whisper)
- [OpenAI Whisper](https://github.com/openai/whisper)
- [Whisper model comparison](https://github.com/openai/whisper#available-models-and-languages)