# Whisper Voice Memo Transcription for Obsidian ## Overview A simple, free, local transcription setup that: - Uses OpenAI Whisper (large-v3 model) for high-quality transcription - Handles French Canadian accent and English seamlessly - Auto-detects language switches mid-sentence - Outputs formatted markdown notes to Obsidian - Runs via conda environment `test_env` --- ## Configuration | Setting | Value | |---------|-------| | Conda Environment | `test_env` | | Output Directory | `C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts` | | Model | `openai/whisper-large-v3` | | Supported Formats | mp3, m4a, wav, ogg, flac, webm | --- ## Installation ### Step 1: Activate conda environment ```bash conda activate test_env ``` ### Step 2: Install insanely-fast-whisper ```bash pip install insanely-fast-whisper ``` ### Step 3: Verify installation ```bash insanely-fast-whisper --help ``` --- ## Batch Script: Transcribe.bat Save this file to your Desktop or a convenient location. **File:** `Transcribe.bat` ```batch @echo off setlocal enabledelayedexpansion :: ============================================ :: CONFIGURATION - Edit these paths as needed :: ============================================ set "OUTPUT_DIR=C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts" set "CONDA_ENV=test_env" set "CONDA_PATH=C:\Users\antoi\anaconda3\Scripts\activate.bat" :: ============================================ :: MAIN SCRIPT - No edits needed below :: ============================================ :: Check if file was dragged onto script if "%~1"=="" ( echo. echo ======================================== echo Voice Memo Transcriber echo ======================================== echo. echo Drag an audio file onto this script! echo Or paste the full path below: echo. set /p "AUDIO_FILE=File path: " ) else ( set "AUDIO_FILE=%~1" ) :: Generate timestamp for filename for /f "tokens=1-5 delims=/:.- " %%a in ("%date% %time%") do ( set "TIMESTAMP=%%c-%%a-%%b %%d-%%e" ) set "NOTE_NAME=Voice Note %TIMESTAMP%.md" set "TEMP_FILE=%TEMP%\whisper_output.txt" echo. echo ======================================== echo Transcribing: %AUDIO_FILE% echo Output: %NOTE_NAME% echo ======================================== echo. echo This may take a few minutes for long recordings... echo. :: Activate conda environment and run whisper call %CONDA_PATH% %CONDA_ENV% insanely-fast-whisper --file-name "%AUDIO_FILE%" --transcript-path "%TEMP_FILE%" --model-name openai/whisper-large-v3 :: Check if transcription succeeded if not exist "%TEMP_FILE%" ( echo. echo ERROR: Transcription failed! echo Check that the audio file exists and is valid. echo. pause exit /b 1 ) :: Create markdown note with YAML frontmatter echo --- > "%OUTPUT_DIR%\%NOTE_NAME%" echo created: %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%" echo type: voice-note >> "%OUTPUT_DIR%\%NOTE_NAME%" echo status: raw >> "%OUTPUT_DIR%\%NOTE_NAME%" echo tags: >> "%OUTPUT_DIR%\%NOTE_NAME%" echo - transcript >> "%OUTPUT_DIR%\%NOTE_NAME%" echo - voice-memo >> "%OUTPUT_DIR%\%NOTE_NAME%" echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo # Voice Note - %date% at %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo ## Metadata >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo - **Source file:** `%~nx1` >> "%OUTPUT_DIR%\%NOTE_NAME%" echo - **Transcribed:** %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo ## Raw Transcript >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" type "%TEMP_FILE%" >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo ## Notes distillees >> "%OUTPUT_DIR%\%NOTE_NAME%" echo. >> "%OUTPUT_DIR%\%NOTE_NAME%" echo ^ ``` --- ## Processing with Claude After transcription, use this prompt template to organize your notes: ``` Voici un transcript de notes vocales en français/anglais. Peux-tu: 1. Corriger les erreurs de transcription évidentes 2. Organiser par thèmes/sujets 3. Extraire les points clés et action items 4. Reformatter en notes structurées Garde le contenu original mais rends-le plus lisible. --- [COLLER LE TRANSCRIPT ICI] ``` --- ## Troubleshooting ### "conda is not recognized" - Verify conda path: `where conda` - Update `CONDA_PATH` in the script to match your installation ### Transcription takes too long - The `large-v3` model is accurate but slow on CPU - For faster (less accurate) results, change model to: ``` --model-name openai/whisper-medium ``` or ``` --model-name openai/whisper-small ``` ### GPU acceleration If you have an NVIDIA GPU, install CUDA support: ```bash pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 ``` ### Wrong language detected Add language hint to the whisper command: ```bash insanely-fast-whisper --file-name "audio.mp3" --transcript-path "output.txt" --model-name openai/whisper-large-v3 --language fr ``` --- ## Alternative: Python Script Version For more control or integration with other tools: **File:** `transcribe.py` ```python import subprocess import sys from datetime import datetime from pathlib import Path # Configuration OUTPUT_DIR = Path(r"C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts") MODEL = "openai/whisper-large-v3" def transcribe(audio_path: str): audio_file = Path(audio_path) timestamp = datetime.now().strftime("%Y-%m-%d %H-%M") note_name = f"Voice Note {timestamp}.md" temp_file = Path.home() / "AppData/Local/Temp/whisper_output.txt" print(f"\n🎙️ Transcribing: {audio_file.name}") print(f"📝 Output: {note_name}\n") # Run whisper subprocess.run([ "insanely-fast-whisper", "--file-name", str(audio_file), "--transcript-path", str(temp_file), "--model-name", MODEL ]) # Read transcript transcript = temp_file.read_text(encoding="utf-8") # Create markdown note note_content = f"""--- created: {datetime.now().strftime("%Y-%m-%d %H:%M")} type: voice-note status: raw tags: - transcript - voice-memo --- # Voice Note - {datetime.now().strftime("%Y-%m-%d")} at {datetime.now().strftime("%H:%M")} ## Metadata - **Source file:** `{audio_file.name}` - **Transcribed:** {datetime.now().strftime("%Y-%m-%d %H:%M")} --- ## Raw Transcript {transcript} --- ## Notes distillees """ output_path = OUTPUT_DIR / note_name output_path.write_text(note_content, encoding="utf-8") print(f"\n✅ Done! Created: {note_name}") print(f"📁 Location: {OUTPUT_DIR}") if __name__ == "__main__": if len(sys.argv) > 1: transcribe(sys.argv[1]) else: audio = input("Enter audio file path: ").strip('"') transcribe(audio) ``` Run with: ```bash conda activate test_env python transcribe.py "path/to/audio.mp3" ``` --- ## Next Steps - [ ] Install `insanely-fast-whisper` in `test_env` - [ ] Save `Transcribe.bat` to Desktop - [ ] Test with a short audio clip - [ ] Pin to taskbar for quick access - [ ] Set up Claude prompt template for processing --- ## Resources - [insanely-fast-whisper GitHub](https://github.com/Vaibhavs10/insanely-fast-whisper) - [OpenAI Whisper](https://github.com/openai/whisper) - [Whisper model comparison](https://github.com/openai/whisper#available-models-and-languages)