Features: - Audio recording with pause/resume and visual feedback - Local Whisper transcription (tiny/base/small models) - 7 note types: instructions, capture, meeting, idea, daily, review, journal - Claude CLI integration for intelligent note processing - PKM context integration (reads vault files for better processing) - Auto-organization into type-specific folders - Daily notes with yesterday's task carryover - Language-adaptive responses (matches transcript language) - Custom icon and Windows desktop shortcut helpers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
8.8 KiB
8.8 KiB
Whisper Voice Memo Transcription for Obsidian
Overview
A simple, free, local transcription setup that:
- Uses OpenAI Whisper (large-v3 model) for high-quality transcription
- Handles French Canadian accent and English seamlessly
- Auto-detects language switches mid-sentence
- Outputs formatted markdown notes to Obsidian
- Runs via conda environment
test_env
Configuration
| Setting | Value |
|---|---|
| Conda Environment | test_env |
| Output Directory | C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts |
| Model | openai/whisper-large-v3 |
| Supported Formats | mp3, m4a, wav, ogg, flac, webm |
Installation
Step 1: Activate conda environment
conda activate test_env
Step 2: Install insanely-fast-whisper
pip install insanely-fast-whisper
Step 3: Verify installation
insanely-fast-whisper --help
Batch Script: Transcribe.bat
Save this file to your Desktop or a convenient location.
File: Transcribe.bat
@echo off
setlocal enabledelayedexpansion
:: ============================================
:: CONFIGURATION - Edit these paths as needed
:: ============================================
set "OUTPUT_DIR=C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts"
set "CONDA_ENV=test_env"
set "CONDA_PATH=C:\Users\antoi\anaconda3\Scripts\activate.bat"
:: ============================================
:: MAIN SCRIPT - No edits needed below
:: ============================================
:: Check if file was dragged onto script
if "%~1"=="" (
echo.
echo ========================================
echo Voice Memo Transcriber
echo ========================================
echo.
echo Drag an audio file onto this script!
echo Or paste the full path below:
echo.
set /p "AUDIO_FILE=File path: "
) else (
set "AUDIO_FILE=%~1"
)
:: Generate timestamp for filename
for /f "tokens=1-5 delims=/:.- " %%a in ("%date% %time%") do (
set "TIMESTAMP=%%c-%%a-%%b %%d-%%e"
)
set "NOTE_NAME=Voice Note %TIMESTAMP%.md"
set "TEMP_FILE=%TEMP%\whisper_output.txt"
echo.
echo ========================================
echo Transcribing: %AUDIO_FILE%
echo Output: %NOTE_NAME%
echo ========================================
echo.
echo This may take a few minutes for long recordings...
echo.
:: Activate conda environment and run whisper
call %CONDA_PATH% %CONDA_ENV%
insanely-fast-whisper --file-name "%AUDIO_FILE%" --transcript-path "%TEMP_FILE%" --model-name openai/whisper-large-v3
:: Check if transcription succeeded
if not exist "%TEMP_FILE%" (
echo.
echo ERROR: Transcription failed!
echo Check that the audio file exists and is valid.
echo.
pause
exit /b 1
)
:: Create markdown note with YAML frontmatter
echo --- > "%OUTPUT_DIR%\%NOTE_NAME%"
echo created: %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo type: voice-note >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo status: raw >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo tags: >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - transcript >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - voice-memo >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo # Voice Note - %date% at %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Metadata >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - **Source file:** `%~nx1` >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - **Transcribed:** %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Raw Transcript >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
type "%TEMP_FILE%" >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Notes distillees >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ^<!-- Coller le transcript dans Claude pour organiser et distiller --^> >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
:: Cleanup temp file
del "%TEMP_FILE%" 2>nul
echo.
echo ========================================
echo DONE!
echo Created: %NOTE_NAME%
echo Location: %OUTPUT_DIR%
echo ========================================
echo.
pause
Usage
Method 1: Drag and Drop
- Record your voice memo (any app)
- Drag the audio file onto
Transcribe.bat - Wait for transcription (few minutes for 30min audio)
- Find your note in Obsidian
Method 2: Double-click and Paste Path
- Double-click
Transcribe.bat - Paste the full path to your audio file
- Press Enter
- Wait for transcription
Output Format
Each transcription creates a markdown file like this:
---
created: 2026-01-15 14:30
type: voice-note
status: raw
tags:
- transcript
- voice-memo
---
# Voice Note - 2026-01-15 at 14:30
## Metadata
- **Source file:** `recording.m4a`
- **Transcribed:** 2026-01-15 14:30
---
## Raw Transcript
[Your transcribed text appears here...]
---
## Notes distillees
<!-- Coller le transcript dans Claude pour organiser et distiller -->
Processing with Claude
After transcription, use this prompt template to organize your notes:
Voici un transcript de notes vocales en français/anglais.
Peux-tu:
1. Corriger les erreurs de transcription évidentes
2. Organiser par thèmes/sujets
3. Extraire les points clés et action items
4. Reformatter en notes structurées
Garde le contenu original mais rends-le plus lisible.
---
[COLLER LE TRANSCRIPT ICI]
Troubleshooting
"conda is not recognized"
- Verify conda path:
where conda - Update
CONDA_PATHin the script to match your installation
Transcription takes too long
- The
large-v3model is accurate but slow on CPU - For faster (less accurate) results, change model to:
or
--model-name openai/whisper-medium--model-name openai/whisper-small
GPU acceleration
If you have an NVIDIA GPU, install CUDA support:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Wrong language detected
Add language hint to the whisper command:
insanely-fast-whisper --file-name "audio.mp3" --transcript-path "output.txt" --model-name openai/whisper-large-v3 --language fr
Alternative: Python Script Version
For more control or integration with other tools:
File: transcribe.py
import subprocess
import sys
from datetime import datetime
from pathlib import Path
# Configuration
OUTPUT_DIR = Path(r"C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts")
MODEL = "openai/whisper-large-v3"
def transcribe(audio_path: str):
audio_file = Path(audio_path)
timestamp = datetime.now().strftime("%Y-%m-%d %H-%M")
note_name = f"Voice Note {timestamp}.md"
temp_file = Path.home() / "AppData/Local/Temp/whisper_output.txt"
print(f"\n🎙️ Transcribing: {audio_file.name}")
print(f"📝 Output: {note_name}\n")
# Run whisper
subprocess.run([
"insanely-fast-whisper",
"--file-name", str(audio_file),
"--transcript-path", str(temp_file),
"--model-name", MODEL
])
# Read transcript
transcript = temp_file.read_text(encoding="utf-8")
# Create markdown note
note_content = f"""---
created: {datetime.now().strftime("%Y-%m-%d %H:%M")}
type: voice-note
status: raw
tags:
- transcript
- voice-memo
---
# Voice Note - {datetime.now().strftime("%Y-%m-%d")} at {datetime.now().strftime("%H:%M")}
## Metadata
- **Source file:** `{audio_file.name}`
- **Transcribed:** {datetime.now().strftime("%Y-%m-%d %H:%M")}
---
## Raw Transcript
{transcript}
---
## Notes distillees
<!-- Coller le transcript dans Claude pour organiser et distiller -->
"""
output_path = OUTPUT_DIR / note_name
output_path.write_text(note_content, encoding="utf-8")
print(f"\n✅ Done! Created: {note_name}")
print(f"📁 Location: {OUTPUT_DIR}")
if __name__ == "__main__":
if len(sys.argv) > 1:
transcribe(sys.argv[1])
else:
audio = input("Enter audio file path: ").strip('"')
transcribe(audio)
Run with:
conda activate test_env
python transcribe.py "path/to/audio.mp3"
Next Steps
- Install
insanely-fast-whisperintest_env - Save
Transcribe.batto Desktop - Test with a short audio clip
- Pin to taskbar for quick access
- Set up Claude prompt template for processing