Files
CODEtomaste/Tool_Scripts/Whisper_Transcript/Whisper-Obsidian-Transcription-Setup.md
Anto01 659bc7fb2e Add Voice Recorder - Whisper transcription tool for Obsidian
Features:
- Audio recording with pause/resume and visual feedback
- Local Whisper transcription (tiny/base/small models)
- 7 note types: instructions, capture, meeting, idea, daily, review, journal
- Claude CLI integration for intelligent note processing
- PKM context integration (reads vault files for better processing)
- Auto-organization into type-specific folders
- Daily notes with yesterday's task carryover
- Language-adaptive responses (matches transcript language)
- Custom icon and Windows desktop shortcut helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 19:51:53 -05:00

8.8 KiB

Whisper Voice Memo Transcription for Obsidian

Overview

A simple, free, local transcription setup that:

  • Uses OpenAI Whisper (large-v3 model) for high-quality transcription
  • Handles French Canadian accent and English seamlessly
  • Auto-detects language switches mid-sentence
  • Outputs formatted markdown notes to Obsidian
  • Runs via conda environment test_env

Configuration

Setting Value
Conda Environment test_env
Output Directory C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts
Model openai/whisper-large-v3
Supported Formats mp3, m4a, wav, ogg, flac, webm

Installation

Step 1: Activate conda environment

conda activate test_env

Step 2: Install insanely-fast-whisper

pip install insanely-fast-whisper

Step 3: Verify installation

insanely-fast-whisper --help

Batch Script: Transcribe.bat

Save this file to your Desktop or a convenient location.

File: Transcribe.bat

@echo off
setlocal enabledelayedexpansion

:: ============================================
:: CONFIGURATION - Edit these paths as needed
:: ============================================
set "OUTPUT_DIR=C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts"
set "CONDA_ENV=test_env"
set "CONDA_PATH=C:\Users\antoi\anaconda3\Scripts\activate.bat"

:: ============================================
:: MAIN SCRIPT - No edits needed below
:: ============================================

:: Check if file was dragged onto script
if "%~1"=="" (
    echo.
    echo  ========================================
    echo   Voice Memo Transcriber
    echo  ========================================
    echo.
    echo   Drag an audio file onto this script!
    echo   Or paste the full path below:
    echo.
    set /p "AUDIO_FILE=File path: "
) else (
    set "AUDIO_FILE=%~1"
)

:: Generate timestamp for filename
for /f "tokens=1-5 delims=/:.- " %%a in ("%date% %time%") do (
    set "TIMESTAMP=%%c-%%a-%%b %%d-%%e"
)

set "NOTE_NAME=Voice Note %TIMESTAMP%.md"
set "TEMP_FILE=%TEMP%\whisper_output.txt"

echo.
echo  ========================================
echo  Transcribing: %AUDIO_FILE%
echo  Output: %NOTE_NAME%
echo  ========================================
echo.
echo  This may take a few minutes for long recordings...
echo.

:: Activate conda environment and run whisper
call %CONDA_PATH% %CONDA_ENV%
insanely-fast-whisper --file-name "%AUDIO_FILE%" --transcript-path "%TEMP_FILE%" --model-name openai/whisper-large-v3

:: Check if transcription succeeded
if not exist "%TEMP_FILE%" (
    echo.
    echo  ERROR: Transcription failed!
    echo  Check that the audio file exists and is valid.
    echo.
    pause
    exit /b 1
)

:: Create markdown note with YAML frontmatter
echo --- > "%OUTPUT_DIR%\%NOTE_NAME%"
echo created: %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo type: voice-note >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo status: raw >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo tags: >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo   - transcript >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo   - voice-memo >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo # Voice Note - %date% at %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Metadata >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - **Source file:** `%~nx1` >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo - **Transcribed:** %date% %time:~0,5% >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Raw Transcript >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
type "%TEMP_FILE%" >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo --- >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ## Notes distillees >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo ^<!-- Coller le transcript dans Claude pour organiser et distiller --^> >> "%OUTPUT_DIR%\%NOTE_NAME%"
echo. >> "%OUTPUT_DIR%\%NOTE_NAME%"

:: Cleanup temp file
del "%TEMP_FILE%" 2>nul

echo.
echo  ========================================
echo   DONE!
echo   Created: %NOTE_NAME%
echo   Location: %OUTPUT_DIR%
echo  ========================================
echo.
pause

Usage

Method 1: Drag and Drop

  1. Record your voice memo (any app)
  2. Drag the audio file onto Transcribe.bat
  3. Wait for transcription (few minutes for 30min audio)
  4. Find your note in Obsidian

Method 2: Double-click and Paste Path

  1. Double-click Transcribe.bat
  2. Paste the full path to your audio file
  3. Press Enter
  4. Wait for transcription

Output Format

Each transcription creates a markdown file like this:

---
created: 2026-01-15 14:30
type: voice-note
status: raw
tags:
  - transcript
  - voice-memo
---

# Voice Note - 2026-01-15 at 14:30

## Metadata

- **Source file:** `recording.m4a`
- **Transcribed:** 2026-01-15 14:30

---

## Raw Transcript

[Your transcribed text appears here...]

---

## Notes distillees

<!-- Coller le transcript dans Claude pour organiser et distiller -->


Processing with Claude

After transcription, use this prompt template to organize your notes:

Voici un transcript de notes vocales en français/anglais. 
Peux-tu:

1. Corriger les erreurs de transcription évidentes
2. Organiser par thèmes/sujets
3. Extraire les points clés et action items
4. Reformatter en notes structurées

Garde le contenu original mais rends-le plus lisible.

---

[COLLER LE TRANSCRIPT ICI]

Troubleshooting

"conda is not recognized"

  • Verify conda path: where conda
  • Update CONDA_PATH in the script to match your installation

Transcription takes too long

  • The large-v3 model is accurate but slow on CPU
  • For faster (less accurate) results, change model to:
    --model-name openai/whisper-medium
    
    or
    --model-name openai/whisper-small
    

GPU acceleration

If you have an NVIDIA GPU, install CUDA support:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Wrong language detected

Add language hint to the whisper command:

insanely-fast-whisper --file-name "audio.mp3" --transcript-path "output.txt" --model-name openai/whisper-large-v3 --language fr

Alternative: Python Script Version

For more control or integration with other tools:

File: transcribe.py

import subprocess
import sys
from datetime import datetime
from pathlib import Path

# Configuration
OUTPUT_DIR = Path(r"C:\Users\antoi\antoine\My Libraries\Antoine Brain Extension\+\Transcripts")
MODEL = "openai/whisper-large-v3"

def transcribe(audio_path: str):
    audio_file = Path(audio_path)
    timestamp = datetime.now().strftime("%Y-%m-%d %H-%M")
    note_name = f"Voice Note {timestamp}.md"
    temp_file = Path.home() / "AppData/Local/Temp/whisper_output.txt"
    
    print(f"\n🎙️ Transcribing: {audio_file.name}")
    print(f"📝 Output: {note_name}\n")
    
    # Run whisper
    subprocess.run([
        "insanely-fast-whisper",
        "--file-name", str(audio_file),
        "--transcript-path", str(temp_file),
        "--model-name", MODEL
    ])
    
    # Read transcript
    transcript = temp_file.read_text(encoding="utf-8")
    
    # Create markdown note
    note_content = f"""---
created: {datetime.now().strftime("%Y-%m-%d %H:%M")}
type: voice-note
status: raw
tags:
  - transcript
  - voice-memo
---

# Voice Note - {datetime.now().strftime("%Y-%m-%d")} at {datetime.now().strftime("%H:%M")}

## Metadata

- **Source file:** `{audio_file.name}`
- **Transcribed:** {datetime.now().strftime("%Y-%m-%d %H:%M")}

---

## Raw Transcript

{transcript}

---

## Notes distillees

<!-- Coller le transcript dans Claude pour organiser et distiller -->

"""
    
    output_path = OUTPUT_DIR / note_name
    output_path.write_text(note_content, encoding="utf-8")
    
    print(f"\n✅ Done! Created: {note_name}")
    print(f"📁 Location: {OUTPUT_DIR}")

if __name__ == "__main__":
    if len(sys.argv) > 1:
        transcribe(sys.argv[1])
    else:
        audio = input("Enter audio file path: ").strip('"')
        transcribe(audio)

Run with:

conda activate test_env
python transcribe.py "path/to/audio.mp3"

Next Steps

  • Install insanely-fast-whisper in test_env
  • Save Transcribe.bat to Desktop
  • Test with a short audio clip
  • Pin to taskbar for quick access
  • Set up Claude prompt template for processing

Resources