Files

Anto01 659bc7fb2e Add Voice Recorder - Whisper transcription tool for Obsidian

Features:
- Audio recording with pause/resume and visual feedback
- Local Whisper transcription (tiny/base/small models)
- 7 note types: instructions, capture, meeting, idea, daily, review, journal
- Claude CLI integration for intelligent note processing
- PKM context integration (reads vault files for better processing)
- Auto-organization into type-specific folders
- Daily notes with yesterday's task carryover
- Language-adaptive responses (matches transcript language)
- Custom icon and Windows desktop shortcut helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-27 19:51:53 -05:00

5.7 KiB

Raw Permalink Blame History

Voice Recorder Android Port - Brainstorm

Current App Features to Port

Audio Recording - Record with pause/resume
Whisper Transcription - Local AI transcription
Note Type Selection - Meeting, Todo, Idea, Review, Journal
Obsidian Export - Markdown files with YAML frontmatter
Claude Processing - AI-powered note organization
Folder Organization - Auto-sort by note type

Approach Options

Option 1: React Native + Expo (Recommended)

Pros:

Cross-platform (iOS + Android)
Large ecosystem, good documentation
Hot reload for fast development
Can use expo-av for audio recording
Good integration with file systems

Cons:

Whisper would need cloud API or native module
Claude CLI not available, would need API

Stack:

- React Native / Expo
- expo-av (recording)
- expo-file-system (file management)
- OpenAI Whisper API or whisper.cpp native module
- Claude API (not CLI)
- Obsidian sync via shared folder or plugin

Effort: Medium (2-3 weeks for MVP)

Option 2: Flutter

Pros:

Single codebase for Android/iOS
Fast performance with native compilation
Good audio packages (record, just_audio)
Material Design 3 built-in

Cons:

Dart learning curve
Whisper integration more complex
Smaller ecosystem than React Native

Stack:

- Flutter / Dart
- record package (audio)
- whisper_flutter or cloud API
- Claude API
- path_provider for file storage

Effort: Medium-High (3-4 weeks for MVP)

Option 3: Native Kotlin (Android Only)

Pros:

Best performance
Full Android API access
Can integrate whisper.cpp directly
Better battery optimization
Works offline

Cons:

Android only (no iOS)
More code to maintain
Longer development time

Stack:

- Kotlin + Jetpack Compose
- MediaRecorder API
- whisper.cpp via JNI (local transcription)
- Claude API
- Storage Access Framework for Obsidian folder

Effort: High (4-6 weeks for MVP)

Option 4: PWA (Progressive Web App)

Pros:

Works on any device with browser
No app store needed
Shared codebase with potential web app
Easy updates

Cons:

Limited audio recording capabilities
No background processing
Can't access file system directly
Requires internet for Whisper

Stack:

- Vue.js or React
- MediaRecorder Web API
- Whisper API (cloud)
- Claude API
- Download files or sync via Obsidian plugin

Effort: Low-Medium (1-2 weeks for MVP)

Whisper Integration Options

Cloud-based (Easier)

OpenAI Whisper API - $0.006/min, reliable
Replicate - Pay per use, hosted models
Self-hosted - Run whisper on home server/NAS

On-device (Harder but offline)

whisper.cpp - C++ port, works on Android via JNI
whisper-android - Pre-built Android bindings
ONNX Runtime - Run whisper.onnx model

Recommendation: Start with OpenAI API, add offline later

Obsidian Sync Options

Option A: Direct File Access

Use Android's Storage Access Framework
User grants access to Obsidian vault folder
Write markdown files directly
Works with Obsidian Sync, Syncthing, etc.

Option B: Obsidian Plugin

Create companion plugin for Obsidian
App sends notes via local HTTP server
Plugin receives and saves notes
More complex but cleaner UX

Use Android share functionality
Share transcribed note to Obsidian
User manually saves
Simplest but requires user action

Recommendation: Option A (direct file access)

Recommended MVP Approach

Phase 1: Core Recording (Week 1)

React Native + Expo setup
Basic UI matching desktop app style
Audio recording with pause/resume
Timer display
Note type selection

Phase 2: Transcription (Week 2)

OpenAI Whisper API integration
Loading states and error handling
Transcript preview

Phase 3: Export & Processing (Week 3)

File system access setup
Markdown generation
Claude API integration
Folder organization

Phase 4: Polish (Week 4)

Offline queue for transcription
Settings screen
Obsidian folder picker
Widget for quick recording

Technical Considerations for Pixel 7

Hardware Advantages

Tensor G2 chip - could run small whisper models
Good microphone array
Large battery

Android-Specific Features

Material You theming
Quick Settings tile
Home screen widget
Voice Assistant integration potential

Alternative: Termux + Python

For a quick hack without building a full app:

# Install Termux from F-Droid
pkg install python
pip install openai-whisper sounddevice

# Run existing Python script (modified)
python voice_recorder_android.py

Pros: Reuse existing code, fast to test Cons: Requires Termux, not user-friendly

Decision Matrix

Criteria	React Native	Flutter	Kotlin	PWA
Dev Speed	Fast	Medium	Slow	Fastest
Performance	Good	Great	Best	OK
Offline	Possible	Possible	Yes	No
iOS Support	Yes	Yes	No	Yes
Learning Curve	Low	Medium	Medium	Low
Maintenance	Easy	Easy	More	Easy

Recommended Path

Start with React Native + Expo for fastest MVP
Use OpenAI Whisper API initially
Direct file access to Obsidian vault
Claude API (not CLI) for processing
Add offline whisper.cpp later if needed

This approach gets a working app fastest while leaving room for optimization.

Next Steps

Set up React Native + Expo project
Design mobile UI mockups
Get OpenAI API key for Whisper
Get Claude API key
Test file system access on Pixel 7
Create basic recording prototype

5.7 KiB Raw Permalink Blame History

Voice Recorder Android Port - Brainstorm

Current App Features to Port

Approach Options

Option 1: React Native + Expo (Recommended)

Option 2: Flutter

Option 3: Native Kotlin (Android Only)

Option 4: PWA (Progressive Web App)

Whisper Integration Options

Cloud-based (Easier)

On-device (Harder but offline)

Obsidian Sync Options

Option A: Direct File Access

Option B: Obsidian Plugin

Option C: Share Intent

Recommended MVP Approach

Phase 1: Core Recording (Week 1)

Phase 2: Transcription (Week 2)

Phase 3: Export & Processing (Week 3)

Phase 4: Polish (Week 4)

Technical Considerations for Pixel 7

Hardware Advantages

Android-Specific Features

Alternative: Termux + Python

Decision Matrix

Recommended Path

Next Steps

5.7 KiB

Raw Permalink Blame History