Files
CODEtomaste/Tool_Scripts/Whisper_Transcript/Android-Port-Brainstorm.md
Anto01 659bc7fb2e Add Voice Recorder - Whisper transcription tool for Obsidian
Features:
- Audio recording with pause/resume and visual feedback
- Local Whisper transcription (tiny/base/small models)
- 7 note types: instructions, capture, meeting, idea, daily, review, journal
- Claude CLI integration for intelligent note processing
- PKM context integration (reads vault files for better processing)
- Auto-organization into type-specific folders
- Daily notes with yesterday's task carryover
- Language-adaptive responses (matches transcript language)
- Custom icon and Windows desktop shortcut helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 19:51:53 -05:00

5.7 KiB

Voice Recorder Android Port - Brainstorm

Current App Features to Port

  1. Audio Recording - Record with pause/resume
  2. Whisper Transcription - Local AI transcription
  3. Note Type Selection - Meeting, Todo, Idea, Review, Journal
  4. Obsidian Export - Markdown files with YAML frontmatter
  5. Claude Processing - AI-powered note organization
  6. Folder Organization - Auto-sort by note type

Approach Options

Pros:

  • Cross-platform (iOS + Android)
  • Large ecosystem, good documentation
  • Hot reload for fast development
  • Can use expo-av for audio recording
  • Good integration with file systems

Cons:

  • Whisper would need cloud API or native module
  • Claude CLI not available, would need API

Stack:

- React Native / Expo
- expo-av (recording)
- expo-file-system (file management)
- OpenAI Whisper API or whisper.cpp native module
- Claude API (not CLI)
- Obsidian sync via shared folder or plugin

Effort: Medium (2-3 weeks for MVP)


Option 2: Flutter

Pros:

  • Single codebase for Android/iOS
  • Fast performance with native compilation
  • Good audio packages (record, just_audio)
  • Material Design 3 built-in

Cons:

  • Dart learning curve
  • Whisper integration more complex
  • Smaller ecosystem than React Native

Stack:

- Flutter / Dart
- record package (audio)
- whisper_flutter or cloud API
- Claude API
- path_provider for file storage

Effort: Medium-High (3-4 weeks for MVP)


Option 3: Native Kotlin (Android Only)

Pros:

  • Best performance
  • Full Android API access
  • Can integrate whisper.cpp directly
  • Better battery optimization
  • Works offline

Cons:

  • Android only (no iOS)
  • More code to maintain
  • Longer development time

Stack:

- Kotlin + Jetpack Compose
- MediaRecorder API
- whisper.cpp via JNI (local transcription)
- Claude API
- Storage Access Framework for Obsidian folder

Effort: High (4-6 weeks for MVP)


Option 4: PWA (Progressive Web App)

Pros:

  • Works on any device with browser
  • No app store needed
  • Shared codebase with potential web app
  • Easy updates

Cons:

  • Limited audio recording capabilities
  • No background processing
  • Can't access file system directly
  • Requires internet for Whisper

Stack:

- Vue.js or React
- MediaRecorder Web API
- Whisper API (cloud)
- Claude API
- Download files or sync via Obsidian plugin

Effort: Low-Medium (1-2 weeks for MVP)


Whisper Integration Options

Cloud-based (Easier)

  1. OpenAI Whisper API - $0.006/min, reliable
  2. Replicate - Pay per use, hosted models
  3. Self-hosted - Run whisper on home server/NAS

On-device (Harder but offline)

  1. whisper.cpp - C++ port, works on Android via JNI
  2. whisper-android - Pre-built Android bindings
  3. ONNX Runtime - Run whisper.onnx model

Recommendation: Start with OpenAI API, add offline later


Obsidian Sync Options

Option A: Direct File Access

  • Use Android's Storage Access Framework
  • User grants access to Obsidian vault folder
  • Write markdown files directly
  • Works with Obsidian Sync, Syncthing, etc.

Option B: Obsidian Plugin

  • Create companion plugin for Obsidian
  • App sends notes via local HTTP server
  • Plugin receives and saves notes
  • More complex but cleaner UX

Option C: Share Intent

  • Use Android share functionality
  • Share transcribed note to Obsidian
  • User manually saves
  • Simplest but requires user action

Recommendation: Option A (direct file access)


Phase 1: Core Recording (Week 1)

  • React Native + Expo setup
  • Basic UI matching desktop app style
  • Audio recording with pause/resume
  • Timer display
  • Note type selection

Phase 2: Transcription (Week 2)

  • OpenAI Whisper API integration
  • Loading states and error handling
  • Transcript preview

Phase 3: Export & Processing (Week 3)

  • File system access setup
  • Markdown generation
  • Claude API integration
  • Folder organization

Phase 4: Polish (Week 4)

  • Offline queue for transcription
  • Settings screen
  • Obsidian folder picker
  • Widget for quick recording

Technical Considerations for Pixel 7

Hardware Advantages

  • Tensor G2 chip - could run small whisper models
  • Good microphone array
  • Large battery

Android-Specific Features

  • Material You theming
  • Quick Settings tile
  • Home screen widget
  • Voice Assistant integration potential

Alternative: Termux + Python

For a quick hack without building a full app:

# Install Termux from F-Droid
pkg install python
pip install openai-whisper sounddevice

# Run existing Python script (modified)
python voice_recorder_android.py

Pros: Reuse existing code, fast to test Cons: Requires Termux, not user-friendly


Decision Matrix

Criteria React Native Flutter Kotlin PWA
Dev Speed Fast Medium Slow Fastest
Performance Good Great Best OK
Offline Possible Possible Yes No
iOS Support Yes Yes No Yes
Learning Curve Low Medium Medium Low
Maintenance Easy Easy More Easy

  1. Start with React Native + Expo for fastest MVP
  2. Use OpenAI Whisper API initially
  3. Direct file access to Obsidian vault
  4. Claude API (not CLI) for processing
  5. Add offline whisper.cpp later if needed

This approach gets a working app fastest while leaving room for optimization.


Next Steps

  • Set up React Native + Expo project
  • Design mobile UI mockups
  • Get OpenAI API key for Whisper
  • Get Claude API key
  • Test file system access on Pixel 7
  • Create basic recording prototype