Features: - Audio recording with pause/resume and visual feedback - Local Whisper transcription (tiny/base/small models) - 7 note types: instructions, capture, meeting, idea, daily, review, journal - Claude CLI integration for intelligent note processing - PKM context integration (reads vault files for better processing) - Auto-organization into type-specific folders - Daily notes with yesterday's task carryover - Language-adaptive responses (matches transcript language) - Custom icon and Windows desktop shortcut helpers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.7 KiB
5.7 KiB
Voice Recorder Android Port - Brainstorm
Current App Features to Port
- Audio Recording - Record with pause/resume
- Whisper Transcription - Local AI transcription
- Note Type Selection - Meeting, Todo, Idea, Review, Journal
- Obsidian Export - Markdown files with YAML frontmatter
- Claude Processing - AI-powered note organization
- Folder Organization - Auto-sort by note type
Approach Options
Option 1: React Native + Expo (Recommended)
Pros:
- Cross-platform (iOS + Android)
- Large ecosystem, good documentation
- Hot reload for fast development
- Can use
expo-avfor audio recording - Good integration with file systems
Cons:
- Whisper would need cloud API or native module
- Claude CLI not available, would need API
Stack:
- React Native / Expo
- expo-av (recording)
- expo-file-system (file management)
- OpenAI Whisper API or whisper.cpp native module
- Claude API (not CLI)
- Obsidian sync via shared folder or plugin
Effort: Medium (2-3 weeks for MVP)
Option 2: Flutter
Pros:
- Single codebase for Android/iOS
- Fast performance with native compilation
- Good audio packages (record, just_audio)
- Material Design 3 built-in
Cons:
- Dart learning curve
- Whisper integration more complex
- Smaller ecosystem than React Native
Stack:
- Flutter / Dart
- record package (audio)
- whisper_flutter or cloud API
- Claude API
- path_provider for file storage
Effort: Medium-High (3-4 weeks for MVP)
Option 3: Native Kotlin (Android Only)
Pros:
- Best performance
- Full Android API access
- Can integrate whisper.cpp directly
- Better battery optimization
- Works offline
Cons:
- Android only (no iOS)
- More code to maintain
- Longer development time
Stack:
- Kotlin + Jetpack Compose
- MediaRecorder API
- whisper.cpp via JNI (local transcription)
- Claude API
- Storage Access Framework for Obsidian folder
Effort: High (4-6 weeks for MVP)
Option 4: PWA (Progressive Web App)
Pros:
- Works on any device with browser
- No app store needed
- Shared codebase with potential web app
- Easy updates
Cons:
- Limited audio recording capabilities
- No background processing
- Can't access file system directly
- Requires internet for Whisper
Stack:
- Vue.js or React
- MediaRecorder Web API
- Whisper API (cloud)
- Claude API
- Download files or sync via Obsidian plugin
Effort: Low-Medium (1-2 weeks for MVP)
Whisper Integration Options
Cloud-based (Easier)
- OpenAI Whisper API - $0.006/min, reliable
- Replicate - Pay per use, hosted models
- Self-hosted - Run whisper on home server/NAS
On-device (Harder but offline)
- whisper.cpp - C++ port, works on Android via JNI
- whisper-android - Pre-built Android bindings
- ONNX Runtime - Run whisper.onnx model
Recommendation: Start with OpenAI API, add offline later
Obsidian Sync Options
Option A: Direct File Access
- Use Android's Storage Access Framework
- User grants access to Obsidian vault folder
- Write markdown files directly
- Works with Obsidian Sync, Syncthing, etc.
Option B: Obsidian Plugin
- Create companion plugin for Obsidian
- App sends notes via local HTTP server
- Plugin receives and saves notes
- More complex but cleaner UX
Option C: Share Intent
- Use Android share functionality
- Share transcribed note to Obsidian
- User manually saves
- Simplest but requires user action
Recommendation: Option A (direct file access)
Recommended MVP Approach
Phase 1: Core Recording (Week 1)
- React Native + Expo setup
- Basic UI matching desktop app style
- Audio recording with pause/resume
- Timer display
- Note type selection
Phase 2: Transcription (Week 2)
- OpenAI Whisper API integration
- Loading states and error handling
- Transcript preview
Phase 3: Export & Processing (Week 3)
- File system access setup
- Markdown generation
- Claude API integration
- Folder organization
Phase 4: Polish (Week 4)
- Offline queue for transcription
- Settings screen
- Obsidian folder picker
- Widget for quick recording
Technical Considerations for Pixel 7
Hardware Advantages
- Tensor G2 chip - could run small whisper models
- Good microphone array
- Large battery
Android-Specific Features
- Material You theming
- Quick Settings tile
- Home screen widget
- Voice Assistant integration potential
Alternative: Termux + Python
For a quick hack without building a full app:
# Install Termux from F-Droid
pkg install python
pip install openai-whisper sounddevice
# Run existing Python script (modified)
python voice_recorder_android.py
Pros: Reuse existing code, fast to test Cons: Requires Termux, not user-friendly
Decision Matrix
| Criteria | React Native | Flutter | Kotlin | PWA |
|---|---|---|---|---|
| Dev Speed | Fast | Medium | Slow | Fastest |
| Performance | Good | Great | Best | OK |
| Offline | Possible | Possible | Yes | No |
| iOS Support | Yes | Yes | No | Yes |
| Learning Curve | Low | Medium | Medium | Low |
| Maintenance | Easy | Easy | More | Easy |
Recommended Path
- Start with React Native + Expo for fastest MVP
- Use OpenAI Whisper API initially
- Direct file access to Obsidian vault
- Claude API (not CLI) for processing
- Add offline whisper.cpp later if needed
This approach gets a working app fastest while leaving room for optimization.
Next Steps
- Set up React Native + Expo project
- Design mobile UI mockups
- Get OpenAI API key for Whisper
- Get Claude API key
- Test file system access on Pixel 7
- Create basic recording prototype