Local-first voice cloning, text-to-speech, PDF reader, and audiobook creator. Optimized for Apple Silicon with native Metal acceleration via MLX.
Supported TTS & Voice Cloning Engines
Each engine brings unique strengths. Use the right one for your task, or combine them for maximum flexibility.
82M parameter model with sub-200ms latency. 21 British and American voices with speed control.
Clone any voice from just 3 seconds of audio. 9 premium preset speakers with style instructions.
Multilingual voice cloning across 23 languages. Clone voices and speak in any supported language.
Upload a short audio clip and Qwen3-TTS will learn the voice characteristics. Use style instructions to control emotion, pace, and tone.
The fastest engine in the studio. Generate speech in under 200ms with fine-grained speed control and high naturalness for narration and dialogue.
Read PDFs aloud with sentence-by-sentence highlighting, or convert full PDFs into audiobooks with chapter markers. Audiobook generation uses Kokoro voices.
Integrate MimikaStudio into your workflow with a comprehensive REST API and Model Context Protocol server. Use it programmatically from Claude Code, scripts, or your own applications.
The built-in model manager lets you download and manage TTS models directly from the app. See model sizes, status, and switch between engines instantly.
Configure output folders, view app information, and manage your preferences. Everything you need to tailor MimikaStudio to your needs.
Chatterbox brings multilingual voice cloning. Clone a voice in English and speak in Japanese, or any other supported language. Hebrew requires the Dicta model, which can be downloaded directly from the app.
MimikaStudio runs natively on macOS with MLX-Audio, Apple's machine learning framework. Native Metal acceleration on M1, M2, M3, and M4 chips. Windows support coming soon.
Native Metal acceleration via Apple's MLX framework. Optimized neural inference on M1, M2, M3, and M4 chips.
The codebase supports Windows with CUDA. Pre-built Windows binaries will be available in a future release.
Access MimikaStudio from any browser. The same Flutter app runs as a web UI backed by the local API server.
Kokoro TTS generates speech almost instantly. Real-time performance for interactive use cases.
No cloud, no accounts, no data leaves your machine. All processing happens on-device with local storage.
Full command-line interface and MCP server for automation. Integrate into Claude Code or any workflow.
Start with a free trial, then upgrade to a lifetime license. No subscriptions, no recurring fees.
No credit card required
Lifetime access
Runs locally on macOS (Apple Silicon). No account needed. Windows support coming soon.