Local-first voice cloning, text-to-speech, PDF reader, and audiobook creator. Now with agentic MCP support for Codex and Claude Code. Optimized for Apple Silicon with native Metal acceleration via MLX and ONNX runtimes for multilingual TTS.
Listen to complete chapters generated locally on Apple Silicon. Qwen3 Clone delivers studio-quality narration from your custom voice prompts.
Voice-cloned narration with natural prosody and emotional depth. "A Scandal in Bohemia" from Conan Doyle's Sherlock Holmes (public domain). Each audiobook is ~43 minutes of continuous speech generated entirely on-device.
Sherlock Holmes · 43m 20s
"A Scandal in Bohemia" with Yelena's cloned voice. Full 20-page chapter with natural pacing and expression.
Qwen3Sherlock Holmes · 45m 36s
"A Scandal in Bohemia" with Mikhail's cloned voice. Holmes and Watson with expressive male narration.
Qwen3Sherlock Holmes · 43m 40s
"A Scandal in Bohemia" with Anastasia's cloned voice. Elegant narration with refined vocal character.
Qwen3Sherlock Holmes · 43m 55s
"A Scandal in Bohemia" with Svetlana's cloned voice. Warm narration with distinctive timbre.
Qwen3Built-in voices with quick generation. Marcus Aurelius' "Meditations" (public domain). Ideal for rapid prototyping and shorter texts.
British Female · 12m 28s
"Meditations" with Emma's crisp British accent. Good for quick audiobook drafts.
KokoroBritish Male · 13m 36s
"Meditations" with George's grounded tone. Suitable for essay-length exports.
KokoroGenesis 1:1 generated with shipped Mimika voices across every supported Qwen3 clone language, including Korean with Eleanor.
Voice: Alistair
In the beginning God created the heaven and the earth.
Qwen3-TTSVoice: Anastasia
่ตทๅ๏ผ็ฅๅ้ ๅคฉๅฐใ
Qwen3-TTSVoice: Beatrice
ๅใใซใ็ฅใฏๅคฉใจๅฐใๅต้ ใใใใ
Qwen3-TTSVoice: Eleanor
ํ์ด์ ํ๋๋์ด ์ฒ์ง๋ฅผ ์ฐฝ์กฐํ์๋๋ผ.
Qwen3-TTSVoice: Harriet
Am Anfang schuf Gott Himmel und Erde.
Qwen3-TTSVoice: Mikhail
Au commencement, Dieu crรฉa les cieux et la terre.
Qwen3-TTSVoice: Svetlana
ะ ะฝะฐัะฐะปะต ัะพัะฒะพัะธะป ะะพะณ ะฝะตะฑะพ ะธ ะทะตะผะปั.
Qwen3-TTSVoice: Yelena
No princรญpio, Deus criou os cรฉus e a terra.
Qwen3-TTSVoice: Alistair
En el principio creรณ Dios los cielos y la tierra.
Qwen3-TTSVoice: Anastasia
Nel principio Dio creรฒ i cieli e la terra.
Qwen3-TTSClone any voice from a 3-second sample. Qwen3-TTS learns timbre, pitch, and cadence while style prompts let you control emotion, pace, and tone.
Turn documents into full audiobooks with chapter markers and natural pacing. Follow along with sentence highlighting, or export hours of polished audio.
Generate narration for YouTube, explainers, and ads. Switch between four TTS engines to find the perfect voice, then fine-tune with style controls.
Everything runs on your Mac with native Metal acceleration. No cloud uploads, no API limits, no subscriptions. Your voices stay on your device.
Listen to samples generated by each TTS engine. For voice cloning demos, compare the reference voice with the generated output.
Voice cloning from a 3-second sample. Compare the reference voice with the generated output.
Genesis4 Style
Generated speech keeps the reference timbre while applying Genesis4 style control.
Genesis4 Style
Reference and generated clips align on pitch contour with a more polished studio-like cadence.
Genesis4 Preset Voice
Preset speaker profile with dependable pronunciation for general narration.
Qwen3-TTSMultilingual Demo
Cross-language cloning maps the same vocal identity into multilingual output.
Expressive voice cloning with emotion control. Natural, emotive speech synthesis.
Emotional Speech Demo
Generated output adds emotional dynamics while preserving speaker identity.
Emotional Speech Demo
Emotion-aware rendering demonstrates richer contour and emphasis with natural prosody shifts.
Supported TTS & Voice Cloning Engines
Each engine brings unique strengths. Use the right one for your task, or combine them for maximum flexibility.
82M parameter model with sub-200ms latency. 21 British and American voices with speed control.
Clone any voice from just 3 seconds of audio. 9 premium preset speakers with style instructions.
Multilingual voice cloning across 23 languages. Clone voices and speak in any supported language.
ONNX-based multilingual synthesis with low-latency local generation across preset voices and multiple languages.
All models below appear in the in-app model manager with independent download and status tracking.
0.6B/1.7B Base + CustomVoice variants with optional 8-bit footprints and 10 clone languages. Mimika ships 8 custom-designed reference voices: Alistair, Anastasia, Beatrice, Eleanor, Harriet, Mikhail, Svetlana, and Yelena.
Built-in voices with natural prosody. Reading from the Book of Job.
Job 6:1-2 · British Female
Balanced pacing and crisp articulation for long-form reading with book-like clarity.
KokoroJob 14:7 · British Male
Lower-register delivery with stable rhythm suited for document narration and reports.
KokoroJob 42:5-6 · British Female
Warm and expressive tone with subtle emphasis on key phrases for natural spoken prose.
KokoroFast multilingual ONNX synthesis. Instant playback from bundled pre-generated voices.
Genesis 4 Preview
Clear high-register narration tuned for scripture-style reading with quick response time.
SupertonicGenesis 4 Preview
Lower-register biblical narration that keeps a grounded tone while preserving clear verse cadence.
SupertonicThe Voice Prompts tab is the staging area for cloning. Upload or import a prompt once, preview it, tag it, and then reuse it across Qwen3 Clone and Chatterbox without duplicating setup.
The Voice Prompt Management screen keeps the Voice Library and Import URL workflows together. Paste a YouTube URL, choose an optional start time, audition the extracted preview, then save it into the shared voice library with transcript and metadata.
The fastest engine in the studio. Generate speech in under 200ms with fine-grained speed control and high naturalness for narration and dialogue.
Read PDFs aloud with sentence-by-sentence highlighting, or convert full PDFs into audiobooks with chapter markers. Audiobook generation uses Kokoro voices.
The new Jobs tab shows all executed tasks across TTS, voice cloning, and audiobook exports. Review status instantly and replay outputs directly from the queue.
Integrate MimikaStudio into your workflow with a comprehensive REST API and the Mimika MCP (Model Context Protocol) server. The in-app Settings > MCP screen exposes the live tool and endpoint catalog for Claude Code, Codex, scripts, and your own applications.
Example prompts once your client is connected to Mimika MCP at http://127.0.0.1:8010:
Codex: Use Mimika MCP tool audiobook_generate_from_file with file_path=/absolute/path/to/document.pdf, voice=bf_emma, output_format=mp3. Then poll audiobook_status until completed.
Claude Code: Call Mimika MCP audiobook_generate_from_file for /absolute/path/to/document.pdf with voice bf_emma and output_format mp3, then track audiobook_status every 10 seconds.
The built-in model manager lets you download and manage TTS models directly from the app. See model sizes, status, and switch between engines instantly.
Configure output folders, view app information, and manage your preferences. Everything you need to tailor MimikaStudio to your needs.
Qwen3 voice cloning supports 10 languages, and Mimika ships 8 custom-designed reference voices for those demos: Alistair, Anastasia, Beatrice, Eleanor, Harriet, Mikhail, Svetlana, and Yelena. Chatterbox extends multilingual cloning to 23 languages. Hebrew requires the Dicta model, which can be downloaded directly from the app.
Chatterbox brings multilingual voice cloning across 23 languages. Clone a voice in English and speak in Japanese, Hebrew, or any other supported language.
MimikaStudio runs natively on macOS with MLX-Audio, Apple's machine learning framework. Native Metal acceleration on M1, M2, M3, and M4 chips. Windows support coming soon.
Native Metal acceleration via Apple's MLX framework. Optimized neural inference on M1, M2, M3, and M4 chips.
The codebase supports Windows with CUDA. Pre-built Windows binaries will be available in a future release.
Access MimikaStudio from any browser. The same Flutter app runs as a web UI backed by the local API server.
Kokoro TTS generates speech almost instantly. Real-time performance for interactive use cases.
No cloud, no accounts, no data leaves your machine. All processing happens on-device with local storage.
Full command-line interface and MCP server for automation. Integrate into Claude Code, Codex, or any workflow.
Runs locally on macOS (Apple Silicon). No account needed. Download directly from GitHub releases. Windows support coming soon.
This is an early alpha version intended for testing and development. Features may be incomplete, unstable, or change significantly before the stable release. Please report any issues on GitHub.
As of April 1, 2026, the MimikaStudio DMG is not yet signed/notarized by Apple. If macOS blocks launch, you must remove the quarantine attribute and approve it via Gatekeeper.
/Applications.# If installed to /Applications (system-wide):
xattr -d com.apple.quarantine /Applications/MimikaStudio.app
# If installed to ~/Applications (user-only):
xattr -d com.apple.quarantine ~/Applications/MimikaStudio.app
Why? macOS quarantines all downloaded apps. For unsigned apps, Gatekeeper may block execution. This command removes the quarantine flag.