v2026.03.5 macOS (Apple Silicon) · MLX + ONNX · Windows coming soon

Clone any voice
in seconds

Local-first voice cloning, text-to-speech, PDF reader, and audiobook creator. Now with agentic MCP support for Codex and Claude Code. Optimized for Apple Silicon with native Metal acceleration via MLX and ONNX runtimes for multilingual TTS.

macOS (Apple Silicon) · MLX-Audio · Open Source

MimikaStudio
MimikaStudio - Main Interface

Create studio-quality AI voices locally, with zero cloud dependency

Instant Voice Cloning

Clone any voice from a 3-second sample. Qwen3-TTS learns timbre, pitch, and cadence while style prompts let you control emotion, pace, and tone.

PDF to Audiobook

Turn documents into full audiobooks with chapter markers and natural pacing. Follow along with sentence highlighting, or export hours of polished audio.

Video Voiceovers

Generate narration for YouTube, explainers, and ads. Switch between five TTS engines to find the perfect voice, then fine-tune with style controls.

Privacy-First Processing

Everything runs on your Mac with native Metal acceleration. No cloud uploads, no API limits, no subscriptions. Your voices stay on your device.

Voice Cloning
MimikaStudio Voice Cloning

Hear Mimika in action

Listen to samples generated by each TTS engine. For voice cloning demos, compare the reference voice with the generated output.

🎧 Qwen3-TTS

Voice cloning from a 3-second sample. Compare the reference voice with the generated output.

Natasha Clone

Genesis4 Style

Qwen3-TTS Voice Clone 23 Languages Style Prompt

Generated speech keeps the reference timbre while applying Genesis4 style control.

Reference
Generated
Qwen3-TTS
Suzan Clone

Genesis4 Style

Qwen3-TTS Voice Clone 23 Languages Style Prompt

Reference and generated clips align on pitch contour with a more polished studio-like cadence.

Reference
Generated
Qwen3-TTS
Ryan

Genesis4 Preset Voice

Qwen3-TTS Preset Voice English Long-form

Preset speaker profile with dependable pronunciation for general narration.

Qwen3-TTS
0:00
Natasha (Hebrew)

Multilingual Demo

Qwen3-TTS Voice Clone Hebrew Cross-language

Cross-language cloning maps the same vocal identity into multilingual output.

Reference
Generated
Qwen3-TTS

💬 ChatterBox

Expressive voice cloning with emotion control. Natural, emotive speech synthesis.

Natasha Clone

Emotional Speech Demo

ChatterBox Voice Clone Emotion Control Expressive

Generated output adds emotional dynamics while preserving speaker identity.

Reference
Generated
ChatterBox
Suzan Clone

Emotional Speech Demo

ChatterBox Voice Clone Emotion Control Expressive

Emotion-aware rendering demonstrates richer contour and emphasis with natural prosody shifts.

Reference
Generated
ChatterBox

Five engines, one studio

Each engine brings unique strengths. Use the right one for your task, or combine them for maximum flexibility.

Kokoro TTS

82M parameter model with sub-200ms latency. 21 British and American voices with speed control.

Fast 21 voices Local
🎧

Qwen3-TTS

Clone any voice from just 3 seconds of audio. 9 premium preset speakers with style instructions.

Voice Clone 3s reference Styles
🌎

Chatterbox

Multilingual voice cloning across 23 languages. Clone voices and speak in any supported language.

23 languages Multilingual

Supertonic

ONNX-based multilingual synthesis with low-latency local generation across preset voices and multiple languages.

ONNX Runtime Fast Multilingual

CosyVoice3

Standalone ONNX-based expressive speech synthesis with preset voices, style control, and emotional delivery for narration.

Expressive ONNX Runtime 10 languages
5
TTS Engines
23
Languages
30+
Built-in Voices
60+
API Endpoints

Supported Models

All models below appear in the in-app model manager with independent download and status tracking.

Kokoro-82M

Fast local TTS with British and American voice presets.

MLX 82M English
🎧

Qwen3-TTS

0.6B/1.7B Base + CustomVoice variants with optional 8-bit footprints.

MLX Voice Clone 8-bit options
🌎

Chatterbox Multilingual

Expressive multilingual voice cloning with shared voice-library support.

MLX 23 languages Clone

Supertonic-2

Dedicated ONNX multilingual TTS engine for fast preset-voice synthesis.

ONNX Runtime Preset voices 5 languages

CosyVoice3 ONNX

Standalone CosyVoice3 model package (separate from Supertonic) with expressive preset voices.

ONNX Runtime Independent model 10 languages

🎤 Kokoro

Built-in voices with natural prosody. Reading from the Book of Job.

Emma

Job 6:1-2 · British Female

Kokoro Local English Narration

Balanced pacing and crisp articulation for long-form reading with book-like clarity.

Kokoro
0:00
George

Job 14:7 · British Male

Kokoro Local English Narration

Lower-register delivery with stable rhythm suited for document narration and reports.

Kokoro
0:00
Lily

Job 42:5-6 · British Female

Kokoro Local English Narration

Warm and expressive tone with subtle emphasis on key phrases for natural spoken prose.

Kokoro
0:00

📖 Long-Form Audiobooks

Full audiobook excerpts generated with Kokoro TTS. H.G. Wells' "A Short History of the World" (~17 minutes each) at 0.95x speed for natural pacing.

Emma

British Female · 17+ minutes

Kokoro Audiobook British RP 0.95x Speed

Full audiobook narration with Emma's crisp British accent. Perfect for long-form document reading and study material.

Kokoro
0:00
George

British Male · 19+ minutes

Kokoro Audiobook British RP 0.95x Speed

Rich male narration with George's authoritative tone. Ideal for history and non-fiction audiobook exports.

Kokoro
0:00

CosyVoice3

Emotionally expressive standalone ONNX generation with preset voice styles and instant local playback.

F1

Genesis 4 Preview

CosyVoice3 Preset Voice Expressive ONNX Runtime

Female preset tuned for expressive scripture narration with stable pacing and natural emphasis.

CosyVoice3
0:00
M2

Genesis 4 Preview

CosyVoice3 Preset Voice Expressive Low-register

Male preset with grounded tone and clear phrasing for audiobook-style read-aloud playback.

CosyVoice3
0:00

Supertonic

Fast multilingual ONNX synthesis. Instant playback from bundled pre-generated voices.

F1

Genesis 4 Preview

Supertonic Local English ONNX Runtime

Clear high-register narration tuned for scripture-style reading with quick response time.

Supertonic
0:00
M2

Genesis 4 Preview

Supertonic Local English Scripture

Lower-register biblical narration that keeps a grounded tone while preserving clear verse cadence.

Supertonic
0:00

Build one shared voice library for every clone engine

The Voice Prompts tab is the staging area for cloning. Upload or import a prompt once, preview it, tag it, and then reuse it across Qwen3 Clone, Chatterbox, and IndexTTS-2 without duplicating setup.

  • Shared voice library across clone engines
  • Search, filter, preview, edit, and delete prompts from one screen
  • Default voices plus imported external references in the same table
  • Clone-ready prompts available immediately in generation screens
Voice prompt library with reusable clone voices

Capture a clean 20-second preview from YouTube

MimikaStudio can convert a YouTube clip into a reusable prompt locally. Paste the URL, choose an optional start time, listen to the extracted preview, then save it into the shared voice library with transcript and metadata.

  • Paste a YouTube URL and set an optional start offset
  • Download a short preview before committing anything to the library
  • Add transcript, gender, and language metadata when saving
  • Use the saved prompt for cloning in Qwen3, Chatterbox, and IndexTTS-2
Import a voice prompt from a YouTube preview clip

21 British & American voices

The fastest engine in the studio. Generate speech in under 200ms with fine-grained speed control and high naturalness for narration and dialogue.

  • Sub-200ms generation on MPS and CUDA
  • British & American voice selection
  • Adjustable speech speed and style per project
Kokoro TTS voice synthesis

Turn PDFs into audiobooks

Read PDFs aloud with sentence-by-sentence highlighting, or convert full PDFs into audiobooks with chapter markers. Audiobook generation uses Kokoro voices.

  • Live sentence highlighting as it reads
  • Export as WAV, MP3, or M4B with chapters
  • ~60 chars/sec on M2 MacBook Pro
  • Use any voice, including your cloned voices
PDF Reader & Audiobook Creator

Track every generation job in one place

The new Jobs tab shows all executed tasks across TTS, voice cloning, and audiobook exports. Review status instantly and replay outputs directly from the queue.

  • Unified history for TTS, clone, and audiobook jobs
  • Completed/in-progress status visibility in one queue
  • Inline playback controls for generated audio
Jobs queue with completed TTS and audiobook tasks

60+ endpoints. 50+ MCP tools. Full control.

Integrate MimikaStudio into your workflow with a comprehensive REST API and the Mimika MCP (Model Context Protocol) server. Use it programmatically from Claude Code, Codex, scripts, or your own applications.

  • Full REST API with Swagger docs
  • Mimika MCP server for Claude Code and Codex
  • Voice management, audiobook, and TTS endpoints

Example prompts once your client is connected to Mimika MCP at http://127.0.0.1:8010:

Codex: Use Mimika MCP tool audiobook_generate_from_file with file_path=/absolute/path/to/document.pdf, voice=bf_emma, output_format=mp3. Then poll audiobook_status until completed.

Claude Code: Call Mimika MCP audiobook_generate_from_file for /absolute/path/to/document.pdf with voice bf_emma and output_format mp3, then track audiobook_status every 10 seconds.

MCP & API Dashboard

Download models with one click

The built-in model manager lets you download and manage TTS models directly from the app. See model sizes, status, and switch between engines instantly.

  • One-click model downloads
  • Automatic model detection & status
  • Choose model size: 0.6B (fast) or 1.7B (quality)
Model Manager

Customize your workflow

Configure output folders, view app information, and manage your preferences. Everything you need to tailor MimikaStudio to your needs.

  • Custom output folder configuration
  • Version info and credits
  • Links to documentation and support
Settings and About

Speak in 23 languages

Chatterbox brings multilingual voice cloning. Clone a voice in English and speak in Japanese, or any other supported language. Hebrew requires the Dicta model, which can be downloaded directly from the app.

English German Spanish French Italian Japanese Korean Portuguese Russian Chinese Arabic Danish Hindi Dutch Norwegian Polish Swedish Turkish Filipino Malay Swahili Hebrew (Dicta model) Finnish

Optimized for Apple Silicon

MimikaStudio runs natively on macOS with MLX-Audio, Apple's machine learning framework. Native Metal acceleration on M1, M2, M3, and M4 chips. Windows support coming soon.

macOS + MLX-Audio

Native Metal acceleration via Apple's MLX framework. Optimized neural inference on M1, M2, M3, and M4 chips.

Windows Coming Soon

The codebase supports Windows with CUDA. Pre-built Windows binaries will be available in a future release.

Flutter Web UI

Access MimikaStudio from any browser. The same Flutter app runs as a web UI backed by the local API server.

Sub-200ms Latency

Kokoro TTS generates speech almost instantly. Real-time performance for interactive use cases.

Local & Private

No cloud, no accounts, no data leaves your machine. All processing happens on-device with local storage.

CLI & MCP Server

Full command-line interface and MCP server for automation. Integrate into Claude Code, Codex, or any workflow.

Simple Pricing

Start with a free trial, then upgrade to a lifetime license. No subscriptions, no recurring fees.

7 Days Full Access

Free Trial

$ 0

No credit card required

  • All TTS Engines
  • Voice Cloning
  • PDF Audiobook Creator
  • All Languages
Start Free Trial
Best Value
One-time Purchase

Pro License

$ 39.99

Lifetime access

  • Everything in Trial
  • Priority Support
  • Auto-updates via Sparkle
  • Commercial Use
Buy with Polar Buy with LemonSqueezy

Start cloning voices today

Runs locally on macOS (Apple Silicon). No account needed. Windows support coming soon.

macOS (Apple Silicon) · MLX-Audio · Open Source

Direct files: DMG v2026.03.5 · SHA256

Alpha Release

This is an early alpha version intended for testing and development. Features may be incomplete, unstable, or change significantly before the stable release. Please report any issues on GitHub.

Unsigned DMG Notice (Apple Gatekeeper)

As of February 19, 2026, the MimikaStudio DMG is not yet signed/notarized by Apple. If macOS blocks launch, you must remove the quarantine attribute and approve it via Gatekeeper.

  1. Open the DMG and drag MimikaStudio to /Applications.
  2. Remove the quarantine attribute by running this command in Terminal:
    # If installed to /Applications (system-wide):
    xattr -d com.apple.quarantine /Applications/MimikaStudio.app
    
    # If installed to ~/Applications (user-only):
    xattr -d com.apple.quarantine ~/Applications/MimikaStudio.app
    Why? macOS quarantines all downloaded apps. For unsigned apps, Gatekeeper may block execution. This command removes the quarantine flag.
  3. In Applications, right-click MimikaStudio and choose Open.
  4. Click Open again in the warning dialog.
  5. If still blocked, go to System Settings -> Privacy & Security -> Open Anyway for MimikaStudio.
  6. On first launch, wait for backend startup; the startup log screen is expected for a few seconds.
  7. On first use, download the required model from the in-app model card.