v2026.02 macOS (Apple Silicon) · MLX Native · Windows coming soon

Clone any voice
in seconds

Local-first voice cloning, text-to-speech, PDF reader, and audiobook creator. Optimized for Apple Silicon with native Metal acceleration via MLX.

macOS (Apple Silicon) · MLX-Audio · Open Source

MimikaStudio - PDF Reader and Audiobook Creator

Three engines, one studio

Each engine brings unique strengths. Use the right one for your task, or combine them for maximum flexibility.

Kokoro TTS

82M parameter model with sub-200ms latency. 21 British and American voices with speed control.

Fast 21 voices Local
🎧

Qwen3-TTS

Clone any voice from just 3 seconds of audio. 9 premium preset speakers with style instructions.

Voice Clone 3s reference Styles
🌎

Chatterbox

Multilingual voice cloning across 23 languages. Clone voices and speak in any supported language.

23 languages Multilingual
3
TTS Engines
23
Languages
30+
Built-in Voices
60+
API Endpoints

Clone any voice from a 3-second sample

Upload a short audio clip and Qwen3-TTS will learn the voice characteristics. Use style instructions to control emotion, pace, and tone.

  • 3-second minimum reference audio
  • 9 premium preset speakers (Ryan, Aiden, Vivian, Serena...)
  • Style instructions: "whisper softly", "excited", "formal"
  • Shared voice library across all engines
Voice Cloning Interface

21 British & American voices

The fastest engine in the studio. Generate speech in under 200ms with fine-grained speed control and high naturalness for narration and dialogue.

  • Sub-200ms generation on MPS and CUDA
  • British & American voice selection
  • Adjustable speech speed and style per project
Kokoro TTS voice synthesis

Turn PDFs into audiobooks

Read PDFs aloud with sentence-by-sentence highlighting, or convert full PDFs into audiobooks with chapter markers. Audiobook generation uses Kokoro voices.

  • Live sentence highlighting as it reads
  • Export as WAV, MP3, or M4B with chapters
  • ~60 chars/sec on M2 MacBook Pro
  • Use any voice, including your cloned voices
PDF Reader & Audiobook Creator

60+ endpoints. 50+ MCP tools. Full control.

Integrate MimikaStudio into your workflow with a comprehensive REST API and Model Context Protocol server. Use it programmatically from Claude Code, scripts, or your own applications.

  • Full REST API with Swagger docs
  • MCP server for Claude Code integration
  • Voice management, audiobook, and TTS endpoints
MCP & API Dashboard

Download models with one click

The built-in model manager lets you download and manage TTS models directly from the app. See model sizes, status, and switch between engines instantly.

  • One-click model downloads
  • Automatic model detection & status
  • Choose model size: 0.6B (fast) or 1.7B (quality)
Model Manager

Customize your workflow

Configure output folders, view app information, and manage your preferences. Everything you need to tailor MimikaStudio to your needs.

  • Custom output folder configuration
  • Version info and credits
  • Links to documentation and support
Settings and About

Speak in 23 languages

Chatterbox brings multilingual voice cloning. Clone a voice in English and speak in Japanese, or any other supported language. Hebrew requires the Dicta model, which can be downloaded directly from the app.

English German Spanish French Italian Japanese Korean Portuguese Russian Chinese Arabic Danish Hindi Dutch Norwegian Polish Swedish Turkish Filipino Malay Swahili Hebrew (Dicta model) Finnish

Optimized for Apple Silicon

MimikaStudio runs natively on macOS with MLX-Audio, Apple's machine learning framework. Native Metal acceleration on M1, M2, M3, and M4 chips. Windows support coming soon.

macOS + MLX-Audio

Native Metal acceleration via Apple's MLX framework. Optimized neural inference on M1, M2, M3, and M4 chips.

Windows Coming Soon

The codebase supports Windows with CUDA. Pre-built Windows binaries will be available in a future release.

Flutter Web UI

Access MimikaStudio from any browser. The same Flutter app runs as a web UI backed by the local API server.

Sub-200ms Latency

Kokoro TTS generates speech almost instantly. Real-time performance for interactive use cases.

Local & Private

No cloud, no accounts, no data leaves your machine. All processing happens on-device with local storage.

CLI & MCP Server

Full command-line interface and MCP server for automation. Integrate into Claude Code or any workflow.

Simple Pricing

Start with a free trial, then upgrade to a lifetime license. No subscriptions, no recurring fees.

7 Days Full Access

Free Trial

$ 0

No credit card required

  • All TTS Engines
  • Voice Cloning
  • PDF Audiobook Creator
  • All Languages
Start Free Trial
Best Value
One-time Purchase

Pro License

$ 39.99

Lifetime access

  • Everything in Trial
  • Priority Support
  • Auto-updates via Sparkle
  • Commercial Use
Buy with Polar Buy with LemonSqueezy

Start cloning voices today

Runs locally on macOS (Apple Silicon). No account needed. Windows support coming soon.

macOS (Apple Silicon) · MLX-Audio · Open Source