New macOS · Windows · Web · CUDA · Apple MPS

Clone any voice
in seconds

Local-first voice cloning, text-to-speech, PDF reader, and audiobook creator. Runs on macOS (MPS), Windows (CUDA), and Web. Tested on RTX 4090 & 5090.

Get Started View on GitHub

macOS (MPS) · Windows (CUDA) · Web UI · Free & Open Source

Supported TTS & Voice Cloning Engines

Kokoro

Qwen3-TTS

Chatterbox

IndexTTS-2

Kokoro

Qwen3-TTS

Chatterbox

IndexTTS-2

Kokoro

Qwen3-TTS

Chatterbox

IndexTTS-2

Kokoro

Qwen3-TTS

Chatterbox

IndexTTS-2

TTS Engines

Four engines, one studio

Each engine brings unique strengths. Use the right one for your task, or combine them for maximum flexibility.

⚡

Kokoro TTS

82M parameter model with sub-200ms latency. 21 British and American voices with speed control.

Fast 21 voices IPA

🎧

Qwen3-TTS

Clone any voice from just 3 seconds of audio. 9 premium preset speakers with style instructions.

Voice Clone 3s reference Styles

🌎

Chatterbox

Multilingual voice cloning across 23 languages. Clone voices and speak in any supported language.

23 languages Multilingual

🔮

IndexTTS-2

High-fidelity voice cloning with a large 24GB model for maximum quality and naturalness.

Hi-Fi Large model

TTS Engines

Languages

30+

Built-in Voices

60+

API Endpoints

Voice Cloning

Clone any voice from a 3-second sample

Upload a short audio clip and Qwen3-TTS will learn the voice characteristics. Use style instructions to control emotion, pace, and tone.

3-second minimum reference audio
9 premium preset speakers (Ryan, Aiden, Vivian, Serena...)
Style instructions: "whisper softly", "excited", "formal"
Shared voice library across all engines

Kokoro TTS

21 voices with British IPA transcription

The fastest engine in the studio. Generate speech in under 200ms with fine-grained speed control. Includes Emma IPA for phonetic transcription powered by your choice of LLM.

Sub-200ms generation on MPS and CUDA
British & American voice selection
British phonetic (IPA) transcription via Claude or GPT

PDF Reader & Audiobooks

Turn any document into an audiobook

Read PDFs aloud with sentence-by-sentence highlighting, or convert entire documents to audiobooks with chapter markers. Supports PDF, EPUB, TXT, Markdown, and DOCX.

Live sentence highlighting as it reads
Export as WAV, MP3, or M4B with chapters
~60 chars/sec on M2 MacBook Pro
Use any voice, including your cloned voices

MCP & API

60+ endpoints. 50+ MCP tools. Full control.

Integrate MimikaStudio into your workflow with a comprehensive REST API and Model Context Protocol server. Use it programmatically from Claude Code, scripts, or your own applications.

Full REST API with Swagger docs
MCP server for Claude Code integration
Voice management, audiobook, and TTS endpoints

Model Manager

Download models with one click

The built-in model manager lets you download and manage TTS models directly from the app. See model sizes, status, and switch between engines instantly.

One-click model downloads
Automatic model detection & status
Choose model size: 0.6B (fast) or 1.7B (quality)

Languages

Speak in 23 languages

Chatterbox brings multilingual voice cloning. Clone a voice in English and speak in Japanese, or any other supported language.

English German Spanish French Italian Japanese Korean Portuguese Russian Chinese Arabic Danish Hindi Dutch Norwegian Polish Swedish Turkish Filipino Malay Swahili Hebrew Finnish

Built for Performance

Runs on macOS, Windows & Web

MimikaStudio runs natively on macOS with MPS acceleration, on Windows with NVIDIA CUDA (tested on RTX 4090 & 5090), and in any browser via Flutter Web.

macOS + Apple MPS

Metal Performance Shaders acceleration for M1, M2, M3, and M4. Optimized neural inference on Apple Silicon and Intel.

Windows + NVIDIA CUDA

Full CUDA support on Windows. Tested on RTX 4090 and RTX 5090 for maximum inference throughput.

Flutter Web UI

Access MimikaStudio from any browser. The same Flutter app runs as a web UI backed by the local API server.

Sub-200ms Latency

Kokoro TTS generates speech almost instantly. Real-time performance for interactive use cases.

Local & Private

No cloud, no accounts, no data leaves your machine. All processing happens on-device with local storage.

CLI & MCP Server

Full command-line interface and MCP server for automation. Integrate into Claude Code or any workflow.

Start cloning voices today

Free, open source, runs locally on macOS, Windows, and Web. No account needed.

Get Started View on GitHub

macOS (MPS) · Windows (CUDA) · Web · Tested on RTX 4090 & 5090

Clone any voicein seconds

Four engines, one studio

Kokoro TTS

Qwen3-TTS

Chatterbox

IndexTTS-2

Clone any voice from a 3-second sample

21 voices with British IPA transcription

Turn any document into an audiobook

60+ endpoints. 50+ MCP tools. Full control.

Download models with one click

Speak in 23 languages

Runs on macOS, Windows & Web

macOS + Apple MPS

Windows + NVIDIA CUDA

Flutter Web UI

Sub-200ms Latency

Local & Private

CLI & MCP Server

Start cloning voices today

Clone any voice
in seconds