New macOS · Windows · Web · CUDA · Apple MPS

Clone any voice
in seconds

Local-first voice cloning, text-to-speech, PDF reader, and audiobook creator. Runs on macOS (MPS), Windows (CUDA), and Web. Tested on RTX 4090 & 5090.

macOS (MPS) · Windows (CUDA) · Web UI · Free & Open Source

MimikaStudio - Voice Cloning & TTS

Four engines, one studio

Each engine brings unique strengths. Use the right one for your task, or combine them for maximum flexibility.

Kokoro TTS

82M parameter model with sub-200ms latency. 21 British and American voices with speed control.

Fast 21 voices IPA
🎧

Qwen3-TTS

Clone any voice from just 3 seconds of audio. 9 premium preset speakers with style instructions.

Voice Clone 3s reference Styles
🌎

Chatterbox

Multilingual voice cloning across 23 languages. Clone voices and speak in any supported language.

23 languages Multilingual
🔮

IndexTTS-2

High-fidelity voice cloning with a large 24GB model for maximum quality and naturalness.

Hi-Fi Large model
4
TTS Engines
23
Languages
30+
Built-in Voices
60+
API Endpoints

Clone any voice from a 3-second sample

Upload a short audio clip and Qwen3-TTS will learn the voice characteristics. Use style instructions to control emotion, pace, and tone.

  • 3-second minimum reference audio
  • 9 premium preset speakers (Ryan, Aiden, Vivian, Serena...)
  • Style instructions: "whisper softly", "excited", "formal"
  • Shared voice library across all engines
Voice Cloning Interface

21 voices with British IPA transcription

The fastest engine in the studio. Generate speech in under 200ms with fine-grained speed control. Includes Emma IPA for phonetic transcription powered by your choice of LLM.

  • Sub-200ms generation on MPS and CUDA
  • British & American voice selection
  • British phonetic (IPA) transcription via Claude or GPT
Kokoro TTS with IPA Transcription

Turn any document into an audiobook

Read PDFs aloud with sentence-by-sentence highlighting, or convert entire documents to audiobooks with chapter markers. Supports PDF, EPUB, TXT, Markdown, and DOCX.

  • Live sentence highlighting as it reads
  • Export as WAV, MP3, or M4B with chapters
  • ~60 chars/sec on M2 MacBook Pro
  • Use any voice, including your cloned voices
PDF Reader & Audiobook Creator

60+ endpoints. 50+ MCP tools. Full control.

Integrate MimikaStudio into your workflow with a comprehensive REST API and Model Context Protocol server. Use it programmatically from Claude Code, scripts, or your own applications.

  • Full REST API with Swagger docs
  • MCP server for Claude Code integration
  • Voice management, audiobook, and TTS endpoints
MCP & API Dashboard

Download models with one click

The built-in model manager lets you download and manage TTS models directly from the app. See model sizes, status, and switch between engines instantly.

  • One-click model downloads
  • Automatic model detection & status
  • Choose model size: 0.6B (fast) or 1.7B (quality)
Model Manager

Speak in 23 languages

Chatterbox brings multilingual voice cloning. Clone a voice in English and speak in Japanese, or any other supported language.

English German Spanish French Italian Japanese Korean Portuguese Russian Chinese Arabic Danish Hindi Dutch Norwegian Polish Swedish Turkish Filipino Malay Swahili Hebrew Finnish

Runs on macOS, Windows & Web

MimikaStudio runs natively on macOS with MPS acceleration, on Windows with NVIDIA CUDA (tested on RTX 4090 & 5090), and in any browser via Flutter Web.

macOS + Apple MPS

Metal Performance Shaders acceleration for M1, M2, M3, and M4. Optimized neural inference on Apple Silicon and Intel.

Windows + NVIDIA CUDA

Full CUDA support on Windows. Tested on RTX 4090 and RTX 5090 for maximum inference throughput.

Flutter Web UI

Access MimikaStudio from any browser. The same Flutter app runs as a web UI backed by the local API server.

Sub-200ms Latency

Kokoro TTS generates speech almost instantly. Real-time performance for interactive use cases.

Local & Private

No cloud, no accounts, no data leaves your machine. All processing happens on-device with local storage.

CLI & MCP Server

Full command-line interface and MCP server for automation. Integrate into Claude Code or any workflow.

Start cloning voices today

Free, open source, runs locally on macOS, Windows, and Web. No account needed.

macOS (MPS) · Windows (CUDA) · Web · Tested on RTX 4090 & 5090