Mimic 3



{% if show_openapi %} API {% endif %} Docs
Advanced Settings
More about SSML

Process some Speech Synthesis Markup Language tags in the text above.

Examples: Documentation
<break time="500ms" /> Insert pause
<prosody volume="50%">...</prosody> Change volume
<prosody rate="200%">...</prosody> Change speaking rate
<voice name="en_US/vctk_low#p239">...</voice> Change voice
More about speaking rate

Controls how fast the voice speaks the text. A value of 1 is the speed of the training dataset. Less than 1 is faster, and more than 1 is slower.

More about audio volatility

The amount of noise added to the generated audio (0-1). Can help mask audio artifacts from the voice model. Multi-speaker models tend to sound better with a lower amount of noise than single speaker models.

More about phoneme volatility

The amount of noise used to generate phoneme durations (0-1). Allows for variable speaking cadance, with a value closer to 1 being more variable. Multi-speaker models tend to sound better with a lower amount of phoneme variability than single speaker models.

About the Beta

This website hosts a beta version of Mimic 3, Mycroft's newest text to speech system developed for the Mark II. When released, Mimic 3 will be available to run locally on Linux systems like the Raspberry Pi 4.

We are interested in hearing your feedback, especially on the non-English language voices! We hope to improve the quality and accuracy of every voice over time 😀

Some notes on the performance of Mimic 3 and this website:


License   •   Terms of Use

Privacy: this website does not store the text you send or the audio that is synthesized.