Mimic 3
Advanced Settings
More about SSML
Process some Speech Synthesis Markup Language tags in the text above.
Examples: | Documentation |
<break time="500ms" /> | Insert pause |
<prosody volume="50%">...</prosody> | Change volume |
<prosody rate="200%">...</prosody> | Change speaking rate |
<voice name="en_US/vctk_low#p239">...</voice> | Change voice |
More about speaking rate
Controls how fast the voice speaks the text. A value of 1 is the speed of the training dataset. Less than 1 is faster, and more than 1 is slower.
More about audio volatility
The amount of noise added to the generated audio (0-1). Can help mask audio artifacts from the voice model. Multi-speaker models tend to sound better with a lower amount of noise than single speaker models.
More about phoneme volatility
The amount of noise used to generate phoneme durations (0-1). Allows for variable speaking cadance, with a value closer to 1 being more variable. Multi-speaker models tend to sound better with a lower amount of phoneme variability than single speaker models.
About the Beta
This website hosts a beta version of Mimic 3, Mycroft's newest text to speech system developed for the Mark II. When released, Mimic 3 will be available to run locally on Linux systems like the Raspberry Pi 4.
We are interested in hearing your feedback, especially on the non-English language voices! We hope to improve the quality and accuracy of every voice over time 😀
Some notes on the performance of Mimic 3 and this website:
- Mimic 3 is running without any GPUs (CPU only)
- This website is shared among all beta reviewers
- Caching is disabled, so each request is synthesized fresh
Privacy: this website does not store the text you send or the audio that is synthesized.