- Voice stability
- Voice similarity
- Speaking rate
1. Voice stability
Voice stability controls how monotone vs. expressive the voice sounds.- More to the right: steadier, more formal, more consistent.
- More to the left: more emotional, friendlier, more dynamic.
2. Voice similarity
Voice similarity is a fine-tuning control for stability and closeness to the chosen base voice. If you want an emotional voice to keep that character more consistently, or stay closer to the original reference voice, move this slider further right.3. Speaking rate
Speaking rate depends heavily on the selected voice.- 1.0 is often a good baseline.
- Some voices are naturally slower or faster, so tune accordingly.
Recommended workflow
- Pick your voice first.
- Start with moderate default values.
- Change only one slider per test.
- Listen, compare, then continue.

