TTS Talking Speed (Words per minute)

Does anyone know where I could find information on the TTS models’ talking speeds in word per minute? Thank you!

Hey there!

So, you’re probably not going to find anything that’s words per minute, because typically, text is processed (streamed) in different “chunks”. Meaning, it would return speech after it processes x amount of text. Doing it per word would end up slowing everything down to the point where it’s unusable.

What you could do is process text, and then speed up the resultant audio file by x amount to increase the speed of the audio.

1 Like

Interesting question!

I don’t think there is any official information about this that is publically available.

It would be interesting to test out, grab some standard bit of text and generate speech files for all the voices and see if there is any appreciable difference.

So… I did just that.

Using two standard text passages, the “Grandfather Passage” and “Rainbow Passage” I tested all of the available voices with the default speed.

Voice Passage Time WPM
alloy grandfather 0.7432 177.6103
rainbow 1.8536 177.4924
echo grandfather 0.7388 178.6681
rainbow 1.8508 177.7610
fable grandfather 0.7452 177.1337
rainbow 1.8716 175.7854
nova grandfather 0.7384 178.7649
rainbow 1.8424 178.5714
onyx grandfather 0.7392 178.5714
rainbow 1.8524 177.6074
shimmer grandfather 0.7440 177.4194
rainbow 1.8660 176.3130

So, the answer seems to be the voices are pegged at about 178 WPM, which is super fast.

2 Likes