Getting metallic voice at slower speeds on speech API


First, I would like to thanks openai’s team for releasing the speech API, it has a very good quality at a reasonable price.

That said, when I opt for a lower speed like 0.7, the voices are getting a metallic sound. I tried both normal and HD quality.

Then I tried a speed like 0.99 and it still had a considerable difference in comparison to a 1.0 speed, which leads me to think it is probably related to how it is post processed.

Other TTS engines usually don’t have this effect, the quality goes down but the pitch is mantained.

It is bearable, but thought it was important to report in order to get it in the radar.

Again, thanks for all the new features! They will get me busy for some time lol

1 Like

I get the same metalic sounding voice when bringing the speed above 1 as well.

It’s pretty clear they are just performing pitch shifting time stretching on the AI generated voice.

Python as easy as pip install for voice: