I was testing the new TTS models of GPT-4o Mini on the OpenAI FM platform and noticed that every time I pressed the play button, I got a different result with the same prompt and text. I checked the API docs but couldn’t find a temperature setting for these models.
My question is: how can I get the same result with the same text and prompt?
That is not a feature even when using the standard realtime audio model.
Some quirk of how the gpt-4o-based audio models work will make them go batty with garbage noise at low or zero temperature. They need the higher temperature, perhaps for pattern-breaking.
But just in case you were wondering: the latest API SDK also doesn’t validate such an unknown parameter and let it be sent.
TypeError: Speech.create() got an unexpected keyword argument 'temperature'
Making the request yourself, you can send about anything you want as made-up parameter and they are dropped, without audio symptom for temperature.
speed also has no effect. Looks like they didn’t implement whatever post-processing sped up tts-1.
You’re saying there is no way to have same result with the same text and the same prompt?
There’s no way to get the same result each time.
You get a creative telling each time - instead of one unacceptable result you’d be stuck with.