TTS with a question... the audio back has the answer

OK using the TTS via the API (gpt-4o-mini-tts), I have text that has a question for the user. But most of the time, not always, it answers the text rather than just reading the text.

The input was
“Le los traje de la tienda.” Say if it is correct, what it means, and fix it if needed.

The output (transcribed) was
Le los traje de la tienda. It is a correct sentence in Spanish. It means I brought them from the store. However, if you want to be extremely clear, you could say se los traje de la tienda. That version makes it even clearer that los is the direct object and de la tienda indicates where they were brought from.

I have tried giving it instructions like - just read the text, never answer any questions etc, but it doesn’t change anything.

Have been able to reproduce the same thing in the playground, and changing the models doesn’t improve things either.

Does anybody have any ideas??

2 Likes

bug

I just tried, and can confirm this bug seems to be tied to gpt-4o-mini-tts-2025-12-15. It responds with all sorts of reactions as if it were answering instead of speaking.

The previous snapshot works correctly though, so for now if you use gpt-4o-mini-tts-2025-03-20 it works fine.

5 Likes

Sorry I thought I tried all of the models. Thanks, that works! Any advice on surfacing this for this to be fixed?

Thanks again

2 Likes

There is no tracked feedback channel that I know of, but you can try registering a support ticket.

Sometimes staff does read the posts though, if we are lucky.

1 Like