TTS models returning blank audio and repetitions

fran.aubry · April 6, 2025, 9:51pm

I’ve been using gpt-4o-mini-tts to build a tool to read documents for me. However, sometimes the output is quite long, containing the input text several times, separated by long empty audio. For example, I submit a text like “Once upon a time there was a cat.” would generate a long audio like “Once upon a time there was a cat[BLANK for 20 seconds]Once upon a time there was a cat[BLANK for 17 seconds]Once upon a time there was a cat”.

Is anyone else experiencing the same problem? If so, is there a way around this?

I doubt this is useful since it’s pretty standard, but here’s the code making the request:

    const mp3 = await openai.audio.speech.create({
      model: "gpt-4o-mini-tts",
      voice,
      input: text,
      instructions,
    });

Thank you all in advance.

aprendendo.next · April 6, 2025, 10:23pm

Yeah, it just happened to me too, after a little less than 1k input characters the audio stopped, then I thought it went wrong and was truncated, but then it continued after a very long period of silence.

I also noticed the audio tokens charged were a bit high, I don’t know if it was charging for the silence, but I didn’t have time to replicate what happened as it was in a compiled app that didn’t save logs. It charged like 15k audio tokens for about 2k tokens of text.

Topic		Replies	Views
Getting TTS hallucinations with long inputs in 40-mini API gpt-4o-mini	1	192	April 8, 2025
Issue with Incomplete Audio Output Using OpenAI's tts-1 Model API tts	2	964	May 31, 2024
GPT-4o-mini-tts Issues: Volume Fluctuations, Silence, Repetition, Distortion Feedback	6	840	April 10, 2025
High Costs Due to Silence or Noisy Segments in gpt-4o-audio-preview Outputs Bugs gpt-4o-audio-preview	5	404	February 24, 2025
Huge problems with TTS API Bugs tts	4	2152	May 27, 2024

TTS models returning blank audio and repetitions

Related topics