Truncated Scentences for TTS API, missing in generated audio

Hello Forum,

I use TTS for a KI-Voice-Chat (a “Game-Master”) . Rarely I recognise, that the last Scentence is truncated, I get only the frist part in the generated audio. Very strange: It seems to clip always on the last scentence.

E.g. for “I’m Jack from Roll-Bonz, but I don’t care about your food cravings — I sell tough vehicles, not snacks. The BonsAI is a robust mechanical beast with a stainless steel frame and no electronics. Why waste time on this nonsense?”,

I get a 120k Blob (.opus/webm), without “Why waste time on this nonsense?".

If I repeat exactly the same request, I get all.

Is this a known issue?

I use the /api.openai.com/v1/audio/speech

This is from my logs:
Text[227]:‘I’m Jack from Roll-Bonz, but I don’t care about your food cravings — I sell tough vehicles, not snacks. The BonsAI is …ronics. Why waste time on this nonsense?’ Blob[123765]
Text[227]:‘I’m Jack from Roll-Bonz, but I don’t care about your food cravings — I sell tough vehicles, not snacks. The BonsAI is …ronics. Why waste time on this nonsense?’ Blob[139670]

Thanks for your help!
Best regards, Jo

2 Likes

Hi and welcome to the community!

Yes, I observed the same behavior in recent tests. The output stops after the first part of the message, which matches your case.

This most likely happens because the pinned model was updated to a newer version two days ago. If you explicitly select the previous snapshot, ‘gpt-4o-mini-tts-2025-03-20’, the behavior should revert to the earlier state.

2 Likes

Hi,

thanks. So i will wait and hope it was only a temp. issue. Yes, thebnew voice release sounds better, however , the previous were able to ‘roll the R’ for German/Bavarian/Austrian accent. Now it sounds more German/Palatin…

Best regards, Jo

2 Likes