I’ve been using gpt-4o-mini-tts
to build a tool to read documents for me. However, sometimes the output is quite long, containing the input text several times, separated by long empty audio. For example, I submit a text like “Once upon a time there was a cat.” would generate a long audio like “Once upon a time there was a cat[BLANK for 20 seconds]Once upon a time there was a cat[BLANK for 17 seconds]Once upon a time there was a cat”.
Is anyone else experiencing the same problem? If so, is there a way around this?
I doubt this is useful since it’s pretty standard, but here’s the code making the request:
const mp3 = await openai.audio.speech.create({
model: "gpt-4o-mini-tts",
voice,
input: text,
instructions,
});
Thank you all in advance.