I need the OpenAI TTS implementation to return the generated audio at once when it is complete. At the moment, it seems it is streaming in. Is there any way to modify the API call to achieve this?

While the https response is “chunked” (if you dig into the protocol), that’s a pretty standard way to receive files, and to be able to resume them from position depending on implementation.

In fact, OpenAI might use a method that could be parsed as a buffered stream, but what is sent for everything except AAC and RAW is files with file headers.

Perhaps you can clarify your concern with the endpoint. You don’t have to use the audio until its done being received if you don’t want.