Stream text to speech in chunks but my chunks are coming to fast

acro · February 21, 2024, 9:43pm

Im using openai.chat.completions with stream: true to create a streamed response from the chat. It outputs chunks which I am sending to my textToSpeech function. This function does create a audio but the chunks are all in a mess.

In python I created the same solution but I used something like this to handle the chunks:

        if (text_chunk := chunk.choices[0].delta.content) is not None:
            yield text_chunk
            tempResponse.append(text_chunk)
    
    response = ' '.join(word for word in tempResponse if word)
    response = re.sub(r'\s+', ' ', response)

and then I took one chunk at the time and generated the audio.

But when Im using stream in javascript all the audio chunks seems to come all at once.

How can I make them align so they are playing sorted and only one chunk at the time

async function chat(response) {
  const completion = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: response },
    ],
    stream: true,
  });

  for await (const chunk of completion) {
    console.log(chunk.choices[0].delta.content);
    textToSpeech(chunk.choices[0].delta.content);
  }
}

// text to speech
async function textToSpeech(text) {
  if (!text) {
    console.error("Error: Text is undefined or empty.");
    return;
  }
  try {
    const response = await axios.post(
      "https://api.openai.com/v1/audio/speech",
      {
        model: "tts-1",
        input: text,
        voice: "alloy",
      },
      {
        headers: {
          Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
          "Content-Type": "application/json",
        },
        responseType: "blob", // Specify the response type as arraybuffer
      }
    );

    // Create a Blob object from the response data
    const blob = new Blob([response.data], { type: "audio/mpeg" });

    // Create a temporary URL for the Blob object
    const url = window.URL.createObjectURL(blob);

    // Create an audio element and set its source to the Blob URL
    const audio = new Audio(url);

    // Play the audio
    audio.play();

    // Cleanup: Revoke the temporary URL when audio playback ends
    audio.addEventListener("ended", () => {
      window.URL.revokeObjectURL(url);
    });

    console.log("Speech generated successfully.");
  } catch (error) {
    console.error("Errora generating speech:", error);
  }
}

michael.simpson555 · February 22, 2024, 12:50am

try making a catch and for the chunks try making a hierarchy for the chunks so the first is first and the last is last not all of them at once just have to do some fine tuning to this should be good

Topic		Replies	Views
ChatCompletion stream to tts API gpt-4 , gpt-35-turbo , chatgpt , api , tts	2	2677	June 19, 2024
How to replace my GPT TTS call for better performance? API tts , audio	1	238	November 5, 2024
How to decrease the latency of Text-To-Speech API? API gpt-4 , api	6	3806	April 26, 2024
GPT4 audio preview with streaming of audio output API gpt-4	2	564	January 18, 2025
Cant setup streaming for code-gen API gpt-4	1	230	June 9, 2024

Stream text to speech in chunks but my chunks are coming to fast

Related topics