Implementing a 'Stop Generating' Function for OpenAI Streams

Hi everyone,

I’d like to further discuss how to control the termination of OpenAI API stream calls to prevent excess token generation. This topic was brought up previously (for some reason I can’t include links here but the path is “/t/chatgpts-stop-generating-function-how-to-implement/235121/11”), but a clear code solution is still needed.

The Challenge

When using the stream: true parameter in the openai.chat.completions.create function, the API will continuously generate tokens until the stream naturally closes. How can we introduce a mechanism to stop the generation on demand?

Here’s the provided code in API REF without any show case:

import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const completion = await openai.chat.completions.create({
    model: "gpt-4-1106-preview",
    messages: [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    stream: true,
  });

  for await (const chunk of completion) {
    console.log(chunk.choices[0].delta.content);
  }
}

main();

I don’t think just a simple “break” inside the for loop enough to close the connection! right?
Could you provide a code example demonstrating one or more of these solutions?