ChatGPT's "Stop Generating" function - how to implement?

I need to implement the same functionality as in ChatGPT’s “Stop Generating” function. I am using a Completion API call.
How can I request this via the API? The Stop Sequences in the initial API call is not what I am looking for.
Thanks, and help would be greatly appreciated.

1 Like

why doesn’t stop sequence do what you need?

Can you use streaming and simply close the stream?

1 Like

Thanks for looking at this.

There is no pre-determined reason to stop generation, ie Stop Sequences won’t apply. The reason to stop generating is if a user realises the answer is poor, due to a poor prompt formulation.
I.e. need exactly like the “Stop generating” in ChatGPT.

I am using streaming, but don’t know what ‘close the stream’ mean and how to implement it.

Your further advice would be greatly appreciated, thanks.

1 Like

Sorry, without knowing a lot more about your code I can’t help much. Are you getting the stream in a callback? If the callback is running in a separate thread (most likely), then if you are using python you can probably just call sys.exit(0)
python threads (other than the original one) just quietly terminate when you do that.
don’t forget to import sys.

That’s what I do to force termination of streaming output, others may have other thoughts or criticisms of that approach.

You need to use the streaming API (use server-sent events.)
Then you can reset/close/abort the connection when you believe you’ve gotten enough text and want to stop it.
There is no way to implement “stop generating” on top of the batch API.
There’s a number of resources on the web about how to use the streaming API; for example this NodeJS module (which I haven’t used; no idea if it’s any good, but it’s a google hit):

1 Like

I generate an instance of AbortController with new AbortController(). This controller possesses a signal attribute, which represents an instance of AbortSignal. During the creation of the fetch request to the server, this signal is introduced as part of the fetch options, thus linking the fetch request with the abort controller.

On application code I have a button for invoking the abort() function on the controller that only appears when a response is being streamed/generated. This action sends an abort signal to the fetch request, thereby initiating its immediate termination. In situations where the fetch request gets aborted, it rejects and generates an error that has a name attribute of 'AbortError'. This specific error can be trapped in a catch segment where it can be managed as required.

Also, I have tried everything server side as well for the abort, but open AI still sends the whole response no matter what. The AbortController method works really well for stopping generation client side.


@Humboldt Do you have any source code that you don’t mind sharing?? I implemented as you did, but it does not work. I mean when I call abortController.abort() and controller.signal.aborted changed to “false” but the stream did not stop.

Hi, sorry I am very bad at explaining things but here is the key parts of my source code I’m using to this day,

let controller;
let abortBtn = document.getElementById('abortBtn');

abortBtn.addEventListener('click', function () {
  if (controller) {

submitButton.addEventListener("click", async function () {
  // ...
  controller = new AbortController(); 
  let signal = controller.signal;

  fetch("", {
    // ...
  .then(response => {
    // ...
      .catch(error => {
        if ( === 'AbortError') {
          console.log('Fetch aborted by user');
          // handle the abort
        } else {
          throw error;
  .catch((error) => {
    if ( === 'AbortError') {
      console.log('Fetch aborted');
    } else {
      // handle other errors
  • An instance of AbortController is created (controller = new AbortController()). This instance has a signal property, which is passed to the fetch request.

  • If the abort() method is called on the controller, it will set the Aborted flag on the signal to true (controller.abort()). This happens when the user clicks the abort button.

  • Inside the fetch promise chain, you can listen for a catch error with the name ‘AbortError’. This error is thrown if the fetch request was aborted.

  • When ‘AbortError’ is caught, you can perform any cleanup necessary and let the user know the request was cancelled.

  • If you try to abort a fetch request, it stops the processing of the response on the client-side, but doesn’t actually stop the server from processing the request. This means the OpenAI API will continue to compute the whole response and send it to the client, but the client will just not process the remaining part of the response once it has been aborted.

I’m curious, if I disconnect, will the AI inference behind ChatGPT stop immediately or not?

AI inference means AI create completion progress.

Some AI services explain that once the AI service starts generating a completion, it cannot be stopped, so I’m not sure if ChatGPT is like that.

1 Like

If you are using streaming mode then generation will stop within a few 10’s of tokens (7-30 in testing clustering more at the lower end), if you are not in streaming mode the full reply will be generated irrespective of if you are there to receive it.

1 Like

So if it’s in streaming mode, I understand that I only need to pay for the cost of the tokens I received, right?

plus some amount of tokens that get sent prior to the disconnect being detected up the chain, yes.