How can I determine whether GPT is experiencing streaming failure or if the streaming has already been completed in a Stream situation?

Currently, we are using Python to request a stream from ChatGPT and returning the streamed text to the frontend webpage. However, there are instances where the returned text is interrupted. I’m unsure whether adding logs would help identify whether the issue lies with ChatGPT not providing a complete sentence or if the frontend is refusing to receive, causing us to stop further reception. Below is my code for obtaining the stream:

responses = openai.ChatCompletion.create(
        model=model, messages=messages, temperature=0, stream=True
    )
for response in responses:
            try:
                    last_response = response["choices"][0]["delta"]["content"]
            except:
                    logging.error("incomplete reception")

I can only determine if the stream has ended by checking the length of the list obtained from response["choices"][0]["delta"]["content"] . If the length exceeds a certain threshold, it indicates that ChatGPT has stopped providing content, and the stream ends. However, if the stream termination is due to an incorrect response or a failure in proper content delivery, how can I distinguish between these scenarios?

Here’s a Chat completion chunk object:

{
  "id": "chatcmpl-123",
  "object": "chat.completion.chunk",
  "created": 1677652288,
  "model": "gpt-3.5-turbo",
  "choices": [{
    "index": 0,
    "delta": {
      "content": "Hello",
    },
    "finish_reason": "stop"
  }]
}

Consider the property finish_reason. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached, or function_call if the model called a function.

If it’s neither of these when your stream terminates, your stream was likely interrupted.

1 Like

You’re right. If finish_reason != stop / length when the stream terminates, it should indeed be interrupted. However, how can we differentiate whether the interruption occurred while waiting for a response from ChatGPT or if it was due to the external webpage user terminating my program?

It’s pretty simple. The logic to detect interruptions on client-end would have to be implemented by you.