Completion Endpoint Randomly Freezes

  1. Does anyone else run into the issue where “open.Completion.create” just hangs and doesn’t return anything sometimes?
1 Like

No really. However, performance is model dependent.

The completion API (in general) performs much faster than the new chat completion API.

Maybe you can post your code so we can see it?



Sure! I’m just using this. But sometimes (maybe 10% of the time) it just never returns a response and it times out.

 response = openai.ChatCompletion.create(
            {"role": "system",
                "content": "You are an intelligent, helpful, good teacher explaining to a student. Answer as concisely as possible."},
            {"role": "user", "content": f"{questionText}"},
1 Like

Well, we learned a lot from you code, @derek8bai

You are talking about the chat completion endpoint and not the completion endpoint and you are using the gpt-3.5-turbo model,

It is well know these new chat completion models are very stressed with new users pounding on them and so they are very slow.



I’m having the exact same issue as @derek8bai. I’m also using the ChatCompletions endpoint with gpt-3.5-turbo with it hanging randomly. I’ve implemented a retry with exponential backoff, and it still doesn’t want to work. This is impacting the public image of our service. What other solutions have people found?

In my case, a time.sleep(2) solved the issue. I cannot say 100% but I was running a loop over 100 prompts to send, and it would always hang at some point. Very rarely reaching the 50th iteration. After adding the sleep command, it is now ending the loop


I have stumbled upon the same exact issue.
Trying to make calls concurrently and it always hangs up and never exists.
Has anyone found a solution to this?

Your issue is likely unrelated to the topic. A coding problem. Code you haven’t mentioned.

Big code example on running parallel tasks.

Using await asyncio.sleep() so tasks can actually run?

1 Like

Have the same problem and unlike @_j I don’t think it is a “coding problem”.
Made a simple translation prompt and tried increasing the number of text inputs to translate.
On the 46th request the gpt-3.5-turbo chat completion hung for 10 minutes before timing out.

The OpenAI rate limit documentation states that I get 90000 tokens per minute and 3500 requests per minute.
The 46 requests and likely only 1000-2000 tokens used for my test should not cause an issue.
Think something else is going on.


I have encountered the same problem, ChatCompletion just gets stuck with some content and that’s it. And I have not exceeded the rate limits

1 Like