GPT3.5 Turbo 1106 Just Hangs

bertha.kgokong · November 13, 2023, 11:48am

I have updated my code from gpt-3.5-turbo-0613 to gpt-3.5-turbo-1106, but the code seems to hang for no reason. When I first tested it, it ran without issue and actually worked out faster that the older model.

But if I come the next day, the code hangs for minutes . . . . I am afraid this is causing us to loose clients, it looks like our app is not working — and I do not want to revert back to the old model.

This is a new problem with the new model, perhaps it hangs looking for a cache — not sure because when I manage to get one request through, everything works well after that with no ‘hanging’, the calls are even faster . . . . but that initial hanging is a problem, I would rather have a slower model that works 95% of the time than a faster model that works 50% of the time.

Foxalabs · November 13, 2023, 12:56pm

Hi and welcome to the Developer Forum!

Are you using any kind of VPN or Proxy? What server kind of hosting infrastructure are you using? Is it on Azure, AWS, Google, Commercial VPS, home internet? Can you post a code snippet of the API calling code along with any setup it relies on, please?

miguelwon · November 13, 2023, 6:43pm

I’m having the same issue. I’m doing a simple for loop that calls gpt-3.5 for a simple translation task (with a very short text). Tried with different versions of gpt-3.5 and got the same problem with all versions. It just hangs at some iterations. And I’m using a quite longe sleep time of 2s…

miguelwon · November 14, 2023, 10:38am

For the users who end up here with the same problem. I was able to solve the issue with a timeout. This is my call:

@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(10))
def completion_with_backoff(messages,model = "gpt-4-1106-preview"):
    completion = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature = 0.7,
    timeout=5
    )
    return completion

(openai version 1.1.1)

This is not a very good solution because it means that a request is being made, for some reason it hangs, the timeout is called, and only with the second request I have a response. Not sure if I’m being charged by the first request…
Also note that I define a short timeout of 5 seconds because I’m working with very short tasks that should give short completion.

bertha.kgokong · November 14, 2023, 11:48am

This is a good solution, but I think I saw somewhere in the documentation where you can specify the timeout and max-retries in the Client() - when you initiate it.

client = OpenAI(--specify retry and timeout behaviour here----)

I just cant remember where I saw it now.

guohaochuan · November 15, 2023, 8:08am

A client side retry for appearently server-side problem is just a walkaround, should not be considered a long time solution.

We’re also seeing very high rate of time outs, and because we’re quite time sensitive, we aren’t able to upgrade to the new model.

nikhil.sehgal · November 16, 2023, 8:35am

I’m also having the same issue here…

Foxalabs · November 16, 2023, 8:37am

The server infrastructure is under heavy load with the all of the new users joining, hence the pause placed on new Plus memberships, this should get better over the coming days. Most of the lack of responses and hangs at the moment are related to this.

bertha.kgokong · November 16, 2023, 9:32am

I switched from gpt-3.5-turbo-1106 to gpt-4-1106-preview and that helped quite a bit with the hanging, but at 10X the price, I hope they resolve this issue soon so I can switch back to the cheaper model.

tanmonyvisal · November 18, 2023, 4:15am

This is my temporary solution as well. Most of gpt3.5-turbl 1106 timeout errors happen in the afternoon (12PM to 4PM), so I have switched to gpt4 turbo during that time.

Anyway, except during the afternoon, gpt3.5 turbo’s response time is faster compared to that of gpt3.5 turbo 0613

vahid.vaezian · November 26, 2023, 5:54pm

Hopefully this gets resolved before they shutdown gpt-3.5-turbo-0613.
We cannot use gpt4 turbo as it’s still too slow for our use-case (compared to gpt3.5).

@bertha.kgokong please remove the “solved” tag, because the provided solution is really not a solution.

bertha.kgokong · November 27, 2023, 6:33am

I know I have suggested many solutions, but if anyone lands here again. I have found work around with a timeout function, if the GPT3.5 call hangs for over 20sec, I call GPT4. So far I have been able to catch the hanged calls and at least respond within 20sec - then mix between the cheaper and faster GPT3.5 turbo and more expensive GPT4. This way I can even count how many times a day the function hangs.
Here is my workaround . . .

from func_timeout import func_timeout, FunctionTimedOut

def run_with_timeout(messages, chat_functions):
    try:
        return func_timeout(20, returnFunctionChatGPTCall, args=(messages, chat_functions))
    except FunctionTimedOut:
        return returnFunctionChatGPT4Call(messages, chat_functions)

This first function calls GPT3.5 Turbo model, if it times out, I then call the second function - returnFunctionChatGPT4Call. Of course it could be the same function and you just pass in a different model – you can build this how you like.
GPT4 does not hang, but I cannot call it all the time due to its 10X price difference - so our default is 3.5, except when it hangs.

_j · November 27, 2023, 7:01am

Why not send to the normal gpt-3.5-turbo then or 16k if needed?

If the API is failing 1 in 10 times (as has been typical for gpt-3.5-turbo-1106 if not even worse) you can dispatch the answer to two parallel calls of the model at the same time and still have even lower expense, especially if streaming and closing the connection on the one that is not first to respond.

logankilpatrick · December 4, 2023, 3:23pm

Hey! It turns out there was a bug on our end that could result in timeouts in certain scenarios. We have since fixed the issue. Please let us know in a new thread if you end up seeing similar issues again. Thanks again for reporting this!

Topic		Replies	Views
New gpt-3.5-turbo-1106 model constantly times out, anyone else? API	20	3204	December 4, 2023
Chat Completion API extremely slow and hanging API	7	5131	December 4, 2023
Gpt-3.5-turbo-1106 model hangs for 10 minutes every couple of requests API bug , api	12	2280	December 2, 2023
Gpt-3-5-turbo-1106 either timeout or gives radically different result from gpt-3.5-turbo-16k API gpt-35-turbo	9	3428	December 4, 2023
GPT 3.5-Turbo API call randomly hangs indefinitely API	10	3958	July 18, 2024

GPT3.5 Turbo 1106 Just Hangs

Related topics