I have updated my code from gpt-3.5-turbo-0613 to gpt-3.5-turbo-1106, but the code seems to hang for no reason. When I first tested it, it ran without issue and actually worked out faster that the older model.
But if I come the next day, the code hangs for minutes . . . . I am afraid this is causing us to loose clients, it looks like our app is not working — and I do not want to revert back to the old model.
This is a new problem with the new model, perhaps it hangs looking for a cache — not sure because when I manage to get one request through, everything works well after that with no ‘hanging’, the calls are even faster . . . . but that initial hanging is a problem, I would rather have a slower model that works 95% of the time than a faster model that works 50% of the time.
Hi and welcome to the Developer Forum!
Are you using any kind of VPN or Proxy? What server kind of hosting infrastructure are you using? Is it on Azure, AWS, Google, Commercial VPS, home internet? Can you post a code snippet of the API calling code along with any setup it relies on, please?
I’m having the same issue. I’m doing a simple for loop that calls gpt-3.5 for a simple translation task (with a very short text). Tried with different versions of gpt-3.5 and got the same problem with all versions. It just hangs at some iterations. And I’m using a quite longe sleep time of 2s…
1 Like
For the users who end up here with the same problem. I was able to solve the issue with a timeout. This is my call:
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(10))
def completion_with_backoff(messages,model = "gpt-4-1106-preview"):
completion = client.chat.completions.create(
model=model,
messages=messages,
temperature = 0.7,
timeout=5
)
return completion
(openai version 1.1.1)
This is not a very good solution because it means that a request is being made, for some reason it hangs, the timeout is called, and only with the second request I have a response. Not sure if I’m being charged by the first request…
Also note that I define a short timeout of 5 seconds because I’m working with very short tasks that should give short completion.
This is a good solution, but I think I saw somewhere in the documentation where you can specify the timeout and max-retries in the Client() - when you initiate it.
client = OpenAI(--specify retry and timeout behaviour here----)
I just cant remember where I saw it now.
A client side retry for appearently server-side problem is just a walkaround, should not be considered a long time solution.
We’re also seeing very high rate of time outs, and because we’re quite time sensitive, we aren’t able to upgrade to the new model.
I’m also having the same issue here…
The server infrastructure is under heavy load with the all of the new users joining, hence the pause placed on new Plus memberships, this should get better over the coming days. Most of the lack of responses and hangs at the moment are related to this.
I switched from gpt-3.5-turbo-1106 to gpt-4-1106-preview and that helped quite a bit with the hanging, but at 10X the price, I hope they resolve this issue soon so I can switch back to the cheaper model.
1 Like
This is my temporary solution as well. Most of gpt3.5-turbl 1106 timeout errors happen in the afternoon (12PM to 4PM), so I have switched to gpt4 turbo during that time.
Anyway, except during the afternoon, gpt3.5 turbo’s response time is faster compared to that of gpt3.5 turbo 0613
1 Like