Just got access to ChatGPT API 4, we’ve been using 3.5-turbo until now. Today we switched and it’s nothing but 502 Bad Gateway errors. We managed to get maybe one good response in a whole afternoon. We switch back to 3.5-turbo and everything is fine again.
Browsing through the forums, seems this is not an isolated incident.
So being that is pretty prevalent, what can we do to get around this issue?
Right now the timeout is set to 300 seconds from what I can tell and we get the error around that time. Can we change the time out for the call? Is that possible?
And yes, we are handling it within a try / except block, waiting and retrying but nothing seems to work.
Any ideas welcome.
Hi and welcome to the developer forum!
502 is a gateway error, i.e. a server acting as a relay or a proxy failed in some way, I’m assuming the exact issue is a=some variant of a timeout, this is not caused by OpenAI, it’s a failure in-between the message being sent from OpenAI and it landing in your network socket.
What I would do is perform some basic sanity checks. Try a super short prompt, i.e. “test” and wait for the reply, if that comes back reliably then try increasing the prompt length. I don’t know what your network environment looks like so it’s hard to tell exactly what’s going on. Do you have command line access? could you just run a python shell and try it from there?
We use Google Colab to run the python scripts. Meaning the scripts run on Google servers and seem to work for everything else (including other versions of gpt like gpt-3.5-turbo) except for gpt-4.
I saw a similar message in a previous thread stating the same that this is not an issue with OpenAI but I think that if the API is struggling to return an answer we would be getting a timeout or similar and thus the error, right?
I mean if this were to work fine with a shorter text, what would that mean?
Well, it would mean that there is some timeout occurring in the google colab, lots of open soccket connections have 60 second and 300 second timeouts.
One possible solution is to use streaming, but I have not set that up in a google colab before, and I know the Google App Engine system will not handle server side events, so that might be an issue.
We’ve tried with shorter prompts and seem to work fine. It does struggle with 4000+ tokens though.
OK, so it seems like the colab environment does not like waiting over 300 second with a connection active, that is a limitation you will have to work within, unless you can get streaming to work.
There may be a way to increase the colab timeout values, if someone else who has been through this knows.
I am facing the same issue, with gpt-3.5-turbo-16k and I am using jupyter notebook on my pc. Whenver the token count goes over around 3000. Any help will be of course very helpful.