GPT-4 API Gateway timeout for long requests, but billed anyway

nathan.kent · May 9, 2023, 5:00pm

I did, but only yesterday. I’ll post here if I get a response.

nathan.kent · May 12, 2023, 6:57pm

No response yet. Replying to bump the thread for OpenAI staff to hopefully see.

martinshkreli · May 15, 2023, 1:24pm

I have been having the same problem. Unfortunately it makes a new app I’m prototyping useless For me, it is only happening with large context (~6k tokens) requests.

wolfgeppert · May 16, 2023, 4:31am

21 days after the first post here I still have the same problem. For big content (500 pages - 100.000 words) it is a major problem. 1. The costs to charge the customers are too high and unpredictable if errors coming up like this. 2. if you get timeout or something went wrong, you cant start from the beginning — for now I have no clue if this issue kills my business model or I can find a smart solution.

julius.jacobsohn · May 16, 2023, 9:39am

I just really wonder how they want to serve the 32k model if even 4k+ tokens cause intermittent timeouts. Maybe that’s why the roll-out is so slow.

No reply from the support yet, by the way.

wolfgeppert · May 16, 2023, 9:40am

they serve low hanging fruits first. And maybe not many people have that use case

rlee · May 16, 2023, 11:38pm

I’m experiencing a similar issue where I receive a timeout error if the process exceeds 600 seconds. I’ve adjusted my request to be processed in smaller chunks, which works. However, selecting chunk size is guessing game that takes 10 mins per try and im out $4. At least dont charge us for errors, ratelimits, overloaded, and timeouts.

sorn.denis · May 17, 2023, 12:08am

I don’t even have access to gpt4, but on average i have to wait around 20 seconds for replies. Also I only receive replies when I use default GetResponseFromChatbotAsync method (from unnoficial C# lib I’m using). When I specify endpoint details like model, Nr of tokens, temp etc, i usually get timeout errors even when I specify the fastest models, and small nr of tokens. The chatbotasync method I mentioned never times out, but sometimes I have to wait like 30 sec. Will have to investigate this further to figure out what’s different between these requests and why some fail, and some not even after a longer period of time. My guess is it’s default timeout period used in different methods or different endpoints because I’m 100% not exceeding the token cap per unit of time.

julius.jacobsohn · May 17, 2023, 2:48pm

Sounds like you have mutliple different, unrelated issues.
The 20 second wait sounds normal to me depending on your request length - the model is still generating the whole response token-by-token, so you need to wait for it to finish before you receive a response. The 3.5 turbo model in the api isn’t nearly as fast as the one in ChatGPT, in my experience.
Try out streaming, you should get a response faster. The lib you’re using has example code for streaming in its readme.

nicolasjarin · May 17, 2023, 3:47pm

decrease GPT4 with 20usd plan and take car of API customers who pay several hundred dollars !

david.zitney · May 18, 2023, 1:36pm

I am running into the same problem. The gpt-4 api is unusable for large requests that are well under the token limit. I am running my requests from the parallel processing script they supplied in the openai cookbook, chunking my doc with tiktoken and still get timeouts and I am getting billed. I reached out to support but after reading all of these comments I am skeptical about getting a reply.

Nathaniel · May 18, 2023, 3:48pm

Same issue for me. GPT4 API not usable and putting off my users

pproviamo · May 18, 2023, 5:06pm

same issue 650 api call (prompt 200 token each and completion_token 200 token each) and each in parallel are taking more than 2h… and still waiting…

pproviamo · May 21, 2023, 5:19pm

the slowdown to me seems to be resolved. call with 200token 3 days ago took +30seconds now less then 2 seconds.

moreover I completely redesign my script yesterday (gave up to gpt4… and focus on gpt 3.5 turbo). definitely the api was “hanging” my script significantly so i added a lot of error handling… and back off… but to run 650 call 3 days ago took me 3 hours and now around 8-15min… so i think some of the issue to the api was resolve
(i assume being week end now there is less stress on the system)

damiance · May 25, 2023, 4:51pm

my calls are around 6k token, very rarely working, need to make 4, 5 calls to get it working. worst is that it charges me for failed response. they should def not be charging us.

Imitech · May 26, 2023, 12:38pm

Same for me.
Failure for large request it doesn’t matter, we can start over, but please don’t charge for failures.

davinci · May 26, 2023, 12:53pm

Having the exact same issues, still being billed as well

itayekk1 · May 26, 2023, 1:36pm

I’m just here to say that I’m dealing with the same issues like everyone else, and the lack of transparency was killing me, as a result, a friend and I have collaborated on an open-source platform to address it. Basically we addressed observability and troubleshooting features. Initially, we relied on Slack messages for error notifications, but we soon realized it wasn’t sufficient for scalability, particularly when handling more than 100 requests per day. If you’re interested, check out Pezzo on github.

nathan.kent · May 26, 2023, 10:18pm

This is still going on for me, almost 3 weeks after my initial post. No response from staff yet Ouch

jwr · May 31, 2023, 7:00am

I rarely can get even a single request through (mostly get 524 or 502 errors), and I’m being billed for everything, which is fraudulent.

This is very worrying, I’m developing a product and I’m not at all sure I’ll be able to ship anything, given that the API essentially doesn’t work.

Topic		Replies	Views
Openai Api Error "The server had an error while processing your request. Sorry about that" API	74	45517	November 25, 2023
Error: 429 Too Many Requests API	56	14396	December 2, 2023
Continuous gpt3 api 500 error: The server had an error while processing your request. Sorry about that! API	60	28277	December 2, 2023
RateLimitError: The server had an error with no reason given API	58	15638	December 2, 2023
GPT-3.5 API is 30x slower than ChatGPT equivalent prompt API gpt-35-turbo , api	69	14024	November 30, 2023

GPT-4 API Gateway timeout for long requests, but billed anyway

Related topics