GPT-4 API Gateway timeout for long requests, but billed anyway

I did, but only yesterday. I’ll post here if I get a response.


No response yet. Replying to bump the thread for OpenAI staff to hopefully see.

I have been having the same problem. Unfortunately it makes a new app I’m prototyping useless :frowning: For me, it is only happening with large context (~6k tokens) requests.

21 days after the first post here I still have the same problem. For big content (500 pages - 100.000 words) it is a major problem. 1. The costs to charge the customers are too high and unpredictable if errors coming up like this. 2. if you get timeout or something went wrong, you cant start from the beginning — for now I have no clue if this issue kills my business model or I can find a smart solution.

1 Like

I just really wonder how they want to serve the 32k model if even 4k+ tokens cause intermittent timeouts. Maybe that’s why the roll-out is so slow.

No reply from the support yet, by the way.

they serve low hanging fruits first. And maybe not many people have that use case

I’m experiencing a similar issue where I receive a timeout error if the process exceeds 600 seconds. I’ve adjusted my request to be processed in smaller chunks, which works. However, selecting chunk size is guessing game that takes 10 mins per try and im out $4. At least dont charge us for errors, ratelimits, overloaded, and timeouts.

I don’t even have access to gpt4, but on average i have to wait around 20 seconds for replies. Also I only receive replies when I use default GetResponseFromChatbotAsync method (from unnoficial C# lib I’m using). When I specify endpoint details like model, Nr of tokens, temp etc, i usually get timeout errors even when I specify the fastest models, and small nr of tokens. The chatbotasync method I mentioned never times out, but sometimes I have to wait like 30 sec. Will have to investigate this further to figure out what’s different between these requests and why some fail, and some not even after a longer period of time. My guess is it’s default timeout period used in different methods or different endpoints because I’m 100% not exceeding the token cap per unit of time.

Sounds like you have mutliple different, unrelated issues.
The 20 second wait sounds normal to me depending on your request length - the model is still generating the whole response token-by-token, so you need to wait for it to finish before you receive a response. The 3.5 turbo model in the api isn’t nearly as fast as the one in ChatGPT, in my experience.
Try out streaming, you should get a response faster. The lib you’re using has example code for streaming in its readme.

decrease GPT4 with 20usd plan and take car of API customers who pay several hundred dollars !

1 Like

I am running into the same problem. The gpt-4 api is unusable for large requests that are well under the token limit. I am running my requests from the parallel processing script they supplied in the openai cookbook, chunking my doc with tiktoken and still get timeouts and I am getting billed. I reached out to support but after reading all of these comments I am skeptical about getting a reply.

1 Like

Same issue for me. GPT4 API not usable and putting off my users :frowning:


same issue 650 api call (prompt 200 token each and completion_token 200 token each) and each in parallel are taking more than 2h… and still waiting…

the slowdown to me seems to be resolved. call with 200token 3 days ago took +30seconds now less then 2 seconds.

moreover I completely redesign my script yesterday (gave up to gpt4… and focus on gpt 3.5 turbo). definitely the api was “hanging” my script significantly so i added a lot of error handling… and back off… but to run 650 call 3 days ago took me 3 hours and now around 8-15min… so i think some of the issue to the api was resolve
(i assume being week end now there is less stress on the system)

my calls are around 6k token, very rarely working, need to make 4, 5 calls to get it working. worst is that it charges me for failed response. they should def not be charging us.

Same for me.
Failure for large request it doesn’t matter, we can start over, but please don’t charge for failures.

1 Like

Having the exact same issues, still being billed as well :frowning:


I’m just here to say that I’m dealing with the same issues like everyone else, and the lack of transparency was killing me, as a result, a friend and I have collaborated on an open-source platform to address it. Basically we addressed observability and troubleshooting features. Initially, we relied on Slack messages for error notifications, but we soon realized it wasn’t sufficient for scalability, particularly when handling more than 100 requests per day. If you’re interested, check out Pezzo on github.

This is still going on for me, almost 3 weeks after my initial post. No response from staff yet :frowning: Ouch

I rarely can get even a single request through (mostly get 524 or 502 errors), and I’m being billed for everything, which is fraudulent.

This is very worrying, I’m developing a product and I’m not at all sure I’ll be able to ship anything, given that the API essentially doesn’t work.

1 Like