GPT-4 API Gateway timeout for long requests, but billed anyway

When I send long requests (6000+ tokens in total) I receive a gateway timeout response after a long while (a couple of minutes). I am being billed anyway, but get nothing back. Note that I am using the chat completion api, but without streaming, so I get the full response back in one go.
For smaller requests or long requests for GPT 3.5 it works without issues.

This is the response I receive:
{"error":{"code":524,"message":"Gateway timeout.","param":null,"type":"cf_gateway_timeout"}}
The timeout seems to be exactly 10 minutes.

Update 16.05.2023:
I just tried out the streaming variant, and response tokens do arrive (around 1500), then the server closes the connection. More precisely, I’m getting an

IOException: “The response ended prematurely”

in C#.
So not even that request variant works. At this point, I’m not sure if it’s even possible to use the gpt-4 api for long requests. If this is working for anyone, please write a comment.


I am getting the same here, but even for shorter queries…
Whenever I send something that is longer than 1k tokens, I get random timeouts.

1 Like

I set my network timeout to 30 minutes for the requests I send, so I’m guessing it’s something I don’t have control over.

You can set up retries, but sometimes even that does not help. I am trying with 10 retries and pauses of 1, 2, 4, 8, etc seconds between each. Even that does not help.

What is a good practice for retries in general?

This is definitely not an option, since I am getting billed for each attempt. The pricing is already a limiting factor for my purpose, doubling or quadrupling it is not possible for me (and probably many others)


Hey, I used to get a similar error (although gateway timeout is rather vague, mine specifically said it was from Cloudflare), and the reason was too many requests at the same time, if I created a random delay between requests, I wouldn’t get the error.

Granted, my code would shoot something like 50-100 requests at the same time.

For me, no specifics are given, it just says 524 gateway error in the json response message object sent by OpenAI. Also, this happens with the first request I send. And only for long requests. I am definitely not rate-limited.

1 Like

Are you sure you are billed for bad gateway attempts? I think the server does not process the request at that time, so it can not charge you for tokens?

1 Like

Yes, I am absolutely sure. I haven’t had a single succesful request today or yesterday, yet there’s around 1.50€ in my billing overview. I haven’t made any other requests on the api aside from one test request with like 15 tokens total. I updated the original post with the response I’m getting.

1 Like

I defaulted to 3.5. GPT-4 was unusable today.

I would love to, but my request content is too long :expressionless:

I’ve experienced the same issue (gateway timeouts, charged anyway), and as usual received 0 response from the support team on it. Super frustrating (especially since I’d had queued jobs set up the first time, and didn’t realize I was getting charged for the non-completes).

Side-note - has anybody ever received an answer via chat? The only time I’ve had somebody respond to me was when I emailed re: an Acceptable Use Policy question, but I’ve never had a chat response from support.


Same problem here. GPT-4 API is non-functional for longer requests, and getting billed for them.

Error: Gateway timeout. {"error":{"code":524,"message":"Gateway timeout.","param":null,"type":"cf_gateway_timeout"}} 524 {'error': {'code': 524, 'message': 'Gateway timeout.', 'param': None, 'type': 'cf_gateway_timeout'}} {'Date': 'Thu, 04 May 2023 09:37:15 GMT', 'Content-Type': 'application/json', 'Content-Length': '92', 'Connection': 'keep-alive', 'X-Frame-Options': 'SAMEORIGIN', 'Referrer-Policy': 'same-origin', 'Cache-Control': 'private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0', 'Expires': 'Thu, 01 Jan 1970 00:00:01 GMT', 'Server': 'cloudflare', 'CF-RAY': '7c1fb4efd836b951-AMS'}

Same problem today. Also get billed for each failed attempt.


The issue persists as of today for me.
And I can’t even publish this message because apparently it is incomplete.

1 Like

Same issue here. We are getting billed for requests that never arrive. Plus, we have to shorten the text in order to get less timeouts, which increases the overall prompt length, if that makes sense.

I am running into the same exact issue. My requests are 6.5k~ tokens at the moment, and I send them in a batch of 6 requests, with a delay of 2-15 seconds per request, and a 60 second delay after the batch is done before starting the next batch.

Pretty frustrating because I don’t think I’m doing anything wrong… anyone figure out any working solutions to this without dramatically decreasing my request lengths?

I am also getting this error for some of my GPT4 API requests that are ~3500 tokens or more. I tried increasing the request_timeout to 40 minutes but the request still sits there for 40 minutes and then times out.

I tried streaming my response to see where it breaks.

Input: 3,551 (tokens)
Ouput: 2,155 (tokens)

I tried the same request multiple times and it seems to crash around the same part of the output every time.

As a temporary workaround, I ask Chat GPT to continue and feed it the previous message + output. (I tested this in Playground, will see if it works with the APIs). I’ll have to pay for an extra 6K input tokens every call, but at least my app will be working.

Update: This method worked as a temporary workaround.

I have it stream the response so I can see when/where it crashes instead of it timing out. Then I make a new request with the same prompt and the response I received so far:
[{“role”: “user”, “content”: prompt}]
[{“role”: “assistant”, “content”: first_chunk_of_response}]

And it picks up where it left off.

This is also happening to my larger API requests when experimenting with GPT-4. When I use the prompt in the playground it seems to work fine. As some people here mentioned it, I checked the usage data and can also confirm that requests that are not fulfilled are billed! This could have gotten expensive if i would have let my queue workers running that are setup to retry api requests…

Has anyone else contacted support over this? It’s been 14 days without a reply.

1 Like