When will the response time/timeout issue be addressed?

lostinsauce · November 2, 2023, 7:30pm

I’m having significant issues with chat completions hitting the 600 second timeout threshold with the GPT-4 API. These completions are 5,000-6,000 tokens, so well within the 8,000 context range.

These issues coincide just about perfectly with the average completion times seen here. Any time the average gets near ~40 seconds or above, I can count on consistent errors. Even with error handling that retries the completion, it will just error out again on the 2nd or 3rd attempt as well.

Browsing this forum, I can see I’m far from the only one having this issue in the past months. I get that this is a newer product, but it doesn’t seem like their server speeds are anywhere close to what they need to be. Unless my math is off, you would need to average 256 tokens roughly every 20 seconds to finish an 8k completion in 600 seconds, but the average times are never anywhere close to that.

Again, I know the GPT-4 API has only been on limited release since July, but a ~15-20% failure rate in production is beyond frustrating. Does anyone know if this is being worked on or will be in the future? I can’t find anything about it in OpenAI’s patch notes or press releases.

_j · November 2, 2023, 7:40pm

If it’s being worked on, its by training the AI to curtail and deny long outputs.

Use streaming and you at least will get a partial answer generated for the the five minutes before error instead of paying and getting nothing.

You can also set a max_tokens below what will cause error, and then resubmit with a new assistant message of what it wrote so far.

Topic		Replies	Views
GPT-4 API slow response over 60sec API	6	2656	February 16, 2024
GPT-4 API to slow when you have to work with a 46 second time out API	11	2832	July 30, 2023
Managing timeout when waiting for the response from chat completions request API	1	2651	May 7, 2023
Is there an issue with GPT 3.5 turbo 16k? API	5	957	October 27, 2023
HTTP timeout error at the maximum token limit API	2	877	May 16, 2023

When will the response time/timeout issue be addressed?

Related topics