API response time is insane (60+ seconds)

I am sending API requests to gpt-3.5-turbo-1106 and gpt-4-0613 that are pretty small:

"prompt_tokens": 308,
"completion_tokens": 670,
"total_tokens": 978

It takes 60 to 120 seconds to respond to such request.

There is just no way my users are going to wait even a minute to get a response. There is no amount of loading animations and funny quotes I can pack into my loading screen to make this even remotely feasible…

I am seeing a lot of similar complaints on the forum about the response time but no solutions. What can be done?


I think OpenAI’s news caused a flood of users who wants to ride this wave. So I’m guessing that OpenAI will adress the capacity issue.

Hey! It turns out there was a bug on our end that could result in timeouts in certain scenarios. We have since fixed the issue. Please let us know in a new thread if you end up seeing similar issues again. Thanks again for reporting this!

