I’m noticing an important delay in the API response since yesterday (happening in the morning US Central time), with 1000 tokens in minimum 30 seconds, and some requests take more than 10 or 20 minutes to complete (usually, was taking 10 to 12 seconds). At the opposite, Chatgpt (free) web interface is responding quickly.
Anyone facing the same issue ?
Edit : I decided to trust @nomuhyuna and bought $50 credit in my company account. Result : No issues anymore, queries 3 times faster than when the API was working well, queries are more reliable, and with a better ChatGPT model. Crazy ! Also, it made me increase the limit from $120 to $600.
However, the issue is still current when I’m using my initial api key.
Yes same since the last two days and a lot of 503 errors too.
Your link is related to Plugins, my issue is specific to the API. Haven’t this issue the last 3/4 months.
Experiencing the same problem (US eastern time zone). I will contact support to see what’s going on
No, no firewalls. The error description says “cf service_unavailable”.
I have the same problem (US Eastern Time Zone). Super slow, and lots of retries. Something is going on… @yingzhao58 Please let us know what you find.
I can confirm the same problems. Multiple timeouts and increased response time.
Generally 15 sec average. Currently around 60 sec if it doesn’t timeout.
I confirm the issue from Poland and Norway.
The -0301 model making token rates it never could have before - meaning they messed with it.
[stop] 292 words/374 chunks, chunk 1 in in 0.470 seconds
374 tokens in 5.3 seconds. 70.1 tokens/s @ rate: 76.9 tokens/s)
today’s non-function call gpt-3.5-turbo for me:
[length] 388 words/512 chunks, chunk 1 in in 0.401 seconds
512 tokens in 12.1 seconds. 42.4 tokens/s @ rate: 43.9 tokens/s)
I see new posters here. With the prepay account for people that only added a payment method since the last month, there is also a tier system that was rolled out. Basically, if you haven’t paid $50 of credits to OpenAI, you are a candidate to be shuffled off to lower slower AI models.
Below is a breakdown of the first 3 usage tiers.
||User must be in an allowed geography
||3 RPM, 200 RPD
||20K TPM (GPT-3.5), 4K TPM (GPT-4)
||500 RPM, 10K RPD
||40K TPM (GPT-3.5), 10K TPM (GPT-4)
||$50 paid and 7+ days since first successful payment
||80K TPM (GPT-3.5), 20K TPM (GPT-4)
We plan to expose additional usage tiers over time and will adjust these accordingly in response to capacity and fraud activity. Our main goal with usage tiers is to automatically increase rate limits and spending limits for customers who are successfully paying their bill.
As your usage tier increases, we may also move your account onto lower latency models behind the scenes.
Look at the rate limits of your account and compare to find yourself in the tier guide.
For me being a long monthly billed account (and on the forum?), I have an apparent tier 3 160k for gpt-3.5-turbo (new from default 90k since I last looked), but a peasant rate below tier 1 for gpt-4.
That’s interesting, but unfortunate I was not aware of it. I’ve used pay as you go since Nov last year.
Perhaps I should add a credit, but if this is an indication that the higher latency models we’ve been shuffled off to " As your usage tier increases, we may also move your account onto lower latency models behind the scenes " I’m not too keen to wait it out.
If you are seeing this problem then just drop a me too reply.
As much as I personally do not like me too replies, they work for getting problems resolved.
Another way apparently to skip the bin you’ve been put in is to be paying 8x as much for your tokens – by using a fine-tune model.
[length] 384 words/512 chunks, chunk 1 in in 0.972 seconds
512 tokens in 6.7 seconds. 76.2 tokens/s @ rate: 89.1 tokens/s)
Good idea, writing the first message to make issue to the top. Had my first bad customer demo this morning because of this issue.
And it’s crazy that the people who are buying the API have slower response rates than the free chatgpt web app.
Awesome. Are you facing this issue too ?
I see. It’s hard to present a demo to show how your product is going to make gain time, if the generation takes 20 minutes.
I can’t even add credit, since my account was created in long time ago.
Same boat once upon a time (maybe not 20 minutes though). Was in a casual meeting when the latest model was davinci. I made a chatbot and wanted to show it off. The model was so slow to respond because of some unannounced downtime (even though the status page reported OK)
I have resorted to caching responses for presentations now. That, or using Azure instead of OpenAI servers will most likely be your best bet.
Side note: I am not experiencing any slow-downs. Not very helpful, I know.
This is also happenning to me these last few weeks. Some of my automated processes failed because of timeout when status page was showing a big green “OK” status…