I’m noticing an important delay in the API response since yesterday (happening in the morning US Central time), with 1000 tokens in minimum 30 seconds, and some requests take more than 10 or 20 minutes to complete (usually, was taking 10 to 12 seconds). At the opposite, Chatgpt (free) web interface is responding quickly.
Anyone facing the same issue ?
Edit : I decided to trust @anon5861895 and bought $50 credit in my company account. Result : No issues anymore, queries 3 times faster than when the API was working well, queries are more reliable, and with a better ChatGPT model. Crazy ! Also, it made me increase the limit from $120 to $600.
However, the issue is still current when I’m using my initial api key.
I can confirm the same problems. Multiple timeouts and increased response time.
Generally 15 sec average. Currently around 60 sec if it doesn’t timeout.
The -0301 model making token rates it never could have before - meaning they messed with it.
[stop] 292 words/374 chunks, chunk 1 in in 0.470 seconds
374 tokens in 5.3 seconds. 70.1 tokens/s @ rate: 76.9 tokens/s)
today’s non-function call gpt-3.5-turbo for me:
[length] 388 words/512 chunks, chunk 1 in in 0.401 seconds
512 tokens in 12.1 seconds. 42.4 tokens/s @ rate: 43.9 tokens/s)
I see new posters here. With the prepay account for people that only added a payment method since the last month, there is also a tier system that was rolled out. Basically, if you haven’t paid $50 of credits to OpenAI, you are a candidate to be shuffled off to lower slower AI models.
$50 paid and 7+ days since first successful payment
$250
5000 RPM
80K TPM (GPT-3.5), 20K TPM (GPT-4)
We plan to expose additional usage tiers over time and will adjust these accordingly in response to capacity and fraud activity. Our main goal with usage tiers is to automatically increase rate limits and spending limits for customers who are successfully paying their bill.
As your usage tier increases, we may also move your account onto lower latency models behind the scenes.
Look at the rate limits of your account and compare to find yourself in the tier guide.
For me being a long monthly billed account (and on the forum?), I have an apparent tier 3 160k for gpt-3.5-turbo (new from default 90k since I last looked), but a peasant rate below tier 1 for gpt-4.
That’s interesting, but unfortunate I was not aware of it. I’ve used pay as you go since Nov last year.
Perhaps I should add a credit, but if this is an indication that the higher latency models we’ve been shuffled off to " As your usage tier increases, we may also move your account onto lower latency models behind the scenes " I’m not too keen to wait it out.
Same boat once upon a time (maybe not 20 minutes though). Was in a casual meeting when the latest model was davinci. I made a chatbot and wanted to show it off. The model was so slow to respond because of some unannounced downtime (even though the status page reported OK)
I have resorted to caching responses for presentations now. That, or using Azure instead of OpenAI servers will most likely be your best bet.
Side note: I am not experiencing any slow-downs. Not very helpful, I know.
This is also happenning to me these last few weeks. Some of my automated processes failed because of timeout when status page was showing a big green “OK” status…