In last 3 days, all of my requests are lasting forever or just throwing 503 Service Unavailable error. The prompts I used to get a response in 4-5 seconds are now lasting 40-50 seconds.
Problem exists both on my server and my PC. Tried proxies, tried different API keys, but result is same.
Legacy APIs like Completion works fine, only Chat API has this problem.
If your account has been targeted for the slowdown, there’s not much you can do.
One theory might be that if you are in Europe, you are routed to a different datacenter with different performance when using a particular account.
Nobody seems to have reported whether they get the same slow speed when moving to hosted US datacenter for their API calls.
You can see if you instead get fast performance when you fine tune a very basic model (10 benign questions, 1 epoch) and then use that ft model. Also, gpt-3.5-turbo-instruct may be fast for you if you can adapt to that model.
How can they target me for a slowdown without a proper reason or without telling anything? I pay for their service and this is the service we got served.
When I tell OpenAI help staff to check my account if there’re any speed limitations , they copy and paste the same text over and over. I’ll lose my mind.