We currently have 3 OpenAI organizations for DEV, PREPROD and PROD environments. We noticed unusual latency on the PROD environment since last October 13.
We initially thought it was related to our implementation but after isolated tests, it seems the issue comes from your service. I attached a table showing the completion time for 15 different prompts.
Itâs not your environments, it is your organization.
OpenAI found a bunch of old mining GPU machines on eBay, blew them out with a leaf blower, and set them up in the basement of the old bubble gum factory to run turbo models finally crushed enough to fit in 16GB. And then assigned them to those who have paid the least money upfront. Or so you would think.
The actual answer lies in your organization rate limit page. You will see a new tier system that your account has been assigned to. Especially for accounts with prepay, if you havenât purchased $50 of credits, youâll have been throttled, moved to slow servers, or whatever mechanism it is that they are using to provide lower satisfaction. So pay for credits.
If you can de-convolute the tangle of invited users from the strange organization system that you might be trying to use to limit internal rights or spending, you might try to consolidate so that you can overall be seen as a big spender, at least on that account needed for facing customers.
Here are the numbers for the last 12 days for that same test. 15 completions in parallel on the 3 environments. The y axis is the average completion time in ms.