I’d like to check with others, because it seems like something is seriously off. Perhaps it’s just my region or my country being treated this way, but: I’ve been trying to use the API for the last week or so, and it barely works. I get perhaps one response out of 8, after 3-7 minutes of waiting.
Most of my calls end with a “524 Gateway Timeout” response after 10 minutes, but I also have a collection of “502 Bad Gateway”, some 429s, and even a mildly amusing “404 Model not found”.
I use GPT-4, but I also tried gpt-3.5-turbo, which is slightly more responsive, but nowhere near reliable.
My calls are around 2k tokens in size, with around 1k tokens returned.
BTW, I get billed for everything, including when I get 5xx responses, which I think is fraudulent.
Is it just me, or is it like that for everyone? And if everybody is seeing the same thing, how do people build anything on top of OpenAI APIs?
we recently moved a project to Azure’s Open AI service (dedicated deployment) for a client and we saw a 75% decrease in latency. so I guess this is the way to go for performance: develop on openai endpoint, deploy on Azure dedicated endpoint
Thanks for the replies! It seems that something is off regionally, then — I wouldn’t even complain about processing time that much, if it wasn’t for the errors and timeouts.
An API that responds 1-2 times out of 10 and takes up to 10 minutes to not respond is a broken API. I can’t see any reasonable way of using it right now. And it doesn’t help that I have to pay for all the broken calls.
Now how do I get through the “support” moat and wall that OpenAI has set up and report this…
There’s no support moat, there’s already plenty of people reporting their latency issues, and OpenAI is actively trying to add server capacity around the globe. If you still want to contact OpenAI you can do that here:
There definitely is a moat. There is no way to contact support other than being forced through an obstacle course of “search our knowledge base”, talking to a bot, and going through a bunch of rather useless questions. The company is doing its best to lose us along the way, so as to get fewer support requests.
I would respectfully suggest that a better way to get fewer support requests is having a product that actually works
Ok, so I’ve narrowed it down to request size. GPT-4 requests with less than 300 tokens get replies in about a minute or so. Requests with 500 tokens get replies in about two minutes. 2k requests timeout pretty much every time, on the OpenAI side.
As for support, there is no excuse for making people go through an obstacle course. Either you run a service (paid API), or you don’t. Price the service realistically, treat your customers with respect and support them.
Almost a month later, and I still can’t use the API. Larger requests time out and I usually end up with 500 or 502 Bad Gateway responses. Meanwhile, the status page is happily green, indicating no API outages. Over the last month there were perhaps two or three days when the API did work.
Support responded after three weeks with a generated auto-response telling me to try again later. The “respond” link in their E-mail doesn’t even work, so I have no way to reply to their pseudo-response. There is no way to talk to OpenAI.
I can fight my way through the bot and file new support requests, but what’s the point: nobody seems to be listening anyway.
I pay for those “502 Bad Gateway” requests and with GPT-4 that’s accumulating quickly.
Before you say “it works for me”: things do work with small requests. But my GPT-4 requests are >3k tokens in size.
This is frustrating. I’ve naively assumed the API is a product and I’m frustrated by things being totally broken, paying for the brokenness, and being unable to contact anyone at OpenAI.
I filed two support requests, jumping through all the hoops as instructed. After a month, here’s where I am:
Request #1: an auto-generated reply, mostly saying that things will be improving and asking if that resolves my problem (it doesn’t). I couldn’t respond: the link was broken and there is no way to reply in the online interface.
Request #2: charging for requests that terminated with 4xx/5xx codes. No reply after 4 weeks.
I’d say there is a moat (you have to fight your way through the bots). And the support is pretty much non-existent — please note that I don’t even need “support” (I’m not asking about how to develop using the API!), I need customer service.
I’m sorry to hear that you’ve had a bad experience, mine isn’t better, but if you want costumer support you’ll have to go though help.openai.com
We unfortunately cannot help you in that regard, as none of us works for openAI.
I understand what you mean by “moat”, as I’ve said before, I don’t think this in intentional, I’m just saying that the que is long.
Well, I understand that — but if help.openai.com doesn’t respond, what else do I have left, other than complain here?
FWIW, my GPT-4 responses have gotten better recently and the API became somewhat usable, even for larger queries. I’m still paying for 5xx/4xx requests, which is fraudulent, and I still haven’t heard from “support” about this.
I still disagree about the moat. Setting up your “support” so that customers have to jump through hoops, run an obstacle course through the FAQs and fight their way through bots (instead of just providing an E-mail address where I can send an E-mail) is a moat. The process is designed to discourage customers from contacting support, and this is intentional.
On a technical note, how would OpenAI securely implement a 400/500 error detection system?
I use a whole host of API’s from various providers for everything from weather prediction to tidal flow rates, and if I get a 4xx or 5xx class message that’s just part of the price of using a globally interconnected network. There is no way, that I know of, for the sending server not to spend the compute on your request prior to error occurring. In this case Cloudflare times out due to a longer than usual delay in a response, which there is a solution to, in the form of streaming replies, so the calculations have been performed with good intent and then a down chain interconnect fails, the only way to detect that would be for the client (you) to send back a “i did not get that” message or some error code, now while in a perfect world that would be great, in the real world it would be hacked in 2 seconds and used to spoof free content.
What OpenAI is doing is industry standard practice and it is also well understood when developing beta products.
I think this problem has already been standardized in computer science, here a link for people who are interested:
I agree that this seem to be standard practice across the industry.
You always have the option of voting with your wallet, you have the option of unsubscribing, if you’re unsatisfied with the service. This is the community forum for developers, it’s intended as a place where developers, who are either developing stuff that uses the OpenAI or plugins for chatGPT, can share knowledge and help each other, it’s not intended as the overflow for OpenAI’s costumer support.
If you need costumer support your best bet is still:
If you have any issues with you code or getting GPT to respond in a particular way, we’re always more than happy to help
Yup, I agree on the 2 generals point, but in this case, the compute is spent before the result is returned unless the user has opted for streaming, in which case only an approximate 10 additional tokens will be used if the connection is broken.
The current solution is not “industry standard practice”, it’s a hack. The comments about the “Two Generals Problem” are beside the point — I wasn’t complaining about requests that timed out between me and OpenAI. I was complaining about requests that timed out within OpenAI’s infrastructure (that includes cloudflare, apparently). The API endpoint is what I transact with, and if that returns a 4xx/5xx without the data I paid for, I should not be charged. If you implement your billing based on the actual API endpoint that customers transact with, you don’t bill for stuff you didn’t deliver, so it’s not like it’s an unsolvable problem.
But I do understand @N2U’s suggestions that I should shut up, because that’s just how Things Are, and I guess I’ll proceed to do just that.