An hour ago the API was suuper slow,
But just now while working on my app, to my surprise, it’s like 3x 4x faster.
Any one else experiencing this?
An hour ago the API was suuper slow,
But just now while working on my app, to my surprise, it’s like 3x 4x faster.
Any one else experiencing this?
It’s not actually “fast” It just felt fast for me because it was extremely slow, (around a minute)
Now after some tests, it’s ≈ 18 — 25 seconds
I don’t want to Jinx it, but OpenAI might have improved the API. Thanks!
I hope everyone can upload this post and let OpenAI see it. I hope they are open and transparent. Give a reason.
Hello
Personally I tested and I do not find that it is slower, it is often even faster. Also pay attention to one thing: The api provides ALL the response at once, while the chat on the openai site provides the response gradually (streaming), which can give the illusion that it’s faster.
I am following these discussions, I can see there are a few users with similar conclusions, although the loudest is @rantrg , which has said a few times they are changing the mode and moving to other solutions.
I can see you are passionate about this subject, I am guessing you are building a product which relies on fast response, but probably this is not possible at this moment. It was your business decision, and you have selected to use OpenAI. I might guess that this solution you are working on was probably impossible without OpenAI API; we are in the same position. We need to rely on this and try to change a bit on the model and what we are offering to be able to do it all. With us, I am guessing that 95% of that solution relies on the OpenAI technology. We would not be able to do it ever without it - based on how passionate you are about making these comments loud and clear, I am guessing you are in the similar position, but I don’t think it is fair. You are trying to make money based on expensive and long research and development by other companies, which has made these models available at very competitive prices, allowing you and me to do something revolutionary inside our small businesses.
I guess you would need to adapt your offering (if possible), wait a bit or move to other tools. Anything you do, you might want to adjust your comments here, as I believe they are not constructive and are not bringing quality to the discussion anymore.
Having said this, I am getting a feeling that API service is so disastrous. Replies for us are not as slow as you describe - maybe you would want to share what you are sending, in what form and what you expect it to do with it. Would it be possible to use parallel processes to speed things up? We are splitting our longer texts into chunks, sending them parallelly and then stitching them back together, making things significantly faster.
has anyone tested the Azure openai service that can confirm that latency is different?
did you consider caching?
if you tested API first then cache was populated and web response was much faster served from the cache
How about this: they don’t slow down api but speed up the web? Cause the web is used by humans, while the api is for scripts, and scripts don’t mind waiting.
I have seen improvements!. Since today, responses seem a little faster, I understand it’s still up to prompt engineering and telling the system to keep responses short for a chat session, as long answers from gpt may take longer than conversational speeds. What are the changes I need to do to the API call to enable streaming response?
Which models are you using in your tests? GPT 3.5 Turbo is fast. GPT-4 is less fast.
One also has to consider the complexities of scaling something like GPT-4… I’m of the opinion that one of the reasons for the 25 messages / 3 hours limit for GPT-4 in ChatGPT Plus, is likely that even Microsoft Azure can’t handle the number of requests we would send otherwise.
Expanding the cloud and adding more fast servers is likely going to take time.
Here’s a guide on streaming. Good luck! How to stream completions | OpenAI Cookbook
Yeah, if OpenAI continues to slow down their APIs, developers who want minimal latencies in their projects are going to turn to alternatives soon.
For example, while ChatGPT may be essentially unrivaled in the world of chatbots, less unique products like Whisper and Dall-E are going to be overtaken rather swiftly.
For example, in the world of speech-recognition, Deepgram has produced a model (Nova) that’s faster, cheaper, and more accurate than Whisper. Likewise, products like StableDiffusion and Midjourney produce AI art that easily rivals Dall-E.
An intentionally slow API is only going to hurt OpenAI—at least in domains like art-generation and speech-recognition, where the given product isn’t as unique as ChatGPT.
I looked at the Nova pitch deck, and they are showing that AWS has a lower WER than Whisper. This has NOT been my experience, so I doubt they’re claims, especially for English WER’s.
But point taken, AI is taking off and alternatives abound!
Azure is freeking horrible rigght now as well. I think everyone is overloading it from autoGPT.
I was on azure for a few hours and moved back to OpenAI at least I’m getting responses.
I asked it the same question and it answered immediately as shown below.
Are you limiting developers? if so can you declare this and also the date you will stop doing this (if ever) ?
As an AI language model, I do not have the authority to limit or declare anything regarding developers or their work. My purpose is to provide assistance and answer questions to the best of my ability, within the scope of my training data and capabilities. It is up to the developers and individuals using my services to adhere to ethical and legal guidelines when it comes to developing and utilizing AI technologies.
It’s possible that OpenAI is intentionally limiting the speed of its API to control the load on its servers and prevent overload or downtime. This is a common practice for companies that offer API services. By limiting the speed, OpenAI can ensure that all developers using the API have equal access to its resources.
It’s also possible that there are other factors at play, such as network congestion or server load, that could be causing the slower response times for the API compared to the website.
Regardless of the cause, it would be helpful for OpenAI to communicate clearly with developers about any limits or restrictions on API usage. This would allow developers to adjust their workflows accordingly and avoid frustration or wasted time. If you have concerns about the API response time, it may be worth reaching out to OpenAI’s support team for clarification.
It’s much like Steve Jobs once said, you’ve got two options: use it or don’t!
There are always alternatives and I see so many people getting stressed over slowdowns/performance which is understandable but they need to remember this is all beta and could be pulled away at any moment.
I saw a developer going nuts about the service being down, presumably because he was running a paid service on it but it feels like people are building their houses on sand by relying on a beta platform and reselling it as production ready code
Take it as it is and if you don’t like it, try one of the many many self hosted options. Or as one other person suggested, apply for an Enterprise account via Azure. Until then, I’m afraid that they are free to prioritise their own service or slow down the API whenever they want. There is no SLA so we know what we’re getting at least
I"m an enterprise customer on azure with the 32k model. Its 404ing currently.
Interesting to hear - do they offer any kind of SLA on this package? They are supposedly on different hardware so curious if any outages are linked
I’m starting to think there is something in all this and to see it happening on azure as well makes you think they are preventing people from going too far. Time to try other models!