ChatGPT-3.3 turbo api not working

Hi everyone! I have ChatGPT Plus, and I made a shortcut in Apple Shortcuts, which is making an API request, using my API key. The problem is that I have already set up billing details, and rate limits in my Openai account, but I can’t use the GPT-3.5-turbo modell. Everytime I use the API key, the answear is not as fast as it should be. That means my API doesn’t use the turbo model for some reason…

Are you getting an error from the API or is it just slow? The API is not instant even on good days (don’t let the turbo name fool you), and there’s been several performance issues today

Its working, but slowly. The last month I tested several times (for example “write a 200 word essey” online with the turbo model, and also with my API) and the difference was huge (I response speed) I know, sometimes there are technical issues etc. but Its looks like there is something more

Is it a real difference or merely perceived?

Are you streaming the result back from the API or waiting for the whole result?

To better quantify what you’re experiencing, you should send the same prompts to both the web interface and the API and time them and count the token usage.

Do that several times.

Then, for each method, add up all the tokens used and the times, divide the first by the second and you’ve got the tokens per second (t/s).

You can also at that point compute the t/s for each prompt through each interface which will give you an ability to estimate the distributions of t/s rates for each interface.

Then you can do a simple difference of means test to see if there is a statistically significant difference.

At that point you’ll have something quantifiable to talk about.

Until then, it’s impossible for anyone to say anything with any authority about the problem you’re describing.

The service is also experiencing a lot of attention and growth. Some days it does certainly seem slower than others, so I’m not surprised if you thought it was faster 1 month ago. Bottom line though is nothing can really be done. Use streaming, limit your prompt size if you can, and hope that OpenAI adds more scale soon.

We’ve definitely been noticing a huge, and quantifiable, slowdown today (and over the last few days)

When making an API request (3.5 model) asking for 1 result, it takes 10 seconds.

The same prompt, when entered in ChatGPT website (Plus subscription, using 3.5 model), completes streaming in 3 seconds.

If we ask for several results (n = 3+), the whole request often times out (we wait 30s, then give up).
We’ve tried this dozens of times, and are seeing the same slowness repeatedly.

Would be helpful to get official guidance as to whether API calls have a different priority than ChatGPT Plus.

I don’t know that it’s a difference in priority as much as they’re probably on just on different server clusters and experiencing different loads.

I know there’s also been times when the web interface has been inaccessible while the API continued working.

I have done it, several times during the last month (also with long, and short messages) and the speed (tokens/sec) was much slower than the online model. I mean its looks like my API key only uses the default 3.5 model, not the faster.

1 Like

I just did some more speed comparisons with this prompt, running the prompt 5 times in each interface.

  1. API - 20-25s
  2. ChatGPT Plus - 7-15s
  3. ChatGPT (Free) - 20-30s

Definitely looks like API calls run at the same speed as the free ChatGPT web interface, and don’t get the GPT Plus speed boost :disappointed: Which is a shame, as both are paid services.

Granted, they’re getting $20 per month for GPT Plus, and the same volume of calls through the API would probably cost < $1.

I imagine this is because API calls open them up to huge volumes through automation, whereas GPT Plus web interface can only consume data input at human speeds.

I’d love to see a priority/faster API access level - it’s so cheap, we’re happy to pay double to get GPT Plus speeds.