Response times of GPT3.5 models

sbaldino · November 24, 2023, 2:22pm

Hi!

New here, I’m developing an application based on OpenAI for my company. In my application, I do some easy stuff using GPT3.5 and for the hard generation I invoke GPT4. GPT4 is a marvel, doing very complicated tasks flawlessly, but it unfortunately costs too much and I cannot market the application only calling GPT4.

Now, I’m having problems with the API calls to GPT3.5. I’m calling chat.completions.create in the standard openai client for python, but I’m experiencing wildly variable answer times. For the exact same prompt, I sometimes get the answer immediately, sometimes I get it after something like 3 minutes. Unless I kill the process and restart it, it usually reverts back to being almost instantaneous.

This variability is quite horrible, and I would like to know if you also experience this. Did you find a solution?

GPT4 is way faster, and never hangs for me.

Thanks!

RouseNexus · November 24, 2023, 2:35pm

I have the same problem on my destip, but not my phone. Not sure why, but its an easy adjustment. Do the work on your phone and it can be pulled up on your desk top. I know it’s a “hack” but it works for me.

sbaldino · November 24, 2023, 2:46pm

Well, you wouldn’t say that if you saw the phone that I’m using! That thing is SLOW

Jokes aside, I don’t understand your solution. I’m talking about the API calls, not ChatGPT: that one is always very fast for me on my computer (I don’t touch ChatGPT with my phone). Are you saying you experience difference when doing API calls from the phone?

RouseNexus · November 24, 2023, 7:04pm

I do. Its muck slower on my laptop often.

Topic		Replies	Views
Unstable speed of gpt-3.5-turbo-16k API api , gpt-35-turbo-16k , performance	6	1071	January 9, 2024
Chat Completion API Responding too Slow API	7	88	January 31, 2025
[GPT-3.5-Turbo-16k] Response generation is slower now for Function Calls API gpt-35-turbo , function-calling	9	2947	October 13, 2023
Chat Completion API extremely slow and hanging API	7	4588	December 4, 2023
Slow Chat api responses ------ API	17	6358	December 24, 2023

Response times of GPT3.5 models

Related topics