Response times of GPT3.5 models

Hi!

New here, I’m developing an application based on OpenAI for my company. In my application, I do some easy stuff using GPT3.5 and for the hard generation I invoke GPT4. GPT4 is a marvel, doing very complicated tasks flawlessly, but it unfortunately costs too much and I cannot market the application only calling GPT4.

Now, I’m having problems with the API calls to GPT3.5. I’m calling chat.completions.create in the standard openai client for python, but I’m experiencing wildly variable answer times. For the exact same prompt, I sometimes get the answer immediately, sometimes I get it after something like 3 minutes. Unless I kill the process and restart it, it usually reverts back to being almost instantaneous.

This variability is quite horrible, and I would like to know if you also experience this. Did you find a solution?

GPT4 is way faster, and never hangs for me.

Thanks!

I have the same problem on my destip, but not my phone. Not sure why, but its an easy adjustment. Do the work on your phone and it can be pulled up on your desk top. I know it’s a “hack” but it works for me.

Well, you wouldn’t say that if you saw the phone that I’m using! That thing is SLOW

Jokes aside, I don’t understand your solution. I’m talking about the API calls, not ChatGPT: that one is always very fast for me on my computer (I don’t touch ChatGPT with my phone). Are you saying you experience difference when doing API calls from the phone?

I do. Its muck slower on my laptop often.