Completion vs. chat performance

Hi Guys

I am having a problem with speed and reliability of the chat api
I started my app with completion/davinci. went reasonably fast.
I then wanted to switch to chat and gpt-3.5-turbo for cost and speed reasons. However 3.5-tubo is painfully slow - and gets 500 error frequently due to overload.
I am now trying gpt-4. a bit faster at first. but now I get a timeout from again.
I am running the latest python lib openai-0.27.7

At this time, the chat api is practically unusable for for.
any advice ? what am I doing wrong ?

I am not using Python, but I think you are almost correct, and that you are experiencing is normal.

  1. Currently, Davinci is faster than GPT-3.5-turbo, and GPT-3.5-turbo is faster than GPT-4.
    You can check this website for real-time results (not OpenAI official website):
    OpenAI API response time tracker

  2. Unfortunately, overload happens, and we cannot control it if many people are using the service. However, if you are consistently experiencing this issue, it might be an error. In that case, you can check the status page:

  3. At least, if you turn on “stream”, you can see the progress. So, even if it takes 40 seconds to fully respond, you won’t feel stuck. Check the OpenAI documentation and example here:
    OpenAI API
    openai-cookbook/How_to_stream_completions.ipynb at main · openai/openai-cookbook · GitHub

Thanks for the answer. I only generate a handful of characters with each call. So streaming wont help in my case. I’ll check the sites you suggested to get a better feeling about the availability.
Thanks !