ChatGPT API responses are very slow

ChatGPT API responses are very slow, even for short API calls with 200-400 tokens take 20-30 seconds. Is there any way to make the response faster.

4 Likes

Hi @nandha

Yes, things are slow based on the demand. I just checked for you by sending 300 words of lorem epsom text to the chat completion API for you and got these results:

gpt-3.5-turbo-0301

Total Tokens: 826, Completion API Time: 16.17 seconds
Total Tokens: 866, Completion API Time: 14.434 seconds
Total Tokens: 1313,  Completion API Time: 38.629 seconds

I don’t think there is much you can do at the moment as the issue is with the performance of the turbo model(s.) You could switch to another model, which have tested to be faster than turbo these days.

HTH

:slight_smile:

Appendix: Example Completion

1 Like

hi ruby_coder,
im using api and model gpt 3.5 turbo too. but the response is very slow. im calling api by python.

Yeah, is it slow, for sure now, I tested again for you now, completion time was nearly 22 seconds:

My advice is to relax and do something less frustrating until the issue on the OpenAI infrastructure side improves, if you can.

HTH

:slight_smile:

1 Like

Yes, may be “turbo” it´s a little bit “pretencious” adjective for this model :slight_smile:
I´m using curl with PHP on 500 tk max environment and the answers takes arround 30-50 secs to get ready.

Mine, too… I’m also having connection errors like this.

openai.error.APIConnectionError: Error communicating with OpenAI: (‘Connection aborted.’, ConnectionResetError(104, ‘Connection reset by peer’))

Same slowness here, plus occasional 502 Bad Gateway responses after a long wait.

Sadly, the API is throttled for normal paying users. And at the moment we are getting also a lot of errors. Not very usable in the current state and we hope OpenAI will find a solution soon.

Is there a way to avoid this error?
I got a loop that broke today after 5 minutes and I didn’t even notice when it did.

The best way to adjust, I think is trying to change your solution to avoiding invoke the API or classify your demands to reduce the times calling it to decrease the total amount of time

I came here looking to see if other people were encountering this. I guess it is reassuring that it’s not just me. But also unfortunate because I’m hoping to launch my app in a few weeks and hope this improves.

Was gonna try using an another model but for this feature I need chat API to keep context. Guess I’ll just have to wait it out like everyone else.

I’m using the @backoff.on_exception(backoff.expo, openai.error.RateLimitError) from backoff library. Trying

for i in rlist: 
    try: 
        #mycode
    except TimeoutError:
        print("error")
        continue

but it still breaks…