Anyone facing gpt-3.5-turbo API delays?

anon10827405 · October 30, 2023, 6:23pm

Instead of increasing the timeout may want to consider resetting it back to default (30 seconds is more than enough) and enable streaming.

If there is some sort of connection issue you are now left hanging for potentially 2 minutes. That’s 2 minutes of waiting before you can make a decision.
You’ve now noticed that setting an arbitrary timeout still doesn’t solve any issues. So do you keep increasing it, or handle it differently?
With a 30 second timeout you know that there’s a connection issue and can implement a retry/backoff library. At the very least you know if it’s an issue with connecting, or an issue with the token generation.
In most cases the token output is very slow, but still outputting tokens. So you can monitor tokens/second and use this information as a notice for end-users. If the server sends back 5 tokens and crashes you can just re-send the payload with the additional tokens.

Topic		Replies	Views
GPT-3.5 API is very slow. Any fix? API	31	9872	October 12, 2023
GPT-3.5 Turbo API response is slow API	20	12385	November 11, 2023
GPT-3.5 API is 30x slower than ChatGPT equivalent prompt API gpt-35-turbo , api	69	13890	November 30, 2023
Chat GPT's API is significantly slower than the website with GPT Plus API	35	36652	December 12, 2023
ChatGPT API responses are very slow API	31	29279	December 12, 2023