Unstable speed of gpt-3.5-turbo-16k

ameramayreh · December 13, 2023, 9:28am

Dears,

We are getting so slow responses with gpt-3.5-turbo-16k from time to time.
It was taking 18 seconds, then it became more than 3 minutes before a month. Then it became 20 seconds, last week it became 1.8-2.8 min, then reduced to 1 min, then 30 seconds, yesterday it was 45 seconds.

OpenAI support is not useful, their context window is less than 2K tokens I think

Is there any explanation about this inconsistency?
Is there a way to get consistent fast speed?
If not, is there alternative with large context window that is production ready?

We are in tier 3 as we are still in testing and demoing to our customers and prospects.
The use cases is structuring data as JSON (that is reflecting on a UI), with the ability to modify the result by conversation.

Thanks

jr.2509 · December 13, 2023, 9:37am

Some variation is normal I’m afraid. You could consider accessing the API via Azure’s OpenAI platform. I have only been using it so far for GPT-4-turbo for select use cases but found the API performance to be generally fairly stable.

TonyAIChamp · December 13, 2023, 9:54am

3 minutes response time?! How big is your prompt, expected completion and what’s the use-case?

computerexel · January 9, 2024, 9:16am

I have been getting response times exceeding 5 mins for outputting 6.5k tokens with 3.5-turbo-16k context. This model is a godsend but the latency on it can be insane sometimes.

TonyAIChamp · January 9, 2024, 9:30am

Is it just today? I have some issue with the speed today as well (for 1106 models).

computerexel · January 9, 2024, 9:34am

It’s not just today, I have sporadically experienced extremely bad latency with 3.5-16k. My use case is code generation. In my testing I haven’t been able to find an alternative to this model. Even gpt-4-1106 only has max 4096 tokens as output.

TonyAIChamp · January 9, 2024, 11:38pm

Is the bad speed from your computer or from a server?

What I noticed is that for me the API’s speed from my computer (I live in Indonesia and internet here though showing good metrics on Speedtest, may sometime be very poor for some services) are much worse that from the hosting.

Topic		Replies	Views
Very slow response time with chatgpt-3.5 turbo model API API	17	11006	December 19, 2023
Is there an issue with GPT 3.5 turbo 16k? API	5	938	October 27, 2023
GPT 4 API is Very Slow Still API gpt-4 , chatgpt , api	15	6766	December 16, 2023
Gpt-4-0125-preview INCREDIBLY slower than 3.5 turbo API	12	9575	July 22, 2024
Gpt-4-0125-preview is slower than gpt-4-0613? Feedback gpt-4 , api	5	5570	January 30, 2024

Unstable speed of gpt-3.5-turbo-16k

Related topics