Night and day different in Assistant's API latency between gpt-3.5 versus gpt-4-turbo

CinematicDev · February 15, 2024, 12:22am

I just switched my Assistant from gpt-4-turbo-preview to gpt-3.5-turbo-1106 and I have to say the difference in the amount of time it takes to get a response dropped by 60 - 70%. I ended up going with gpt-4-turbo because the documentation used that specific model but after seeing how slow it was, I almost gave up on the assistants api entirely.

Many people have been complaining on the forums here about the slowness of the Assistants API and it’s likely they too are using the GPT-4. It feels like the tutorials should consider switching to use gpt-3.5. It would lead to a much better engineering experience for the vast majority of people.

vb · February 15, 2024, 9:01am

That’s a nice idea.
I’ll move this to the feedback category.

kgupta · February 16, 2024, 5:50pm

Facing the same issue, gpt-4-turbo-preview is so much better working with JSON responses though the performance really sucks…gpt-3.5-turbo-1106 is faster but not giving results back in json even emphasizing in prompts, it works sometimes but not always, inconsistent behavior

Topic		Replies	Views
Which model is faster gpt-3.5-turbo-1106 OR gpt-4-preview-1106? API assistants-api	4	10972	April 9, 2024
Assistant API performance Feedback gpt-4-turbo , assistants-api	3	849	June 5, 2024
ChatGPT 3.5 Turbo Vs ChatGPT 4 - API Response Speed API	9	3692	December 24, 2023
Gpt-4-0125-preview INCREDIBLY slower than 3.5 turbo API	12	9530	July 22, 2024
Gpt-4-0125-preview is slower than gpt-4-0613? Feedback gpt-4 , api	5	5544	January 30, 2024

Night and day different in Assistant's API latency between gpt-3.5 versus gpt-4-turbo

Related topics