Fine-tuned model with same seed and data is 7x slower now vs 6 months ago

frankcort · January 2, 2025, 4:27pm

Hi,
Roughly 6 months ago (06/24) I succesfully created a fine-tuned model based on gpt-3.5-turbo-0125 that is very fast.

Recently (12/25), it seems that all of my fine-tuned models are very slow.

So I did an experiment and re-ran a fine-tune job from 06/24 now. Same seed, same everything. The result is a model that behaves the same but is 7-8x slower

Is this expected? What changed?

Thanks!

frankcort · January 6, 2025, 4:00pm

Correction/Clarifying: I built the first model in June, 2024. I re-ran the fine-tune job that created that model in December 2024 and the result of that re-run is 7-8x slower than the first model.

nchappell · January 9, 2025, 1:28pm

Sorry I don’t have anything official to say, but I have some performance tests that I’ve been running ~daily and I’ve noticed a slow increase in latency. Here’s my armchair hand waving… I saw that 4o was quite fast when they were testing out the realtime stuff, then when they released a separate endpoint for it the latency increased significantly, and since then has been slowly creeping to be slower and slower… It’s not intolerable yet, but I hope it doesn’t continue to creep. Latency and cost are the two factors that will make it unusable for us (frankly the 4o models are doing tasks well enough for now… I don’t need them to be smarter, just fast and cheap!)

Topic		Replies	Views
Gpt-4-1106-preview get slow API gpt-4	5	3414	February 26, 2024
Fine-tuned gpt-3.5-turbo latency Feedback fine-tuning-problems	15	3574	November 15, 2024
Has fine-tuning gotten worse since last year? Bugs gpt-35-turbo , api	0	392	April 9, 2024
Performance Degradation and Increased Latency in GPT Models API gpt-4 , api	1	498	May 17, 2024
[GPT-3.5-Turbo-16k] Response generation is slower now for Function Calls API gpt-35-turbo , function-calling	9	2933	October 13, 2023

Fine-tuned model with same seed and data is 7x slower now vs 6 months ago

Related topics