Fine-tuned model with same seed and data is 7x slower now vs 6 months ago

Hi,
Roughly 6 months ago (06/24) I succesfully created a fine-tuned model based on gpt-3.5-turbo-0125 that is very fast.

Recently (12/25), it seems that all of my fine-tuned models are very slow.

So I did an experiment and re-ran a fine-tune job from 06/24 now. Same seed, same everything. The result is a model that behaves the same but is 7-8x slower

Is this expected? What changed?

Thanks!

1 Like

Correction/Clarifying: I built the first model in June, 2024. I re-ran the fine-tune job that created that model in December 2024 and the result of that re-run is 7-8x slower than the first model.

Sorry I don’t have anything official to say, but I have some performance tests that I’ve been running ~daily and I’ve noticed a slow increase in latency. Here’s my armchair hand waving… I saw that 4o was quite fast when they were testing out the realtime stuff, then when they released a separate endpoint for it the latency increased significantly, and since then has been slowly creeping to be slower and slower… It’s not intolerable yet, but I hope it doesn’t continue to creep. Latency and cost are the two factors that will make it unusable for us (frankly the 4o models are doing tasks well enough for now… I don’t need them to be smarter, just fast and cheap!)