GPT-4o vs. gpt-4-turbo-2024-04-09, gpt-4o loses

anon22939549 · May 20, 2024, 11:14am

Then you’d need to quantify how often, if ever, the model can produce worse outputs.

Imagine we can strictly quantify a model’s strength.

Then say they have a new model that is ten-times better than the old one at everything other than writing salamander-themed haikus, where it’s only half as good.

Should they not release the new model because it produces worse outputs for that narrow use case?

I would argue they should release such a model.

OpenAI, with very few exceptions, knows everything their GPT models have been and our prompted for. They are building models which they hire are generally better overall, but especially are better for most of the things, most of their users, want to do most of the time.

Unfortunately that sometimes means if you really need it want to do something not very many people are doing, a newer model might not be as strung as an older one for that particular thing you want to do…

Sometimes, the goal might be a model that meets most people’s needs but is much smaller and more efficient, so it can do 90% of what they need for half the cost.

The end-goal is better models for everyone, but there are many competing needs and motivations, so the path there is unlikely to be strictly increasing all the time for everyone.

Topic		Replies	Views
GPT-4-Turbo models perform better the older GPT-4 models in LMSys benchmark API gpt-4 , api	14	6691	May 13, 2024
Comparing GPT-4 to GPT-4o API gpt-4	4	1863	May 14, 2024
Gpt-4o and o1-pro > other models? API models	8	507	April 28, 2025
Performance of GPT-4o on the Needle in a Haystack Benchmark API chatgpt , api , gpt-4o	13	5935	June 13, 2024
GPT-4o has been bad for my GPT; anyways to switch back to GPT-4 Plugins / Actions builders gpt-4 , gpts , gpt-4o	45	3626	June 18, 2024

GPT-4o vs. gpt-4-turbo-2024-04-09, gpt-4o loses

Related topics