With 2x the speed and 1/2 cost of Turbo and higher scores on synthetic benchmarks and the chatbot arena ELO, it would seem appropriate to stop using Turbo altogether and just migrate to ‘o’. What’s the expectations for developers?
I think it would be great if OpenAI produced a guide on which model to select. We still have gpt-4, 3.5-turbo, 4-turbo, 4o (and 4-32k for some). Without this guide every developer is left in the dark over an ever increasing number of variants.
Shall we migrate to ‘o’ vs Turbo, ASAP?
Is there ANY reason for Turbo to exist, at all?
Are there use cases or guidance on which model to select?
I migrated everything to gpt-4o ASAP. On playground I noticed that as the size of the conversation grew with images, the slower it got. So, in case you use long conversations, MAYBE gpt-4-turbo will be worth it? Not sure, I haven’t scrutinized it enough to give you a proper response.
On the simple-eval repository gpt-turbo seems to outperform gpt-4o in the DROP (F1, 3-shot) category
They usually leave older models around for a while before they fully deprecate them.
I think the expectation is going to be to switch to o ASAP. However, there might be legacy dependencies in people’s code that may or may not make this more difficult (I’m thinking about DALL-E, vs. whatever GPT4o would eventually produce), so the choice is yours. But as you already point out, there’s many reasons to migrate
Which model is the best is something you’ll have to test on your own application, I highly recommend creating a benchmark, and using that to test the various models for your specific application
I still need to do more testing but I found that GPT-4o is “casual smart” and GPT4turbo is “technical smart”.
I migrated most automations to GPT-4o and kept technical workflows with GPT4turbo and Claude Opus.
What are other ppls experience?
FINAL UPDATE: I had the model name misspelled as was pointed out to me. The correct model name is “gpt-4o”.
You can’t use gpt4-o for batch processing if that matters to you. Just found out today, after spending a boat load of time coding up batch processing code.
UPDATE: You can’t use gpt4 either. I had to downgrade to one of the gpt 3.5 models. I guess they haven’t set up batch processing for gpt4 yet.
UPDATE-2: I wasn’t crazy, the docs say clearly that you can use gpt4 series:
As you can see from my screenshot, when I tried to use “gpt4-o” the Dashboard clearly failed the job when I tried to use it for batch processing. The same happened when I tried to use “gpt4”. Are you saying there is a way to fix this and if so, how?