GPT-4o vs. gpt-4-turbo-2024-04-09, gpt-4o loses

mdyildirim · May 24, 2024, 12:18pm

Some initial tests gave mixed results. Will try further. Question: did you try fine-tuning?

anon34024923 · May 24, 2024, 2:22pm

you tried 3.5? Not sure what you mean by what you highlighted

mdyildirim · May 24, 2024, 2:36pm

Yes, I tried. Gave some examples in the prompt, asking for a similar output. Got good results, but sometimes not so good. It was an initial test, gotta test more for a proper evaluation.

Finetuning: https://platform.openai.com/docs/guides/fine-tuning

andremr · May 24, 2024, 2:47pm

this could explain lower cost and faster performance than 4t.

isaacwrubin · May 24, 2024, 3:14pm

the gpt4o model produces worse outputs across the board and is hilariously stupider than claude. I have claude running autocoding itself into ABM and GPT 4o cant figure out how to get past its own tool template.

anon22939549 · May 24, 2024, 3:16pm

So don’t use it. I don’t see what the problem is.

anon34024923 · May 24, 2024, 4:06pm

welp. dono what to tell ya

andremr · May 24, 2024, 11:53pm

Interesting… I took an excerpt of english text from the OpenAI page just now, and opened 3 tabs with ChatGPT, each with a different model.

To my surprise, the tab with the 4o model produced exactly the same translation as the tab with the 3.5 model, and only the 4t model produced a different translation (and better than the other 2).

mdyildirim · May 25, 2024, 7:17am

more like a gpt-3.5o then? One test or few observations wouldn’t prove anything, but if true, the real implications are very profound. Like; it’s not half-price, it’s actually x10 more expensive with the same speed. But more importantly, Openai would be dishonest, which means humanity is in danger (and I, being an Openai fan, really hope it’s not the case).

_j · May 25, 2024, 5:02pm

@ 2:15

dmki · May 25, 2024, 5:46pm

Personally, I’ve got pretty good results with gpt-4o vision, and another scenario where it had to produce text report of data. Previous model, gpt4, didn’t pay attention to details in request, and also added some dumb jokes to it. Don’t get it with gpt-4o, so I stick to it.

mdyildirim · May 26, 2024, 7:25am

Indeed all the demos look amazing. That’s just not my experience so far (though I must admit I have interacted with gpt-4o only via API and Replit’s AI-chat-engine).

ryadaj · May 27, 2024, 11:36pm

You have described a situation I myself have experienced with both 4o and 4, i.e., the same prompt is understood by one model and massively misunderstood by the other model. I did not experience such issues before the launch of 4o.

miskka · June 4, 2024, 12:47pm

4o is very dumb in complex issues. It just cannot combine simpler concepts together. We reverted back to 4 turbo. It feels like THE new version after using 4o.

mihailpet · June 4, 2024, 4:54pm

You get what you pay for. It’s not cheaper for no reason.

merefield · June 9, 2024, 5:40pm

I have to agree with the OP, gpt-4o struggles where gpt-4-turbo succeeds.

It’s completely failing on a completions based categorisation task for me, where turbo is very good.

(yes, I’m aware that’s a non-standard approach to categorisation, but I actually find embeddings frustratingly poor at some types of categorisation problems)

Shame!

jinchenwang123456789 · June 11, 2024, 12:36am

Look at the text evaluation graph at OpenAI. Website: https://openai.com/index/hello-gpt-4o/
Scroll down the page to find the benchmark performance.
This proves that GPT 4o is the best.

merefield · June 11, 2024, 8:10am

Real world experience is not a benchmark

anon22939549 · June 11, 2024, 8:11am

I think you’re responding to a troll.

Topic		Replies	Views
GPT-4-Turbo models perform better the older GPT-4 models in LMSys benchmark API gpt-4 , api	14	6679	May 13, 2024
Comparing GPT-4 to GPT-4o API gpt-4	4	1859	May 14, 2024
Gpt-4o and o1-pro > other models? API models	8	484	April 28, 2025
Performance of GPT-4o on the Needle in a Haystack Benchmark API chatgpt , api , gpt-4o	13	5872	June 13, 2024
GPT-4o has been bad for my GPT; anyways to switch back to GPT-4 Plugins / Actions builders gpt-4 , gpts , gpt-4o	45	3609	June 18, 2024

GPT-4o vs. gpt-4-turbo-2024-04-09, gpt-4o loses

Related topics