Gpt-4o and o1-pro > other models?

is it just me or is gpt-4o and o1-pro still much better then o4-high, o3 and gpt-4.1?

in my experience, yes, but I wonder what everyone else thinks about this.

I don’t agree actually, what makes you say that 4o and o1-pro is better than the newer models?

1 Like

Well, I think overall the new models are better, with image reasoning and larger input contexts being a differential of this generation.

But I am still adapting, it feels like the way prompts behave have changed significantly.

Like for example, sometimes they are excessively talkative and keep asking for confirmation of what I want instead of just doing it first and let me complain later.

1 Like

Higher success rate on solving a problem that isn’t part of the benchmarks they are measured against.

For example, o1-pro is able to generate longer outputs while o3 doesn’t seem to be able to. So, If I give a text and ask it to improve it while not removing anything, o1-pro is capable of, while o3 decreases the size of the text.

same for gpt-4o vs o4-high. After 3 attempts, o4-high wasn’t able to solve it, while gpt-4o one shot it.

I was quite surprised by this.

true, it might just be an adapting thing on my end, I don’t know.

Unfortunately I don’t have access to o1-pro, but wouldn’t be unfair to compare it with o3?
I think only o3-pro (when it launches) would be able to match it.

Yeah, the little gpt-4o has been updated and got smarter, much of the things I ask don’t need reasoning models. That’s a particular aspect I like about having a myriad of models to choose.

that’s a good point! You might be right, I’m sure o3 is better then o1.

well, I think the whole “comparing different model ‘versions’” solves this topic. I shouldn’t compare gpt-4o to o3 or o4-mini as those are different models, I should compare it gpt-5o when it comes out.

I still find gpt-4o better then the reason models, the o models (o1,o3) are better but take longer, but the mini reasoning models still perform worse in my opinion, even on tasks that would require reasoning, but maybe I’m doing the resoning for it so that might be why.

1 Like

I totally agree with you when it comes to coding>
The other models are faster. chattier and tend to generate enormous blocks of code you did not ask for. Even with clear instructions (just like Grok btw)

O1 Pro is to the point and only delivers what you request. It is very expensive but delivers way more value to me.

hence why paying $200 a month is worth it, it would cost far more to use on the API. But sometimes, specially in automating certain functionalities, using the API is the only way.

also, the codex cli is just not there yet. Reminds me of SORA, the Operator and gpt-4.5. Might as well add o3 and o4-mini in under deliveries when comparing to the current version of gpt-4o, o1-pro and the new image model (I think its ‘gpt-4o-image’?) which are in my humble opinion SOTA models.