I wonder if that explains all the slowness?!
think you’ll also be happy with o3-pro pricing for the performance
Definitely looking forward to the release of that one, especially when it is possible to fire off a few requests without breaking the bank.
Let’s hope for the release very soon!
Still not giving that kid my ID.
Any word on o3-mini price drop?
Update:
I just wish there were more information about when. Should I have a cup of coffee and stay up, or not?
Slightly skeptical that suddenly o3 is reduced by 80% and then o3-pro is announced at the same time. Is this not just … Basically o3-mini vs o3?
Is this the same concept as naming regular chips “Family Size”, and then reducing the regular chip size?
Any news on an upcoming full o4?
Right now o3-mini and o4-mini are priced the same, if a full o4 comes out at the current o3 prices that would be game changing.
Actually, I’m expecting GPT-5 to be released fairly soon, and I have to admit that from that point on, I stopped thinking about o4 .
Also, I was expecting GPT-5, but I need a thermos of coffee.
BTW they posted release note:
OpenAI o3-pro, available now for Pro users in ChatGPT and in our API.
https://help.openai.com/en/articles/6825453-chatgpt-release-notes#h_6c98aab643
So I had to look back at the o3 announcement page.
Notice that o3 scored 2706 & o4-mini scored 2719
Now o3 scores 2517 and o3-pro is 2748?
Reduced to 2301 from 4/4 reliability benchmark
At this point, o4-mini is most likely going to be better & cheaper than o3 as it so far looks to be a distilled model. Could be why they only compared it to o1-pro and the now reduced o3?
Probably just one more “A model version we’ll never release to the public” benchmark.
I’m actually sad from this release. I imagine this new o3 will replace the current one for Plus users.
Dark patterns FTW
There is a comment regarding the updated code forces eval. Apparently the numbers can’t be compared 1:1.
- The Codeforces evals for o3 and o3-pro were run using an updated set of Codeforces questions with more difficult tasks, as the previous version (used for o1-pro) was close to saturation.
Still expensive but much more reasonable, o1-pro was totally out of reach for me.
I honestly picked the first one I saw.
This pattern is consistent throughout the shown benchmarks:
o3 Announcement page
New o3-pro page
The same pattern: o4-mini outperforms or is competitive with o3 and is almost 1/2 the price
- The Codeforces evals for o3 and o3-pro were run using an updated set of Codeforces questions with more difficult tasks, as the previous version (used for o1-pro) was close to saturation.
Where did you find this?
Ah, thank you.
It would’ve been nice to have o4-mini on the graph. At best, it’s highly ambiguous on which o3 model they used.