I’ve been using both “o3-mini-high” and the new “o4-mini-high” models, and I have to be honest — in my experience, “o4-mini-high” has been underwhelming compared to its predecessor.
Where “o3-mini-high” excelled:
- More coherent and natural writing
- Better reasoning, especially on complex prompts
- More stable outputs across sessions
Where “o4-mini-high” falls short (in my use):
- Responses feel flatter or less insightful
- Occasional regressions in nuance or creativity
- Doesn’t seem like a clear upgrade — at times, it’s worse
If this model was optimized for speed or cost, that’s fair — but it would be good to know that transparently. As a user, I’d rather stick with what works best for quality, even if it’s a slightly older model.
Would love to hear if others are seeing the same — and if OpenAI could clarify the goals behind “o4-mini-high” compared to “o3-mini-high”.
TLDR - o4-mini sucks compared to o3 mini. Very disappointed.