I’ve been finding o3’s answers less consistent and less reasonable compared to 4o.
For example, I asked a straightforward question comparing the benefits of a Microsoft product versus AWS and Google alternatives. The response included duplicated information (which it acknowledged) without any valid reason. In another part of the answer it even struck through part of its own answer and replaced it with different info — which felt messy and unjustified:
In another case, I requested five simple science/tech jokes. It gave me two actual jokes and three that were nonsensical. When I asked o3 to explain why the jokes were funny, the reasoning made no sense. I then asked 4o to explain them, and it replied that the jokes didn’t seem to follow any logical pattern.
Initially, o3 felt like a step forward from o1, but over the past few days, its quality seems to be declining.
Is anyone else finding o3 a bit off lately?