Besides what is being written about GPT4o as a step forward towards better interaction with humans - I find it the most annoying model so far - sacrificing attention to quality, so to be faster, and its speed is not compatible with its interface - I will explain.
It is obviously an experiment for voice-to-voice interactions which are not my thing. I don’t understand why we should strive for speed, when most of us leverage these models to make qualitative work.
It writes more content than GPT4, which is good, but it writes compulsively more content even when it doesn’t have to, and even when it is specifically asked to not do so. It’s like when you ask your mother to give you directions for a place. It takes her more time to explain than the actual time you need to actually go there.
When writing code, it becomes the most annoying from any other cases, as in every reply it writes the same code again and again, with changes that do not correlate with the guides that I give it in very precise language.
The model is biased towards speed, and on writing drafts of content which miss quality. GPT4 is also biased towards writing drafts of content and code, but through conversation that can be corrected. GPT4o gives more gravity on correcting things by following its own ways, resulting in generation of huge worthless content that I cannot deal with and properly orient it towards a qualitative finalization.
Finally, the Interface is INCOMPATIBLE for such a fast model. I don’t understand how such big companies do not handle well such inconsistencies. I should probably write a completely different post about this because they apply to all models, but GPT4o amplifies them.
The experience of interacting with GPT4o goes like this:
- Discussion and directions are being outlined,
- the model responds with very long and fast context which adopts this annoying and hard to follow Real-time animation mimicking human writing,
- the output of the model is so fast that nobody can read it in real-time though, making the animation incompatible and straining the eyes,
- The interface gets stuck in all browsers in older machines, (probably all the developers of OpenAI have the latest macbooks and they don’t see this, but this animation, besides being the opposite of human-intuitive, it halts the system with every message),
- when writing code, the model presents it from its end (when it finishes writing as we said its output scrolls to the end of it), even the “copy” function is at the top of the code, so EEEVERY time we have to scroll up to copy code, and to read the replies, (Absolutely NO POINT in having the “real-time-writing” animation)
- the result is a quite capable model, which sacrifices quality for the sake of speed, and ends up not resolving problems that we put to it qualitatively, but instead circling around resolving errors in a way that momentary shows that the code we are writing works without errors - without concern if the code actually does what it aims originally to do.
It is obviously the result of biases towards speed, and ranking high in benchmarks. Is that a wise approach? Seeking to satisfy new customers, but not taking care to retain the present ones.