GPT-4o exhibits inferior memory compared to gpt4-1106-preview

jr.2509 · June 4, 2024, 3:06pm

Given that GPT-4o has a different underlying architecture than the other GPT-4 models, it is fair to assume that amendments to instructions are required to achieve similar outputs than before.

I do agree however with your observation - which is consistent with what others have reported and also reflects my own experience so far - that instruction following occasionally is a challenge under this model. See also this thread: GPT-4o vs. gpt-4-turbo-2024-04-09, gpt-4o loses - #10 by elmstedt

OpenAI itself has also made the point that the model may underperform GPT-4-turbo in some cases and that they are still looking to gather feedback as to when that specifically is the case:

As measured on traditional benchmarks, GPT-4o achieves GPT-4 Turbo-level performance on text, reasoning, and coding intelligence, while setting new high watermarks on multilingual, audio, and vision capabilities.

Through our testing and iteration with the model, we have observed several limitations that exist across all of the model’s modalities. We would love feedback to help identify tasks where GPT-4 Turbo still outperforms GPT-4o, so we can continue to improve the model.

Source: https://openai.com/index/hello-gpt-4o/

Topic		Replies	Views
Is gpt-4o-2024-08-06 performing worse than gpt-4o in complex reasoning tasks? API	4	998	September 10, 2024
GPT-4o is stuck in a loop and unusable GPT builders gpt-4	8	3121	January 18, 2025
GPT4 and Spelling Mistakes in Coding Tasks and JSON outputs Feedback	7	804	June 19, 2024
ChatGPT 4o model feedback after a few months of usage (coding) Feedback gpt-4	8	1383	December 3, 2024
New Model GPT-4-o e GPT-4 Turbo are returning weird results Bugs	9	631	June 1, 2024

GPT-4o exhibits inferior memory compared to gpt4-1106-preview

Related topics