Is gpt-4o-2024-08-06 performing worse than gpt-4o in complex reasoning tasks?

capreal26 · August 28, 2024, 8:22am

Seeing that gpt-4o-2024-08-06 performs worse (or lets say less comprehensive) than gpt-4o sometimes. Is that by design or an anecdotal observation?

dignity_for_all · August 28, 2024, 9:52am

When you simply specify “gpt-4o” as the model name, it points to “gpt-4o-2024-05-13”.

Since “gpt-4o-2024-05-13” and “gpt-4o-2024-08-06” are slightly different models, the differences you observe may be due to this.

https://platform.openai.com/docs/models/gpt-4o

capreal26 · August 28, 2024, 10:05am

Yeah, I understand that. I was just curious that enabling structured outputs and larger context windows in gpt-4o-2024-08-06 has taken away some of the reasoning power away from gpt-4o / gpt-4o-2024-05-13? Has anyone else seen that in dev or production or is this anecdotal observation?

jviksne · September 10, 2024, 11:02am

Yes, I am encountering an issue when asking the model to select a category for an item from a predefined list of approximately 50 categories, which includes a “miscellaneous” option. Some items are difficult to match, so I have added instructions for the model to prefer broad matches or, at the very least, choose the “miscellaneous” category instead of returning an error. However, gpt-4o-2024-08-06 consistently fails to follow this instruction, even with a wide range of temperatures (including 0), while gpt-4o-2024-05-13 performs well under the same conditions.

jviksne · September 10, 2024, 6:23pm

After reimplementing the request to include the JSON specification as a structured output, rather than specifying it in the prompt, the results seem to have improved. I also noticed a typo in the category list for the miscellaneous category, which was correct in the prompt—this might have had some impact. Overall, after making these two changes, I’m no longer sure that gpt-4o-2024-08-06 performs worse.

Topic		Replies	Views
GPT-4o exhibits inferior memory compared to gpt4-1106-preview API api , memory-issues , agents , memory , gpt-4o	2	1109	June 5, 2024
Issues with the new default version of gpt-4o (October 2024) API gpt-4o	3	466	October 18, 2024
Gpt-4o-2024-08-06 has many more errors than gpt-4o-2024-05-13 API gpt-4o	3	1473	September 24, 2024
Feedback on gpt-4o-2024-11-20 - is it worthy of an OpenAI switch as default gpt-4o? API gpt-4o	0	211	March 17, 2025
Has regular gpt-4 model changed for the worse by any chance? Community gpt-4 , hallucinations	12	1820	April 23, 2025

Is gpt-4o-2024-08-06 performing worse than gpt-4o in complex reasoning tasks?

Related topics