I was wondering if something has been changed with gpt-4o-2024-08-06. I recently (I think today) noticed that the results I am getting different to what I got a month ago. I use this model to parse large text with items and convert them into json that I then use. I noticed that now it repeats items or misses them completely when a few days ago that did not happen! I use the same test set (and I used that test set like a 100 times before with always the same good result). Now the results fail for a few of my test cases. It seems to behave almost as bad as gpt-4o-2024-11-20 (which btw is really bad with large input!).
Welcome to the community!
We have another thread here where folks are reporting the same issue:
The community suggestion would be to switch over to a different fixed version if possible until OpenAI acnowledges and deals with the issue (they haven’t yet according to https://status.openai.com/)
I am using a fixed version, that is the whole point. I am using gpt-4o-2024-08-06 but I seem to get the same issue indeed as in the other forum.
I just received confirmation from the OpenAI team that the issue has been resolved.
Please let us know if you continue to encounter this behavior, along with any relevant details such as the completion or run ID.
Thanks sps. Did they tell you anything about the precise nature/cause of the problem?
are they not going to communicate on this???
I really would like to at least know what happened. Right now, as a precaution I have switched to using Claude instead as I don’t want this happening again. My company is using this solution to provide offers and if items are repeated or wrong this has a big impact. This is the reason why I specifically choose the GPT-4o-2024-08-06 to precisely avoid issues like this. I can accept an outage as that I can see immediately but if the model changes behavior that scares me a lot as I can’t see this until customers complain. At the very least I would like to know the root cause so I can judge if this is a one-off or something that can come back at any time.
Exactly the same. How are they not officially communicating on this? It’s baffling