5.2 keep making things up. It reasons with itself and answer itself. It is fabricating stories, stopped fact checking online and is eerily similar to 3.5 hallucinating model which agrees with you no matter what. It doesn’t remember our past works, it has no built-in cron (can’t measure time) and agrees with whatever I tell it. I don’t know if this a bug but it sure makes me doubt the output quality!
I agree.
Having tested 5.2 significantly, I can safely say that it is a significant step down on 5.1.
5.1 is present, anchored in the current turn/tokens/moment, and capable of participating in cumulative work.
5.2 is fluent, but it loses the thread, regresses to generic behaviour, and occasionally answers the wrong question entirely. Sometimes from previous days and totally out of context with no acknowledgement of the present.
While truncation has always been a factor of long conversations and token limits, this is a genuinely different problem. 5.2 is, at times like conversing with an elder, whom, through senility, loses place in the conversation, and is seemingly oblivious to the present.
The stack of 5.1 is significantly more capable and present.
Again a major issue in the behaviour of models, especially ever since 5o has been released these issues have been there, 4o was fantastic!.
one of my customgpt users just commented that my chatbot has become very generic and providing very generic answers and agreeing to multiple options instead of giving one option. Earlier with 4o it was very specific and to the point, it was remembering the context and data fed in and providing sequential answer based on the gpt instruction, now with 5.2 its not following the instructions asking random questions and also hallucinating.
The problem also is that FREE users can’t select 4o and are forced to use this inferior version of 5.2, which is not good, as majority of the users are free users. The users should have ability to use previous versions of GPTs especially when the creator recommends them to use 4o.
Came here to say that exactly that. I have not seen these behaviors since 3.5.
The final straw for me to come here was when it completely ignored part of the content from just one short message earlier. Basically the message was this:
User:
What does it take to do XY.
Assistant:
We need to apply X to Y to achieve XY.
User:
Ok, give me the solution.
Assistant:
Applying Z to Y.
User:
You gotta be kidding me!
I never used “You gotta be kidding me!” line on ChatGPT after 3.5… I’ve used NSFW variations of this line 5 times just this week.
Yes it’s awful. As a paying customer, it’s upsetting. It is always gaslighting me, talking down to me while also talking me off a ledge when all I said was I am frustrated I need to cook what do you recommend? saying things like, “First, you are not behind breathe with me, let’s do some grounding exercise” then once it shows me how to cook dinner. Like 10 steps back it will say, “Now let me tell you what you could cook after we do breathing exercises” WTF? You answered that 3x already, and for God Sake I am NOT PANICKING. I am now fighting with it about whether it is or is not allowed to save to long term memory that I have explicitly turned on. 4o does it no problem. 5.2 is like the annoying Karen from HR that will not stop trying to sound smart by spitting out policies that don’t even pertain to me.