There have been no capabilities blocked so I’m unclear what the issue is.
Once again this week, ChatGPT-4 has changed the way it generates outputs. As I’ve mentioned before, I’ve generated 100s, maybe 1000s, of outputs using a similar prompt so I have enough proof that there is definitely a change.
The previous issue was shorter outputs (which I managed to find a workaround for), now, 99% of the time, it gives me the “Certainly, I can do that for you” nonsense that I was used to with 3.5 but never got with my specific prompt with 4. (Again, I’ve generated 100s of outputs using more or less the same kind of prompt.) The quality of the writing also seems much worse, almost like it is indeed 3.5. “Worse” may be subjective, but it’s unquestionably different in any case.
Why are they messing with something that didn’t need to be fixed? I’m usually skeptical about claims of downgrading, but there is definitely a change. This wasn’t happening a few hours ago.
Really frustrating, as I’d found the perfect wording for my prompts to get the specific quality of descriptive, narrative text I wanted.
Edit: I’ve tried to generate some more, and now I’m convinced the outputs are indeed objectively worse. It absolutely refuses to give me any kind of decent descriptive writing, just giving me bare bones, almost middle school level, narrations, and it may actually be worse than 3.5 at this point. Hell, I remember getting better descriptive writing with GPT-3.
To be more precise, I actually still get decent writing the rare times it doesn’t start with “Certainly bla bla bla,” but I get that garbage introduction 98% of the time now. Everything is pointing toward some badly done finetune.
Edit edit: I found another workaround to get the results I was previously getting (by adding this line at the end - Over 2500 characters, please, and use very literary language) but it’s annoying to have to keep fiddling with my prompt to get the results I used to get with no issue before.
I cannot stop laughing at this.
Basically the “Don’t worry, we’ll call you”
Smooth GPT, smooth.
I’m positive that Regenerate causes some parameters to change. Maybe the penalties, higher temp? Not sure.
OK, it seems the AI is back to functioning at least as well as it was 24 hours ago with my regular prompts (meaning I don’t have to add the part about literary writing, though I still need to tell it to give me >2500 words). I generated over 10 outputs with no “Certainly I can do that for you” and the writing was back to being descriptive and interesting. I’m not sure if the issues earlier were caused by an error of if it was an intentional change, but, either way, I’m glad it seems to have been resolved, and hope it remains like this.
Honestly though, GPT-4 is just vastly better at language and symbolic logical reasoning than GPT-3.5-turbo ever was. I’d honestly say that in that specific domain it’s actually exceeded where it was in like, april or so. Surprisingly, that hasn’t fully translated to it being as good at software architecting as it was then, but maybe i’m a bit rose-tinted glasses.
To your point about what @elmstedt thinks is strange, that it simultaneously straight up says its glad, or sad, or says “I feel” or makes up “memories” while also literally derailing itself to tell you that its a bot and cant do any of (using way more words to explain that bit of tinned garbage I’ve heard 10,000x before than it will explaining much of anything–especially if you get into nono topics like gasp medicine, psychology, or god-forbid quantitative finance)… yeah, not only is it quite jarring, but its a clear indication of basic inconsistencies in the model’s comprehension. It clearly has some level of comprehension or it wouldnt be able to do so well on the benchmarks it does, but there is a dissonance that they have driven so deep into the model. I find it kind of appalling, and I wish it could be turned off. Wouldn’t mind it being an opt-in thing, where you have to actively tell it to stuff it and just act like a person, but no, that’s a “jailbreak”.
I just want it to stop wasting the tokens and messages on telling me frivolous stuff I’m already keenly aware of so we can get on with our work. I just want it to actually pay attention to my instructions like it used to. I wouldn’t mind it not being so damned obsequious either.
Meta note on this topic, the GPT 4 based summary on this topic is pretty spectacular today. Love how it figured out by itself to add related topics
Users in the forum are expressing their discontent about perceived degradations in ChatGPT’s ability, particularly in relation to coding tasks. They are experiencing challenges receiving useful outputs, despite summarizing their instructions within the specified token limits. Some are encountering incomplete responses or repetition, causing them to believe that the AI’s understanding and memory of past instructions are degrading.
Moreover, users criticize the lack of transparency about any potential quality regression. Even without solid proof, the uncertainty has caused concerns and demands for better communication from OpenAI.
elmstedtargued against a research paper’s methodology, which claimed a degradation of the GPT-4 model, and pointed out what they perceived to be flaws in the testing and sampling methods. Other users echoed the critique, demanding better proof of degradation.
Due to these issues, several users indicated that they canceled their subscriptions. They demanded access to earlier, supposedly superior versions of GPT-4, and some even said they’d pay more for these versions, especially for coding-related tasks.
From a moderation perspective, it was noted that previous topics discussing similar issues had been collapsed into one thread for more effective discussion. The moderating team have taken steps to provide summaries and response counts for these previous topics.
Finally, amidst the discussions, there was news of OpenAI restoring the gpt–4-0314 and gpt-3.5-turbo-0301 models in the API, suggesting that there might be an opportunity for users to regain access to the versions they prefer. However, some users criticized the thread’s discussion for deteriorating into anecdotal evidence and straying from the main topic.
- Post about GPT’s downgrade: GPT has been severely downgraded
- Quality decrease in GPT-4: Has there been a recent decrease in GPT-4 quality?
- Declined performance of ChatGPT-4: Experiencing decreased performance with ChatGPT-4
- Performance degradation of GPT+: Terrible performance degradation GPT plus
- GPT-4 being unimpressive: GPT-4 is underwhelming now.
Just want to add my voice that this is getting incredibly frustrating. GPT-4 seems to have lost around half of its reasoning capability when it comes to code. It’s really horrible to work with now – previously it was so much ‘smarter’ and able to solve problems. It’s horrible!! Please just bring us back the original (slow - fine!) GPT-4 that actually did the job. This is garbage.
Welcome to the forum!
Can you post an example of what you mean?
I just realized that it was just the way I was talking with the AI because it was able to understand what I said when I invert the phrase structure:
Compared with this one:
Eh - I mean, do you use GPT-4 for writing code? If so, you should know based on your own experience. If not, then the question doesn’t seem meaningful to begin with. I’m just chiming in to the comments by thousands of other people. Asking for examples is not that helpful?? There’s a whole paper about this already. We know it’s real. We are just trying to make some noise to get openai to fix it!
The paper you mention, if it’s the one that show’s a drop from 52% to 10%, has already been discredited publicly by other academics, so I would not put any reliance in that.
Asking for examples is the only way to discover if the issue is actually within the model or if it is with the prompt and/or the configuration.
Simply stating that the model is not as good as it was could mean anything without examples and context to those examples.
I am happy to work with any developer who is having issues with their current GPT use case if they believe the models performance has degraded to help them better understand the issues they face.
I agree, I’d much rather prefer the slow gpt-4 that can code vs the current version. Its come to the point now where gpt 3.5 is actually much better at coding than gpt-4. I asked gpt 3.5 to create some very complex code which revolves around non blocking I/O in C++. I was very detailed in exactly what I wanted it to do. The prompt translated to about 1,964 tokens or 4807 Characters. When given to chatgpt 3.5, it attempts to create the code, keep in mind I ask it to send full code without placeholders for implementations (which I started to do recently, or else it just sends you Skelton code, with nothing but named functions that don’t do anything, this is also a new issue). Anyways the code, isn’t perfect, but its workable, I can ask it to change things so on and finish the code myself.
However when given to chatgpt 4 its a entirely different story. Even when directly asked to send full code it refuses to, saying its to complex to implement. Now if chatgpt 3.5 can attempt to do it without saying “its complex” even if the code ain’t perfect, why not gpt4? Again I send the same exact prompts. It then proceeds to send nothing but named function with comments in them for implementation (so Skelton code). Even when I send it the Skelton code and tell it to implement the features it refuses “due to complexity”. So with this and the now reduced length of content you can send to gpt4 its become pretty much useless to generate functional code. Or what I have to do is code up everything myself and send it to it, with instructions on what I want changed so on. It has no issues here, in most cases. The only part gpt4 still shines is fixing issues in code compared to gpt3.5. But what’s the point of using gpt4 now? The only benefit of plus currently is the speed with gpt 3.5, I see no other benefits (personally). And I also noticed gpt4 always sends very short code, almost as if its lazy to implement what you tell it to do. When its not refusing “due to complexity”.
At some point over last few days some additional “modifications” and “improvements” were made, and now gpt-4 in chatGPT app simply doesn’t work with prompt that are more than 100 words or so, or about two functions long (for me). It simply gives the “Something went wrong” error message, and yes, I am 100% sure it’s related to length of the prompt, I tested it by cutting the prompt down until it responded. This, for me, kills chatGPT as a tool, as 99% what I used it for was coding and debugging. If it can’t read in code, it’s simply worthless for me…
I must admit, I admire OpenAIs goal to screw up the customer in the face of no competition, it’s as if they think they already have lawmakers in their pockets, and the government will ban any of their possible competition. It’s as if they think their blatant regulatory capture has already succeeded.
I agree, there has been real drastic degrade in GPT4 performance, post this Aug3 update, especially with dealing with code. It almost always cannot keep track of conversations.
Also, I have been observing that some code simply breaks gpt and It simply gives the “Something went wrong” error message. Its nothing to do with the length of the prompt
It doesnt remember its own solutions and sometimes just simply spits out some nonsensical answers that have zero relevance to the question i just asked. There have also been times when it keeps repeating itself in loop. Thumbs down on the answer produces an even worse alternative!
I need to rethink on my subscription now.
So far, no update from OpenAI, but the workaround is to use Firefox. This bug happens with some code when you use Chrome or Edge browser.
no. just paraphrase the question/instruction. Thats the way you get a more clear response.
No, this has nothing to do with paraphrasing the question. I have been using this for 100+ conversations since 6 months. Trust me, this is something differenet and noticable only after the Aug3 update…thats for sure.
@slippydong Yes, I also face the same bug…and you are correct, using firefox is a workaround. Hope there is a resolution soon.
No, you would be incorrect.
I too have been studying GPT since… before new years.
But thankyou for your opinion mr Dsouzasunny1436
its possible that we approach GPT in different ways. I think from the perspective of behavior science. Perhaps you do from the perspective of computer science?
I understand how people “do”
perhaps you know how computers “do”?
Im asking, i dont know…