GPT-4 has been severely downgraded (topic curation)

The browser was saying that…

Source: ChatGPT Plus Subscribers Can Now Use Plugins and Web Browsing - Here are Some Tips

Why can inanimate objects can think (Make usage of the word think to describe something obvious for which no lexicologist or philosopher are needed to understand the meaning)… Then I should not be forbid of using this word just because one Blake Lemoine please…

Also, I have been struggling to make the AI accept the word “think” in short prompts. Surprisingly, I have never received any warning about using it in a long prompt. I would say that it is not different from the AI’s capability of expressing that it’s “happy” or “genuinely sorry”.

“I am genuinely sorry, but I can’t think… I am happy to know that you can understand this.”

The AI obviously never said that in one sequence… But I am stitching together phrases that the AI has already said in few unrelated contexts.

Obviously I have no issue with the AI showing empathy, but then it also seems unable to accept to “remember”.

So take as an example, for instance, the links on my desktop. My computer can remember where a file is located. If someone were to argue about the semantics, I would gladly find counterexamples. Subsequently, I would probably assert that I am “genuinely sorry”.

I would reserve the use of the word “genuine” only for when I am sincerely sorry. But for now, my argument centers on the fact that these expressions are permitted in longer prompts. It’s not something that is based on discriminatory language… It is not like the AI would be asking for the respect similar like if someone would be talking to someone else like to a dog… but talking to an AI like a human is not something disrespectful as we all understand it is to avoid confusion which I think is far fetched…

The processing that happens during a plugin’s operation just has whatever text the programmers thought was good feedback to make the user understand there will be a delay in the answer and to be patient. “sending your personal information across the internet to an undisclosed developer’s API” may be more accurate, but “thinking” works.

And the AI might not be able to think about things internally before it starts generating tokens, but it can be put into a convincing state, where an apology doesn’t anger you in context.

2 Likes

What is happening ?? Today it is impossible for me to code with GPT4. It understand nothing and its answers are absolutly terrible if I compared to 5 months ago !!!

I will defintily stop my subscription if nothing change

2 Likes

Ya I have no idea why coding with it became so terrible. This just happened to me check this out right. I gave chatgpt 2 code snippets in C++, I told it something simple like “The main code has issues, while the test code is working correctly here is the test code i made which works correctly, try to use this as almost a debug to find the issue in the main code i sent you above”. Then I sent the test code which was working, and it had prior context on the main code. It responds with this:

“I’ll get back to you with my findings shortly”?? Like what even is that lol. This is with model 3.5 and I am a plus subscriber. But its just so painful coding with it now. Its used to be so amazing at finding bugs and issues in code, would easily fix bugs I told it, or would even fix bugs i didn’t even notice without me asking. It used to be amazing before it got lobotomized. Now I ask it to do something and get responses like this. By the way this was a “Regenerate”. The first answer it gave me did not solve the issue at all, this happened after I pressed Regenerate. Just found it strange so thought i’d share here. It just seems like it can’t follow orders anymore? Or is struggles with them, not like before.

1 Like

How can you say it is ok for the OpenAI’s browser to say think and it is wrong for me to say think when talking with an AI that can be happy and that can be genuinely sorry… it is a limitation not a feature… I should be able to say think as it is possible to say it in longer a text… What should be the boundary at which it should be prohibited after how many words do you say it should be ok to have the word think in a prompt addressing the AI and under how many words should we be warned… it should not be so strict and should be able to infer from the context instead of blocking at «don’t have personal opinions or thoughts» I am not asking a legal advice, a medical advice or if I should say to my girlfriend that she is cute or not in a new dress… (I do not imagine the AI would say something like: «Ultimately, it’s her body, and she has the right to make decisions» but I would like to be able to ask some advise)…

I am being playful a little too much here I am sorry… the focus of my previous message is not only about the word think but also other words like «Remember» or «Please do not forget» and it is a general idea not a specific, I was not asking why it result in this behaviour… but rather why it should have that behaviour… in a wider though process about how it is limited… it makes the overall analysis of our prompts more restrictive… I would have ben happy to have both side of the argument, and I am grateful that you have brought that side of the argument I hope to overall in a cumulative manner to have a more nuanced point of view across everyone’s opinions…

I just discover the 25 year old PhD Student I am sorry for missing the link before replying to you… I feel lucky to have said I was grateful in my previous message or I would have been a little bit shy ha ha :blush: She will Foster my development… I will start reading more and learn how to escape the prison without having to break anything thanks again…

→ I am saying that from a general perspective that if you can not use certain words it gives me the impression to block the capabilities of thinking of the AI and it makes it less powerful to go through complex deductions if it can not, semantically speaking, think and remember. then I do believe that the arguments people might have, should be the same as for being happy, or genuinely sorry… I do believe that what one would say to explain why the ai can not «remember» or can not «think» should be the same but the inverse as of why the ai can say he is «happy» or it is «genuinely sorry»… I do believe that it is turning into a toy version something that would be powerful to help solve and resolve more complex problems…

Like my favorite example:

It is more like a mathematical puzzle but I give that example to illustrate my opinion not to say it would make the AI perform differently on that specific example…

1 Like

Once again this week, ChatGPT-4 has changed the way it generates outputs. As I’ve mentioned before, I’ve generated 100s, maybe 1000s, of outputs using a similar prompt so I have enough proof that there is definitely a change.

The previous issue was shorter outputs (which I managed to find a workaround for), now, 99% of the time, it gives me the “Certainly, I can do that for you” nonsense that I was used to with 3.5 but never got with my specific prompt with 4. (Again, I’ve generated 100s of outputs using more or less the same kind of prompt.) The quality of the writing also seems much worse, almost like it is indeed 3.5. “Worse” may be subjective, but it’s unquestionably different in any case.

Why are they messing with something that didn’t need to be fixed? I’m usually skeptical about claims of downgrading, but there is definitely a change. This wasn’t happening a few hours ago.

Really frustrating, as I’d found the perfect wording for my prompts to get the specific quality of descriptive, narrative text I wanted.

Edit: I’ve tried to generate some more, and now I’m convinced the outputs are indeed objectively worse. It absolutely refuses to give me any kind of decent descriptive writing, just giving me bare bones, almost middle school level, narrations, and it may actually be worse than 3.5 at this point. Hell, I remember getting better descriptive writing with GPT-3.

To be more precise, I actually still get decent writing the rare times it doesn’t start with “Certainly bla bla bla,” but I get that garbage introduction 98% of the time now. Everything is pointing toward some badly done finetune.

Edit edit: I found another workaround to get the results I was previously getting (by adding this line at the end - Over 2500 characters, please, and use very literary language) but it’s annoying to have to keep fiddling with my prompt to get the results I used to get with no issue before.

3 Likes

I cannot stop laughing at this.
Basically the “Don’t worry, we’ll call you” :rofl:
Smooth GPT, smooth.

I’m positive that Regenerate causes some parameters to change. Maybe the penalties, higher temp? Not sure.

1 Like

OK, it seems the AI is back to functioning at least as well as it was 24 hours ago with my regular prompts (meaning I don’t have to add the part about literary writing, though I still need to tell it to give me >2500 words). I generated over 10 outputs with no “Certainly I can do that for you” and the writing was back to being descriptive and interesting. I’m not sure if the issues earlier were caused by an error of if it was an intentional change, but, either way, I’m glad it seems to have been resolved, and hope it remains like this.

1 Like

Honestly though, GPT-4 is just vastly better at language and symbolic logical reasoning than GPT-3.5-turbo ever was. I’d honestly say that in that specific domain it’s actually exceeded where it was in like, april or so. Surprisingly, that hasn’t fully translated to it being as good at software architecting as it was then, but maybe i’m a bit rose-tinted glasses.

To your point about what @anon22939549 thinks is strange, that it simultaneously straight up says its glad, or sad, or says “I feel” or makes up “memories” while also literally derailing itself to tell you that its a bot and cant do any of (using way more words to explain that bit of tinned garbage I’ve heard 10,000x before than it will explaining much of anything–especially if you get into nono topics like gasp medicine, psychology, or god-forbid quantitative finance)… yeah, not only is it quite jarring, but its a clear indication of basic inconsistencies in the model’s comprehension. It clearly has some level of comprehension or it wouldnt be able to do so well on the benchmarks it does, but there is a dissonance that they have driven so deep into the model. I find it kind of appalling, and I wish it could be turned off. Wouldn’t mind it being an opt-in thing, where you have to actively tell it to stuff it and just act like a person, but no, that’s a “jailbreak”.

I just want it to stop wasting the tokens and messages on telling me frivolous stuff I’m already keenly aware of so we can get on with our work. I just want it to actually pay attention to my instructions like it used to. I wouldn’t mind it not being so damned obsequious either.

Meta note on this topic, the GPT 4 based summary on this topic is pretty spectacular today. Love how it figured out by itself to add related topics

Users in the forum are expressing their discontent about perceived degradations in ChatGPT’s ability, particularly in relation to coding tasks. They are experiencing challenges receiving useful outputs, despite summarizing their instructions within the specified token limits. Some are encountering incomplete responses or repetition, causing them to believe that the AI’s understanding and memory of past instructions are degrading.

Moreover, users criticize the lack of transparency about any potential quality regression. Even without solid proof, the uncertainty has caused concerns and demands for better communication from OpenAI.

User elmstedt argued against a research paper’s methodology, which claimed a degradation of the GPT-4 model, and pointed out what they perceived to be flaws in the testing and sampling methods. Other users echoed the critique, demanding better proof of degradation.

Due to these issues, several users indicated that they canceled their subscriptions. They demanded access to earlier, supposedly superior versions of GPT-4, and some even said they’d pay more for these versions, especially for coding-related tasks.

From a moderation perspective, it was noted that previous topics discussing similar issues had been collapsed into one thread for more effective discussion. The moderating team have taken steps to provide summaries and response counts for these previous topics.

Finally, amidst the discussions, there was news of OpenAI restoring the gpt–4-0314 and gpt-3.5-turbo-0301 models in the API, suggesting that there might be an opportunity for users to regain access to the versions they prefer. However, some users criticized the thread’s discussion for deteriorating into anecdotal evidence and straying from the main topic.

Relevant Threads:

2 Likes

Just want to add my voice that this is getting incredibly frustrating. GPT-4 seems to have lost around half of its reasoning capability when it comes to code. It’s really horrible to work with now – previously it was so much ‘smarter’ and able to solve problems. It’s horrible!! Please just bring us back the original (slow - fine!) GPT-4 that actually did the job. This is garbage.

1 Like

Welcome to the forum!

Can you post an example of what you mean?

1 Like

I just realized that it was just the way I was talking with the AI because it was able to understand what I said when I invert the phrase structure:

Compared with this one:

Eh - I mean, do you use GPT-4 for writing code? If so, you should know based on your own experience. If not, then the question doesn’t seem meaningful to begin with. I’m just chiming in to the comments by thousands of other people. Asking for examples is not that helpful?? There’s a whole paper about this already. We know it’s real. We are just trying to make some noise to get openai to fix it!

3 Likes

Hi,

The paper you mention, if it’s the one that show’s a drop from 52% to 10%, has already been discredited publicly by other academics, so I would not put any reliance in that.

Asking for examples is the only way to discover if the issue is actually within the model or if it is with the prompt and/or the configuration.

Simply stating that the model is not as good as it was could mean anything without examples and context to those examples.

I am happy to work with any developer who is having issues with their current GPT use case if they believe the models performance has degraded to help them better understand the issues they face.

1 Like

I agree, I’d much rather prefer the slow gpt-4 that can code vs the current version. Its come to the point now where gpt 3.5 is actually much better at coding than gpt-4. I asked gpt 3.5 to create some very complex code which revolves around non blocking I/O in C++. I was very detailed in exactly what I wanted it to do. The prompt translated to about 1,964 tokens or 4807 Characters. When given to chatgpt 3.5, it attempts to create the code, keep in mind I ask it to send full code without placeholders for implementations (which I started to do recently, or else it just sends you Skelton code, with nothing but named functions that don’t do anything, this is also a new issue). Anyways the code, isn’t perfect, but its workable, I can ask it to change things so on and finish the code myself.

However when given to chatgpt 4 its a entirely different story. Even when directly asked to send full code it refuses to, saying its to complex to implement. Now if chatgpt 3.5 can attempt to do it without saying “its complex” even if the code ain’t perfect, why not gpt4? Again I send the same exact prompts. It then proceeds to send nothing but named function with comments in them for implementation (so Skelton code). Even when I send it the Skelton code and tell it to implement the features it refuses “due to complexity”. So with this and the now reduced length of content you can send to gpt4 its become pretty much useless to generate functional code. Or what I have to do is code up everything myself and send it to it, with instructions on what I want changed so on. It has no issues here, in most cases. The only part gpt4 still shines is fixing issues in code compared to gpt3.5. But what’s the point of using gpt4 now? The only benefit of plus currently is the speed with gpt 3.5, I see no other benefits (personally). And I also noticed gpt4 always sends very short code, almost as if its lazy to implement what you tell it to do. When its not refusing “due to complexity”.

At some point over last few days some additional “modifications” and “improvements” were made, and now gpt-4 in chatGPT app simply doesn’t work with prompt that are more than 100 words or so, or about two functions long (for me). It simply gives the “Something went wrong” error message, and yes, I am 100% sure it’s related to length of the prompt, I tested it by cutting the prompt down until it responded. This, for me, kills chatGPT as a tool, as 99% what I used it for was coding and debugging. If it can’t read in code, it’s simply worthless for me…

I must admit, I admire OpenAIs goal to screw up the customer in the face of no competition, it’s as if they think they already have lawmakers in their pockets, and the government will ban any of their possible competition. It’s as if they think their blatant regulatory capture has already succeeded.

1 Like