GPT-4 has been severely downgraded (topic curation)

I agree, there has been real drastic degrade in GPT4 performance, post this Aug3 update, especially with dealing with code. It almost always cannot keep track of conversations.
Also, I have been observing that some code simply breaks gpt and It simply gives the “Something went wrong” error message. Its nothing to do with the length of the prompt
It doesnt remember its own solutions and sometimes just simply spits out some nonsensical answers that have zero relevance to the question i just asked. There have also been times when it keeps repeating itself in loop. Thumbs down on the answer produces an even worse alternative!

I need to rethink on my subscription now.

This is a bug for which I’ve opened another thread. It’s not just with javascript by the way:
Some javascripts cause an error in ChatGPT web after AUG 3 update - ChatGPT / Bugs - OpenAI Developer Forum

So far, no update from OpenAI, but the workaround is to use Firefox. This bug happens with some code when you use Chrome or Edge browser.

no. just paraphrase the question/instruction. Thats the way you get a more clear response.

No, this has nothing to do with paraphrasing the question. I have been using this for 100+ conversations since 6 months. Trust me, this is something differenet and noticable only after the Aug3 update…thats for sure.

@slippydong Yes, I also face the same bug…and you are correct, using firefox is a workaround. Hope there is a resolution soon.

1 Like

No, you would be incorrect.
I too have been studying GPT since… before new years.
But thankyou for your opinion mr Dsouzasunny1436

its possible that we approach GPT in different ways. I think from the perspective of behavior science. Perhaps you do from the perspective of computer science?
I understand how people “do”
perhaps you know how computers “do”?

Im asking, i dont know…

Could be. I am talking purely from a coding perspective. I have seen a clear degrade. I dont use GPT for any other purpose than help me speed up my script time for my personal projects.
Am not sure if you have tested from scripting perspective, post Aug3 update.


For those asking for specific examples of prompts with worse results now than before, here is a coding problem that I’ve been using as a benchmark since May.


Let ‘cfg’ be a python dictionary.

Write a memoizing decorator that can be used with functions that return numpy arrays, tensorflow tensors, or tuples containing numpy arrays and/or tensorflow tensors.

The decorator does a lookup based on a hash that depends on all the values stored under the following keys of ‘cfg’:

‘offset’, ‘N’, ‘bigoffset’, ‘sim_nphotons’, ‘nimgs_train’, ‘nimgs_test’, ‘data_source’, ‘gridsize’, ‘big_gridsize’

If a match is not found, the memoizing decorator calls the underlying function and stores the output both on disk and in memory.

If a match is found, the decorator reads the memoized function output from disk or memory.

Chatgpt-4 output, early May:
pastebin com/2Hy2jxfD

Chatgpt-4 output, August 6:
pastebin com/Zh5pjYNW

Chat links:
chat openai com/c/8273412b-f3fb-405c-a7a4-c0466bb43b04
chat openai com/c/45262c73-1e55-4e2f-a1b6-9e5822fa9bbf

The May version works perfectly out of the box and does exactly what I asked for. The August version is completely wrong, both conceptually (what chatGPT is trying to do vs. what I asked for) and in the code implementation.

Would love to see a response from OpenAI that doesn’t involve gaslighting its users.


After a couple of days of normal functioning, it looks like it’s back to the poor quality responses with the “Certainly!” garbage again. It’s like I’m getting a different model every few days, and if it continues like this I’m probably going to have to stop paying. If they just want ChatGPT to be good enough for birthday messages and corporate letters, I can just use the free version or Bing, since they do those adequately enough.


Just a ChatGPT problem. Most of these problems can be fixed by just using the API. But then they would have to deal with token management and see how difficult it can be to maintain context.

1 Like

I appreciate that it is difficult to maintain context, but that is not justification for worsening product quality.

Also, this paper was tested against the API…

1 Like

You’re right. I’m trying to say that these difficulties are hard to resolve. So many people use GPT for so many different reasons in so many different ways.

I have no doubt that they’re doing their best to evolve GPT. Day by day may be weird but month by month has been incredible.

That paper is whack.

Most of my experience has been with ChatGPT (although I have used the API some). Anecdotally my experience with the responses I am getting from the GPT-4 (accessed through ChatGPT) has gotten worse over the past 2 months or so (despite the roll out of some cool new tangental features like the code interpreter and custom instructions).

For reference, I have been using ChatGPT for >6 months on an almost daily basis and have had probably 100s of conversations with ChatGPT.

Also, what is your justification that the paper is “whack”?

1 Like

Can you share references or citations to where the paper has been discredited?

To me the paper is nitpicking on issues that exist in all LLMs with proper alignment and doesn’t reflect anything that they’re good at, or common use cases.

The fact is that you feel like OpenAI is gaslighting. Besides being ridiculous, why would they purposely deteriorate their product?

If you have problems with ChatGPT then use GPT in the API. I don’t think there’s anything else to say.

What issues have you noticed?

Why would the purposely deteriorate their product? Because most businesses would be looking to find an optimal balance between how much money they can get for the model and how much money it costs to run the model. Perhaps they made a business decision that they make more money by providing a lower quality model? Or perhaps some new changes to the GPT-4 model have caused regressions?

Why do I feel like they are gaslighting? Because they are not substantially responding to the criticisms leveled by the paper and anecdotal observations by users. In Peter Welinder’s tweet, he seems to be starting from a baseline assumption that “the users must be wrong” rather than digging into the criticisms further.

Also, I think it is tone deaf to tell users who spend $20 per month on ChatGPT plus (and are within their messaging limits) to use the API instead of the service they are already paying money for. API usage should primarily be for research and integrating GPT-4 into applications.

Issues I have anecdotally observed include:

  • decreased code quality
  • decreased quality of problem solving

I hear that. Business is business and it usually sucks.

God. Yes. I completely agree. That tweet is so dismissive it’s frustrating. I think there is some truth in that people notice the cracks the more they use a product but to me there’s some serious issues with ChatGPT, mainly with context. It was “throw neurons at the problem”, and now it’s “throw tokens at the problem”

If you want consistency then use the API. I know it sounds tone deaf but it’s the truth. I’m also very frustrated by these constant updates and lack of true change logs. It’s almost insulting. It’s obvious that they are doing much more than they say.

Hmmm… This is hard. Personally GPT-4 has been improving in both of these qualities. I have noticed that it fails more often when it comes to remembering things, but I usually do single conversation pairs and stick with simple functions

1 Like

Yes, I have the same experience. The 7/20 release was very poor quality. Then for a few days before 8/3 release, it was much better. Now it is back to being basically unusable. It would not be that hard to let the users choose the model they want to use. They already do this with 3.5/4. Just add the ability to select different versions of 4 and put different usage limits on. Give me back the lower usage limits and a good model that can actually write/comprehend code!

1 Like