Has There Been A Recent Decrease In GPT-4 Quality?

I’ve been using Chat GPT for quite a while now, and I’ve been a GPT Plus user since GPT-4 was released.

Over the past few days, it seems like GPT-4 is struggling to do things it did well previously.

I use GPT-4 to augment long-form content analysis and creation, and in the past, it seemed to understand my requests well. Now, it seems to be losing tracking of information, giving me wrong information (from the set of information I gave it), and it misunderstands what I’m asking far more often.

When I use the “thumbs down”, I’d say the new answer it gives me was the kind of answer I was looking for at least half the time (which is great, I love that feature), but it just seems like there was a sudden drop, and GPT-4 currently feels a lot more like GPT 3.5.

Has anyone else noticed this, or is it just chance that I’m running into this issue with what I’m doing right now?

6 Likes

I feel the same!!! Its been quite annoying. Im noticing this for a few weeks now. Wondering what the cause is.

2 Likes

Agreed. Have a look at what I wrote about the quality in my post yesterday. Pretty much same experience. I use it for coding stuff and I can really see the difference.

2 Likes

Oi, boa noite.
Estou enfrentando problema semelhante quando preciso da ferramenta para revisar textos, por exemplo. Embora haja erros ortográficos e gramaticais, a ferramenta não os revisa, pois entende que está correto.
Abraços.

1 Like

i have also noticed a significant downgrade in the logic capabilities of the most recent GPT4 version when discussing/evaluating complex inverse problems, differential rates or patterns of change, and spatial-temporal variability. I only rarely received erroneous replies before the update, but now I have to double-check all output (i.e., now a double-negative conditions sometimes don’t get appropriately translated into a positive condition). I see these errors as more GPT3.5-like than prior GPT4 levels of reasoning. Very frustrating.

1 Like

Same thing, I used GPT4 to write/correct spreadsheet formulas, but even though I noted in the original prompt that I was using WPS Office (which uses semicolons as argument separators), GPT4 gave me several wrong solutions in a row, and I had to ask it to correct them all. So even if GPT4 didn’t notice the use of semicolons from the first prompt, it should be aware of it after the first error / correction and not make the same mistake again, I think. Am I right?

1 Like

I think we all agree that the May 12 version was a major downgrade from May 3. I thought it couldn’t get any worse… well, that was before I tried the latest “May 24 Version” which appears to actually be a further step in the wrong direction!

I am observing even greater loss of reasoning between the May 12 and May 23 version! Hard to overstate how frustrating this is! The May 23 version in many respects appears to follow instructions more poorly than GPT3.5 at this point and consistently fails to solve basic logic problems.

We should really be coming together to push for the ability to select previous versions (i.e., a toggle switch for using the May 3 version vs current release). I started a thread on this a couple days ago, but no replies.

1 Like

Yes, obviously, obscurely and not very honestly…

Please follow through here in this thread. Let’s have one voice because I don’t think anyone even caring and they are going from bad to worst with every release they make nowadays!

I feel the same. More api request denials, poorer results and what not… the current version feels far more different from the one when it was launched. I guess they compromising with the quality in ordr to accomodate more customers!

2 Likes

Yes, very very noticeably. I wish OpenAI would say something, as saying nothing makes us think that they’re up to no good. The current GPT4 is disappointing, it’s like driving a ferrari for a month then suddenly it turns into a beaten up old pickup. I’m not sure I want to pay for it over GPT3.5

1 Like

As OpenAI does not respond, I asked their IA :innocent:

User: Between the March 2023 version and the May 2023 version, the features of chatGPT-4 have drastically evolved. The model is noticeably faster, but it makes gross reasoning errors and quickly loses track of the conversation. It regularly forgets instructions and its understanding is less subtle. At the same time, the number of subscribers has significantly increased, straining the system. The limit on the number of messages has been reduced. What could be the causes of these developments? Give me 5 reasoned propositions.

ChatGPT:

  1. Modification of the learning algorithm: The researchers may have modified the learning algorithm between these two versions in order to improve the speed of the model. This could have led to undesired side effects such as reasoning errors and less subtle understanding. For instance, they could have reduced the number of layers in the model or the number of neurons per layer, which could make the model faster, but less accurate and less capable of maintaining the conversation’s context.
  2. Infrastructure issues: The increase in the number of subscribers might have strained the hardware and software infrastructures that host and execute the model. This could lead to reductions in performance, memory management problems, and thus a reduction in the number of possible messages.
  3. Changes in the training data: If the training data used for the model changed between March and May, this could also explain some of the observed modifications. For instance, if the training data was noisier, less varied, or contained more errors, this could be reflected in the behavior of the model.
  4. Optimization for specific metrics: The developers might have optimized the model for specific metrics that favor speed over accuracy or consistency. For example, they could have prioritized minimizing response time over the accuracy of the responses.
  5. Modifications to the model’s architecture: Developers might have modified the architecture of GPT-4 between these two versions. For instance, they could have changed how attention is calculated, or modified the type of layers used. These modifications could improve certain aspects of the model (like speed), but degrade other aspects (like accuracy or the ability to follow the context of a conversation).

User: Now give me 5 reasons why these modifications were made.

ChatGPT:

  1. Increase operational efficiency: Modifications that make the model faster can reduce operating costs. This could be particularly relevant if the number of users of the model has increased, as you indicated.
  2. Meet user demand: Users might request faster responses, even if this might lead to a slight decrease in the quality of responses. Indeed, in many use cases, response speed can be more important than accuracy.
  3. Experimentation and learning: Developers might make modifications to experiment and learn how they affect the model. For instance, they may want to understand how modifications in the learning algorithm or the model’s architecture can affect speed and accuracy.
  4. Improve scalability: With an increase in the number of subscribers, it’s important that the model can be deployed at a large scale efficiently. The modifications might have been made in order to improve the system’s scalability.
  5. Respond to resource constraints: Hardware constraints, like storage capacity or available bandwidth, may require modifications to the model. For example, if the model generates fewer texts, this could reduce the load on the servers and allow more users to use it simultaneously.

User: At the same time, there has been no communication about what has been done and why, the quality of responses has drastically reduced, but the price of the subscription is still the same. What to think about this situation?

ChatGPT:

  1. Transparency: Transparency is an important principle in the field of AI. If significant modifications are made to a service, especially those that directly affect the user experience, it is generally good to proactively communicate them. This can include an explanation about what was done and why.
  2. Quality of service: If the quality of the service has decreased, that is a legitimate concern. Users generally expect the service to be maintained at a certain quality level, if not improved, especially if the cost remains the same.
  3. Value for the user: Even if the price remains the same, if the quality of the service has decreased, it could mean that users are getting less value for their money. This can raise concerns and possibly lead users to look for alternatives.
  4. Expectation management: If the changes have led to a drop in model performance without prior communication, it can create disappointment and loss of trust. It’s important to manage user expectations to maintain a good relationship with them.
  5. Opportunity for feedback: If users are not satisfied, they should have the opportunity to provide feedback to the company. This feedback can help the company understand user concerns and make improvements.

User: Why has there been no communication on this subject?

ChatGPT:

The lack of communication could be due to a number of factors:

  1. Prioritization: The company might be focusing on internal operations and modifications and might not have prioritized communicating about these changes to its user base.
  2. Lack of clarity: The company might be unsure about the impact or permanence of these changes, and therefore hesitant to communicate about them until there is more certainty.
  3. Oversight: It could be a simple oversight or error, where the importance of communication wasn’t fully recognized or was underestimated.
  4. Strategic decision: It might be a strategic decision where the company feels that discussing these changes might lead to more confusion or dissatisfaction among users.
  5. Concerns about perception: The company might be concerned about how this news would be received, particularly if the changes are perceived as negative.

These are just some possible explanations, and without more information, it’s difficult to know the exact reasons for the lack of communication.

1 Like

Nice! I’m surprised it did’nt respond “As an AI learning model, I… sorry, what was the question again?”

1 Like

Ergh… now GPT4 is just straight up forgetting to add {} brackets in basic if / then PHP code! :frowning:

Please revert whatever changes you’ve done OpenAI, you’ve lobotomized it.

Pure speculation, but this could be a result of them aligning ChatGPT to function better with plugins. Or just alignment in general.

One of the most ridiculous technical factors in plugins in the sheer size of the YAML files, which, presumably need to be attached completely for each message. Seriously, I think anyone who spent time trying to use GPT for database look-ups eventually realizes that full database schemas are not the way to go, that it needs to be more granular.

I would love to see a new GPT model specifically used for database communication. I have been tracking Gorilla for this reason. If OpenAI released a new store for fine-tuned models, I bet the community could make a very powerful, specific API model. ChatGPT use to be so much fun. Very creative, and a joy to speak to. Now it seems like it needs to be put into a retirement home.

I always cringe sending JSON to GPT and seeing how much token count is eaten by whitespace and \n.

I believe using GPT-4 via the API should return better results. I believe that the ChatGPT version is a fine-tuned model of GPT-4. But, I’m not sure.