Right, and sometimes it gives the correct answers with bad explanation. It seems to help when you ask it to make a minute by minute board. It was not necessary in march.

There’s another interesting case, the quality of GPT4 answer is quite the same as before, but the GPT3.5 has increased (march version was unable to simulate the right expectation) :

Gpt4

Gpt3.5

I have been experiencing the same. It’s very frustrating, specially after having had the chance to work with GPT4 and get excellent results. Now, the answers lack context, are significally shorter, are even the grammar and the way of repying seem less human-like and more bot-like. I’m paying the Plus version and still is not working as it used to. Furthemore, it is ridiculous the cap you have on messages when now you need to explain chatgpt 3 or 4 times what it needs to be done, so those 25 messages run out very quickly out of chat gpt own lack of understanding about this question or its inability to stick to what you tell them to do.

8 Likes

Agreed - the degradation in intelligence has been VERY noticeable from what it used to be. I first noticed almost a month ago, however more recently it’s much worse. At the time I was told by others that they think it varies depending on load, but I don’t know.

And with the 25 message cap, having to repeat questions so often as it keeps making mistakes really uses up the cap quickly, I often find myself using GPT3 70% of the time, wondering what I’m paying for.

7 Likes

Hi @logankilpatrick
Appreciate you/team looking at this.
It’s really pretty obvious if you used both models and some samples are shared above in the links above.

Thank you!

1 Like

GPT-4 is in good shape today. I combined a few logic questions to test its capabilities; GPT-4 got 11/11 correct, while GPT-3.5 got 5/11 correct. I will try to keep these records for a small long-term follow-up :wink:

1 Like

Not for me unfortunately :frowning:
It really make the most simplest mistake and don’t follow the rules set, or you have to spend prompt after prompt to get it to do what you want.
Same sample chat I used before still exactly applicable and 3.5 looks better to me, and when I say better I mean in terms of following instructions and following through what it says.

The GPT-4 model says lots of things but basically didn’t follow my instructions fully or the plan that it by itself laid out.

Literally just run out of capacity using the same solar system orbit example, a complete waste for me at this point.

It works well for simple tasks, but once you want it to do something meaningful I don’t think it is able to follow through at all.

I couldn’t get it to improve using any prompt technique :frowning: , I also don’t think they did anything to the model since we are still on that May24th version, unless model updates are not reflected in the web interface.

But net net, it’s good for hobby, poor at tasks, and I’m not talking about AutoGPT stuff, I’m talking about well thought conversation between me and AI where I try to steer it to do what I want.
My thinking is really enough of this thing and wait until it mature or different model release, but all the hype being made, I basically came back crashing to earth now unfortunately :frowning:

4 Likes

I agree. I noticed a stark difference today as well enough to create a account and reply to you. We talked about it on discord server by using the apple test, even if you don’t know what that is, what matters is that we saw a stark difference and we got a 6/10 consistently instead of what it started it as 10/10. THe API is the only workaround apprently, it is just infuriating they don’t announce this. We need to be more vocal. So much for being OPENAI, don’t announce worsening it for the Plus USers

6 Likes

Lol, you made my day with your comment and I can relate to what you’re saying:
I also looked into this in discord

That also happens to me pretty much frequently, I tell it not to do something, and it ack’d it back, then I ask it to go ahead as per plan and it does exactly opposite of that. :rofl:

At one point before sharing becomes available, I gave it a simple Python code to improve, it come out with 5 simple but creative improvements tbh. then I ask it to go ahead and write it, it went ahead and re-wrote the whole thing but instead of adding the code, it just added new functions with single line comments saying that we need to implement this later :rofl:

5 Likes

No noticeable performance decrease for me in GPT-4 Plus or in API.

1 Like

What kind of tasks you use for?
If you use it for some simple stuff, this thing is a Ferrari
If you use for some serios stuff, this thing is a Lada 80s model.
would be great to share some chat if possible.

Please note that when I say serious stuff, I mostly means coding, and Python , one of the most easiest language in the world, and also for either small functions or simple scripts, nothing out of the ordinary.

2 Likes

Been trying to use Chat GPT 4 today, but it won’t even generate responses for me.

1 Like

I find this to be the case the long the chat or tokens are the worse the experience.

For personal use I am always creating new threads. However not as easy for api / plugin use.

1 Like

Thanks, Totally get that.
And that was the case before the Plugin model becomes Beta, it used to work fantastically well until the conversation grow and you would re-create a new one as you exactly describe. But it was really doing incredible job and I don’t mind that, in fact I connected it to a VectorDB and can save and restore the gist of any previous interaction with it. So we’re good on that.

What we’re talking about here is a degradation from the get go, for simple tasks and at the very beginning of the convo. Please take a peek at the Chat snippets I shared earlier on this thread.

2 Likes

I find it quite concerning that we haven’t heard back from “OpenAI” regarding these issues.

7 Likes

Yeah like I said it is working fine for me.

1 Like

I have also done the “apple test” and other similar ones, the result is indeed distressing. The most incorrect part of this story is OpenAI’s silence. API users don’t seem to be concerned about the problem, with the payment model being per token, it’s understandable. ChatGPT users are the ones being harmed, the global payment model encourages reducing computational power for economies of scale. API users are also probably a more “professional” population that shouldn’t be upset, while ChatGPT users are perhaps more individual consumers, being treated like cash cows. The fact that there has been a modification to the model is impossible to deny (those who do are either dishonest or blind): the response speed is significantly faster. But at what compromise?

3 Likes

I also noticed the same thing for a couple of weeks now. At times, it feels just like I am using GPT-3.5. And it’s also a lot of other little things that keep getting downgraded. Like the character limit or word count for GPT-4 per prompt is now less than GPT-3.5, which is mind boggling considering we are paying for GPT-4. I’ve decided to just cancel my subscription because it’s no longer worth it and the plugins have been lackluster. Maybe I will buy it back in the future when there are features that actually justify the cost.

4 Likes

Another funny one, I call it the “caveman test”, inspired from the apple :wink:

3 Likes

So if I can add a couple of the things here:

  1. These are stochastic models so even with a temp of 0 there’s a certain amount of randomness at play. You could call the model 10 times with the same prompt and get similar answers 9 of the 10 times but 1 of them will be just wrong.
  2. They’re retraining these models on a daily basis to incorporate feedback from a mix of humans and GPT itself. The goal of this feedback is to improve the model but it means that for some answers or tasks they might regress which is what you’re seeing.

I personally think they need to introduce check point versions of the models that aren’t being retrained so that people can build actual applications on top of this stuff (it’s too much of a moving target right now) but that’s just my opinion.

3 Likes

There’s a big difference. It’s just not as good. Definitely not as thorough. There’s a chrome extension “Superpower ChatGPT” which allows you to use multiple models of chat-gpt. I compared and older one called chatgpt-4 with 32k token limit with the current chatgpt-4. - Link to text file

1 Like