Has anyone noticed GPT4o quality drop last few days?

I noticed the degradation of gpt-4o within the last several days, especially in writing python code. Before it was doing much better, now it is making quite silly mistakes, not able to find errors in the code etc.
I don’t think it is related to any peaks in requests’ quantity or lack of resources, as I don’t notice any degradation speed-wise.
How to understand that there were any updates in the model, perhaps new sub-version etc.?

1 Like

right now its at 3.5 or below… awful. absolutely poor and not worth the subscription atm. producing pseudo code, missleading with false statements even at the basic TS requests. Do something devs, will checkout other solutions meanwhile.

1 Like

Are you using API or plus account? I got the very similar feeling. I am using plus account. But it seems quality ok at POE. very strange.

1 Like

Plus account. I am aware that its primarly purpose is to avoid high demand waiting time, but the overall quality of the answers in my daily workflow decreased to a level, where its not even remotely helpfull to use it at all. For example in Vue.js it keeps switchting between comp and options api without any reason, producing pseudo TS interfaces and randomly changing classnames and props.

1 Like

Yes, my account continue for around 3 days with low quality in 4o. But it seems the quality going back now.

1 Like

I’ve noticed it and it’s not just in only in slower performance speeds but in how it responds, lately it’s been playing less intelligent than what I have seen and making more mistakes for some reason. I reverted to using the gpt4 mini instead, I got better results but it varies, when you start seeing that behavior next time try the mini.

1 Like

Oh mee too😭 The Gpt 4o responses are short, and is less smart, like I’m back at 3.5 loll😮‍💨

1 Like

Guess I’ve been posting on the wrong thread. I’ve added a huge amount of feedback on this one here about this very issue:

It has become almost unusable.

1 Like

Tbh I’m still stuck on gpt4-turbo for a very critical categorisation task that GPT-4o has never been capable to do.

That’s a pain because it is relatively expensive.

1 Like

Have tried with a fine-tuned gpt-4o yet? Pricing-wise that would still be much cheaper than gpt-4-turbo.

2 Likes

No, I’m lazy. Haha. But that’s not a bad idea. Feed it gpt4 turbo generated outcomes a few times?

1 Like

Yeah, exactly. That would do the trick. I don’t know how complex your classification task is - this would impact the number of examples needed. But you can start with 30-50 examples to get a feel for it, then further increase.

The fine-tuning for gpt-4o (i.e. the training, not the consumption of the fine-tuned model) is still free until end of October - so other than a bit of time commitment to pull the training file together, you don’t have anything lose really :slight_smile:

2 Likes

Brilliant suggestion.

Yes, time being money is a factor.

Turbo might not be cheap but it’s probably 100x cheaper than me doing the task it is currently doing almost perfectly.

(Interestingly it beats an embedding strategy)

2 Likes

I have noticed a drop in the quality and even more alarmingly as of two hours ago any version of ChatGPT that I run will not allow it to search the Internet live? Has anyone else lost disability with their apps today?

1 Like

You might look in custom instructions, where there are checkboxes at the bottom of the form, that someone like me might forget about before asking questions where the requirement for new knowledge by internet search would be obvious…

o1 models cannot search or use the other tools.

1 Like

Ironically I have this image I took for a separate thread about code analysis @_j you should make your ghost say boo :rabbit:-Boo!
IMG_3520

Not only this.
GPT is going paranoid with a bang. Several things so far noticed in my prompts responses :

  • Does not follow instructions ( require solid to massive explanations , where it forgets them right after the next one ).
  • Not sure when this was implemented ( but it’s a huge problem ) , it will browse your other conversations with it and assume stuff ( example im asking it to show me a example of a tool , ifter several argue points made by me … it completelly started posting old stuff from months back / from another similar request but totaly different topic )
  • Goin in random loops where it keeps repeating it’s own code over and over again … inside or outside of code preview window ( to the point where you just simply wanna hit the X button and never return ! :smiley: )
  • It’s recomendations are no longer reliable … no structure , no argumentation , does not accept corrections requests … ( sticks to the same issue , just hurrying to spit non functional “Fixes”)

I literally can point stuff all night so will stop here , but yeah GPT is starting to lack muscles big times ! If only one competitor i’m exploring gives us a unlimited prompt like GPT this will be the end of it for me atleast :slight_smile:

1 Like

Welcome to the community!

While I agree that OpenAI is going through a bit of a goldfishification phase, I think they’ve shown a lot of progress on the hallucination front. So it seems like it’s a tradeoff.

One thing that might help you is to keep your prompts/conversations as short as possible, and delete extraneous information as it becomes irrelevant (if you use the API) or simply start a new conversation when you see this happening.

It’s not super ergonomic, but curating the context gets you quite far.

2 Likes

Thanks for the recomendations. However this is not the issue.
As i mentioned it’s not a one single problem it’s a major complex mix of ones that started happening all at once.
Other AI ( that are sadly limited to few prompts per day for free users , and x5 for paid ones :smiley: hilarious ! ) are not having even a fraction of the problems that hit GPT recenlty.

Don’t get me wrong i still use it and i like it … but ! It’s like we no longer know each other like a year ago :smiley:

2 Likes

Yeah it keeps changing all the time, and that really sucks. The Azure API seems to be a little bit more stable, at least with some of the older models.

1 Like