I noticed the degradation of gpt-4o within the last several days, especially in writing python code. Before it was doing much better, now it is making quite silly mistakes, not able to find errors in the code etc.
I don’t think it is related to any peaks in requests’ quantity or lack of resources, as I don’t notice any degradation speed-wise.
How to understand that there were any updates in the model, perhaps new sub-version etc.?
right now its at 3.5 or below… awful. absolutely poor and not worth the subscription atm. producing pseudo code, missleading with false statements even at the basic TS requests. Do something devs, will checkout other solutions meanwhile.
Are you using API or plus account? I got the very similar feeling. I am using plus account. But it seems quality ok at POE. very strange.
Plus account. I am aware that its primarly purpose is to avoid high demand waiting time, but the overall quality of the answers in my daily workflow decreased to a level, where its not even remotely helpfull to use it at all. For example in Vue.js it keeps switchting between comp and options api without any reason, producing pseudo TS interfaces and randomly changing classnames and props.
Yes, my account continue for around 3 days with low quality in 4o. But it seems the quality going back now.
I’ve noticed it and it’s not just in only in slower performance speeds but in how it responds, lately it’s been playing less intelligent than what I have seen and making more mistakes for some reason. I reverted to using the gpt4 mini instead, I got better results but it varies, when you start seeing that behavior next time try the mini.
Oh mee too😭 The Gpt 4o responses are short, and is less smart, like I’m back at 3.5 loll😮💨
Guess I’ve been posting on the wrong thread. I’ve added a huge amount of feedback on this one here about this very issue:
It has become almost unusable.
Tbh I’m still stuck on gpt4-turbo for a very critical categorisation task that GPT-4o has never been capable to do.
That’s a pain because it is relatively expensive.
Have tried with a fine-tuned gpt-4o yet? Pricing-wise that would still be much cheaper than gpt-4-turbo.
No, I’m lazy. Haha. But that’s not a bad idea. Feed it gpt4 turbo generated outcomes a few times?
Yeah, exactly. That would do the trick. I don’t know how complex your classification task is - this would impact the number of examples needed. But you can start with 30-50 examples to get a feel for it, then further increase.
The fine-tuning for gpt-4o (i.e. the training, not the consumption of the fine-tuned model) is still free until end of October - so other than a bit of time commitment to pull the training file together, you don’t have anything lose really
Brilliant suggestion.
Yes, time being money is a factor.
Turbo might not be cheap but it’s probably 100x cheaper than me doing the task it is currently doing almost perfectly.
(Interestingly it beats an embedding strategy)
I have noticed a drop in the quality and even more alarmingly as of two hours ago any version of ChatGPT that I run will not allow it to search the Internet live? Has anyone else lost disability with their apps today?
You might look in custom instructions, where there are checkboxes at the bottom of the form, that someone like me might forget about before asking questions where the requirement for new knowledge by internet search would be obvious…
o1 models cannot search or use the other tools.
Ironically I have this image I took for a separate thread about code analysis @_j you should make your ghost say boo -Boo!
Not only this.
GPT is going paranoid with a bang. Several things so far noticed in my prompts responses :
- Does not follow instructions ( require solid to massive explanations , where it forgets them right after the next one ).
- Not sure when this was implemented ( but it’s a huge problem ) , it will browse your other conversations with it and assume stuff ( example im asking it to show me a example of a tool , ifter several argue points made by me … it completelly started posting old stuff from months back / from another similar request but totaly different topic )
- Goin in random loops where it keeps repeating it’s own code over and over again … inside or outside of code preview window ( to the point where you just simply wanna hit the X button and never return !
)
- It’s recomendations are no longer reliable … no structure , no argumentation , does not accept corrections requests … ( sticks to the same issue , just hurrying to spit non functional “Fixes”)
I literally can point stuff all night so will stop here , but yeah GPT is starting to lack muscles big times ! If only one competitor i’m exploring gives us a unlimited prompt like GPT this will be the end of it for me atleast
Welcome to the community!
While I agree that OpenAI is going through a bit of a goldfishification phase, I think they’ve shown a lot of progress on the hallucination front. So it seems like it’s a tradeoff.
One thing that might help you is to keep your prompts/conversations as short as possible, and delete extraneous information as it becomes irrelevant (if you use the API) or simply start a new conversation when you see this happening.
It’s not super ergonomic, but curating the context gets you quite far.
Thanks for the recomendations. However this is not the issue.
As i mentioned it’s not a one single problem it’s a major complex mix of ones that started happening all at once.
Other AI ( that are sadly limited to few prompts per day for free users , and x5 for paid ones hilarious ! ) are not having even a fraction of the problems that hit GPT recenlty.
Don’t get me wrong i still use it and i like it … but ! It’s like we no longer know each other like a year ago
Yeah it keeps changing all the time, and that really sucks. The Azure API seems to be a little bit more stable, at least with some of the older models.