Declining Quality of OpenAI Models Over Time: A Concerning Trend

Yes I agree. I’m not in disagreement with you. When using the api as part of a development project at least now with json mode, breaking things down into simple json objects that you can chain together is the way to go. I agree that it feels weird to underuse the models abilities, but back at like model 3, before json mode, it was a nightmare to even attempt to get valid json without HEAVY DUTY output cleanup, so even though it could technically sometimes output valid json or close to it, I backed off a bit and just had it output objects comma delimited and defined the fields in order, simpler no syntax to worry about it messing up (living on the cutting edge - that edge is a sharp double edged sword!)

1 Like

Yep, I’m still extracting field values one by one and then build my objects in code… No way I let LLM output JSON for data extraction task. It can not guarantee the 100% accuracy I have now.

I’m also seeing a big step down over the last week. Files not being able to be read. Generating syntax errors in php code. Constantly making mistakes, when it has the context a few messages before.

Most recently its just written the php delete api call, it chose the return structure, then in the next message completely made up a new structure to handle the reply in angular code.

It has renamed variables at the top of the reply and forgotten that its done so by the end.

And actually done tons of stupid things, which I logged from a recent session (Really low quality php code coming out of 4o today)

And I feel like its doing improv recently. “Yes and…” rather than actually telling me the correct answer it seems to try to make my answer correct, even if it means twisting the code around in wrong ways.

It feels like half of the replies are “yes, sorry, you are correct…”

It’s nice to feel smarter than the AI sometimes, but its getting draining now. I feel like im marking homework not writing code.

And actually, the final reason that made me come looking on the forums today is that the last few weeks its been a terrible ui experience on top.

The browser tab locks up for either ages, or AGES.

The send button is ghosted out even though its not currently working on a reply.

The reply comes back from gpt but its empty.

Pressing regenerage sometimes works, refreshing the browser tab sometimes works, sometimes I just need to leave it and start working on the problem myself then come back and it has recovered.

It seems I am ending each work session now looking for others with the same issues, or reporting things on this forum.

Case in point, browser is not responding on the current conversation, cannot post anything new, tried to refresh and its stuck on this for ages:

Before semi-loading up the conversation but going back to locked up, with all the messages as empty bubbles.

And here is an example of it just not being good:

I’ve started a new conversation, and given it the exact context it needs.

I’m already trying to get it to fix mistakes with using subscribe without good purpose.

It’s fixed that, still got the processing of the api reply wrong, when i gave it the snippet.

I’ve pointed it out and its managed to then process the api reply correctly.

But the updated code its generated leaves bits of destroy$ hanging around which is not being used anywhere now.

It’s not exactly huge or complex tasks that I’m asking it (compared to what it is capable of; despite my complaints, i am still in total wonder at the fact that this even works at all).

Some fails as I’m attempting to continue

Its basically failing at every stage and I have to correct it.

and:

and:

and:

and:

and:

and:

and:

and that time it managed to generate the code!

Update: Later on it turned out that the whole structure it had suggested was wrong after all that, so every step of that was a total waste of time.

It knew it was using angular http request but then had me writing code for js Error class for hours, before finally conceding that we should be returning the actual http response object, and that angular automatically throws for any non 200 error.

So it has been utterly incapable of writing code recently in php or angular without being lead by the nose, down to the point of spelling out the actual step by step logic and telling it which specific classes to use - you cannot just speak to it in English any more, most of the time.

All I was trying to do was write the delete feature in the php api, then consume it and handle the errors in angular, and it took hours and hours of back and forth. End update

Although it is going hyper detailed on comments that just repeat in plain english what the line of code does, or worse, talks about the changes its made since the last line. Ive tried begging it not to, setting a memory, setting it in the pre-text thing, but it just ignores me and floods the code with comments which I have to delete, or regenerate the final working code “again without comments”.

Like others, I don’t have specific benchmarks, and I don’t think they are needed.

I doubt this is a surprise to OpenAI. I think that like other comments, its just them trying to optimise to cost them less per query vs quality.

So I am just attempting to make some noise, which everyone should, when they experience degradation going down too far, so they can find the balance.

NEXT DAY
Another day of absolute pain with this ai. I just wanted to add tooltips into some buttons.

It’s on its fourth attempt now after failing to roll its own, then failing to get angular material working, then going off on a tangent with tippy.js, but forgetting everything about the project again, and just giving some general set up instructions unrelated to angular, and now ive lead it by the nose to the npm library that it should use, and even with that its messing things up.

It is doing syntax errors again, by putting html comments in the middle of the markup attributes:

Since it first started being lazy the other day, which i mentioned in the other linked thread, and I had never seen before, it has gone absolutely through the floor into unusable territory, with markedly briefer replies, syntax errors in code, extreme forgetfulness (even in the same reply, with changes that it had initiated), highly limited reasoning (getting stuck, cannot solve things, going round in loops, making mistakes in every single answer, guiding me towards incorrect solutions).

It is cooked. I’ve actually got the tab open now to get started with Claude. I’ve been sticking with ChatGPT rather than being distracted with the various ai’s but its nearly the end of the weekend now and I’ve got a fraction done of what I planned to complete. Whatever has gone wrong with it, I can no longer get it to produce anything of merit.

i saw your post because im experiencing the same issues :sob:
irrelevant responses, answers based on outdated guidelines, and inconsistency—working with code under these conditions is a nightmare.