Significantly degraded response quality after yesterday's outage (Dec 3, 24). Am I alone?

Do you experience worse response quality in long context-heavy conversations?

Yesterday, on December 3rd, ChatGPT experienced a partial outage. Today, I’m continuing work on the same javascript project and the responses are unusable.

Specifically, the issue lies in partial ignoring of instructions that didn’t used to happen before. This appears to be happening on both models that I use a lot: 4o and o1-preview.

Quick list of issues:

  • New hallucinations: Typically, the models are fairly reliable at maintaining key context (stack, for example). Today, models list the stack and propose obviously incompatible solutions in the same response. Simplest example would be recommending the use of Buffer on edge runtime.
  • Hallucinations overpower instructions: Next step is that once corrected on a specific issue like the one mentioned above, it continues to make the same mistake nearly 100% of the time. I’m seeing this level of “reliability” for the first time in the latest models.

Have you experienced the same issue?
Please share prompting tricks that help mitigate this if you’ve found some already :slight_smile:

4 Likes

4o became stupid than before. always create wrong answers, even if I provide very clear information.

2 Likes

Exactly the same here. Chats that worked just fine yesterday are unusable now. Affects both old and new chats, as well as custom GPTs.

Worse performance than ever before (even GPT 3 didn’t make such mistakes). For example, today it literally claimed that sqrt(1)=sqrt(2).

That’s the end of my subscription, unless fixed by the end of the month.

1 Like

I have repeatedly asked ChatGPT to summarize several articles. It seems to read the documents, but the answers are completely irrelevant. When I ask for the reason, it says it wanted to provide faster responses. Yet, when I re-upload the same articles, the same issue occurs. There is a persistent tendency to provide false information and fabricate details, and this behavior has specifically started two days ago.

1 Like

Yeah, I came here to confirm my suspicions… I don’t know why, but the past 4+ days it has been absolutely terrible.

For example, I needed something translated from the internet and it just REFUSES to translate it… it just summarises and then links to the source and says things that are not true.
When I then ask it questions about the article, it cannot answer them and again, just responds with “sources” and forces me to read it myself… which is not really possible as I need it in English.

I hate when this happens, just goes to absolute crap for no reason. Especially considering I pay for it.

1 Like

It’s become unusable for me.
In the middle of a conversation about Matlab code, it starts giving me irrelevant Python code.

And then it gives me answers like this :

However, MATLAB interprets the first line as declaring rowsum as a scalar, because you used: rowsum = zeros(6,1); The explicit initialization makes MATLAB treat rowsum as a scalar. Thus, when you later try to assign a column vector (sum(...) result) to rowsum, you get a size mismatch error.

This is SO bad. I mean, GPT2-level bad. Saying that rowsum is a scalar when it’s clearly a column vector is beyond a simple hallucination.

This has been going on for about a week or two, but it’s nearly unusable in it’s current state, I can’t trust it for anything.

I am having the same issues. Its not making much sense. I am working on a big project and this is the absolute worst timing. Anyone else on here get it resolved or still having issues?

Same problem. GPT 4o today is very stupid, everything doesn’t work, even worse than 3

1 Like

I had similar issues with 4o, o1-mini, o1-pro, etc.

My problem is essentially take 150 lines of python code, with spaces, introduce a new function X, use it’s output as an initial check, then go from there.

It keeps refactoring my code in ways that break it.

However, trying Claude and Gemini without success too.

I think I need to do more of a Chain of Thought, or Step by Step approach, but at this point just coding it by hand is looking attractive.

These LLM’s have trouble pushing logic around, simple rearrangements. The biggest problem they all have is they cannot copy then alter a few lines, they all try some sly refactor, even when you don’t want it to alter your other code.

I wish they had a lock feature. Like highlight, and lock this code, and then let the LLM play with the rest. That would give me more confidence, and spend less time worrying about all the edits it did to my 150 lines of code to make it “better” (and break it BTW).

OTOH, they are much better at generating small chunks of code from scratch. Even creating unit tests and running them in Canvas (with 4o) seems magical.

So for small little things from scratch, they work. Bigger (>50 lines?) they start to break down.

Granted, not much of my code has extensive comments, I suppose that would help the LLM get its bearings.

I will try some stuff, see what happens.


UPDATE: I got decent results with 4o in Canvas but I had to slow things down.

First, basically opened up Canvas, and pasted the code. So the original code remained unaltered. Then I asked it to create a function in the code that does X. Then I asked it to revise the guts of X. Then I asked it to use the output of X later on in the code.

This is where I broke out of Canvas and just finished the rest by hand.
:stuck_out_tongue_closed_eyes:

It did its usual refactor, but it wasn’t too atrocious. It gave acceptable code to start with. The refactor, honestly made it more “maintable”, but I think harder to follow and read. Oh well.

2 Likes

Dude. Yes. One of my most missed feature was being able to “edit”, or “in-paint” areas with Davinci, which then turned into “priming” the start with Completions. Now, it’s nothing :sob:

2 Likes

10000% what is going on with this? All of my products that I have tuned up all of a sudden are broken.

this is exactly the issue we are having and this just started happening right around the same time you posted this!

What is going on?

It also coincides with the announcement that power users can now buy the $200 a month version. I don’t want to sound like a conspiracy theorist, but I’m starting to wonder if they’re not pushing certain users towards their overpriced product.

I hate doing this, but I’m starting to seriously consider moving to Gemini. This is unacceptable.

no models from Gemini are working for me like GPT4o mini did

any ideas?