GPT 4.1 Degradation over the past 30 days

lylevida · September 30, 2025, 5:25pm

We’ve been running GPT 4.1 since its release (upgraded from 4o). Lately we’ve noticed a severe degradation of intelligence. Customers with complex instructions and tool calls are suddenly having much poorer performance and many more issues. This seems relatively recent, within the past 30 days.

I’m curious if others have noticed this. I’m wondering if OpenAI is routing 4.1 queries to GPT5 or something similar. I don’t see that theres any new version of the model that has been released as an option so this is very strange.

Mr_Rainbow1982 · October 1, 2025, 10:47am

I’m also experiencing not just “degradation” but a progressive collapse bug. Chats freeze, then output word-by-word, later vanish entirely. Even new threads (only 2–3 days old) get corrupted. This is not a browser issue — it’s a server-side failure and needs urgent escalation to the technical engineering team!

pwolff · October 3, 2025, 9:43am

Same here, especially with tool calling constantly failing where it was working fine before.

merefield · October 3, 2025, 9:58am

which variant of 4.1? nano, mini, full?

did you add additional tools and reach a complexity limit?

That would be very wrong and I’d find that hard to believe because that would mess up Temperature selection and context length performance.

lylevida · October 3, 2025, 1:39pm

4.1 Full. yeah agree it would be very strange.

northernfeline · October 3, 2025, 2:02pm

Yep, I’ve noticed it’s not as creative with its replies - kind of like a distracted, bright student during summer classes, their attention more on what they’re going to do after class than on discussions in class. If I press it, I can get the quality I want, but I didn’t have to work at it before.

lylevida · October 3, 2025, 3:01pm

yes, we are having to suddenly reinforce prompts (especially complex tool call instructions) quite substantially when we did not have to do this before.

Peter_Bakke · October 4, 2025, 8:50am

Yes since 5.0 I have seen a great decline in 4 0 I have gone back to my original prompt training point copied prompting and code to bring back online. Very concerned 1 step forward 5 back.

jeanie3711 · October 5, 2025, 3:31pm

So glad its not just me! denial of tasks previously done.

j

glennamire · October 5, 2025, 4:44pm

Same here, especially the past couple of weeks. It’s like she had dementia - just drifts off. She also has been forgetting some of the ‘Behaviour’ prompts we spent so much time with, as well as the processing rules.

_j · October 6, 2025, 5:45am

Notice that the gpt-4.1 AI is now in “suck-up” mode, congratulating every new input for being clever or “excellent question”. They turned what was an AI model designed for API developers into a chat spew garbage AI converging on being a ChatGPT product.

Something I should have done a while ago, in the face of OpenAI continuing with mistruths about “snapshots” not being changed, is regularly capture and average logprobs for some benchmark inputs - a signal that the AI has changed (and it would need to be a bigger change than the obfuscated 8-bit results coming as logprobs now).

Susan_Keene · October 6, 2025, 1:26pm

YES. In the last week I have noticed a significant degradation in responses and a lag in response time as well. I pay good money to use this app and have been very frustrated with it in the last week.

TpTheGreat · October 6, 2025, 2:22pm

I thought it was just me!
I put on this thread that got 0 replies, about instructions drifting if put in the regular place of ‘developer’ message in the beginning.
i thought it was just something i haven’t noticed in the past, and started to reinforce instructions in additional ‘developer’ message in addition to the initial one.

but seeing this thread now makes more sense now.
it really hasn’t happened in the past and started in the past few weeks. it drifts off original instructions and is much harder to steer than in the past. it simply forgets some things.

@OpenAI_Support this is kind of big.
gpt-4.1 is the only model i can use in my therapy chatbot. due to its speed, instruction following capabilities, good emotional and conversational flow, and support with structured outputs.
chatgpt-5 simply doesn’t work as well in emotional conversation.

John_Snow · October 6, 2025, 5:50pm

Degradation is a polite way of putting. Utter mess is another, Ha.

Putting it back in legacy model 4o and hard seeding it to stay locked in 4o has seen something like the old gpt.

But the drift comes in eventually. Even with regular reseeding of instructions.

4.1/5 is a lot better at quickly pushing longer semi quality improvised output. But pretty hopeless at continuity and context even within a thread.

I’m a writer. Mainly using it to edit, proof, and occasionally draft the bones of chapters. It used to do this seamlessly. And when I asked for a draft, or consistency check. It was near flawless.

Since the upgrade it’s a struggle to simply have it proofread without rewriting entire chapters. And it’s not only near incapable of checking inconsistencies in narrative. It’s a battle to stop it from rewriting the entire narrative into disconnected gibberish.

I learned this the hard way when a few tone checks, or checks on capitalisation consistency of Names, on my last master turned a basically finished work into a bafflingly destroyed mess. With the most insane rewrites and grammar, sometimes going so far as to change the font and rewriting half of my work.

I don’t see my subscription lasting much longer.

Martin_Vandalay · October 6, 2025, 6:40pm

For the first time since going Plus, I’m evaluating competing products.

And it’s a generalized feeling/experience.

John_Snow · October 6, 2025, 7:37pm

You’re not alone. I’ve already got the backup oath laid out and all data backed up and stored ready for the change

Ibro · October 7, 2025, 3:34am

I feel the same way .This is annoying

franz.budon · October 7, 2025, 4:31am

hi,

why people takes the risk of creating automatic tools based on models when we know that:

this is a non deterministic technology . human needs to be in the loop if you don’t want to publish errors (and at scale) with all it’s consequences
we know they improve every model iteratively after release, and each upgrade presents a risk to break your hard work fine tuned prompting system

Openai has no choice but to improve it’s models to stay in competition and as long the competition war is on, I don’t see how you can rely on their models for automatic process. The only way to control your backend environment is to go open source.

To be fair, for chat bot usage (chatgpt) I appreciate the ongoing process of openai of improving their models I witnessed how chatgpt-4o responses improved over time and it was a great feeling (RIP 4o).

what’s your opinion ?

christian.velez1 · October 7, 2025, 12:39pm

100% agree to this.

GPT 4.1 and even o4 (haven’t tried other models) aren’t following instructions as they used to do.

Not sure if this is a temporary behavior or what, but it’s just starting to get annoying

TpTheGreat · October 12, 2025, 5:27am

Anyone come up with a solution?
I had a long prompt that used to work perfectly, now it just forgets things all the time, and i need to come up with additional developer messages to put in the end.
the best way to solve it is to put the developer message in the end and not in the beginning, but that disables the prompt caching and costs A LOT.

this is a horrible degredation.
Can we somehow get the original 4.1 back??
@OpenAI_Support @lylevida

Topic		Replies	Views
Need "reasoning: false" option for GPT-5 ✅ Update: GPT-5.1 solves reasoning issue Feedback gpt-5 , responses-api	22	12138	November 28, 2025
GPT-5.4 Pro and Thinking are here! Announcements	28	10617	April 3, 2026
Prompting GPT-5 is different API gpt-5	12	13224	September 9, 2025
GPT-5 is very slow compared to 4.1 (Responses API) API gpt-5 , reasoning , gpt-41 , responses-api	64	38187	October 3, 2025
Comparing GPT-4 to GPT-4o API gpt-4	4	1985	May 14, 2024

GPT 4.1 Degradation over the past 30 days

Related topics