Please don’t retire GPT-5.1 Thinking – GPT-5.2 feels worse

Mahad · March 5, 2026, 8:21am

I’m really glad this thread exists because I’m in the same boat about 5.1 getting retired, and I wanted to share a concrete way I’ve been comparing 5.1 Thinking and 5.2.

I’ve been running a lot of informal A/B tests between the two models. My process is pretty simple and repeatable:

I open one chat with GPT-5.1 Thinking and one with GPT-5.2 Thinking.
I give both chats the exact same prompt, attachments, and context, copied and pasted. For example, “Help me answer this discussion board question in my own voice,” or “Rewrite this email to sound more natural and human.”
After both models reply, I copy their outputs into a new message and label them “Response Version 1” (which is the response 5.1 gave) and “Response Version 2” (which is the response 5.2 gave) without saying which model wrote which.
I then ask each model something like: “Here are two responses to the same prompt. Which one is stronger and why?”
I repeat this across different tasks: scripts, essays, discussion board replies, explanations, product reviews, etc.

What’s wild is that in roughly 90 percent of these comparisons, both models pick 5.1’s answer as better. Even 5.2 consistently prefers the 5.1 output when it doesn’t know which one it wrote.

The reasons it gives usually line up with what people in this thread are already saying:

5.1 sounds more natural and less mechanical
It follows nuanced style instructions more closely
It organizes ideas better and adds useful detail without rambling
It feels more like a human collaborator instead of a template generator

So from my perspective, it isn’t just a “vibe” thing. When the newer model is repeatedly judging blind and still saying “the other response is better,” that feels like a pretty clear signal that 5.1 is still the stronger model for a lot of real world creative and writing tasks.

I understand from the support reply that retirement decisions happen at a broader platform level and can’t be reversed just because a few threads ask for it. But I really hope the team takes this kind of side by side evidence seriously. I attached a screenshot that show 5.2 explicitly choosing 5.1’s answer and explaining why. I can provide many more screenshots of this happening as well as the actual results from both models so you can see the clear degradation in quality from 5.1 to 5.2 if needed.

At minimum, it would help a lot if 5.1 could stay available as a legacy or “creative” option for Plus users instead of being removed entirely. For many of us who use ChatGPT mainly for writing, research synthesis, and long running projects, 5.1 isn’t interchangeable with 5.2 at all. And when the newer model itself keeps saying the older one is doing a better job, it’s hard to understand why that older one has to disappear instead of staying as another tool in the toolbox.

Topic		Replies	Views
Some thoughts on human-AI relationships Community chatgpt	40	4330	June 25, 2025
GPT-5 Coding Feels Downgraded — Please Fix This Codex	127	16451	January 27, 2026
Big Idea: GPT as a universal concept translator Community idea	46	1894	September 5, 2024
Day 12 of Shipmas: New frontier models o3 and o3-mini announcement Community shipmas	71	9593	December 26, 2024
GPT-5.4 Pro and Thinking are here! Announcements	28	10544	April 3, 2026

Please don’t retire GPT-5.1 Thinking – GPT-5.2 feels worse

Related topics