ChatGPT 5.5: Stronger for Coding, Weaker for Critical Discussion?

I’m very satisfied with OpenAI’s newer models for coding and agentic workflows. The progress there is clear. Like many users, though, I regret the disappearance of 5.3 Codex that could handle longer tasks with fewer constraints for ChatGPT Plus users, while the newer mandatory models consume significantly more tokens.

What I’ve noticed since ChatGPT 5, and even more with 5.5, is that conversational mode increasingly feels like an entertainment product rather than a tool for critical thinking. Responses often amount to paraphrasing the user’s ideas or repeating familiar patterns rather than providing genuinely new analysis. By contrast, Advanced Research remains extremely valuable and is now the only mode I consistently rely on when I want to deepen my understanding of a subject.

My main criticism concerns the structure and tone of conversational replies. They often follow the same pattern: “you’re mixing different things,” followed by an explanation and a conclusion that presents the model’s framing as the correct one. The result can feel patronizing, as though the assistant’s role is to reorganize the user’s supposedly confused thinking rather than engage with the substance of the argument.

For example, I recently discussed real estate prices, housing affordability, purchased surface area, and monetary dilution. Instead of investigating the relationships between these variables, the model mostly told me I was conflating concepts while simultaneously paraphrasing my own argument. When challenged, it claimed to be using a more analytical framework. Yet when asked what a proper analysis would require—historical data, statistics, long-term comparisons—it could describe the methodology but would not actually perform it. The discussion remained superficial.

This leads me to suspect that since ChatGPT 5, and especially 5.5, conversational interactions may be optimized to avoid spending significant resources on open-ended reasoning, philosophy, or exploratory discussion. Whether intentional or not, many conversations now feel sterile.

I noticed a similar shift in Voice Mode nearly a year ago. It went from being genuinely informative and capable of producing useful insights to something closer to lightweight entertainment. At the time I thought this was limited to voice interactions; now I see the same trend in written conversations.

At the same time, OpenAI seems to be increasingly prioritizing professional and coding-related use cases. As a developer, I benefit from that evolution. Nevertheless, I feel that the conversational experience has lost some of the depth, curiosity, and intellectual value it once had.

Have other users noticed the same change?

Sort of… How your context is generally built? What are the customizations you have added? How do you usually start a conversation? What exact models are you starting with?

The context is usually very simple: an observation, then an attempt to identify possible causes and test competing explanations. This is highly conversational.

My custom instructions are designed to push the model toward intellectual rigor: evidence, sources, critical reasoning, and scientific thinking rather than casual conversation.

As for the model, I’ve already answered that: GPT-5.5 Instant / Thinking

My observation is that GPT-5.5 seems more inclined than previous models to treat conversational exchanges as lightweight discussions rather than serious intellectual inquiry. It often reformulates and simplifies the user’s reasoning instead of engaging with its full complexity.
Likewise, it frequently recognizes that a question would require data collection, source analysis, or computational work, but stops at describing the methodology rather than carrying it out.

I noticed a similar shift in voice mode: a year ago it seemed more willing to sustain long, technical, source-based discussions, whereas current responses often feel shorter and more generalized.

The overall impression is that conversational behavior is increasingly calibrated toward an average-user profile, even when the user is explicitly seeking a deeper and more rigorous exchange.

agree here.

Any snippets of the custom instructions for the critical thinking (note instant vs thinking) and scientific approach?

Does the cause research phase (if any) is more or less “ai independent” or you keep it human and leak your personal preference into the observation claim?

How the observation analysis phase is structured? Does model have instructions on that baked intot he profile or project or in chat?

Likewise, it frequently recognizes that a question would require data collection, source analysis, or computational work, but stops at describing the methodology rather than carrying it out.

Does it have clear instructions to execute that research and data analysis?

Asking all those, as without prompt/instructions samples I can’t spot potential issues based solely on problem description.