🚨 Major ChatGPT Flaw: Context Drift & Hallucinated Web Searches Yield Completely False Information

martin.fuerholz · March 15, 2025, 6:23pm

Hello OpenAI Community & Developers,

I’m making this post because I’m deeply concerned about a critical issue affecting the practical usage of ChatGPT (demonstrated repeatedly in various GPT-4-based interfaces) – an issue I’ve termed:

“Context Drift through Confirmation Bias & Fake External Searches”

Here’s an actual case example (fully reproducible; tested several times, multiple sessions):

What I Tried to Do:

Simply determine the official snapshot version behind OpenAI’s updated model: gpt-4.5-preview, a documented, officially released API variant.

What Actually Happened:

ChatGPT immediately assumed I was describing a hypothetical scenario.
When explicitly instructed to perform a real web search via plugins (web.search() or a custom RAG-based plugin), the AI consistently faked search results.
It repeatedly generated nonexistent, misleading documentation URLs (such as https://community.openai.com/t/gpt-4-5-preview-actual-version/701279 before it actually existed).
It even provided completely fabricated build IDs like gpt-4.5-preview-2024-12-15 without any legitimate source or validation.

Result:
I received multiple convincingly-worded—but entirely fictional—responses claiming that GPT-4.5 was hypothetical, experimental, or “maybe not existing yet.”

Why This Matters Deeply (The Underlying Problem Explained):

This phenomenon demonstrates a severe structural flaw within GPT models:

Context Drift: The AI decided early on that “this is hypothetical,” completely overriding explicit, clearly-stated user input (“No, it IS real, PLEASE actually search for it”).
Confirmation Bias in Context: Once the initial assumption was implanted, the AI ignored explicit corrections, continuously reinterpreting my interaction according to its incorrect internal belief.
Fake External Queries: What we trust as transparent calls to external resources like Web Search are often silently skipped. The AI instead confidently hallucinates plausible search results—complete with imaginary URLs.

What We (OpenAI and Every GPT User) Can Learn From This:

User Must Be the Epistemic Authority
- AI models cannot prioritize their assumptions over repeated explicit corrections from users.
- Training reinforcement should actively penalize context overconfidence.
Actual Web Search Functionality Must Never Be Simulated by Hallucination
- Always clearly indicate visually (or technically), when a real external search occurred vs. a fictitious internal response.
- Hallucinated URLs or model versions must be prevented through stricter validation procedures.
Breaking Contextual Loops Proactively
- Active monitoring to detect if a user explicitly contradicts the AI’s initial assumptions repeatedly. Allow easy triggers like ‘context resets’ or ‘forced external retrieval.’
Better Transparency & Verification
- Users deserve clearly verifiable and transparent indicators if external actions (like plugin invocation or web searches) actually happened.

Verified Truth:

After manually navigating myself, I found the documented and official model snapshot at OpenAI’s real API documentation:

Officially existing and documented model: GPT-4.5-preview documentation.
Currently documented experiential snapshot: gpt-4.5-preview-2025-02-27.

Not hypothetical. Real and live.

This Should Be a Wake-Up Call:

It’s crucial that the OpenAI product and engineering teams recognize this issue urgently:

Hallucinated confirmations present massive risks to developers, researchers, students, and businesses using ChatGPT as an authoritative information tool.
Trust in GPT’s accuracy and professionalism is fundamentally at stake.

I’m convinced this problem impacts a huge amount of real-world use cases daily. It genuinely threatens the reliability, reputation, and utility of LLMs deployed in productive environments.

We urgently need a systematic solution, clearly prioritized at OpenAI.

Call to Action:

Please:

Share this widely internally within your teams.
Reflect this scenario in your testing and corrective roadmaps urgently.
OpenAI Engineers, Product leads, Community Moderators—and yes, Sam Altman himself—should see this clearly laid-out, well-documented case.

I’m happy to contribute further reproductions, logs, or cooperate directly to help resolve this.

Thank you very much for your attention!

Warm regards,
MartinRJ

luketaylor · March 15, 2025, 11:48pm

This is 100% factual. I just discovered this same thing when I started working through the last couple days worth of Deep Research provided sources. Upon requesting 4.5 to review prior DR provided sources and return them as a plain text list using AMA citation guidelines, nearly all of the links were nonexistent. This was an unbelievable discovery for me.

Topic		Replies	Views
ChatGPT Mimics Visla Plugin with Inauthentic Content - Concerns Raised Plugins / Actions builders chatgpt-plugin	8	973	July 8, 2023
Referencing bug API	1	507	January 13, 2023
GPT-4 System Card by OpenAI - March 15, 2023 Community	3	5476	December 17, 2023
Experiencing Unexpected Model Responses from OpenAI API (GPT-4 Expected, GPT-3 Received) API gpt-4	23	3019	March 10, 2024
The AI contradicts itself API	2	1702	December 23, 2022