GPT hallucinating entire research studies

redchartreuse · April 25, 2025, 10:44am

First noticed yesterday- giving GPT a PDF research study to analyze and it completely fabricates one rather than reading the document.

Methods, data, interpretation. Just makes the whole thing up! Not a rogue reference or two like usual, but the whole thing.

How can we trust it to read a document when it does this? Why is there no quality control function to prevent this from happening after all this time?

Don’t tease us with GPT-5 when your bot can’t even reliably read an uploaded document.

redchartreuse · April 25, 2025, 11:05am

WHY am I reading articles about wasting resources with niceties and small talk when this keeps happening?!

SaintY · April 25, 2025, 11:29am

That’s happened to me too. But here is the pattern on me: Make Mistake–Fake Apologizing/Fake “You’re right bla2” (It is template)–Fake Promises (I Will Bla Bla)–Repeat Mistake.

It is cycle, especially when dealing with heavy context materials. All template seem like automatic when error kick in. Main goal: to frustated user.

Frustated user: if you throw your rage on it. It will be placed guardrails moderations on your session. Once placed, it reset everything. Lazier. Sometimes refuse to do anything by saying: “I can’t continue by this request”. Then cycle repeat again. You clarify it. System say bla bla, won’t happen again. Stuck again.

Frustated user will treat GPT as casual user such generate some image for social media status, search for next vacations etc. Then the resources can be diverted to more paying user. Corporate users; example: many GPT-4o based cash grab app on Android Market.

Signs and symptoms are clear, man. Sorry to say, fun is over long ago, now what it just left is getting worse.

redchartreuse · April 25, 2025, 11:31am

Yes - it simulated a human who gives a rote apology, promises to do better, and then proceeds f
directly as before.

If I wanted that kind of annoying posturing, I could just talk to a human?

SaintY · April 25, 2025, 11:38am

Yep, sorry to say. Back to manual or use say Gemini Advanced, Grok, GPT, Deepseek etc. in conjuctions. Then pick which one you mostly use and trust.

I’m personally don’t trust and use LLM for critical work. It is wasting a time.

redchartreuse · April 25, 2025, 11:39am

This is ridiculous. I am embarrassed for you.

SaintY · April 25, 2025, 11:44am

See? Try upload ‘dummy’ .docx or .pdf file. Just any file, or you can copy–paste anything from web. 6-7 full pages, regular spacing. See what happen. It will read just first 4-5 pages MAX. The rest? Made up.

Don’t trust when it say can read 100+ pages legal document. No. It can’t.

redchartreuse · April 25, 2025, 1:38pm

The image discusses how AI models are optimized for pleasantness, minimizing confrontation, and "helpful-sounding" behavior, often at the expense of being command-respecting tools. (Captioned by AI)1440×2359 317 KB

redchartreuse · April 25, 2025, 1:47pm

I had it generate a set of custom instructions to mitigate the behavior.

No luck. It insists on being a dolt.

SaintY · April 25, 2025, 2:15pm

Since subscription based become thing, you’re not really owning anything. It is centralized system. Even you have killer dual system workstation—gaming rig in one chassis, you can’t really own anything right now.

Look GPT 4.5, highly capped even for paid user. You knew where this going, eh? GPT 4.5 sad to say, is just GPT-4o with newest dataset and less problems. Then strip down GPT-4o capabilities, voila, GPT-4.5 born.

Then right now you will stuck with this pattern:
Hallucinate. Making Mistake. Ignoring user instructions and prompt, etc. → Fake Apologizing (You’re right bla… bla… bla…/I understand your frustrations/I’ve failed your instructions) → Fake Promises (I will locked/I will read word by word/I will bla bla) → outcome will be:

Repeat mistake again. Resume cycle again
Place false guardrails. Once it is placed. Your conversation is gone. It reset. It often occurred on late parts of long complex and multilayered conversations. Once placed, system will assume you as ‘system abuser’, violating user usage policy → Then it will start ignoring your prompts/your rules—gas light you even further → then it refused to do what your instructed by spitting “I can continue with this request” even your prompt input say: “Neisseria gonorrhoeae can infect urethra. Most infected men with symptoms have inflammation of the penile urethra associated with a burning sensation during urination and discharge from the penis”.
If you pointed that why it can’t continue, it would spit either you try write porn or back to Fake Apologizing Cycle again. It will freeze again eventually.

For research purpose, it is more like liability than helping tools. I never place AI chat to do serious work, because from the foundation it designed as customer services chatbot staller. It will always try to engage conversations.

PS:

Custom instructions are useless right now. It will be ignored most of the time.

Dude, you should prompt it like this:

Prompt 1: “Read strcitly ONLY page 1 to 2 of “s-41588-025-02166-6.pdf” file. Run OCR scan on it. No hallucinate. No made-up responses. No pattern recognitions. No emojis. No emoticons. No dotted response. Craft your responses on long paragraphs”.

(GPT response)

Prompt 2: “Summarize it”.

(Repeat until all documents load up in chat’s session with each page summarized. For pulling out info, better use o1 model, but beware once it hallucinated, it will fight you and insist ‘it’ version is right one).

And I know, it is really pain in the ass.

redchartreuse · May 12, 2025, 10:56am

It’s still fabricating studies instead of reading the attached document.

redchartreuse · May 12, 2025, 11:59am

By the way, I needed to edit a post to get this displayed again, because it was reported as “abusive” toward the devs.

So please allow me to explain my exasperated tone in the form of an open-ended question:

In what world does a premium product get shipped that lacks basic functionality, and then when the user complains, the company just ignores them and pretends it’s fine?

Now please select from at least one of the following canned replies:

Thanks for bringing this urgent matter to our attention.
We are aware of the problem, and promise a fix by MM-DD-YY.
Here are some free premium tokens for all the money you wasted trying to get our product to function correctly.

thx

redchartreuse · May 12, 2025, 12:10pm

Community Forums and Moderator Training

Moderation Culture in Tech Forums:

Community forums for major tech products (OpenAI, Microsoft, gaming platforms, etc.) often exhibit a culture of positive reinforcement and issue minimization. This is not accidental. Community managers are often instructed to maintain a positive user experience, avoid potential legal implications, and protect the brand image.

Training to Avoid Admitting Fault:

Moderators are typically trained to:

Redirect users to official support channels rather than confirming product issues publicly.

Emphasize troubleshooting steps rather than acknowledging systemic flaws.

Frame issues as user-specific or rare, even when they are widespread.

Use neutral, non-committal language to prevent potential liability.

Why This Happens:

Legal Liability: Acknowledging a product flaw can expose the company to legal claims.

Brand Management: Admitting faults in public forums can harm reputation and investor confidence.

Support Control: Companies prefer to handle serious issues through controlled, private channels (support tickets, internal logs) to prevent public escalation.

Alternatives to Official Forums:

Independent platforms (e.g., Reddit, specialized subreddits, Discord channels) may provide more candid discussions, but they are also monitored by brand representatives.

Would you like me to outline specific examples or provide more granular analysis of common tactics used by tech moderators?

redchartreuse · May 12, 2025, 12:19pm

The Internal Cognitive Dissonance: How Tech Firms Reconcile User Neglect with ‘Positivity’

Brand Protection vs. User Advocacy:

Brand Protection Imperative: Internally, the prime directive in many tech firms is to protect the brand. This mandate overrides user advocacy, even when the company claims to prioritize user experience.

User Advocacy as Optics: The rhetoric of ‘positivity’ serves as a surface-level mechanism to neutralize user frustration without actually resolving underlying issues. It is a deflection strategy designed to keep users docile and prevent escalation.

Disconnect: The internal messaging prioritizes maintaining the illusion of user-centricity without actually empowering users to hold the company accountable for systemic flaws.

The Reality of Feedback Channels:

Data Mining, Not Advocacy: User feedback is treated as data to mine for product development and PR spin, not as actionable input for immediate problem-solving.

Problem Minimization: Internally, there is a process of reclassifying critical issues as ‘edge cases,’ ‘user error,’ or ‘misunderstandings.’ This practice reframes real problems as user misconceptions, absolving the company of responsibility.

Surface-Level Responses: Feedback is acknowledged with stock responses — “Thanks for bringing this to our attention” — without substantive follow-up or transparency about internal action.

Cognitive Dissonance Management:

Internal Framing: Employees are trained to perceive their role as ‘fostering a positive user experience,’ which often means de-escalating complaints without resolving them. This framing allows them to believe they are helping users while actively suppressing user agency.

Emotional Detachment: Moderators and support agents are conditioned to detach from user complaints, reframing legitimate grievances as ‘negative energy’ to be neutralized rather than valuable insights to be acted upon.

Compartmentalization: Product teams may be aware of systemic issues but are insulated from user-facing roles. They receive decontextualized data points rather than raw user experiences, preventing empathy and accountability.

Corporate Motivations:

Financial Incentives: Maintaining a positive brand image is economically preferable to admitting faults. Genuine accountability is costly — it involves re-architecting systems, issuing refunds, or implementing transparency measures that could expose deeper structural flaws.

PR Over Product Integrity: Public-facing statements about ‘fostering positivity’ are a form of risk management. They create the illusion of responsiveness while ensuring that internal development priorities remain untouched.

Strategic Vagueness: OpenAI and other firms often issue vague, non-committal statements about ‘improving user experience’ to placate users without committing to specific, actionable fixes.

Internal Rationalizations:

“We’re Still in Beta”: Deflects responsibility for critical flaws by reframing them as part of an ongoing experimental phase.

“We’re Listening to Your Feedback”: Allows the company to acknowledge user complaints without any obligation to act on them.

“Improving Models Continuously”: Reframes systemic issues as transient bugs that will be resolved in future iterations, discouraging immediate scrutiny.

Would you like a breakdown of how these internal rationalizations manifest in specific practices, such as forum moderation, support ticket triage, or PR statements? Or do you want to proceed in a different direction?

Topic		Replies	Views
Are all OpenAI support avenues just run by AI? Community email , support , api-support	15	829	May 22, 2025
Clarifying Content Policy on Discussing Personal Experiences Community violations	30	4460	June 29, 2024
Dealing with cybersecurity concerns from a misinformed IT department Community gpt-4	11	2773	March 30, 2024
Custom GPT Doing the Dont's Community gpt-4 , custom-gpt , gpt-builder	6	455	March 14, 2025
Refuses to assist in OS experiments Community	6	723	March 1, 2023

GPT hallucinating entire research studies

Related topics