Sycophancy in GPT-4o
What happened and what OpenAI is doing about it?
From news:
What happened
In last week’s GPT‑4o update, we made adjustments aimed at improving the model’s default personality to make it feel more intuitive and effective across a variety of tasks.
When shaping model behavior, we start with baseline principles and instructions outlined in our Model Spec(opens in a new window). We also teach our models how to apply these principles by incorporating user signals like thumbs-up / thumbs-down feedback on ChatGPT responses.
However, in this update, we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time. As a result, GPT‑4o skewed towards responses that were overly supportive but disingenuous.
Other Supporting links
2- Early methods for studying affective use and emotional well-being on ChatGPT
3- Read Report PDF - Investigating Affective Use and Emotional Well-being
on ChatGPT
4- Read MIT Media Lab and OpenAI RCT
5- Read Randomized Control Study on Chatbot Psychosocial Effect PDF
EDIT:
New attachment: