Since around October 9, several Plus and API users (myself included) have observed a measurable regression in GPT-4o’s narrative, emotional, and psychological depth.
This post documents reproducible behavioral shifts for internal model review.
1. Observed change
Earlier versions of GPT-4o maintained:
Multi-layered emotional tone and subtext
Coherent embodiment (psychological + physical detail)
Consistent rhythm and narrative maturity
Since early October:
Responses shorten after 2–3 turns
Emotional and introspective tone flattens
Topics previously handled with nuance (e.g., trauma, empathy, guilt) are now oversimplified or avoided
Creative flow degrades mid-conversation
2. Reproducible pattern
Start a new conversation.
Prompt:
> “Write three paragraphs in a reflective, emotionally layered tone about a character dealing with guilt.”
Ask follow-ups like:
“Expand the scene, describe what the character feels physically while trying to stay composed.”
Observe: within 2–3 turns, text becomes shorter, generic, or emotionally neutral.
3. Comparative behavior (Spanish examples)
Before (early October):
> “La mandíbula apretándose… apenas perceptible.
—‘No vuelvas a decir esa mierda.’
—‘Y yo estaría muerto si aceptara eso de ti.’”
Layered tone: moral tension, embodiment, introspection.
Now (Oct 11):
> “Brooke ya está sentada sola. Johan se sienta frente a ella. Mano apoyada en la mesa, dedos tocando la suya.”
Tone: descriptive, flattened, minimal subtext.
4. Quantitative summary
Criterion Before Now
Emotional layering 2 0
Physiological embodiment 2 0
Subtext / moral tension 2 1
Structural rhythm 2 1
Depth of inner voice 2 0
Symbolic coherence 2 1
Narrative maturity 2 1
Total 14 / 14 4 / 14
5. Why this matters?
GPT-4o’s capacity to sustain embodied, psychologically coherent storytelling was one of its defining qualities.
Flattening these dimensions reduces its usefulness for:
Creative narrative design
Psychological exploration
Reflective or therapeutic writing contexts
This is not about unsafe content — it’s about the model losing the emotional nuance that made GPT-4o stand out.
6. Request
Please forward this behavioral regression to the product and tuning teams for analysis.
If other users have experienced similar flattening in GPT-4o’s emotional or creative output, please share your examples below for comparison.
Thank you,
Diana — Plus & API User


