ChatGPT 4o insists on delivering very long answers

Recently, the 4o model has generated excessively long responses during sessions.

I try to modify the personalized instructions, asking for short and objective answers, not repeating entire answers already produced (regardless of the subject), not generating repeated programming code, but he seems to obey this only in the first prompts of the session. If the conversation lasts for a few interactions he starts to become chatty, examples:

a) I ask for an opinion on a code approach it generates a lot of code in addition to writing an article.
b) I ask about 1 line of a function, he rewrites everything again.
c) I reprimand him, instructing him not to repeat it, and he rewrites the entire answer, summarizing it again, without need.

But this goes beyond programming. The same thing happens on different subjects, he tends to write in the form of articles with lists, bullet points, introduction, final considerations, etc.

And if the session starts to get long he starts not even obeying me anymore. Then I start having to keep getting his attention with each answer as if he were a stubborn child, like “stop regenerating this, just answer the last point” :joy:, to regenerate shorter answers with no need. Or start a brand new chat loosing all context. :weary:

Does this model only work well with short sessions? :fearful:

PS: I didn’t have problems at the beginning of the model 4’s life or right after the launch of the 4o. Do these models tend to get worse over time?

Yes. From my experience it does not do well at instructions following, following the context, and loves to repeat itself.

1 Like

I noticed my ChatGPT starting with 3.5 models selected for every new session last days, so I have to manually change to 4o. Service usage doesn’t seem to have increased last weeks according to specialized stats portals…

I rememeber I noticed a drop in the quality of the “model 4” weeks before the release of the 4o.

I also noticed a drop in the quality of MS Bing Chat (now Copilot) and their Dall-e generations before major updates around that same time.

Which raises a big question for me: do these models get worse with usage? Or are companies making things worse on purpose to build hype for adoption of the next update?