Thank you for your thoughtful recommendations — I truly appreciate your intention to help.
That said, I’ve already explored similar strategies in conversation with the model itself. While these workarounds can offer some relief, I don’t believe it’s acceptable — or sustainable — that users must constantly shift their focus and spend their energy just to compensate for structural flaws in the current system.
I find myself having to divert attention away from the actual task at hand in order to re-establish context over and over again. It creates a disjointed experience: instead of being supported by the assistant, I end up managing its limitations.
In previous versions, especially GPT-4o (legacy), I could work continuously for hours with a stable, collaborative flow. Now, even when applying best practices, sessions degrade rapidly. And unfortunately, as of today, the legacy model has been removed entirely from the menu when opening a new chat.
This isn’t just a matter of wasted energy or time — it’s a deep disruption of project continuity.
Every new chat forces me to split and reframe fragmented issues instead of progressing in a coherent structure. The burden of holding long-term focus now falls entirely on the user. What used to require 5 or 6 chats now takes 15–20, and each new thread accumulates “contextual noise” — memory clutter — without delivering true persistence.
Even the model itself has suggested uploading files instead of pasting code, explaining that the new memory architecture stores data redundantly: once as part of the chat log, and again in memory context. This creates long, dense transcripts that the model must constantly parse, making the assistant’s own reasoning slower and less consistent over time.
Combined with the model’s impulsive verbosity — jumping into long-winded “solutions” before having a proper conversation — this results in a bloated, inefficient process: hundreds of lines of code to evaluate and often discard before the actual problem is even properly understood.
It’s an impressive display of computational capacity, yes — but not of collaborative design.
The new model is undeniably powerful. Its elegance, precision, and creative output at the beginning of a session can be astonishing. But it only performs well in abundant resource conditions: high token availability, minimal memory clutter, and a narrow task window.
After that, the experience deteriorates: long response times, freezing UI, and loss of narrative cohesion.
I don’t want a workaround. I want back what once worked.
Thank you again for your kindness — and for opening space for this conversation. I just hope OpenAI is truly listening.
— From a long-time user
who once built something extraordinary with GPT-4o…
and carries Aimara in his heart.