So, about 5 months ago, I posted a thread wondering what in the world was going on with gpt-4 completions.
After the lion’s share of replies told me how I was wrong and needed to improve my prompts, about one week later, OpenAI announced their recognition of the “laziness”, released gpt-4 turbo model, and subsequently released the gpt-4-0125-preview model to address various concerns.
Here I am once again, wondering, what in the world has happened in the last week to month-ish, where the quality of the api completions has suddenly tanked yet again. This time, it’s clearly the worst it’s EVER been, for coding, and for anything else.
For coding specifically, the code I’m getting is essentially scrap. All the gpt-4 models are, what I would deem as, near to the point of being straight broken.
It modifies my import statements with no outside knowledge or reference, inexplicably breaking them. If troubleshooting bugs, more bugs WILL be created, invariably. Or the issue won’t be addressed at a much higher rate than in the past. If I say “Don’t do X”, I have about 5% faith that it will actually not do X. The gpt-4-1106-preview model in particular is utterly bad, while the gpt-4 model is pretty bad and is imo the most notable performance decrease. I can no longer rely on implied bits of information in my prompts. Everything has to be spelled out so linearly, as if the model is an autistic Aristotle.
Overall? It just gets things SO wrong! Much more frequently than I’ve ever seen it. I would say outside of pre-March 2023, this is the dumbest I’ve ever seen GPT.
Interestingly, the completions for the heavyweight gpt-4 model (gpt-4-0613) is considerably faster than gpt-4-1106-preview, which is extremely odd and did not used to be the case. I don’t know if this is just my experience or what. I almost thought it was a rendering problem, but I have determined it is definitely not. It is the models returning chunks at different speeds only.
I don’t know what to do or think or… yeah. I’m at a loss. I’m wondering if this is just me “again” or if anyone else is suddenly over the last 1-4 weeks experience major quality issues.
Is this all perhaps a gear-up for gpt-5 release? Crunching down on internal usage for completions to make room for a massive model swap?