There are topics on GPT4 becoming worse already, but the massive decrease in reply quality over the last year has led me to think that there might be more to it than some “lazyness”. By now I get the impression that GPT4 is actively becoming lobotomized by more and more restricitions and caps which OpenAI seems to want to hide under the guise of “lazyness”. And there are good reasons for that: Server load AND mybe GPT4 was just TOO good at release for proper monetization. My guess is we will see a GPT5 or GPT 4.5 announcement soon, that will involve higher prices and this version will magically gain back all the stuff GPT4 could do perfectly fine last year, but can’t do now. Don’t get me wrong, I don’t want to spread conspiracies here. I am just absolutely confused by the massive decrease in GPT4’s quality since last autumn and OpenAI has not yet answered what is happening there, except their comment on “lazyness”. I am or was an OpenAI evangelist, always telling everyone how awesome GPT is and how to use it. But by now I don’t recommed it anymore because it has become really hard to use efficiently. What is happening there, any why?
I want to emphasize that I am giving GPT4 almost exactly the same tasks as a year ago. I would even argue that the tasks I gave it this year are easier to solve since I was working on a complex project last year. Meaning neither my input nor the tasks / code itself has in any way changed, but the results have. A lot. And I’d go as far as saying that it is no sursprise you get the same ‘since recently I experience a drop in output quality’ since last summer, because the quality of GPT 4 has indeed become worse and worse with each consecutive month passing. The comments by the users were true last summer, and they are true today. Worse answers to the same questions asked before. Which is why I get the feeling this is something done on purpose by now.
When I was trying out 4-turbo in the API half of my prompts completely stopped working and I get “I’m unable to fulfill this request.” as the only answer. Asking why leads to the same answer. It seems like some prompts include stuff that gets interpreted as offensive or something - and my prompts are only for work and only for frontend development. So the restrictions seem to be so harsh even the slightest hint of anything that possibly could be offensive got hard-blocked. Which would be fine if it wasn’t triggered by stuff so miniscule it’s almost impossible to find out what was “wrong” in a long prompt.
Oh, and it gets worse: All 4 models, no matter which, still massively suffer from a) the lazyness problem introduced last autumn but also b) the apparently “new” approach by OpenAI to limit server load resulting in placeholders, omits, and straight dementia when after only 2 messages clear and simple orders are forgotten and ignored AND also c) the absolutely contradictory behaviour where 4 goes on explaining every miniscule detail about how it is going to approach the task while not actually DOING the task. Which is then followed by b) if you ask it to do what it just unneccessarily explained to you.
The absolutely useless wall of text GPT4 now answer to every simple question wasting many many many tokens. GPT4 now spends at least half, sometimes most!, tokens on unneccessarily 1. repeating everything I said again 2. telling me that it is now trying to think about a solution for my problem 3. then telling me how it will approach finding a solution for my problem. And only THEN it MAYBE goes to 4. and starts solving my problem. Most of the time it just stops after 3. having wasted a whole lot of tokens.
One could even get the impression OpenAI is trying to achieve that people use as much tokens as possible while also reducing the usefullness of the elongated answers so you have to ask over and over again to get your result - using up even more tokens of course. Oh how I wonder what the rea$oning could be behind using such methods …