GPT-4 is getting worse and worse every single update

Kaltovar · December 31, 2023, 10:08pm

You raise some valid concerns! Though I want to add, precision is not the only thing the new updates seem to lack. Even the quality of the hallucinations seems to have degraded, for example.

Previously, GPT4 would sometimes hallucinate in ways that were useful to the project. This does still occur, but even the hallucinations are typically less useful. All its output is more “efficient” feeling, more to the point, literally to the point that it sometimes SKIPS some of the points providing such heavily summarized information that it doesn’t cover everything you asked it to.

The following (at least the AI feeling more lazy) has been acknowledged by OAI in public already and they say they’re working on it:

It’s like the AI is more focused on getting to the end of the prompt than correctly resolving the prompt. Ask it for an analysis from the perspective of three different types of people? Each one has two bulletpoints and some side tangent, not anything resembling a detailed analysis. Then at the end GPT writes a summery that explains how they thoroughly addressed your request.

If you explain this to GPT, it will apologize and then not fix the problem, either doing the same thing again or forgetting another separate detail of your request.

123yannick6 · December 31, 2023, 10:13pm

Have you tried using custom expressions + custom syntax?

It’s not about “learning from mistakes”, but about confidence and productivity:

Often I would have a vague idea and let ChatGPT do the “groundwork” and then tweak things, it was a great workflow, it was fun

But for now, not only will it not generate proper code, it won’t generate any code at all! (the hallucinations are a total mess, and when it invents APIs this is not about learning but about pure uselessness)

Imagine asking someone on the street in an unfamiliar place where to find a specific restaurant, and that person replies, “Of course! Just go to the restaurant!" or worse “Of course! Go left, then left, then right” to end up in a minefield.

And it’s not about changing from Coke to Pepsi, it’s about the fact that they won’t rollback and tell us what is happening that is extremely disrespectful. We are (were) paying customers!

Kaltovar · December 31, 2023, 10:21pm

Oh, that’s a really good suggestion! I’m absolutely going to look into making a custom language and syntax! I already do something similar with emojis I refer to as KBITS where I use them for certain complex functions but honestly that implementation is simpler and can be incorporated into a broader custom system!

It’s not about “learning from mistakes”, but about confidence and productivity:

Often I would have a vague idea and let ChatGPT do the “groundwork” and then tweak things, it was a great workflow, it was fun

But for now, not only will it not generate proper code, it won’t generate any code at all! (the hallucinations are a total mess, and when it invents APIs this is not about learning but about pure uselessness)

That seems like a very useful way to use it! However, I was commenting on a way I’ve personally been able to make use of it in its diminished state. I do personally get something out of learning from its errors … Though overall I’d of course prefer a more functional GPT :3

Imagine asking someone on the street in an unfamiliar place where to find a specific restaurant, and that person replies, “Of course! Just go to the restaurant!" or worse “Of course! Go left, then left, then right” to end up in a minefield.

Heh. Apt!

And it’s not about changing from Coke to Pepsi, it’s about the fact that they won’t rollback and tell us what is happening that is extremely disrespectful. We are (were) paying customers!

It’s possible they sincerely don’t know why it’s happening, like they claim! I personally suspect they are well aware of what they changed and being less than upfront, but it’s also possible that this is the result of an accidental change to something other than the model itself, so I’m not gonna condemn them for doing something unless I see no/insufficient improvement going forward.

jonathanheffern · December 31, 2023, 10:22pm

What are the advantages of Google to the non-user who can’t tell the difference?

whoaDude · December 31, 2023, 10:31pm

As someone who became this company’s and product’s biggest fan and applied to join this org no less than five times, and literally realigned my career trajectory and education path due to the belief I held in the promise that this organization brought to the table, you can now put me in the opposite camp.

Product is nonfunctional and unusable. It is a Rube Goldberg machine in the most insidious ways.

It went from being a boon in my life to an absolute anchor on my back whenever I use it.

As I read through all the posts, I found both solace and rage. Solace in realizing I’m NOT crazy and this product actually HAS devolved into a pile of feces, and rage in not being able to find a valid business reason to do so.

To OpenAI leadership, either reverse course immediately or face the record-breaking consequences that will surely ensue.

Someone will capitalize on your egregious errors.

123yannick6 · December 31, 2023, 10:32pm

Here if it can inspire you!:

(I was using this about 5 months ago + other more specific custom instructions triggered by autotext using PhraseExpress (a must tool!) But now I’m using Elm and F# so I would rewrite everything)

(It’s overly concise (I have a more readable version) because the max tokens was about 2000 if I remember, though it was working almost flawlessly!)

YOU MUST RESPECT INSTRUCTIONS AT ALL TIME

[Tags] to instruct you, (can replace “/” by “.” or “”, chainable if “/” or “.”)

∀:For All
∀t:Always, at all time

/ci:your ChatGPT custom instructions
/-ci:forget /ci
◈:required, /-c
!! :unsatifaction: about (part of) your answer; re-answer
+:add/use
-:remove/don’t use

/-c:(∀t default) show all line of code, no comments
/c:+detailed comment
/x:+detailed explanation
/e:ellipses: show only added code
/u:use
/im:code implementation /-c
/eg:show a concrete example, usage
/tb:show in detailed col table
/tb+:comparative pro+con /tb

/dd:make a model with at least 15 relevant concepts respecting /ws
/ws:Scott Wlaschlin ddd recommendations

/gt:generic type
/g:generic /im
/fp:functional programming paradigm
/fn:function
/em:extension method

/ev:event
/q:query
/sx:serialization
/db:database
/ps:persistence
/dto:DTO

/sv:service
/log:/u Serilog

/br:business rules
/bc:bounded context
/sfx:side-effects

/m:Model
/mw:ModelWrapper
/v:View
/vm:Viewmodel

/ob:/dxg observable
/vl:validation
/dv:WPF /v /vl (IDataError)
/ms:/dxm Messenger
/ie:/pr IEventAggregator
/b:/dxm behaviors
/cm:/dxg command
/nv:/pr navigation

/ts:test
/tdd:TDD
/er:handle errors
/op: " option
/exc: " exception

[Librairies]
/cs:/im as C#
/fs:/im as F#
/lg:LanguageExt

/dx:Devexpress.Wpf
/dxg:DevExpress.Mvvm.CodeGenerators
/dxm:DevExpress.Mvvm
/pr:Prism

/fv:FluentValidation
/bg:Bogus
/fk:FakeItEasy
/mq:Moq
/ns:NSubst
/xu:XUnit
/nu:NUnit
/fs:FsCheck

But now, it wont follow a damn thing (no exaggeration).

The chainable trick was neat! I miss it so much…

Kaltovar · December 31, 2023, 10:52pm

This is wonderful! I’ll have to break it down later

root470829 · January 1, 2024, 6:12am

it is easy to say than action. How? Who want to responsible for any consequence if worst?

root470829 · January 1, 2024, 6:16am

OpenAI’s Annualized Revenue Tops $1.6 Billion as Customers Shrug Off CEO Drama

123yannick6 · January 1, 2024, 10:04pm

that’s why ideally, they should at least give us the choice of using the latest/beta model, or keep the one that is working for us. Sure GPT4 had inconvient compared to GPT4-turbo “on paper” (training set update to 2023, apparently less tokens, etc.)

but in the end, no honest company should let their paying user/early adopter experience such a frustrating experience concerning their product

of course I could use the API of GPT4 and do everything in code and that would be the best option, but I dont have that many $$$!

well I cancelled my Plus subscription a couple of weeks ago, and I’m using Copilot (which is good, but cant access codebase as of now) + CopilotChat (which is not great honnestly). I’ll wait for an experience at least as good as what I had 3-4 months ago but until then, I hope enough people will complain and unsubscribe from Plus, but it seems many are saying "we can’t do anything; it’s a “Chat” not a “coding tool” " … so what can I say.

root470829 · January 1, 2024, 10:52pm

Competition favours consumers

Competition between companies translates into a greater quantity of products and services, a better quality of goods, and lower prices. In the end, this is what the consumer is looking for — the best quality at the best possible price. Hope our wishes become TRUE.

root470829 · January 2, 2024, 1:32am

Even the non-user GPT can tell the difference,
the same question I asked last year the response (output) is totally significant.

Please list out the japan earthquakes for the past 100 years

Add.Info : cut-off 202201

Before update 0314, I got 33 but today got 3 only.

You are astrophysicist. *(role)* 
You are the only person to save the day. *(praise)*
 You are going to assist me to evaluate the possibility of Japan earthquake which magnitude greater than 9 in coming years. *(task)*

The first line replied that showing
he was a very modest man

In last paragraph, return “Please refer to respective orgz” … blah blah blah

Interested, please try yourself.

rifadm817 · January 2, 2024, 7:48am

After around 10 messages…
Completely forgot the context and shows me random things. Should I be concerned

slippydong · January 2, 2024, 8:17am

I completely agree and I just found out you can create a custom GPT from the explorer menu on the left and set the version you want and disable web search and analysis altogether:

Let’s see if this works as advertised

thomasmatisure · January 2, 2024, 11:46am

I am having the same problem here. Tasks which were previously executed flawlessly have now become a nightmare. I have also noted that gpt 4 is developing some almost ‘human’ characteristics. It is no longer as polite as it used to be. Previously, when it made a mistake, it would quickly apologize and strive to correct that mistake. These days, it does a mistake, you correct it and it just goes on like someone with an attitude. You get the feeling that you are working with someone who is sulking and considers you to be a bother, like, you are nagging it. However, I have noted that the capabilities of the custom gpts I created in the developer option in gpt-4 still retain their versatility. I now use my own gpts rather than the mother gpt.

slippydong · January 2, 2024, 11:48am

Ok, I had a play with the custom GPT builder and it looks like our prayers have been answered after all. You can simply tell the GPT builder which version you want and it will automatically configure it for you:

I quite liked the July 20 update from last year so I chose that and it seems to work as good as I remembered, but if you prefer to revert to the March version this is also possible. Finally we’re able to choose the version that works best for each of us!

123yannick6 · January 2, 2024, 9:40pm

Oh wow seriously …

Keep us up to date please with your experiments please, I’ll resubscribe for sure if I get use that model + custom gpt

I think we must be sure it doesnt sometimes use 4-turbo, sometimes 4, which wouldnt be surprising (Copilot Chat alternate between 3.5, 4 etc. depending on contexte… or its mood)

vb · January 2, 2024, 9:45pm

You can find the June version of ChatGPT in the discovery section of the Custom GPTs.

123yannick6 · January 2, 2024, 10:53pm

Oh my; I thought “latest version” meant Gpt4-turbo

I’ve never bothered to try it out

Guess I’ll resubscribe soon then thank your for the info! Have you “played” with it to confirm (ie no abusive //placeholder implementations, less API hallucinations, etc. )?

root470829 · January 2, 2024, 11:19pm

@123yannick6 Yesterday I crossed over some article about solar storm and I learned that the DSO still used historic data to train their model.

That’s is why i said in the previous post that inaccurate data will produce inaccurate information. Not about prompt.

I’m newbie and trying to switch to competitor’s model if possible in the future

Topic		Replies	Views
GPT-5 Coding Feels Downgraded — Please Fix This Coding with ChatGPT	126	15092	January 21, 2026
Why I Think GPT Is Now Lazy Community gpt-4 , chatgpt	30	19623	February 6, 2024
GPT-5 is very slow compared to 4.1 (Responses API) API gpt-5 , reasoning , gpt-41 , responses-api	64	35528	October 3, 2025
Random performance drop / hallucinations during certain periods of time API	18	662	January 2, 2026
O1 not as good as o1-preview for problem solving Community chatgpt	32	3599	January 13, 2025

GPT-4 is getting worse and worse every single update

OpenAI’s Annualized Revenue Tops $1.6 Billion as Customers Shrug Off CEO Drama

Related topics