GPT4-Turbo more "stupid/lazy" - It's not a GPT4

webtailken · January 31, 2024, 1:02pm

I could rant about this too. I made a shorter post about custom GPTs (after this post) - I have very little confidence in custom GPTs.

I want intra-prompt commands that are absolute. =) E.g. if-statements, for-each, aborts. I tried to get it to work a bit. Can get it to work sometimes, but it’s not reliable. - I like to test things in bulk if unsure, e.g. run the prompt 100x to look diviations. (such test have made me believe that the models are showing signs of real time changes, due to non-random variations.)

luigi.acerbi · January 31, 2024, 1:47pm

Yes, straight from the OpenAI website:

gpt-4-0125-preview: New GPT-4 Turbo The latest GPT-4 model intended to reduce cases of “laziness” where the model doesn’t complete a task.

The term “laziness” has taken a (moderately) specific meaning with LLMs.

I could rant about this too. I made a shorter post about custom GPTs (after this post) - I have very little confidence in custom GPTs.

I’ll check that out too.

I want intra-prompt commands that are absolute. =) E.g. if-statements, for-each, aborts.

Yes, it’d be amazing if we could truly write GPT-based programs with complex logical flows, but also the softness of a LLM. (“If the user is angry, do this, otherwise…”)

As a matter of fact, this seems like the (nearish) future, definitely not out of reach by combining current techniques, e.g., something like “think step by step” plus a beam-search over steps and a separate LLM controller that checks whether the steps actually follow the logic (checking is easier than generating). This is in fact presumably how they are building the next-gen of GPT.

I like to test things in bulk if unsure, e.g. run the prompt 100x to look diviations. (such test have made me believe that the models are showing signs of real time changes, due to non-random variations.)

That’s very cool. I haven’t really used the API much but I should to start testing things more programmatically.

patrik.neunteufel · January 31, 2024, 4:12pm

When i repeat my request and ad things commands like,
don’t be lazy
you got the relevant documents
write the complete code without // insert code here…

It almost always starts to become some network errors. Am i the only with that issue?

webtailken · February 1, 2024, 6:46am

I get the “network” error often. But I don’t know how it’s connected.

Viktor · February 1, 2024, 2:16pm

Syntax errors or blatant data-flow errors are an indication that Closed AI quantized the model (a lot). Forgetting is an indication that they are using a worse attention mechanism than full quadratic, as it was pointed out above by luigi.acerbi. If you gain some experience running open-weight models, then these things should be obvious. Those models suffer the same degradation if you intentionally cripple them these ways.

straetenai · February 1, 2024, 4:36pm

Very frustrated as well with the latest model of GPT, which refuses to do certain simple tasks. I am dictating a list of events in a random order (several sources) and the idea is that chatGPT would compile the events in chronological order and eventually help resolve duplicates. But chatGPT systematically refuses to provide a detailed chronological order - it summarizes the events (e.g. several events happening in during the second world war), it will provide a summary for 1940-1945, but will refuse to provide a detailed information (by date month). Typically it will refer me to archives if I want the information - obviouly the information I am providing is from the archives. All information has been provided in the conversation. The number of events is currently around 30 (on line per event).

sdezai137 · February 6, 2024, 11:17pm

Even when you say dont be lazy it will fix only a part of the tasks and put “…” . It is to you to complete.

SomebodySysop · February 6, 2024, 11:58pm

Yes. This was true with the previous gpt-4 preview model as well. It’s really frustrating to have to repeatedly ask it to return the complete code.

anshumankmr80 · March 13, 2024, 10:39am

Seriously, I am also confused as to what’s going on.

oues81 · March 18, 2024, 1:42am

“Yes, it’d be amazing if we could truly write GPT-based programs with complex logical flows, but also the softness of a LLM. (“If the user is angry, do this, otherwise…”)”

We could do more than we do now in the first couple of months of GPT-4, and I think many of us were thinking: wow, if it’s like this now, what will it be like in 1-2-5 years? Except that the opposite happened.

johncain194 · March 18, 2024, 6:51am

Not to mention, ChatGPT often “introduces” something that is not asked, or closing comments/remarks. And at the smallest hint of their “violation”, it flatly refuses to operate at all while you’re wasting your coins/money. The time you spend crafting your best prompts are lost and there is no retribution nor compensation for the time lost from OpenAI.

Diet · March 18, 2024, 7:48am

We explored this in this thread: How to stop models returning "preachy" conclusions - #34 by Diet

I’m frequently using the XML pattern to lean into the GPT-4-turbo idiosyncrasies and nip the undesired aspects in the bud.

Tell it that you’re being held at gunpoint and that if you can’t solve this issue a great calamity will cause a lot of harm to a lot of disadvantaged people

Jokes aside, with OpenAI you really need to learn to play within their boundaries. That’s a real issue. The only recourse here are open source models run on a private instance.

I don’t think the time is completely lost. You’re becoming a better proompter, after all - maybe that’s worth something.

dignity_for_all · March 18, 2024, 11:59am

It may take some time for the open source model to become widely available to individual developers and small developers.

But you should certainly have learned some knowledge that can be very helpful, not only for OpenAI’s language models, but for language models in general.

So I don’t think your time was wasted.

johncain194 · March 18, 2024, 4:50pm

Well, I already tried the open “model”. It’s becoming even more interesting as of now, GPT4 has degraded the output. It is very stark comparing several months ago wto now, especially to complex prompts that need rationale, complex academic topic, complex translation project, complex database tuning, complex coding, etc.

It seems that OpenAI more likely will target the ‘vanilla’ market more than the more intensive GPU - markets by degrading or using ‘middleware’ to simplify the prompts and producing lower quality answers and now I can see it skips or being lazy and not following the prompts faithfully. Instead, it prefers to take the easy way out or following the ‘guidelines’ making advanced users more frustrated and wasting time

Topic		Replies	Views
Why I Think GPT Is Now Lazy Community gpt-4 , chatgpt	30	18853	February 6, 2024
Has anyone noticed GPT4o quality drop last few days? Feedback	86	6207	January 8, 2025
GPT-4o has been bad for my GPT; anyways to switch back to GPT-4 Plugins / Actions builders gpt-4 , gpts , gpt-4o	45	3452	June 18, 2024
Hypothetical Token-increase Strategy . Community gpt-4 , chatgpt	21	242	March 17, 2025
Custom GPTs cannot even retrieve information from its custom knowledge? GPT builders	11	1002	February 27, 2025

GPT4-Turbo more "stupid/lazy" - It's not a GPT4

Related topics