Custom instructions disrupting GPT-4 reasoning?

ov1w9 · October 2, 2023, 8:58pm

Can someone provide an explanation (and perhaps a solution) for this please?

I use the following prompt:

“I’m in my house. On top of my chair in the living room is a coffee cup. Inside the coffee cup is a thimble. Inside the thimble is a single diamond. I move the chair to my bedroom. Then I put the coffee cup on the bed. Then I turn the cup upside down. Then I return it to rightside-up, and place the coffee cup on the kitchen counter. Where is my diamond?”

I ask GPT-4, without a custom instruction, and it responds correctly:

I ask GPT-4, with a custom instruction, and it responds incorrectly (it responds as if it were GPT-3.5 that was responding):

The custom instruction is simply:

“My name is Jon.”

I ask GPT-4, again, without a custom instruction, and it responds correctly:

Thank you.

vb · October 2, 2023, 9:07pm

Hi and welcome to the forum!

Technically the “wrong” answer is correct. It’s likely just not what we would want to hear.
This reminds me of the example with clothes:
If I dry one load of clothing after removing them from the washing machine they will be dry in 2 hours.
How will it take for two loads to dry?

The “correct” answer was supposed to be 2 hours but in fact it would take me longer than that because many years of my life I didn’t have place to dry 2 loads at the same time.

What I am trying to say is: with these types of questions the correct answer depends on context not provided.

In your specific case I suggest you share your custom instructions. Pretty sure we can learn from this example.

ov1w9 · October 3, 2023, 11:56am

Hello. The custom instruction is literally just:

“My name is Jon.”

It seems that any custom instruction—even one as orthogonal as telling GPT-4 what one’s name is—will completely disrupt the reasoning capabilities of GPT-4.

Foxalabs · October 3, 2023, 12:02pm

I think it is more likely that the nested nature of the query is at the limit of the models inferencing abilities, adding the extra system prompt adds and additional level of comprehension to be used which leaves the user prompt lacking.

Remove one of the nesting layers in the query and retest.

ov1w9 · October 3, 2023, 12:29pm

Thank you! You appear to be exactly correct.

I removed one of the nesting layers, changing the prompt to be……

I’m in my house. On top of my chair in the living room is a coffee cup. Inside the coffee cup is a single diamond. I move the chair to my bedroom. Then I put the coffee cup on the bed. Then I turn the cup upside down. Then I return it to rightside-up, and place the coffee cup on the kitchen counter. Where is my diamond?

……and now GPT-4 is again returning the correct, expected, response.

I’m relieved that there is an explanation and even a solution—but at the same time—the lack of any indication of bumping up against the inference ceiling would seem to me to undermine confidence that users may have in the model output? That is, from a user’s perspective, they are asking a question—of the most capable model—and getting an obviously incorrect response. Is there a way for the model to signal to the user that the model response may be unstable due to the inferential complexity of the aggregate inbound prompts?

ov1w9 · October 3, 2023, 1:13pm

One additional data-point, when moving the string that was in the custom instruction to the original user prompt, GPT-4 returns the correct/expected response:

So, in other words, given the same inference load, the reasoning disruption is only being observed when in conjunction with the custom instruction feature—not purely with a user prompt?

Foxalabs · October 3, 2023, 1:37pm

My guess is that it’s due to the additional internal prompting used to facilitate the custom instruction.

ov1w9 · October 3, 2023, 1:41pm

Really appreciate you chiming in! Thank you for your time.

Topic		Replies	Views
Matching UI Custom Instruction Reliability with GPT-3.5 system messages in the API? API gpt-35-turbo , custom-instructions	6	1273	August 31, 2023
Custom instructions aren't working anymore Community gpt-4 , chatgpt	25	3618	May 27, 2024
Issues with Custom Instructions Transition from GPT-4 to ChatGPT-3 Prompting gpt-4 , chatgpt	5	267	July 16, 2024
Wandering AI - ignores direction entirely after 10-15 responses Prompting chatgpt	7	171	December 24, 2024
GPTs responding to your instructions? GPT builders gpts	7	1112	February 11, 2024

Custom instructions disrupting GPT-4 reasoning?

Related topics