Impact of Pre-Structured Reasoning in LLM Prompts

TweedBeetle · January 20, 2024, 4:09pm

Hi all,

I am interested in how pre-structured reasoning within AI model prompts affects output quality. Specifically, I’m examining two structures:

‘p r → o’, where reasoning (r) is part of the prompt (p), and
‘p → r o’, where the model generates reasoning after the prompt.

Does including reasoning in the prompt (‘p r’) enhance output (o) similarly to when the model generates its own reasoning?

Any insights or related studies would be very helpful!

Thanks!

vb · January 20, 2024, 8:31pm

Hi!
The answer you are looking for can probably be found easily.

Example 1: tell the model how to reason in general and if done correct it will improve the results.

Example 2: provide specific instructions as to how to behave and act and the result quality will differ accordingly.

explain the concept of mass as if you were a researcher from bizarroworld (but I don’t have a paper for this)

anon22939549 · January 20, 2024, 10:12pm

There are countless extensions on this including,

Tree-of-thought
Graph-of-thought
Program-of-thought
Etc.

If the basic “Chain-of-Thought” doesn’t give you what you need, perhaps one of these will.

(I’ll provide links when I’m able.)

Diet · January 21, 2024, 3:33am

I have a dumb question

where did the reasoning in this case come from?

Or are you thinking

( p → r ) → o

There are some technical considerations that are more specific to OpenAI’s GPT chat models if you’re interested:

There didn’t use to be a difference, but it now seems to make a big difference whether it’s a user or assistant message.

sequence diagram


sequenceDiagram

actor User

activate User

alt 1: allow machine to reason and conclude

User->>gpt call 1: user: prompt + please reason, then conclude

activate gpt call 1

gpt call 1->>gpt call 1: generate reasoning

gpt call 1->>gpt call 1: generate conclusion

gpt call 1->>User: assistant: reasoning + conclusion

deactivate gpt call 1

else 2: present machine reasoning as human reasoning

User->>gpt call 1: user: prompt + please reason

activate gpt call 1

gpt call 1->>gpt call 1: generate reasoning

gpt call 1->>User: assistant: reasoning

deactivate gpt call 1

User->>gpt call 2: user: prompt + reasoning + please conclude

activate gpt call 2

gpt call 2->>gpt call 2: generate conclusion

gpt call 2->>User: assistant: conclusion

deactivate gpt call 2

else 3: present machine reasoning as machine reasoning

User->>gpt call 1: user: prompt + please reason

activate gpt call 1

gpt call 1->>gpt call 1: generate reasoning

User-->>gpt call 2: user: prompt + please reason

gpt call 1-->>gpt call 2: assistant: reasoning

deactivate gpt call 1

User-->>gpt call 2: user: based on that, please conclude

activate gpt call 2

gpt call 2->>gpt call 2: generate conclusion

gpt call 2->>User: assistant: conclusion

deactivate gpt call 2

end

deactivate User

in scenario 3, I’ve often seen the model confuse itself:

after the follow-up prompt, the model might say "Apologies for the confusion in my previous response."
the model will then sometimes rehash the entire process from the start
when it comes to concluding, you might get completely unreliable results because it may start mixing and matching incongruent points from two different reasoning processes. this hapens when you have similar sequences in a long context.

this can also happen in scenario 2, but it’s easier to get the model to accept the reasoning at face value.

vb · January 21, 2024, 9:08am

Found a example of how including the line of reasoning in the prompt increases output quality.

Note that the models have been improved over time and some of the examples will today be answered correct without additional help. But the general conclusion is correct:
instructing the model exactly how to reason about a specific type of question will improve the output for that use case.

Also, all of this is happening inside a single prompt-reply conversational turn.

TweedBeetle · January 21, 2024, 4:40pm

Hi Diet,

Thank you so much for your detailed response!

I can’t really tell you about the source of the reasoning in this scenario but I’ll still try shed some light on the motivation for my question:

Within my domain, I have access to this mysterious cheap way of acquiring gold standard reasoning r* for any p. However, I don’t have a direct source for the corresponding output, o.

My goal is to fine-tune a model that can efficiently transition from ‘p’ to ‘o’, utilizing this r*.

If the output quality of ‘p r → o’ and ‘p → r o’ is similar, I can train my model to use r* in it’s input.

Otherwise I would have to train my model to approximate r* in it’s output and then generate o. This would yield both lower quality r (and therefore lower quality o) and also incur higher costs because of r is now output tokens instead of input tokens which are cheaper.

–

My scenario is most analogous to case 2 in your diagram, just with the alteration that the reasoning comes from a different source. Since I am fine, cheating a model I should be able to prevent it from rehashing the entire process!

Thank you again for your input!

Topic		Replies	Views
Graph of Thought as prompt Prompting chatgpt	4	4539	November 16, 2024
How to Nudge Models into Factual Outputs - Case in Point Community	11	848	July 9, 2021
Meta-Prompting Concept: Asking Chat-GPT for the best prompt for your desired completion, then to revise it before using it Prompting chatgpt	12	27215	June 25, 2025
When processing a text: prompt before it or after it? Prompting chatgpt	13	8446	December 14, 2023
Is an LLM which both generates and critiques its output a contradictory practice? Prompting gpt-4	3	186	November 23, 2024

Impact of Pre-Structured Reasoning in LLM Prompts

Related topics