When processing a text: prompt before it or after it?


When you’re processing a text (making an excerpt, summary or whatnot, or making changes in program code) you could add the actual prompt that tells ChatGPT what to do before the text, like

Please make a summary of the following text:

or after the text:

Please make a summary of this text.

Is there any reason to prefer one over the other? Both of them mostly seem to work, but there might be differences in performance which I couldn’t spot without serious research, or other more subtle differences. I’m interested in both the gpt-3.5-turbo and gpt-4 models.
Thanks so much!



Interesting point, I think the first one more rely on the instruction and the second one more on the context. I need to run a test and evaluation to figure this out.

I run a small experiment the first one will work better. (3.5-turbo) 50 sampling, evaluation with 3.5-trubo itself.


Here is what GPT-3.5 has to say:

Both placing the prompt before or after the text can be effective in guiding ChatGPT to perform the desired task. However, there are some subtle differences to consider when choosing the prompt placement.

For GPT-3.5-turbo and GPT-4 models, here are a few factors to consider:

  1. Contextual understanding: GPT models consider the context provided in the prompt to generate relevant responses. Placing the prompt before the text allows the model to immediately understand the task and frame its response accordingly. This can help ensure that the model’s understanding of the task is clear from the beginning.
  2. Prompt length: GPT models have a maximum token limit, and longer prompts may get truncated or require additional tokens to fit within the model’s capacity. Placing the prompt after the text could potentially lead to longer overall input, especially if the input text is lengthy. In such cases, placing the prompt before the text may be advantageous as it allows for a shorter and more concise prompt.
  3. Task clarity: Placing the prompt before the text provides a clear instruction to the model about the intended task. It sets the context and expectation for the response. On the other hand, placing the prompt after the text might require the model to make more assumptions about the desired task, which could potentially introduce ambiguity.

It’s worth noting that the GPT models are trained on a mixture of supervised and unsupervised learning, and their performance can vary based on the specific use case. While both prompt placements can work, the best approach may depend on the particular task, input length, and desired outcome. Experimenting with different prompt placements is recommended to determine the optimal configuration for your specific requirements.

this is not a solution but a suggestion.

Unless your instruction prompt is too long, there should not be any difference in both scenarios.
However, if incase its too long, i think putting it in the end will help get better results (proper instruction following). This is just based on the fact that, as the input is read sequentially, the text read recently will have more effect/impact.

1 Like

@shatzakis Was that the output of ChatGPT itself, not some external information about it? I’m wondering how much we can trust that - especially with the knowledge cutoff being before the current versions actually existed. :smile: It sounds like they trained it later with some information, though. But 2. sounds very strange and confusing to me, and 3. is also hard to fit into my intuition.

But, right, I guess the general answer to my question is “it depends” and “try it out on your task”, even if that’s not too satisfying. Nice that @kevin6 did some experiments!

I’m wondering about one thing, tough. It is obvious that the output is rendered sequentially, but what about the input? @aman.rai said in his reply that the input is read sequentially, but I’m not sure about that. I have to admit I didn’t really dive in to understand the transformer architecture yet, but I’d rather expect that the whole input (context) is given at the same time to the network. As far as I know there is this attention mechanism that can focus on various aspects during the computation of the next answer token(s). Andrei Karpathy said in his highly interesting and recommendable talk “State of ChatGPT” that the context is the working memory which it can directly access instantaneously anytime - which sounds like that to me. Did anybody really study how that works? If my mental image from that is correct, it would rather depend on what the network has learned whether it’s better to put the prompt at the beginning or end (or even split it into both parts). Whether that matches what a human would do might be very questionable.

Great points, as yes that was the output of Chat-GPT and its knowledge is limited to cutoff data. Regarding sequence, I am not sure, I thought with parallel processing it is read all at the same time?

@hpstoerr i think what you said about attention and parallel input makes more sense

Best to add the prompt at the beginning, followed by examples, and then your task.


How an expert summarizes a (insert a description of the type of text to summarize here)

Example 1:
(insert worked example here inc summary)

Example 2:
(insert worked example here inc summary)

Example 3:
Text to summarize:(Insert your text here)


note: This strategy ensures you get exactly what you want.

Not the most technical answer but based on working with the API as a part of the pipeline, the instructions that I provided for the generation seem to permeate better when the text is provided after the the set of instructions.

This has also lead to a decrease in the amount of hallucinations for the generated text.

This is for gpt-3.5-turbo though.

I usually find the first technique rather useful with large prompts

It is not read sequentially. That is the most important feature of transformer architectures like GPT

I’m not sure it has anything to do with the “possibility” of how the model is used to generate the next response.