Location of RAG context within system prompt

Diet · June 20, 2024, 7:58pm

I think you’re on the right-ish track.

I’ve taken to putting a system message as the last message after the user query. I don’t put much into it, apart from schema instructions.

I see it like this: the very last handful of tokens dictate what the model “focuses” on, and what information the model should pull out of its context. And when you steer that focus, you can pull pretty much anything out of anywhere in the context as long as it doesn’t conflict with the model’s training data.

You may have seen these haystack experiments:

https://arxiv.org/html/2404.08865v1

Anecdotally, I’d say it’s more important to ensure that the context is short, clean, and relevant. The positioning of reference information isn’t that important, unless you’re trying to break its training (e.g. get it to not do markdown or act like a chatbot)

So yes: I think including soft information at the top (bot role and that stuff) is a good idea because that information is less likely to be actively purposefully recalled - but I wouldn’t waste that real estate on contextual information.

Language that initiates behavior should be the last thing the bot sees. This is also critical real estate.

But contextual information, especially if it’s unlikely to be overridden by training data, can probably be put anywhere.

Topic		Replies	Views
Are system blocks throughout the conversation supported? API	13	4279	January 22, 2025
How to structure system prompt, RAG context, and user input for multi-turn RAG-based chatbots using OpenAI Chat Completions API lost-user	1	392	June 20, 2025
Contexts with the new turbo end point API	22	6466	September 23, 2023
Should I use System or User Messages when I only need One? Prompting gpt-4 , prompt-engineering	9	5093	March 22, 2025
Changing prompts to remove references to context Prompting	11	9318	September 22, 2023

Location of RAG context within system prompt

Related topics