Location of RAG context within system prompt

I think you’re on the right-ish track.

I’ve taken to putting a system message as the last message after the user query. I don’t put much into it, apart from schema instructions.

I see it like this: the very last handful of tokens dictate what the model “focuses” on, and what information the model should pull out of its context. And when you steer that focus, you can pull pretty much anything out of anywhere in the context as long as it doesn’t conflict with the model’s training data.

You may have seen these haystack experiments:


https://arxiv.org/html/2404.08865v1

Anecdotally, I’d say it’s more important to ensure that the context is short, clean, and relevant. The positioning of reference information isn’t that important, unless you’re trying to break its training (e.g. get it to not do markdown or act like a chatbot)

So yes: I think including soft information at the top (bot role and that stuff) is a good idea because that information is less likely to be actively purposefully recalled - but I wouldn’t waste that real estate on contextual information.

Language that initiates behavior should be the last thing the bot sees. This is also critical real estate.

But contextual information, especially if it’s unlikely to be overridden by training data, can probably be put anywhere.

3 Likes